linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-15 20:06:43 +07:00

Author	SHA1	Message	Date
Sebastian Ott	81deca12c2	s390/pci: move io address mapping code to pci_insn.c This is a preparation patch for usage of new pci instructions. No functional change. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	fbfe07d440	s390/pci: add parameter to force floating irqs Provide a kernel parameter to force the usage of floating interrupts. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	07e3ec3acb	s390/pci: gather statistics for floating vs directed irqs Gather statistics to distinguish floating and directed interrupts. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	914b7dd07e	s390: show statistics for MSI IRQs Improve /proc/interrupts on s390 to show statistics for individual MSI interrupts. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	e979ce7bce	s390/pci: provide support for CPU directed interrupts Up until now all interrupts on s390 have been floating. For MSI interrupts we've used a global summary bit vector (with a bit for each function) and a per-function interrupt bit vector (with a bit per MSI). This patch introduces a new IRQ delivery mode: CPU directed interrupts. In this new mode a per-CPU interrupt bit vector is used (with a bit per MSI per function). Further it is now possible to direct an IRQ to a specific CPU so we can finally support IRQ affinity. If an interrupt can't be delivered because the appointed CPU is occupied by a hypervisor the interrupt is delivered floating. For this a global summary bit vector is used (with a bit per CPU). Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	414cbd1e3d	s390/airq: provide cacheline aligned ivs Provide the ability to create cachesize aligned interrupt vectors. These will be used for per-CPU interrupt vectors. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	30e63ef2ef	s390/airq: recognize directed interrupts Add an extra parameter for airq handlers to recognize floating vs. directed interrupts. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	0a9fddfaa8	s390/sclp: detect DIRQ facility Detect the adapter CPU directed interruption facility. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Sebastian Ott	c840927cf5	s390/pci: move everything irq related to pci_irq.c Move everything interrupt related from pci.c to pci_irq.c. No functional change. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:47:01 +02:00
Philipp Rudo	c9896acc78	s390/ipl: Provide has_secure sysfs attribute Provide an interface for userspace so it can find out if a machine is capeable of doing secure boot. The interface is, for example, needed for zipl so it can find out which file format it can/should write to disk. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:44:04 +02:00
Philipp Rudo	99feaa717e	s390/kexec_file: Create ipl report and pass to next kernel Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:44:02 +02:00
Philipp Rudo	e23a8020ce	s390/kexec_file: Signature verification prototype Add kernel signature verification to kexec_file. The verification is based on module signature verification and works with kernel images signed via scripts/sign-file. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:44:01 +02:00
Philipp Rudo	653beba24d	s390/kexec_file: Load new kernel to absolute 0 The leading 64 kB of a kernel image doesn't contain any data needed to boot the new kernel when it was loaded via kexec_file. Thus kexec_file currently strips them off before loading the image. Keep the leading 64 kB in order to be able to pass a ipl_report to the next kernel. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:44:00 +02:00
Philipp Rudo	8e49642613	s390/kexec_file: Unify loader code s390_image_load and s390_elf_load have the same code to load the different components. Combine this functionality in one shared function. While at it move kexec_file_update_kernel into the new function as well. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:43:59 +02:00
Philipp Rudo	d0d249d75d	s390/kexec_file: Simplify parmarea access Access the parmarea in head.S via a struct instead of individual offsets. While at it make the fields in the parmarea .quads. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-29 10:43:57 +02:00
Martin Schwidefsky	937347ac56	s390/ipl: add helper functions to create an IPL report PR: Adjusted to the use in kexec_file later. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-26 12:34:05 +02:00
Martin Schwidefsky	9641b8cc73	s390/ipl: read IPL report at early boot Read the IPL Report block provided by secure-boot, add the entries of the certificate list to the system key ring and print the list of components. PR: Adjust to Vasilys bootdata_preserved patch set. Preserve ipl_cert_list for later use in kexec_file. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-26 12:34:05 +02:00
Martin Schwidefsky	d29af5b7a8	s390/ipl: add definitions for the IPL report block To transport the information required for secure boot a new IPL report will be created at boot time. It will be written to memory right after the IPL parameter block. To work with the IPL report a couple of additional structure definitions are added the the uapi/ipl.h header. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-26 12:34:05 +02:00
Martin Schwidefsky	5f1207fbe7	s390/ipl: provide uapi header for list directed IPL The IPL parameter block is used as an interface between Linux and the machine to query and change the boot device and boot options. To be able to create IPL parameter block in user space and pass it as segment to kexec provide an uapi header with proper structure definitions for the block. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-26 12:34:05 +02:00
Martin Schwidefsky	86c74d869d	s390/ipl: make ipl_info less confusing The ipl_info union in struct ipl_parameter_block has the same name as the struct ipl_info. This does not help while reading the code and the union in struct ipl_parameter_block does not need to be named. Drop the name from the union. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-26 12:34:05 +02:00
Christian Borntraeger	8b905d28ee	KVM: s390: provide kvm_arch_no_poll function We do track the current steal time of the host CPUs. Let us use this value to disable halt polling if the steal time goes beyond a configured value. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-04-26 09:08:17 +02:00
Martin Schwidefsky	a8fd61688d	s390: report new CPU capabilities Add hardware capability bits and features tags to /proc/cpuinfo for 4 new CPU features: "Vector-Enhancements Facility 2" (tag "vxe2", hwcap 2^15) "Vector-Packed-Decimal-Enhancement Facility" (tag "vxp", hwcap 2^16) "Enhanced-Sort Facility" (tag "sort", hwcap 2^17) "Deflate-Conversion Facility" (tag "dflt", hwcap 2^18) Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-25 15:34:10 +02:00
Christian Borntraeger	8ec2fa52ea	KVM: s390: enable MSA9 keywrapping functions depending on cpu model Instead of adding a new machine option to disable/enable the keywrapping options of pckmo (like for AES and DEA) we can now use the CPU model to decide. As ECC is also wrapped with the AES key we need that to be enabled. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2019-04-25 02:26:21 -04:00
Christian Borntraeger	4f45b90e1c	KVM: s390: add deflate conversion facilty to cpu model This enables stfle.151 and adds the subfunctions for DFLTCC. Bit 151 is added to the list of facilities that will be enabled when there is no cpu model involved as DFLTCC requires no additional handling from userspace, e.g. for migration. Please note that a cpu model enabled user space can and will have the final decision on the facility bits for a guests. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2019-04-25 02:24:17 -04:00
Martin Schwidefsky	c9f621524e	s390/mm: fix pxd_bad with folded page tables With git commit `d1874a0c28` "s390/mm: make the pxd_offset functions more robust" and a 2-level page table it can now happen that pgd_bad() gets asked to verify a large segment table entry. If the entry is marked as dirty pgd_bad() will incorrectly return true. Change the pgd_bad(), p4d_bad(), pud_bad() and pmd_bad() functions to first verify the table type, return false if the table level is lower than what the function is suppossed to check, return true if the table level is too high, and otherwise check the relevant region and segment table bits. pmd_bad() has to check against ~SEGMENT_ENTRY_BITS for normal page table pointers or ~SEGMENT_ENTRY_BITS_LARGE for large segment table entries. Same for pud_bad() which has to check against ~_REGION_ENTRY_BITS or ~_REGION_ENTRY_BITS_LARGE. Fixes: `d1874a0c28` ("s390/mm: make the pxd_offset functions more robust") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-24 13:28:50 +02:00
Vasily Gorbik	01eb42afb4	s390/kasan: fix strncpy_from_user kasan checks arch/s390/lib/uaccess.c is built without kasan instrumentation. Kasan checks are performed explicitly in copy_from_user/copy_to_user functions. But since those functions could be inlined, calls from files like uaccess.c with instrumentation disabled won't generate kasan reports. This is currently the case with strncpy_from_user function which was revealed by newly added kasan test. Avoid inlining of copy_from_user/copy_to_user when the kernel is built with kasan support to make sure kasan checks are fully functional. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-24 13:28:46 +02:00
Christoph Hellwig	c67fdc1f00	arch: mostly remove <asm/segment.h> A few architectures use <asm/segment.h> internally, but nothing in common code does. Remove all the empty or almost empty versions of it, including the asm-generic one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-04-23 21:51:40 +02:00
Martin Schwidefsky	1a42010cdc	s390/mm: convert to the generic get_user_pages_fast code Define the gup_fast_permitted to check against the asce_limit of the mm attached to the current task, then replace the s390 specific gup code with the generic implementation in mm/gup.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-23 16:30:04 +02:00
Martin Schwidefsky	d1874a0c28	s390/mm: make the pxd_offset functions more robust Change the way how pgd_offset, p4d_offset, pud_offset and pmd_offset walk the page tables. pgd_offset now always calculates the index for the top-level page table and adds it to the pgd, this is either a segment table offset for a 2-level setup, a region-3 offset for 3-levels, region-2 offset for 4-levels, or a region-1 offset for a 5-level setup. The other three functions p4d_offset, pud_offset and pmd_offset will only add the respective offset if they dereference the passed pointer. With the new way of walking the page tables a sequence like this from mm/gup.c now works: pgdp = pgd_offset(current->mm, addr); pgd = READ_ONCE(pgdp); p4dp = p4d_offset(&pgd, addr); p4d = READ_ONCE(p4dp); pudp = pud_offset(&p4d, addr); pud = READ_ONCE(pudp); pmdp = pmd_offset(&pud, addr); pmd = READ_ONCE(pmdp); Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-23 16:30:03 +02:00
Christian Borntraeger	173aec2d5a	KVM: s390: add enhanced sort facilty to cpu model This enables stfle.150 and adds the subfunctions for SORTL. Bit 150 is added to the list of facilities that will be enabled when there is no cpu model involved as sortl requires no additional handling from userspace, e.g. for migration. Please note that a cpu model enabled user space can and will have the final decision on the facility bits for a guests. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2019-04-18 12:57:53 +02:00
Christian Borntraeger	13209ad039	KVM: s390: add MSA9 to cpumodel This enables stfle.155 and adds the subfunctions for KDSA. Bit 155 is added to the list of facilities that will be enabled when there is no cpu model involved as MSA9 requires no additional handling from userspace, e.g. for migration. Please note that a cpu model enabled user space can and will have the final decision on the facility bits for a guests. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2019-04-18 10:14:11 +02:00
Arnd Bergmann	6e042492a2	s390: avoid __builtin_return_address(n) on clang llvm on s390 has problems with __builtin_return_address(n), with n>0, this results in a somewhat cryptic error message: fatal error: error in backend: Unsupported stack frame traversal count To work around it, use the direct return address directly. This is probably not ideal here, but gets things to compile and should only lead to inferior reporting, not to misbehavior of the generated code. Link: https://bugs.llvm.org/show_bug.cgi?id=41424 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-11 13:37:29 +02:00
Arnd Bergmann	0a113efc3b	s390: make __load_psw_mask work with clang clang fails to use the %O and %R inline assembly modifiers the same way as gcc, leading to build failures with every use of __load_psw_mask(): /tmp/nmi-4a9f80.s: Assembler messages: /tmp/nmi-4a9f80.s:571: Error: junk at end of line: `+8(160(%r11))' /tmp/nmi-4a9f80.s:626: Error: junk at end of line: `+8(160(%r11))' Replace these with a more conventional way of passing the addresses that should work with both clang and gcc. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-11 13:36:52 +02:00
Arnd Bergmann	efb150df1d	s390: syscall_wrapper: avoid clang warning Building system calls with clang results in a warning about an alias from a global function to a static one: ../fs/namei.c:3847:1: warning: unused function '__se_sys_mkdirat' [-Wunused-function] SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, umode_t, mode) ^ ../include/linux/syscalls.h:219:36: note: expanded from macro 'SYSCALL_DEFINE3' #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__) ^ ../include/linux/syscalls.h:228:2: note: expanded from macro 'SYSCALL_DEFINEx' __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) ^ ../arch/s390/include/asm/syscall_wrapper.h:126:18: note: expanded from macro '__SYSCALL_DEFINEx' asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ ^ <scratch space>:31:1: note: expanded from here __se_sys_mkdirat ^ The only reference to the static __se_sys_mkdirat() here is the alias, but this only gets evaluated later. Making this function global as well avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-11 13:36:51 +02:00
Martin Schwidefsky	7aa0055e06	s390: fine-tune stack switch helper The CALL_ON_STACK helper currently does not work with clang and for calls without arguments. It does not initialize r2 although the constraint is "+&d". Rework the CALL_FMT_x and the CALL_ON_STACK macros to work with clang and produce optimal code in all cases. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-11 13:36:45 +02:00
Vasily Gorbik	5abb9351df	s390/uv: introduce guest side ultravisor code The Ultravisor Call Facility (stfle bit 158) defines an API to the Ultravisor (UV calls), a mini hypervisor located at machine level. With help of the Ultravisor, KVM will be able to run "protected" VMs, special VMs whose memory and management data are unavailable to KVM. The protected VMs can also request services from the Ultravisor. The guest api consists of UV calls to share and unshare memory with the kvm hypervisor. To enable this feature support PROTECTED_VIRTUALIZATION_GUEST kconfig option has been introduced. Co-developed-by: Janosch Frank <frankja@de.ibm.com> Signed-off-by: Janosch Frank <frankja@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-10 17:47:21 +02:00
Vasily Gorbik	1e941d3949	s390: move ipl block to .boot.preserved.data section .boot.preserved.data is a better fit for ipl block than .boot.data which is discarded after init. Reusing .boot.preserved.data allows to simplify code a little bit and avoid copying data from .boot.data to persistent variables. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-10 17:47:11 +02:00
Gerald Schaefer	bf9921a9c1	s390: introduce .boot.preserved.data section Introduce .boot.preserve.data section which is similar to .boot.data and "shared" between the decompressor code and the decompressed kernel. The decompressor will store values in it, and copy over to the decompressed image before starting it. This method allows to avoid using pre-defined addresses and other hacks to pass values between those boot phases. Unlike .boot.data section .boot.preserved.data is NOT a part of init data, and hence will be preserved for the kernel life time. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-04-10 17:47:09 +02:00
Ingo Molnar	54bbfe75cb	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-04-10 09:14:42 +02:00
Will Deacon	fdcd06a8ab	arch: Use asm-generic header for asm/mmiowb.h Hook up asm-generic/mmiowb.h to Kbuild for all architectures so that we can subsequently include asm/mmiowb.h from core code. Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2019-04-08 11:59:43 +01:00
Steven Rostedt (VMware)	32d9258662	syscalls: Remove start and number from syscall_set_arguments() args After removing the start and count arguments of syscall_get_arguments() it seems reasonable to remove them from syscall_set_arguments(). Note, as of today, there are no users of syscall_set_arguments(). But we are told that there will be soon. But for now, at least make it consistent with syscall_get_arguments(). Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dave Martin <dave.martin@arm.com> Cc: "Dmitry V. Levin" <ldv@altlinux.org> Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86 Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-04-05 09:27:23 -04:00
Steven Rostedt (Red Hat)	b35f549df1	syscalls: Remove start and number from syscall_get_arguments() args At Linux Plumbers, Andy Lutomirski approached me and pointed out that the function call syscall_get_arguments() implemented in x86 was horribly written and not optimized for the standard case of passing in 0 and 6 for the starting index and the number of system calls to get. When looking at all the users of this function, I discovered that all instances pass in only 0 and 6 for these arguments. Instead of having this function handle different cases that are never used, simply rewrite it to return the first 6 arguments of a system call. This should help out the performance of tracing system calls by ptrace, ftrace and perf. Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dave Martin <dave.martin@arm.com> Cc: "Dmitry V. Levin" <ldv@altlinux.org> Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86 Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-04-05 09:26:43 -04:00
Waiman Long	46ad0840b1	locking/rwsem: Remove arch specific rwsem files As the generic rwsem-xadd code is using the appropriate acquire and release versions of the atomic operations, the arch specific rwsem.h files will not be that much faster than the generic code as long as the atomic functions are properly implemented. So we can remove those arch specific rwsem.h and stop building asm/rwsem.h to reduce maintenance effort. Currently, only x86, alpha and ia64 have implemented architecture specific fast paths. I don't have access to alpha and ia64 systems for testing, but they are legacy systems that are not likely to be updated to the latest kernel anyway. By using a rwsem microbenchmark, the total locking rates on a 4-socket 56-core 112-thread x86-64 system before and after the patch were as follows (mixed means equal # of read and write locks): Before Patch After Patch # of Threads wlock rlock mixed wlock rlock mixed ------------ ----- ----- ----- ----- ----- ----- 1 29,201 30,143 29,458 28,615 30,172 29,201 2 6,807 13,299 1,171 7,725 15,025 1,804 4 6,504 12,755 1,520 7,127 14,286 1,345 8 6,762 13,412 764 6,826 13,652 726 16 6,693 15,408 662 6,599 15,938 626 32 6,145 15,286 496 5,549 15,487 511 64 5,812 15,495 60 5,858 15,572 60 There were some run-to-run variations for the multi-thread tests. For x86-64, using the generic C code fast path seems to be a little bit faster than the assembly version with low lock contention. Looking at the assembly version of the fast paths, there are assembly to/from C code wrappers that save and restore all the callee-clobbered registers (7 registers on x86-64). The assembly generated from the generic C code doesn't need to do that. That may explain the slight performance gain here. The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h with no code change as no other code other than those under kernel/locking needs to access the internal rwsem macros and functions. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-riscv@lists.infradead.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linuxppc-dev@lists.ozlabs.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: uclinux-h8-devel@lists.sourceforge.jp Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-04-03 14:50:50 +02:00
Martin Schwidefsky	9de7d833e3	s390/tlb: Convert to generic mmu_gather No change in behavior intended. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: heiko.carstens@de.ibm.com Cc: linux@armlinux.org.uk Cc: npiggin@gmail.com Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/20180918125151.31744-3-schwidefsky@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-04-03 10:32:57 +02:00
Peter Zijlstra	ed6a79352c	asm-generic/tlb, arch: Provide CONFIG_HAVE_MMU_GATHER_PAGE_SIZE Move the mmu_gather::page_size things into the generic code instead of PowerPC specific bits. No change in behavior intended. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-04-03 10:32:40 +02:00
Linus Torvalds	bfed6d0ffc	s390 update with improvements and bug fixes for 5.1-rc2 - Fix early free of the channel program in vfio - On AP device removal make sure that all messages are flushed with the driver still attached that queued the message - Limit brk randomization to 32MB to reduce the chance that the heap of ld.so is placed after the main stack - Add a rolling average for the steal time of a CPU, this will be needed for KVM to decide when to do busy waiting - Fix a warning in the CPU-MF code - Add a notification handler for AP configuration change to react faster to new AP devices -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJcnIq7AAoJEDjwexyKj9rgddUH/3VQP6BMvq2fwAsLqx8JeYgT 082xzP2nHli3tO6m8fFHmtqrSg5KTEDfuQVafqp92LeEMKUNWQI6kRu7rXeAVBct M6hx21mqkm9VNjAlAjSq8IAUXP2K6/K0BMD5mYInYYYVRvJm3on4sHnkEj0kvXbm OGxwnNBd9UnH5g6ti2vW4cyDvs0aqj1eDbSudy5KedumQz5J2XdFPn4f4Ej6p2+t nuvlZFDnZ2Z4rliE3RFCuKExZR+YFZgS1urm6pcklncfvbJRsqFJ+nvhurskDUI3 4gOp1Yv1tvGNv/cNVEtnz8g/Kg8/sI7evjQBtxhtEsV/W0sbZPnjCt+28Cf1DN4= =4nL7 -----END PGP SIGNATURE----- Merge tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Improvements and bug fixes for 5.1-rc2: - Fix early free of the channel program in vfio - On AP device removal make sure that all messages are flushed with the driver still attached that queued the message - Limit brk randomization to 32MB to reduce the chance that the heap of ld.so is placed after the main stack - Add a rolling average for the steal time of a CPU, this will be needed for KVM to decide when to do busy waiting - Fix a warning in the CPU-MF code - Add a notification handler for AP configuration change to react faster to new AP devices" * tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpumf: Fix warning from check_processor_id zcrypt: handle AP Info notification from CHSC SEI command vfio: ccw: only free cp on final interrupt s390/vtime: steal time exponential moving average s390/zcrypt: revisit ap device remove procedure s390: limit brk randomization to 32MB	2019-03-28 08:35:32 -07:00
Dmitry V. Levin	16add41164	syscall_get_arch: add "struct task_struct *" argument This argument is required to extend the generic ptrace API with PTRACE_GET_SYSCALL_INFO request: syscall_get_arch() is going to be called from ptrace_request() along with syscall_get_nr(), syscall_get_arguments(), syscall_get_error(), and syscall_get_return_value() functions with a tracee as their argument. The primary intent is that the triple (audit_arch, syscall_nr, arg1..arg6) should describe what system call is being called and what its arguments are. Reverts: `5e937a9ae9` ("syscall_get_arch: remove useless function arguments") Reverts: `1002d94d30` ("syscall.h: fix doc text for syscall_get_arch()") Reviewed-by: Andy Lutomirski <luto@kernel.org> # for x86 Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Kees Cook <keescook@chromium.org> # seccomp parts Acked-by: Mark Salter <msalter@redhat.com> # for the c6x bit Cc: Elvira Khabirova <lineprinter@altlinux.org> Cc: Eugene Syromyatnikov <esyr@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: x86@kernel.org Cc: linux-alpha@vger.kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Cc: linux-audit@redhat.com Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Paul Moore <paul@paul-moore.com>	2019-03-20 21:12:36 -04:00
Linus Torvalds	28d747f266	Kbuild updates for v5.1 (2nd) - add more Build-Depends to Debian source package - prefix header search paths with $(srctree)/ - make modpost show verbose section mismatch warnings - avoid hard-coded CROSS_COMPILE for h8300 - fix regression for Debian make-kpkg command - add semantic patch to detect missing put_device() - fix some warnings of 'make deb-pkg' - optimize NOSTDINC_FLAGS evaluation - add warnings about redundant generic-y - clean up Makefiles and scripts -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcjm13AAoJED2LAQed4NsG9FoQALFscagW8R5LIDmzzRPmslhF W1qm9rEmtdnOHGg20QbYUnJwtGZjVN4lIZp6eQ3v6mhvm6IY2VhInGJpcLnwbojb o7y4wKcP9/ucIpfV/z32DrUfEM+qnQwztn56u7lJBxf4cTFEOIwIIS8v1KEnsNXX Zzvu1kSKsc4ZHHdE7h3dmr3iC5GOz/6EAJ9U33WcLy24tRTevIxcZsYvb/SOvDAT NYdPK8yptuVVO+odHObNwMVBidRcXRb49gWQGWLuAvfbklh33pomYarWkNe/Syif UeCHDNwvqzEmjSks73EomdCjME0roWhgKbm/dXJKXhe2hBzP1psMWNzRPSRa4yIj SHE7UfFPXCa+tNveJo2qzTOhpMw1DRiNgZD3EM2cRvwZ1ip8emJr70qFfL+RGpqq 4ZlLb9Tibb51ApLcn+r0AnOMrC8MkK1zC8dKNxgUwdJ7D4UqZ70348c2GXE54yfv kxst/gtLb9r6YEtaCsKbCk1XgR2y2QGtyYrVLKsI/v6fhPVBKxnDXIpsn0Q6NYFi UiYKojTpFKvEMl0tc1EaYrIGoq9ZH4wDna3q4lOSRiyrypUl8NfflWwDSIuYVP5Z Y2tIPYTcGeCxt3gyXu0riL6tvpy1KGVlByNB9V297rSrVenH4VcfYPLJhYAtqpRo gO2eyp64i9LduVZOrEEP =6GIM -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - add more Build-Depends to Debian source package - prefix header search paths with $(srctree)/ - make modpost show verbose section mismatch warnings - avoid hard-coded CROSS_COMPILE for h8300 - fix regression for Debian make-kpkg command - add semantic patch to detect missing put_device() - fix some warnings of 'make deb-pkg' - optimize NOSTDINC_FLAGS evaluation - add warnings about redundant generic-y - clean up Makefiles and scripts * tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: remove stale lxdialog/.gitignore kbuild: force all architectures except um to include mandatory-y kbuild: warn redundant generic-y Revert "modsign: Abort modules_install when signing fails" kbuild: Make NOSTDINC_FLAGS a simply expanded variable kbuild: deb-pkg: avoid implicit effects coccinelle: semantic code search for missing put_device() kbuild: pkg: grep include/config/auto.conf instead of $KCONFIG_CONFIG kbuild: deb-pkg: introduce is_enabled and if_enabled_echo to builddeb kbuild: deb-pkg: add CONFIG_ prefix to kernel config options kbuild: add workaround for Debian make-kpkg kbuild: source include/config/auto.conf instead of ${KCONFIG_CONFIG} unicore32: simplify linker script generation for decompressor h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- kbuild: move archive command to scripts/Makefile.lib modpost: always show verbose warning for section mismatch ia64: prefix header search path with $(srctree)/ libfdt: prefix header search paths with $(srctree)/ deb-pkg: generate correct build dependencies	2019-03-17 13:25:26 -07:00
Masahiro Yamada	037fc3368b	kbuild: force all architectures except um to include mandatory-y Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes the common Kbuild.asm file. Factor out the duplicated include directives to scripts/Makefile.asm-generic so that no architecture would opt out of the mandatory-y mechanism. um is not forced to include mandatory-y since it is a very exceptional case which does not support UAPI. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-03-17 12:56:32 +09:00
Masahiro Yamada	7cbbbb8bc2	kbuild: warn redundant generic-y The generic-y is redundant under the following condition: - arch has its own implementation - the same header is added to generated-y - the same header is added to mandatory-y If a redundant generic-y is found, the warning like follows is displayed: scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h I fixed up arch Kbuild files found by this. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-03-17 12:56:31 +09:00
Linus Torvalds	636deed6c0	ARM: some cleanups, direct physical timer assignment, cache sanitization for 32-bit guests s390: interrupt cleanup, introduction of the Guest Information Block, preparation for processor subfunctions in cpu models PPC: bug fixes and improvements, especially related to machine checks and protection keys x86: many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations; plus AVIC fixes. Generic: memcg accounting -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJci+7XAAoJEL/70l94x66DUMkIAKvEefhceySHYiTpfefjLjIC 16RewgHa+9CO4Oo5iXiWd90fKxtXLXmxDQOS4VGzN0rxvLGRw/fyXIxL1MDOkaAO l8SLSNuewY4XBUgISL3PMz123r18DAGOuy9mEcYU/IMesYD2F+wy5lJ17HIGq6X2 RpoF1p3qO1jfkPTKOob6Ixd4H5beJNPKpdth7LY3PJaVhDxgouj32fxnLnATVSnN gENQ10fnt8BCjshRYW6Z2/9bF15JCkUFR1xdBW2/xh1oj+kvPqqqk2bEN1eVQzUy 2hT/XkwtpthqjSbX8NNavWRSFnOnbMLTRKQyIXmFVsM5VoSrwtiGsCFzBgcT++I= =XIzU -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...	2019-03-15 15:00:28 -07:00
Tony Krowiak	0d9c038fef	zcrypt: handle AP Info notification from CHSC SEI command The current AP bus implementation periodically polls the AP configuration to detect changes. When the AP configuration is dynamically changed via the SE or an SCLP instruction, the changes will not be reflected to sysfs until the next time the AP configuration is polled. The CHSC architecture provides a Store Event Information (SEI) command to make notification of an AP configuration change. This patch introduces a handler to process notification from the CHSC SEI command by immediately kicking off an AP bus scan-after-event. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Reviewed-by: Harald Freudenberger <FREUDE@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-03-11 10:16:42 -07:00
Linus Torvalds	8dcd175bc3	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: - a few misc things - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits) tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include proc: more robust bulk read test proc: test /proc/*/maps, smaps, smaps_rollup, statm proc: use seq_puts() everywhere proc: read kernel cpu stat pointer once proc: remove unused argument in proc_pid_lookup() fs/proc/thread_self.c: code cleanup for proc_setup_thread_self() fs/proc/self.c: code cleanup for proc_setup_self() proc: return exit code 4 for skipped tests mm,mremap: bail out earlier in mremap_to under map pressure mm/sparse: fix a bad comparison mm/memory.c: do_fault: avoid usage of stale vm_area_struct writeback: fix inode cgroup switching comment mm/huge_memory.c: fix "orig_pud" set but not used mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC mm/memcontrol.c: fix bad line in comment mm/cma.c: cma_declare_contiguous: correct err handling mm/page_ext.c: fix an imbalance with kmemleak mm/compaction: pass pgdat to too_many_isolated() instead of zone mm: remove zone_lru_lock() function, access ->lru_lock directly ...	2019-03-06 10:31:36 -08:00
Martin Schwidefsky	152e9b8676	s390/vtime: steal time exponential moving average To be able to judge the current overcommitment ratio for a CPU add a lowcore field with the exponential moving average of the steal time. The average is updated every tick. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-03-06 14:59:50 +01:00
Martin Schwidefsky	cd479eccd2	s390: limit brk randomization to 32MB For a 64-bit process the randomization of the program break is quite large with 1GB. That is as big as the randomization of the anonymous mapping base, for a test case started with '/lib/ld64.so.1 <exec>' it can happen that the heap is placed after the stack. To avoid this limit the program break randomization to 32MB for 64-bit and keep 8MB for 31-bit. Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-03-06 09:34:03 +01:00
Aneesh Kumar K.V	04a8645304	mm: update ptep_modify_prot_commit to take old pte value as arg Architectures like ppc64 require to do a conditional tlb flush based on the old and new value of pte. Enable that by passing old pte value as the arg. Link: http://lkml.kernel.org/r/20190116085035.29729-3-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:18 -08:00
Aneesh Kumar K.V	0cbe3e26ab	mm: update ptep_modify_prot_start/commit to take vm_area_struct as arg Patch series "NestMMU pte upgrade workaround for mprotect", v5. We can upgrade pte access (R -> RW transition) via mprotect. We need to make sure we follow the recommended pte update sequence as outlined in commit `bd5050e38a` ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") for such updates. This patch series does that. This patch (of 5): Some architectures may want to call flush_tlb_range from these helpers. Link: http://lkml.kernel.org/r/20190116085035.29729-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:18 -08:00
Linus Torvalds	b1b988a6a0	Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull year 2038 updates from Thomas Gleixner: "Another round of changes to make the kernel ready for 2038. After lots of preparatory work this is the first set of syscalls which are 2038 safe: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 The syscall numbers are identical all over the architectures" * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) riscv: Use latest system call ABI checksyscalls: fix up mq_timedreceive and stat exceptions unicore32: Fix __ARCH_WANT_STAT64 definition asm-generic: Make time32 syscall numbers optional asm-generic: Drop getrlimit and setrlimit syscalls from default list 32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option compat ABI: use non-compat openat and open_by_handle_at variants y2038: add 64-bit time_t syscalls to all 32-bit architectures y2038: rename old time and utime syscalls y2038: remove struct definition redirects y2038: use time32 syscall names on 32-bit syscalls: remove obsolete __IGNORE_ macros y2038: syscalls: rename y2038 compat syscalls x86/x32: use time64 versions of sigtimedwait and recvmmsg timex: change syscalls to use struct __kernel_timex timex: use __kernel_timex internally sparc64: add custom adjtimex/clock_adjtime functions time: fix sys_timer_settime prototype time: Add struct __kernel_timex time: make adjtime compat handling available for 32 bit ...	2019-03-05 14:08:26 -08:00
Linus Torvalds	3591b19511	s390 updates for the 5.1 merge window - A copy of Arnds compat wrapper generation series - Pass information about the KVM guest to the host in form the control program code and the control program version code - Map IOV resources to support PCI physical functions on s390 - Add vector load and store alignment hints to improve performance - Use the "jdd" constraint with gcc 9 to make jump labels working again - Remove amode workaround for old z/VM releases from the DCSS code - Add support for in-kernel performance measurements using the CPU measurement counter facility - Introduce a new PMU device cpum_cf_diag to capture counters and store thenn as event raw data. - Bug fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJcfh4QAAoJEDjwexyKj9rgXVAH/RzVbi3vznldujSNfCFTZKPu EmFFAZIfbhifW3szfylyOJL52pFhxjcWzY0hkFEkbs2t90sn8l1BNkDscYZtfNHC XvN3N9LsHyxOeyxvQuWLSio58qm+Lr1L0UrIhbMvqyAVkOLmIHvybFwi83OkMptm djoL8NbuNsAA2s26y2bZLNtU7FmOW5smJIlnt7H4dmK4SFylqZKS/EnUZxGDgn+7 UrrTTOQUir0QZ8vraANsP1M0/LqPcd2YusLmj4jOdZ5Muc2Ch2AA991FofqdKShO /8cGlsIzwHWGgdnP/YDea5gbetvonayYduixKy3EnYpWQ9iogiBjH4G7QNxcncs= =v26J -----END PGP SIGNATURE----- Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - A copy of Arnds compat wrapper generation series - Pass information about the KVM guest to the host in form the control program code and the control program version code - Map IOV resources to support PCI physical functions on s390 - Add vector load and store alignment hints to improve performance - Use the "jdd" constraint with gcc 9 to make jump labels working again - Remove amode workaround for old z/VM releases from the DCSS code - Add support for in-kernel performance measurements using the CPU measurement counter facility - Introduce a new PMU device cpum_cf_diag to capture counters and store thenn as event raw data. - Bug fixes and cleanups * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits) Revert "s390/cpum_cf: Add kernel message exaplanations" s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y s390/suspend: fix prefix register reset in swsusp_arch_resume s390: warn about clearing als implied facilities s390: allow overriding facilities via command line s390: clean up redundant facilities list setup s390/als: remove duplicated in-place implementation of stfle s390/cio: Use cpa range elsewhere within vfio-ccw s390/cio: Fix vfio-ccw handling of recursive TICs s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation s390/cpum_cf: Add kernel message exaplanations s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace s390/cpum_cf: add ctr_stcctm() function s390/cpum_cf: move common functions into a separate file s390/cpum_cf: introduce kernel_cpumcf_avail() function s390/cpu_mf: replace stcctm5() with the stcctm() function s390/cpu_mf: add store cpu counter multiple instruction support s390/cpum_cf: Add minimal in-kernel interface for counter measurements s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts ...	2019-03-05 11:13:10 -08:00
Linus Torvalds	6456300356	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Here we go, another merge window full of networking and #ebpf changes: 1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP range without dealing with floods of ARP traffic, from Linus Lüssing. 2) Throttle buffered multicast packet transmission in mt76, from Felix Fietkau. 3) Support adaptive interrupt moderation in ice, from Brett Creeley. 4) A lot of struct_size conversions, from Gustavo A. R. Silva. 5) Add peek/push/pop commands to bpftool, as well as bash completion, from Stanislav Fomichev. 6) Optimize sk_msg_clone(), from Vakul Garg. 7) Add SO_BINDTOIFINDEX, from David Herrmann. 8) Be more conservative with local resends due to local congestion, from Yuchung Cheng. 9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata. 10) Add health buffer support to devlink, from Eran Ben Elisha. 11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen. 12) Add statistics to basic packet scheduler filter, from Cong Wang. 13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan. 14) Lots of new IP tunneling forwarding tests, also from Nir Dotan. 15) Add 3ad stats to bonding, from Nikolay Aleksandrov. 16) Lots of probing improvements for bpftool, from Quentin Monnet. 17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski. 18) Allow #ebpf programs to access gso_segs from skb shared info, from Eric Dumazet. 19) Add sock_diag support for AF_XDP sockets, from Björn Töpel. 20) Support 22260 iwlwifi devices, from Luca Coelho. 21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov. 22) Add JMP32 instruction class support to #ebpf, from Jiong Wang. 23) Add spinlock support to #ebpf, from Alexei Starovoitov. 24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson. 25) Add device infomation API to devlink, from Jakub Kicinski. 26) Add new timestamping socket options which are y2038 safe, from Deepa Dinamani. 27) Add RX checksum offloading for various sh_eth chips, from Sergei Shtylyov. 28) Flow offload infrastructure, from Pablo Neira Ayuso. 29) Numerous cleanups, improvements, and bug fixes to the PHY layer and many drivers from Heiner Kallweit. 30) Lots of changes to try and make packet scheduler classifiers run lockless as much as possible, from Vlad Buslov. 31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows. 32) Add concurrency tests to tc-tests infrastructure, from Vlad Buslov. 33) Add hwmon support to aquantia, from Heiner Kallweit. 34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet. And I would be remiss if I didn't thank the various major networking subsystem maintainers for integrating much of this work before I even saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso, Johannes Berg, Kalle Valo, and many others. Thank you!" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits) net/sched: avoid unused-label warning net: ignore sysctl_devconf_inherit_init_net without SYSCTL phy: mdio-mux: fix Kconfig dependencies net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework selftest/net: Remove duplicate header sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79 net/mlx5e: Update tx reporter status in case channels were successfully opened devlink: Add support for direct reporter health state update devlink: Update reporter state to error even if recover aborted sctp: call iov_iter_revert() after sending ABORT team: Free BPF filter when unregistering netdev ip6mr: Do not call __IP6_INC_STATS() from preemptible context isdn: mISDN: Fix potential NULL pointer dereference of kzalloc net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs cxgb4/chtls: Prefix adapter flags with CXGB4 net-sysfs: Switch to bitmap_zalloc() mellanox: Switch to bitmap_zalloc() bpf: add test cases for non-pointer sanitiation logic mlxsw: i2c: Extend initialization by querying resources data ...	2019-03-05 08:26:13 -08:00
Linus Torvalds	736706bee3	get rid of legacy 'get_ds()' function Every in-kernel use of this function defined it to KERNEL_DS (either as an actual define, or as an inline function). It's an entirely historical artifact, and long long long ago used to actually read the segment selector valueof '%ds' on x86. Which in the kernel is always KERNEL_DS. Inspired by a patch from Jann Horn that just did this for a very small subset of users (the ones in fs/), along with Al who suggested a script. I then just took it to the logical extreme and removed all the remaining gunk. Roughly scripted with git grep -l '(get_ds())' -- :^tools/ \| xargs sed -i 's/(get_ds())/(KERNEL_DS)/' git grep -lw 'get_ds' -- :^tools/ \| xargs sed -i '/^#define get_ds()/d' plus manual fixups to remove a few unusual usage patterns, the couple of inline function cases and to fix up a comment that had become stale. The 'get_ds()' function remains in an x86 kvm selftest, since in user space it actually does something relevant. Inspired-by: Jann Horn <jannh@google.com> Inspired-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-04 10:50:14 -08:00
Paolo Bonzini	8f060f5355	KVM: s390: Features for 5.1 - Clarify KVM related kernel messages - Interrupt cleanup - Introduction of the Guest Information Block (GIB) - Preparation for processor subfunctions in cpu model -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJcb9iJAAoJEONU5rjiOLn4mI8P/1J3IyN3imnZUkTM5JOMWhNM TSwh5obvn7dT6URZ6BgM1DWIz9E/FKrEb2kU4xr1hwf/a69Q1cYKVmHSnzzIpxHQ ZNjr7QbcBCsVJ8LtasOoMmgGnVvtBYKKHr4J8UcqeW9raP3YfPJmqyETufiE2lFy G50r8EBFr9rPh7nK7ImAabKC/7Q/qxZ0729m71cu729/uBb/Wf6frqaDmFlA8362 YZC7KY+xEHZbWqKQqAt/x1TWAOb7nA5dCzemeRckNrs5+FN7rSBrje6SbWApZPfn weteCVbJMLCoRMUTFjRy3YNz1x0gAC9VQT6Qz5Kz7dColVfJjTPWdYuKpbRsj+n1 PEv1uuDBNbDqdS29KG3Dk9cfzUgAU12g+Xsb+3168HsQbU7XU1v6gCoRaR8ccaoq 3k8Em0xusHa+uGI6K4knKmWboRrCA6FWHIaink4B2K7qIaVdWqTebhHaDiDx8qB8 JRNjxQDho92FpRzxHyajHtamFKPjGT/Guc0yWMIrPHBn97GktUnDD6E5AdhTRVxs aXTZv7XFq5j307lc3qWsdAf4zGEaPbi9f2nHgFK8hJf+z560CmNbye9Rw6L96Lil gy0rvSQgN+3xBtSKvq3DNrgoouupOS6kFu5iyYLBS8UUOztXttKEzTCs+M87/3AP fphwixKEEXsMRWR2SJvG =YeIb -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next KVM: s390: Features for 5.1 - Clarify KVM related kernel messages - Interrupt cleanup - Introduction of the Guest Information Block (GIB) - Preparation for processor subfunctions in cpu model	2019-02-22 17:44:23 +01:00
Christian Borntraeger	346fa2f891	KVM: s390: implement subfunction processor calls While we will not implement interception for query functions yet, we can and should disable functions that have a control bit based on the given CPU model. Let us start with enabling the subfunction interface. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2019-02-22 11:04:35 +01:00
Thomas Richter	fe5908bccc	s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace Introduce a PMU device named cpum_cf_diag. It extracts the values of all counters in all authorized counter sets and stores them as event raw data. This is done with the STORE CPU COUNTER MULTIPLE instruction to speed up access. All counter sets fit into one buffer. The values of each counter are taken when the event is started on the performance sub-system and when the event is stopped. This results in counter values available at the start and at the end of the measurement time frame. The difference is calculated for each counter. The differences of all counters are then saved as event raw data in the perf.data file. The counter values are accompanied by the time stamps when the counter set was started and when the counter set was stopped. This data is part of a trailer entry which describes the time frame, counter set version numbers, CPU speed, and machine type for later analysis. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:56 +01:00
Hendrik Brueckner	86c0b75715	s390/cpum_cf: add ctr_stcctm() function Introduce the ctr_stcctm() function as wrapper function to extract counters from a particular counter set. Note that the counter set is part of the stcctm instruction opcode, few indirections are necessary to specify the counter set as variable. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:55 +01:00
Hendrik Brueckner	869f4f98fa	s390/cpum_cf: introduce kernel_cpumcf_avail() function A preparation to move out common CPU-MF counter facility support functions, first introduce a function that indicates whether the support is ready to use. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:54 +01:00
Hendrik Brueckner	346d034d7f	s390/cpu_mf: replace stcctm5() with the stcctm() function Remove the stcctm5() function to extract counters from the MT-diagnostic counter set with the stcctm() function. For readability, introduce an enum to map the counter sets names to respective numbers for the stcctm instruction. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:53 +01:00
Hendrik Brueckner	778fb10ccc	s390/cpu_mf: add store cpu counter multiple instruction support Add support for the STORE CPU COUNTER MULTIPLE instruction to extract a range of counters from a counter set. An assembler macro is used to create the instruction opcode because the counter set identifier is part of the instruction and, thus, cannot be easily specified as parameter. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:52 +01:00
Hendrik Brueckner	17bebcc68e	s390/cpum_cf: Add minimal in-kernel interface for counter measurements Introduce a minimal interface for doing counter measurements of small units of work within the kernel. Use the kernel_cpumcf_begin() function start a measurement session and, later, stop it with kernel_cpumcf_end(). During the measreument session, you can enable and start/stop counter sets by using ctr_set_* functions. To make these changes effective use the lcctl() function. You can then use the ecctr() function to extract counters from the different counter sets. Please note that you have to check whether the counter sets to be enabled are authorized. Note that when a measurement session is active, other users cannot perform counter measurements. In such cases, kernel_cpumcf_begin() indicates this with returning -EBUSY. If the counter facility is not available, kernel_cpumcf_begin() returns -ENODEV. Note that this interface is restricted to the current CPU and, thus, preemption must be turned off. Example: u32 state, err; u64 cycles, insn; err = kernel_cpumcf_begin(); if (err) goto out_busy; state = 0; ctr_set_enable(&state, CPUMF_CTR_SET_BASIC); ctr_set_start(&state, CPUMF_CTR_SET_BASIC); err = lcctl(state); if (err) goto ; /* ... do your work ... / ctr_set_stop(&state, CPUMF_CTR_SET_BASIC); err = lcctl(state); if (err) goto out; cycles = insn = 0; ecctr(0, &cycles); ecctr(1, &insn); / ... */ kernel_cpumcf_end(); out_busy: Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:50 +01:00
Hendrik Brueckner	26b8317f51	s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts During a __kernel_cpumcf_begin()/end() session, save measurement alerts for the counter facility in the per-CPU cpu_cf_events variable. Users can obtain and, optionally, clear the alerts by calling kernel_cpumcf_alert() to specifically handle alerts. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:50 +01:00
Hendrik Brueckner	f944bcdf5b	s390/cpu_mf: move struct cpu_cf_events and per-CPU variable to header file Make the struct cpu_cf_events and the respective per-CPU variable available to in-kernel users. Access to this per-CPU variable shall be done between the calls to __kernel_cpumcf_begin() and __kernel_cpumcf_end(). Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:49 +01:00
Hendrik Brueckner	3d33345aa3	s390/cpum_cf: prepare for in-kernel counter measurements Prepare the counter facility support to be used by other in-kernel users. The first step introduces the __kernel_cpumcf_begin() and __kernel_cpumcf_end() functions to reserve the counter facility for doing measurements and to release after the measurements are done. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:47 +01:00
Hendrik Brueckner	30e145f811	s390/cpum_cf: move counter set controls to a new header file Move counter set specific controls and functions to the asm/cpu_mcf.h header file containg all counter facility support definitions. Also adapt few variable names and header file includes. No functional changes. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-22 09:19:46 +01:00
Sean Christopherson	152482580a	KVM: Call kvm_arch_memslots_updated() before updating memslots kvm_arch_memslots_updated() is at this point in time an x86-specific hook for handling MMIO generation wraparound. x86 stashes 19 bits of the memslots generation number in its MMIO sptes in order to avoid full page fault walks for repeat faults on emulated MMIO addresses. Because only 19 bits are used, wrapping the MMIO generation number is possible, if unlikely. kvm_arch_memslots_updated() alerts x86 that the generation has changed so that it can invalidate all MMIO sptes in case the effective MMIO generation has wrapped so as to avoid using a stale spte, e.g. a (very) old spte that was created with generation==0. Given that the purpose of kvm_arch_memslots_updated() is to prevent consuming stale entries, it needs to be called before the new generation is propagated to memslots. Invalidating the MMIO sptes after updating memslots means that there is a window where a vCPU could dereference the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO spte that was created with (pre-wrap) generation==0. Fixes: `e59dbe09f8` ("KVM: Introduce kvm_arch_memslots_updated()") Cc: <stable@vger.kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-02-20 22:48:32 +01:00
Ilya Leoshkevich	146448524b	s390/jump_label: Use "jdd" constraint on gcc9 [heiko.carstens@de.ibm.com]: ----- Laura Abbott reported that the kernel doesn't build anymore with gcc 9, due to the "X" constraint. Ilya provided the gcc 9 patch "S/390: Introduce jdd constraint" which introduces the new "jdd" constraint which fixes this. ----- The support for section anchors on S/390 introduced in gcc9 has changed the behavior of "X" constraint, which can now produce register references. Since existing constraints, in particular, "i", do not fit the intended use case on S/390, the new machine-specific "jdd" constraint was introduced. This patch makes jump labels use "jdd" constraint when building with gcc9. Reported-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-20 09:48:26 +01:00
Thomas Gleixner	41ea39101d	y2038: Add time64 system calls This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2] Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJcXf7/AAoJEGCrR//JCVInPSUP/RhsQSCKMGtONB/vVICQhwep PybhzBSpHWFxszzTi6BEPN1zS9B069G9mDollRBYZCckyPqL/Bv6sI/vzQZdNk01 Q6Nw92OnNE1QP8owZ5TjrZhpbtopWdqIXjsbGZlloUemvuJP2JwvKovQUcn5CPTQ jbnqU04CVyFFJYVxAnGJ+VSeWNrjW/cm/m+rhLFjUcwW7Y3aodxsPqPP6+K9hY9P yIWfcH42WBeEWGm1RSBOZOScQl4SGCPUAhFydl/TqyEQagyegJMIyMOv9wZ5AuTT xK644bDVmNsrtJDZDpx+J8hytXCk1LrnKzkHR/uK80iUIraF/8D7PlaPgTmEEjko XcrywEkvkXTVU3owCm2/sbV+8fyFKzSPipnNfN1JNxEX71A98kvMRtPjDueQq/GA Yh81rr2YLF2sUiArkc2fNpENT7EGhrh1q6gviK3FB8YDgj1kSgPK5wC/X0uolC35 E7iC2kg4NaNEIjhKP/WKluCaTvjRbvV+0IrlJLlhLTnsqbA57ZKCCteiBrlm7wQN 4csUtCyxchR9Ac2o/lj+Mf53z68Zv74haIROp18K2dL7ZpVcOPnA3XHeauSAdoyp wy2Ek6ilNvlNB+4x+mRntPoOsyuOUGv7JXzB9JvweLWUd9G7tvYeDJQp/0YpDppb K4UWcKnhtEom0DgK08vY =IZVb -----END PGP SIGNATURE----- Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull y2038 - time64 system calls from Arnd Bergmann: This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]	2019-02-10 21:24:43 +01:00
Thomas Gleixner	fd659cc095	arch: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJcXf6XAAoJEGCrR//JCVInIm4P/AlkMmQRa/B2ziWMW6PifPoI v18r44017rA1BPENyZvumJUdM5mDvNofOW8F2DYQ7Uiys2YtXenwe/Cf8LHn2n6c TMXGQryQpvNmfDCyU+0UjF8m2+poFMrL4aRTXtjODh1YTsPNgeDC+KFMCAAtZmZd cVbXFudtbdYKD/pgCX4SI1CWAMBiXe2e+ukPdJVr+iqusCMTApf+GOuyvDBZY9s/ vURb+tIS87HZ/jehWfZFSuZt+Gu7b3ijUXNC8v9qSIxNYekw62vBNl6F09HE79uB Bv4OujAODqKvI9gGyydBzLJNzaMo0ryQdusyqcJHT7MY/8s+FwcYAXyTlQ3DbbB4 2u/c+58OwJ9Zk12p4LXZRA47U+vRhQt2rO4+zZWs2txNNJY89ZvCm/Z04KOiu5Xz 1Nnj607KGzthYRs2gs68AwzGGyf0uykIQ3RcaJLIBlX1Nd8BWO0ZgAguCvkXbQMX XNXJTd92HmeuKKpiO0n/M4/mCeP0cafBRPCZbKlHyTl0Jeqd/HBQEO9Z8Ifwyju3 mXz9JCR9VlPCkX605keATbjtPGZf3XQtaXlQnezitDudXk8RJ33EpPcbhx76wX7M Rux37ByqEOzk4wMGX9YQyNU7z7xuVg4sJAa2LlJqYeKXHtym+u3gG7SGP5AsYjmk 6mg2+9O2yZuLhQtOtrwm =s4wf -----END PGP SIGNATURE----- Merge tag 'y2038-syscall-cleanup' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull preparatory work for y2038 changes from Arnd Bergmann: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds.	2019-02-10 20:44:19 +01:00
Ursula Braun	41c80be24b	s390/net: move pnet constants There is no need to define these PNETID related constants in the pnet.h file, since they are just used locally within pnet.c. Just code cleanup, no functional change. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-07 18:06:18 -08:00
Martin Schwidefsky	142c52d7bc	s390: add alignment hints to vector load and store The z14 introduced alignment hints to increase the performance of vector loads and stores. The kernel uses an implicit alignmenet of 8 bytes for the vector registers, set the alignment hint to 3. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-07 11:57:10 +01:00
Julian Wiedmann	bdf117674e	s390/qdio: make SBAL address array type-safe There is no need to use void pointers, all drivers are in agreement about the underlying data structure of the SBAL arrays. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-02-07 11:57:07 +01:00
Arnd Bergmann	d33c577ccc	y2038: rename old time and utime syscalls The time, stime, utime, utimes, and futimesat system calls are only used on older architectures, and we do not provide y2038 safe variants of them, as they are replaced by clock_gettime64, clock_settime64, and utimensat_time64. However, for consistency it seems better to have the 32-bit architectures that still use them call the "time32" entry points (leaving the traditional handlers for the 64-bit architectures), like we do for system calls that now require two versions. Note: We used to always define __ARCH_WANT_SYS_TIME and __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat mode. The resulting asm/unistd.h changes look a bit counterintuitive. This is only a cleanup patch and it should not change any behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-02-07 00:13:28 +01:00
Arnd Bergmann	805089c2f7	syscalls: remove obsolete __IGNORE_ macros These are all for ignoring the lack of obsolete system calls, which have been marked the same way in scripts/checksyscall.sh, so these can be removed. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-02-07 00:13:27 +01:00
Michael Mueller	9f30f62163	KVM: s390: add gib_alert_irq_handler() The patch implements a handler for GIB alert interruptions on the host. Its task is to alert guests that interrupts are pending for them. A GIB alert interrupt statistic counter is added as well: $ cat /proc/interrupts CPU0 CPU1 ... GAL: 23 37 [I/O] GIB Alert ... Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20190131085247.13826-14-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:23 +01:00
Michael Mueller	6cff2e1046	KVM: s390: add functions to (un)register GISC with GISA Add the Interruption Alert Mask (IAM) to the architecture specific kvm struct. This mask in the GISA is used to define for which ISC a GIB alert will be issued. The functions kvm_s390_gisc_register() and kvm_s390_gisc_unregister() are used to (un)register a GISC (guest ISC) with a virtual machine and its GISA. Upon successful completion, kvm_s390_gisc_register() returns the ISC to be used for GIB alert interruptions. A negative return code indicates an error during registration. Theses functions will be used by other adapter types like AP and PCI to request pass-through interruption support. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Acked-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20190131085247.13826-12-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:23 +01:00
Michael Mueller	25c84dbaec	KVM: s390: add kvm reference to struct sie_page2 Adding the kvm reference to struct sie_page2 will allow to determine the kvm a given gisa belongs to: container_of(gisa, struct sie_page2, gisa)->kvm This functionality will be required to process a gisa in gib alert interruption context. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20190131085247.13826-11-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:23 +01:00
Michael Mueller	1282c21eb3	KVM: s390: add the GIB and its related life-cyle functions The Guest Information Block (GIB) links the GISA of all guests that have adapter interrupts pending. These interrupts cannot be delivered because all vcpus of these guests are currently in WAIT state or have masked the respective Interruption Sub Class (ISC). If enabled, a GIB alert is issued on the host to schedule these guests to run suitable vcpus to consume the pending interruptions. This mechanism allows to process adapter interrupts for currently not running guests. The GIB is created during host initialization and associated with the Adapter Interruption Facility in case an Adapter Interruption Virtualization Facility is available. The GIB initialization and thus the activation of the related code will be done in an upcoming patch of this series. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20190131085247.13826-10-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:23 +01:00
Michael Mueller	3dec192217	s390/cio: add function chsc_sgib() This patch implements the Set Guest Information Block operation to request association or disassociation of a Guest Information Block (GIB) with the Adapter Interruption Facility. The operation is required to receive GIB alert interrupts for guest adapters in conjunction with AIV and GISA. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20190131085247.13826-9-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:23 +01:00
Michael Mueller	982cff4259	KVM: s390: introduce struct kvm_s390_gisa_interrupt Use this struct analog to the kvm interruption structs for kvm emulated floating and local interruptions. GIB handling will add further fields to this structure as required. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20190131085247.13826-8-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:22 +01:00
Michael Mueller	246b72183b	KVM: s390: move bitmap idle_mask into arch struct top level The vcpu idle_mask state is used by but not specific to the emulated floating interruptions. The state is relevant to gisa related interruptions as well. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20190131085247.13826-4-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:22 +01:00
Michael Mueller	689bdf9e9c	KVM: s390: make bitmap declaration consistent Use a consistent bitmap declaration throughout the code. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20190131085247.13826-3-mimu@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2019-02-05 14:29:21 +01:00
Deepa Dinamani	2edfd8e061	arch: Use asm-generic/socket.h when possible Many architectures maintain an arch specific copy of the file even though there are no differences with the asm-generic one. Allow these architectures to use the generic one instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: chris@zankel.net Cc: fenghua.yu@intel.com Cc: tglx@linutronix.de Cc: schwidefsky@de.ibm.com Cc: linux-ia64@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-s390@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-03 11:17:30 -08:00
Greg Kroah-Hartman	d7f2f7c7fc	s390: pci: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Sebastian Ott <sebott@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-28 15:58:54 +01:00
Collin Walling	4ad78b8651	s390/setup: set control program code via diag 318 The s390x diagnose 318 instruction sets the control program name code (CPNC) and control program version code (CPVC) to provide useful information regarding the OS during debugging. The CPNC is explicitly set to 4 to indicate a Linux/KVM environment. The CPVC is a 7-byte value containing: - 3-byte Linux version code, currently set to 0 - 3-byte unique value, currently set to 0 - 1-byte trailing null Signed-off-by: Collin Walling <walling@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <1544135405-22385-2-git-send-email-walling@linux.ibm.com> [set version code to 0 until the structure is fully defined] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-28 15:58:53 +01:00
David S. Miller	1d68101367	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-01-27 10:43:17 -08:00
Arnd Bergmann	b41c51c8e1	arch: add pkey and rseq syscall numbers everywhere Most architectures define system call numbers for the rseq and pkey system calls, even when they don't support the features, and perhaps never will. Only a few architectures are missing these, so just define them anyway for consistency. If we decide to add them later to one of these, the system call numbers won't get out of sync then. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	2019-01-25 17:22:50 +01:00
Martin Schwidefsky	58661489a8	Merge branch 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into features Pull tip branch with Arnds compat system call wrapper rework.	2019-01-23 12:39:35 +01:00
Heiko Carstens	fb8bfca06c	s390: fix system call tracing When converting to autogenerated compat syscall wrappers all system call entry points got a different symbol name: they all got a __s390x_ prefix. This caused breakage with system call tracing, since an appropriate arch_syscall_match_sym_name() was not provided. Add this function, and while at it also add code to avoid compat system call tracing. s390 has different system call tables for native 64 bit system calls and compat system calls. This isn't really supported in the common code. However there are hardly any compat binaries left, therefore just ignore compat system calls, like x86 and arm64 also do for the same reason. Fixes: `aa0d6e70d3` ("s390: autogenerate compat syscall wrappers") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-23 12:38:07 +01:00
Vasily Gorbik	7e0d92f002	s390/kasan: improve string/memory functions checks Avoid using arch specific implementations of string/memory functions with KASAN since gcc cannot instrument asm code memory accesses and many bugs could be missed. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-18 09:34:18 +01:00
Arnd Bergmann	aa0d6e70d3	s390: autogenerate compat syscall wrappers Any system call that takes a pointer argument on s390 requires a wrapper function to do a 31-to-64 zero-extension, these are currently generated in arch/s390/kernel/compat_wrapper.c. On arm64 and x86, we already generate similar wrappers for all system calls in the place of their definition, just for a different purpose (they load the arguments from pt_regs). We can do the same thing here, by adding an asm/syscall_wrapper.h file with a copy of all the relevant macros to override the generic version. Besides the addition of the compat entry point, these also rename the entry points with a __s390_ or __s390x_ prefix, similar to what we do on arm64 and x86. This in turn requires renaming a few things, and adding a proper ni_syscall() entry point. In order to still compile system call definitions that pass an loff_t argument, the __SC_COMPAT_CAST() macro checks for that and forces an -ENOSYS error, which was the best I could come up with. Those functions must obviously not get called from user space, but instead require hand-written compat_sys_*() handlers, which fortunately already exist. Link: https://lore.kernel.org/lkml/20190116131527.2071570-5-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> [heiko.carstens@de.ibm.com: compile fix for !CONFIG_COMPAT] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-18 09:33:19 +01:00
Arnd Bergmann	fef747bab3	s390: use generic UID16 implementation s390 has an almost identical copy of the code in kernel/uid16.c. The problem here is that it requires calling the regular system calls, which the generic implementation handles correctly, but the internal interfaces are not declared in a global header for this. The best way forward here seems to be to just use the generic code and delete the s390 specific implementation. I keep the changes to uapi/asm/posix_types.h inside of an #ifdef check so user space does not observe any changes. As some of the system calls pass pointers, we also need wrappers in compat_wrapper.c, which I add for all calls with at least one argument. All those wrappers can be removed in a later step. Link: https://lore.kernel.org/lkml/20190116131527.2071570-4-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-18 09:33:18 +01:00
David Herrmann	f5dd3d0c96	net: introduce SO_BINDTOIFINDEX sockopt This introduces a new generic SOL_SOCKET-level socket option called SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a network interface index as argument, rather than the network interface name. User-space often refers to network-interfaces via their index, but has to temporarily resolve it to a name for a call into SO_BINDTODEVICE. This might pose problems when the network-device is renamed asynchronously by other parts of the system. When this happens, the SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong device. In most cases user-space only ever operates on devices which they either manage themselves, or otherwise have a guarantee that the device name will not change (e.g., devices that are UP cannot be renamed). However, particularly in libraries this guarantee is non-obvious and it would be nice if that race-condition would simply not exist. It would make it easier for those libraries to operate even in situations where the device-name might change under the hood. A real use-case that we recently hit is trying to start the network stack early in the initrd but make it survive into the real system. Existing distributions rename network-interfaces during the transition from initrd into the real system. This, obviously, cannot affect devices that are up and running (unless you also consider moving them between network-namespaces). However, the network manager now has to make sure its management engine for dormant devices will not run in parallel to these renames. Particularly, when you offload operations like DHCP into separate processes, these might setup their sockets early, and thus have to resolve the device-name possibly running into this race-condition. By avoiding a call to resolve the device-name, we no longer depend on the name and can run network setup of dormant devices in parallel to the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this race. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:51 -08:00
Vasily Gorbik	190f056fba	s390/vdso: correct vdso mapping for compat tasks While "s390/vdso: avoid 64-bit vdso mapping for compat tasks" fixed 64-bit vdso mapping for compat tasks under gdb it introduced another problem. "compat_mm" flag is not inherited during fork and when 31-bit process forks a child (but does not perform exec) it ends up with 64-bit vdso. To address that, init_new_context (which is called during fork and exec) now initialize compat_mm based on thread TIF_31BIT flag. Later compat_mm is adjusted in arch_setup_additional_pages, which is called during exec. Fixes: `d1befa6582` ("s390/vdso: avoid 64-bit vdso mapping for compat tasks") Reported-by: Stefan Liebler <stli@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <stable@vger.kernel.org> # v4.20+ Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-11 17:12:02 +01:00
Martin Schwidefsky	a38662084c	s390/mm: always force a load of the primary ASCE on context switch The ASCE of an mm_struct can be modified after a task has been created, e.g. via crst_table_downgrade for a compat process. The active_mm logic to avoid the switch_mm call if the next task is a kernel thread can lead to a situation where switch_mm is called where 'prev == next' is true but 'prev->context.asce == next->context.asce' is not. This can lead to a situation where a CPU uses the outdated ASCE to run a task. The result can be a crash, endless loops and really subtle problem due to TLBs being created with an invalid ASCE. Cc: stable@kernel.org # v3.15+ Fixes: `53e857f308` ("s390/mm,tlb: race of lazy TLB flush vs. recreation") Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2019-01-11 17:12:02 +01:00
Masahiro Yamada	d6e4b3e326	arch: remove redundant UAPI generic-y defines Now that Kbuild automatically creates asm-generic wrappers for missing mandatory headers, it is redundant to list the same headers in generic-y and mandatory-y. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>	2019-01-06 10:22:15 +09:00
Masahiro Yamada	d4ce5458ea	arch: remove stale comments "UAPI Header export list" These comments are leftovers of commit `fcc8487d47` ("uapi: export all headers under uapi directories"). Prior to that commit, exported headers must be explicitly added to header-y. Now, all headers under the uapi/ directories are exported. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-01-06 09:46:51 +09:00
Linus Torvalds	a65981109f	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: - procfs updates - various misc bits - lib/ updates - epoll updates - autofs - fatfs - a few more MM bits * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits) mm/page_io.c: fix polled swap page in checkpatch: add Co-developed-by to signature tags docs: fix Co-Developed-by docs drivers/base/platform.c: kmemleak ignore a known leak fs: don't open code lru_to_page() fs/: remove caller signal_pending branch predictions mm/: remove caller signal_pending branch predictions arch/arc/mm/fault.c: remove caller signal_pending_branch predictions kernel/sched/: remove caller signal_pending branch predictions kernel/locking/mutex.c: remove caller signal_pending branch predictions mm: select HAVE_MOVE_PMD on x86 for faster mremap mm: speed up mremap by 20x on large regions mm: treewide: remove unused address argument from pte_alloc functions initramfs: cleanup incomplete rootfs scripts/gdb: fix lx-version string output kernel/kcov.c: mark write_comp_data() as notrace kernel/sysctl: add panic_print into sysctl panic: add options to print system info when panic happens bfs: extra sanity checking and static inode bitmap exec: separate MM_ANONPAGES and RLIMIT_STACK accounting ...	2019-01-05 09:16:18 -08:00
Joel Fernandes (Google)	4cf5892495	mm: treewide: remove unused address argument from pte_alloc functions Patch series "Add support for fast mremap". This series speeds up the mremap(2) syscall by copying page tables at the PMD level even for non-THP systems. There is concern that the extra 'address' argument that mremap passes to pte_alloc may do something subtle architecture related in the future that may make the scheme not work. Also we find that there is no point in passing the 'address' to pte_alloc since its unused. This patch therefore removes this argument tree-wide resulting in a nice negative diff as well. Also ensuring along the way that the enabled architectures do not do anything funky with the 'address' argument that goes unnoticed by the optimization. Build and boot tested on x86-64. Build tested on arm64. The config enablement patch for arm64 will be posted in the future after more testing. The changes were obtained by applying the following Coccinelle script. (thanks Julia for answering all Coccinelle questions!). Following fix ups were done manually: * Removal of address argument from pte_fragment_alloc * Removal of pte_alloc_one_fast definitions from m68k and microblaze. // Options: --include-headers --no-includes // Note: I split the 'identifier fn' line, so if you are manually // running it, please unsplit it so it runs for you. virtual patch @pte_alloc_func_def depends on patch exists@ identifier E2; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; type T2; @@ fn(... - , T2 E2 ) { ... } @pte_alloc_func_proto_noarg depends on patch exists@ type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1, T2); + T3 fn(T1); \| - T3 fn(T1, T2, T4); + T3 fn(T1, T2); ) @pte_alloc_func_proto depends on patch exists@ identifier E1, E2, E4; type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1 E1, T2 E2); + T3 fn(T1 E1); \| - T3 fn(T1 E1, T2 E2, T4 E4); + T3 fn(T1 E1, T2 E2); ) @pte_alloc_func_call depends on patch exists@ expression E2; identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; @@ fn(... -, E2 ) @pte_alloc_macro depends on patch exists@ identifier fn =~ "^(__pte_alloc\|pte_alloc_one\|pte_alloc\|__pte_alloc_kernel\|pte_alloc_one_kernel)$"; identifier a, b, c; expression e; position p; @@ ( - #define fn(a, b, c) e + #define fn(a, b) e \| - #define fn(a, b) e + #define fn(a) e ) Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michal Hocko <mhocko@kernel.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:47 -08:00
Matthew Wilcox	3fc2579e6f	fls: change parameter to unsigned int When testing in userspace, UBSAN pointed out that shifting into the sign bit is undefined behaviour. It doesn't really make sense to ask for the highest set bit of a negative value, so just turn the argument type into an unsigned int. Some architectures (eg ppc) already had it declared as an unsigned int, so I don't expect too many problems. Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:46 -08:00
Linus Torvalds	96d4f267e4	Remove 'type' argument from access_ok() function Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-03 18:57:57 -08:00
Linus Torvalds	04a17edeca	s390 updates for the 4.21 merge window - A larger update for the zcrypt / AP bus code + Update two inline assemblies in the zcrypt driver to make gcc happy + Add a missing reply code for invalid special commands for zcrypt + Allow AP device reset to be triggered from user space + Split the AP scan function into smaller, more readable functions - Updates for vfio-ccw and vfio-ap + Add maintainers and reviewer for vfio-ccw + Include facility.h in vfio_ap_drv.c to avoid fragile include chain + Simplicy vfio-ccw state machine - Use the common code version of bust_spinlocks - Make use of the DEFINE_SHOW_ATTRIBUTE - Fix three incorrect file permissions in the DASD driver - Remove bit spin-lock from the PCI interrupt handler - Fix GFP_ATOMIC vs GFP_KERNEL in the PCI code -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJcLGoIAAoJEDjwexyKj9rgyN8IANaQvHbVBA3vz/Ssb6ZiR/K6 rTBoXjJQqyJ/cf6RZeFi1b4Douv4QWJw3s06KXbrdmK/ONm5rypXVfXlAhY71pg5 40BUb92MGXhJw6JFDQ50Udd6Z5r7r6RYR1puyg4tzHmBuNVL7FB5RqFm92UOkMOD ZI03G1sfA6/1XUKhNfCfNBB6Jt6V+iAAex8bgrp09wAeoGnAO20oFuis9u7pLlNm a5Cp9n7faXEN+qes1iBtVDr5o7opuhanwWKnhvsYTAbpOo7jGJ/47IPKT2Wfmurd wkMZBEC+Ntk/IfkaBzp7azeISZD5EbucTcgo/I9nzq/aWeflfXXeYl7My0aQB48= =Lqrh -----END PGP SIGNATURE----- Merge tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - A larger update for the zcrypt / AP bus code: + Update two inline assemblies in the zcrypt driver to make gcc happy + Add a missing reply code for invalid special commands for zcrypt + Allow AP device reset to be triggered from user space + Split the AP scan function into smaller, more readable functions - Updates for vfio-ccw and vfio-ap + Add maintainers and reviewer for vfio-ccw + Include facility.h in vfio_ap_drv.c to avoid fragile include chain + Simplicy vfio-ccw state machine - Use the common code version of bust_spinlocks - Make use of the DEFINE_SHOW_ATTRIBUTE - Fix three incorrect file permissions in the DASD driver - Remove bit spin-lock from the PCI interrupt handler - Fix GFP_ATOMIC vs GFP_KERNEL in the PCI code * tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: rework ap scan bus code s390/zcrypt: make sysfs reset attribute trigger queue reset s390/pci: fix sleeping in atomic during hotplug s390/pci: remove bit_lock usage in interrupt handler s390/drivers: fix proc/debugfs file permissions s390: convert to DEFINE_SHOW_ATTRIBUTE MAINTAINERS/vfio-ccw: add Farhan and Eric, make Halil Reviewer vfio: ccw: Merge BUSY and BOXED states s390: use common bust_spinlocks() s390/zcrypt: improve special ap message cmd handling s390/ap: rework assembler functions to use unions for in/out register variables s390: vfio-ap: include <asm/facility> for test_facility()	2019-01-02 18:37:01 -08:00
Will Deacon	08861d33d6	preempt: Move PREEMPT_NEED_RESCHED definition into arch code PREEMPT_NEED_RESCHED is never used directly, so move it into the arch code where it can potentially be implemented using either a different bit in the preempt count or as an entirely separate entity. Cc: Robert Love <rml@tech9.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-12-07 12:35:46 +00:00
Harald Freudenberger	be53479101	s390/zcrypt: improve special ap message cmd handling There exist very few ap messages which need to have the 'special' flag enabled. This flag tells the firmware layer to do some pre- and maybe postprocessing. However, it may happen that this special flag is enabled but the firmware is unable to deal with this kind of message and thus returns with reply code 0x41. For example older firmware may not know the newest messages triggered by the zcrypt device driver and thus react with reject and the named reply code. Unfortunately this reply code is not known to the zcrypt error routines and thus default behavior is to switch the ap queue offline. This patch now makes the ap error routine aware of the reply code and so userspace is informed about the bad processing result but the queue is not switched to offline state any more. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-11-30 07:22:05 +01:00
Harald Freudenberger	159491f3b5	s390/ap: rework assembler functions to use unions for in/out register variables The inline assembler functions ap_aqic() and ap_qact() used two variables declared on the very same register. One variable was for input only, the other for output. Looks like newer versions of the gcc don't like this. Anyway it is a better coding to use one variable (which may have a union data type) on one register for input and output. So this patch introduces unions and uses only one variable now for input and output for GR1 for the PQAP(QACT) and PQAP(QIC) invocation. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-11-30 07:22:05 +01:00
Linus Torvalds	3541833fd1	s390 updates for 4.20-rc2 - A fix for the pgtable_bytes misaccounting on s390. The patch changes common code part in regard to page table folding and adds extra checks to mm_[inc\|dec]_nr_[pmds\|puds]. - Add FORCE for all build targets using if_changed - Use non-loadable phdr for the .vmlinux.info section to avoid a segment overlap that confuses kexec - Cleanup the attribute definition for the diagnostic sampling - Increase stack size for CONFIG_KASAN=y builds - Export __node_distance to fix a build error - Correct return code of a PMU event init function - An update for the default configs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJb5TIMAAoJEDjwexyKj9rgIH8H/0daZTyxcLwY9gbigaq1Qs4R /ScmAJJc2U/Qj8b9UskhsmHAUuAufF2oljU16SquP7CBGhtkLRrjPtdh1AMiiZGM reVF7X5LU8MH0QUoNnKPWAL4DD1q2E99IAEH5TeGIODUG6srqvIHBNtXDWNLPtBf fpOhJ/NssgxyuYUXi/WnoEjIyP8KABeG6SlpcLzYbmY1hUOIXcixuv39UrL0G5OO P8ciL+W5rTcPZCnpJ1Xk9hKploT8gWXhMT5QhNnakgMF/25v80+TZy5xRZMuLAmQ T5SFP6B71o05nLK7fLi3VAIKPv/QibjiyJOEf9uUHdo1XZcD5uRu0EQ/LklLUBU= =4H06 -----END PGP SIGNATURE----- Merge tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - A fix for the pgtable_bytes misaccounting on s390. The patch changes common code part in regard to page table folding and adds extra checks to mm_[inc\|dec]_nr_[pmds\|puds]. - Add FORCE for all build targets using if_changed - Use non-loadable phdr for the .vmlinux.info section to avoid a segment overlap that confuses kexec - Cleanup the attribute definition for the diagnostic sampling - Increase stack size for CONFIG_KASAN=y builds - Export __node_distance to fix a build error - Correct return code of a PMU event init function - An update for the default configs * tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/perf: Change CPUM_CF return code in event init function s390: update defconfigs s390/mm: Fix ERROR: "__node_distance" undefined! s390/kasan: increase instrumented stack size to 64k s390/cpum_sf: Rework attribute definition for diagnostic sampling s390/mm: fix mis-accounting of pgtable_bytes mm: add mm_pxd_folded checks to pgtable_bytes accounting functions mm: introduce mm_[p4d\|pud\|pmd]_folded mm: make the __PAGETABLE_PxD_FOLDED defines non-empty s390: avoid vmlinux segments overlap s390/vdso: add missing FORCE to build targets s390/decompressor: add missing FORCE to build targets	2018-11-09 06:30:44 -06:00
Martin Schwidefsky	163c8d54a9	compiler: remove __no_sanitize_address_or_inline again The __no_sanitize_address_or_inline and __no_kasan_or_inline defines are almost identical. The only difference is that __no_kasan_or_inline does not have the 'notrace' attribute. To be able to replace __no_sanitize_address_or_inline with the older definition, add 'notrace' to __no_kasan_or_inline and change to two users of __no_sanitize_address_or_inline in the s390 code. The 'notrace' option is necessary for e.g. the __load_psw_mask function in arch/s390/include/asm/processor.h. Without the option it is possible to trace __load_psw_mask which leads to kernel stack overflow. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-11-05 08:14:18 -08:00
Vasily Gorbik	9fed920e68	s390/kasan: increase instrumented stack size to 64k Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe #4 [9a068ba8] 440 bytes lock_acquire at 3576ee #5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 #6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 #7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 #8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 #9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c #10 [9a069dd0] 168 bytes wake_up_worker at 23f97c #11 [9a069e78] 200 bytes insert_work at 23fc2e #12 [9a069f40] 648 bytes __queue_work at 2487c0 #13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 #14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 #15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 #16 [9a06a3a0] 288 bytes __blk_mq_delay_run_hw_queue at 158168c #17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c #18 [9a06a580] 184 bytes blk_mq_sched_insert_requests at 15a2192 #19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a #20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 #21 [9a06acf8] 320 bytes schedule at 219e476 #22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac #23 [9a06b130] 408 bytes wait_for_common at 21a1706 #24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 #25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 #26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 #27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 #28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 #29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e #30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 #31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 #32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca #33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce #34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 #35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de #36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e #37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 #38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 #39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 #40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 #41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 #42 [9a06e518] 88 bytes do_writepages at 73fe6a #43 [9a06e570] 872 bytes __writeback_single_inode at a20cb6 #44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 #45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 #46 [9a06ec98] 928 bytes wb_writeback at a2500e #47 [9a06f038] 848 bytes wb_do_writeback at a260ae #48 [9a06f388] 536 bytes wb_workfn at a28228 #49 [9a06f5a0] 1088 bytes process_one_work at 24a234 #50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 #51 [9a06fe40] 104 bytes kthread at 26545a #52 [9a06fea8] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-11-02 08:31:57 +01:00
Martin Schwidefsky	e12e4044ae	s390/mm: fix mis-accounting of pgtable_bytes In case a fork or a clone system fails in copy_process and the error handling does the mmput() at the bad_fork_cleanup_mm label, the following warning messages will appear on the console: BUG: non-zero pgtables_bytes on freeing mm: 16384 The reason for that is the tricks we play with mm_inc_nr_puds() and mm_inc_nr_pmds() in init_new_context(). A normal 64-bit process has 3 levels of page table, the p4d level and the pud level are folded. On process termination the free_pud_range() function in mm/memory.c will subtract 16KB from pgtable_bytes with a mm_dec_nr_puds() call, but there actually is not really a pud table. One issue with this is the fact that pgtable_bytes is usually off by a few kilobytes, but the more severe problem is that for a failed fork or clone the free_pgtables() function is not called. In this case there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context(). The pgtable_bytes will be off by 16384 or 32768 bytes and we get the BUG message. The message itself is purely cosmetic, but annoying. To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded function to check for the true size of the address space. Reported-by: Li Wang <liwang@redhat.com> Tested-by: Li Wang <liwang@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-11-02 08:31:55 +01:00
Nick Desaulniers	de0d22e50c	treewide: remove current_text_addr Prefer _THIS_IP_ defined in linux/kernel.h. Most definitions of current_text_addr were the same as _THIS_IP_, but a few archs had inline assembly instead. This patch removes the final call site of current_text_addr, making all of the definitions dead code. [akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h] Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-10-31 08:54:12 -07:00
Linus Torvalds	0d1e8b8d2b	KVM updates for v4.20 ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJb0FINAAoJEED/6hsPKofoI60IAJRS3vOAQ9Fav8cJsO1oBHcX 3+NexfnBke1bzrjIR3SUcHKGZbdnVPNZc+Q4JjIbPpPmmOMU5jc9BC1dmd5f4Vzh BMnQ0yCvgFv3A3fy/Icx1Z8NJppxosdmqdQLrQrNo8aD3cjnqY2yQixdXrAfzLzw XEgKdIFCCz8oVN/C9TT4wwJn6l9OE7BM5bMKGFy5VNXzMu7t64UDOLbbjZxNgi1g teYvfVGdt5mH0N7b2GPPWRbJmgnz5ygVVpVNQUEFrdKZoCm6r5u9d19N+RRXAwan ZYFj10W2T8pJOUf3tryev4V33X7MRQitfJBo4tP5hZfi9uRX89np5zP1CFE7AtY= =yEPW -----END PGP SIGNATURE----- Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...	2018-10-25 17:57:35 -07:00
Linus Torvalds	4dcb9239da	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping updates from Thomas Gleixner: "The timers and timekeeping departement provides: - Another large y2038 update with further preparations for providing the y2038 safe timespecs closer to the syscalls. - An overhaul of the SHCMT clocksource driver - SPDX license identifier updates - Small cleanups and fixes all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) tick/sched : Remove redundant cpu_online() check clocksource/drivers/dw_apb: Add reset control clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE clocksource/drivers: Unify the names to timer-* format clocksource/drivers/sh_cmt: Add R-Car gen3 support dt-bindings: timer: renesas: cmt: document R-Car gen3 support clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines clocksource/drivers/sh_cmt: Fixup for 64-bit machines clocksource/drivers/sh_tmu: Convert to SPDX identifiers clocksource/drivers/sh_mtu2: Convert to SPDX identifiers clocksource/drivers/sh_cmt: Convert to SPDX identifiers clocksource/drivers/renesas-ostm: Convert to SPDX identifiers clocksource: Convert to using %pOFn instead of device_node.name tick/broadcast: Remove redundant check RISC-V: Request newstat syscalls y2038: signal: Change rt_sigtimedwait to use __kernel_timespec y2038: socket: Change recvmmsg to use __kernel_timespec y2038: sched: Change sched_rr_get_interval to use __kernel_timespec y2038: utimes: Rework #ifdef guards for compat syscalls ...	2018-10-25 11:14:36 -07:00
Linus Torvalds	ba9f6f8954	Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "I have been slowly sorting out siginfo and this is the culmination of that work. The primary result is in several ways the signal infrastructure has been made less error prone. The code has been updated so that manually specifying SEND_SIG_FORCED is never necessary. The conversion to the new siginfo sending functions is now complete, which makes it difficult to send a signal without filling in the proper siginfo fields. At the tail end of the patchset comes the optimization of decreasing the size of struct siginfo in the kernel from 128 bytes to about 48 bytes on 64bit. The fundamental observation that enables this is by definition none of the known ways to use struct siginfo uses the extra bytes. This comes at the cost of a small user space observable difference. For the rare case of siginfo being injected into the kernel only what can be copied into kernel_siginfo is delivered to the destination, the rest of the bytes are set to 0. For cases where the signal and the si_code are known this is safe, because we know those bytes are not used. For cases where the signal and si_code combination is unknown the bits that won't fit into struct kernel_siginfo are tested to verify they are zero, and the send fails if they are not. I made an extensive search through userspace code and I could not find anything that would break because of the above change. If it turns out I did break something it will take just the revert of a single change to restore kernel_siginfo to the same size as userspace siginfo. Testing did reveal dependencies on preferring the signo passed to sigqueueinfo over si->signo, so bit the bullet and added the complexity necessary to handle that case. Testing also revealed bad things can happen if a negative signal number is passed into the system calls. Something no sane application will do but something a malicious program or a fuzzer might do. So I have fixed the code that performs the bounds checks to ensure negative signal numbers are handled" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits) signal: Guard against negative signal numbers in copy_siginfo_from_user32 signal: Guard against negative signal numbers in copy_siginfo_from_user signal: In sigqueueinfo prefer sig not si_signo signal: Use a smaller struct siginfo in the kernel signal: Distinguish between kernel_siginfo and siginfo signal: Introduce copy_siginfo_from_user and use it's return value signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE signal: Fail sigqueueinfo if si_signo != sig signal/sparc: Move EMT_TAGOVF into the generic siginfo.h signal/unicore32: Use force_sig_fault where appropriate signal/unicore32: Generate siginfo in ucs32_notify_die signal/unicore32: Use send_sig_fault where appropriate signal/arc: Use force_sig_fault where appropriate signal/arc: Push siginfo generation into unhandled_exception signal/ia64: Use force_sig_fault where appropriate signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn signal/ia64: Use the generic force_sigsegv in setup_frame signal/arm/kvm: Use send_sig_mceerr signal/arm: Use send_sig_fault where appropriate signal/arm: Use force_sig_fault where appropriate ...	2018-10-24 11:22:39 +01:00
Linus Torvalds	0200fbdd43	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and misc x86 updates from Ingo Molnar: "Lots of changes in this cycle - in part because locking/core attracted a number of related x86 low level work which was easier to handle in a single tree: - Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E. McKenney, Andrea Parri) - lockdep scalability improvements and micro-optimizations (Waiman Long) - rwsem improvements (Waiman Long) - spinlock micro-optimization (Matthew Wilcox) - qspinlocks: Provide a liveness guarantee (more fairness) on x86. (Peter Zijlstra) - Add support for relative references in jump tables on arm64, x86 and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens) - Be a lot less permissive on weird (kernel address) uaccess faults on x86: BUG() when uaccess helpers fault on kernel addresses (Jann Horn) - macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav Amit) - ... and a handful of other smaller changes as well" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) locking/lockdep: Make global debug_locks* variables read-mostly locking/lockdep: Fix debug_locks off performance problem locking/pvqspinlock: Extend node size when pvqspinlock is configured locking/qspinlock_stat: Count instances of nested lock slowpaths locking/qspinlock, x86: Provide liveness guarantee x86/asm: 'Simplify' GEN_*_RMWcc() macros locking/qspinlock: Rework some comments locking/qspinlock: Re-order code locking/lockdep: Remove duplicated 'lock_class_ops' percpu array x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y futex: Replace spin_is_locked() with lockdep locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs x86/extable: Macrofy inline assembly code to work around GCC inlining bugs x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs x86/refcount: Work around GCC inlining bug x86/objtool: Use asm macros to work around GCC inlining bugs ...	2018-10-23 13:08:53 +01:00
Vasily Gorbik	cf3dbe5dac	s390/kasan: support preemptible kernel build When the kernel is built with: CONFIG_PREEMPT=y CONFIG_PREEMPT_COUNT=y "stfle" function used by kasan initialization code makes additional call to preempt_count_add/preempt_count_sub. To avoid removing kasan instrumentation from sched code where those functions leave split stfle function and provide __stfle variant without preemption handling to be used by Kasan. Reported-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-22 08:37:45 +02:00
Ingo Franzki	fb1136d658	s390/pkey: Introduce new API for transforming key blobs Introduce a new ioctl API and in-kernel API to transform a variable length key blob of any supported type into a protected key. Transforming a secure key blob uses the already existing function pkey_sec2protk(). Transforming a protected key blob also verifies if the protected key is still valid. If not, -ENODEV is returned. Both APIs are described in detail in the header files arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h. Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-10 07:37:19 +02:00
Ingo Franzki	cb26b9ff71	s390/pkey: Introduce new API for random protected key verification Introduce a new ioctl API and in-kernel API to verify if a random protected key is still valid. A protected key is invalid when its wrapping key verification pattern does not match the verification pattern of the LPAR. Each time an LPAR is activated, a new LPAR wrapping key is generated and the wrapping key verification pattern is updated. Both APIs are described in detail in the header files arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h. Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-10 07:37:18 +02:00
Ingo Franzki	a45a5c7d36	s390/pkey: Introduce new API for random protected key generation This patch introduces a new ioctl API and in-kernel API to generate a random protected key. The protected key is generated in a way that the effective clear key is never exposed in clear. Both APIs are described in detail in the header files arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h. Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:38 +02:00
Harald Freudenberger	ee410de890	s390/zcrypt: zcrypt device driver cleanup Some cleanup in the s390 zcrypt device driver: - Removed fragments of pcixx crypto card code. This code can't be reached anymore because the hardware detection function does not recognize crypto cards < CEX2 since commit f56545430736 ("s390/zcrypt: Introduce QACT support for AP bus devices.") - Rename of some files and driver names which where still reflecting pcixx support to cex2a/cex2c. - Removed all the zcrypt version strings in the file headers. There is only one place left - the zcrypt.h header file is now the only place for zcrypt device driver version info. - Zcrypt version pump up from 2.2.0 to 2.2.1. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:35 +02:00
Vasily Gorbik	19733fe872	s390/head: avoid doubling early boot stack size under KASAN Early boot stack uses predefined 4 pages of memory 0x8000-0xC000. This stack is used to run not instumented decompressor/facilities verification C code. It doesn't make sense to double its size when the kernel is built with KASAN support. BOOT_STACK_ORDER is introduced to avoid that. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:32 +02:00
Vasily Gorbik	5dff03813f	s390/kasan: add option for 4-level paging support By default 3-level paging is used when the kernel is compiled with kasan support. Add 4-level paging option to support systems with more then 3TB of physical memory and to cover 4-level paging specific code with kasan as well. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:29 +02:00
Vasily Gorbik	135ff16393	s390/kasan: free early identity mapping structures Kasan initialization code is changed to populate persistent shadow first, save allocator position into pgalloc_freeable and proceed with early identity mapping creation. This way early identity mapping paging structures could be freed at once after switching to swapper_pg_dir when early identity mapping is not needed anymore. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:29 +02:00
Vasily Gorbik	5e78596329	s390/kasan: enable stack and global variables access checks By defining KASAN_SHADOW_OFFSET in Kconfig stack and global variables memory access check instrumentation is enabled. gcc version 4.9.2 or newer is also required. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:28 +02:00
Vasily Gorbik	ac1256f826	s390/kasan: reipl and kexec support Some functions from both arch/s390/kernel/ipl.c and arch/s390/kernel/machine_kexec.c are called without DAT enabled (or with and without DAT enabled code paths). There is no easy way to partially disable kasan for those files without a substantial rework. Disable kasan for both files for now. To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is enabled in diag308 call. pcpu_delegate which disables DAT is marked with __no_sanitize_address to disable instrumentation for that one function. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:27 +02:00
Vasily Gorbik	9e8df6daed	s390/smp: kasan stack instrumentation support smp_start_secondary function is called without DAT enabled. To avoid disabling kasan instrumentation for entire arch/s390/kernel/smp.c smp_start_secondary has been split in 2 parts. smp_start_secondary has instrumentation disabled, it does minimal setup and enables DAT. Then instrumentated __smp_start_secondary is called to do the rest. __load_psw_mask function instrumentation has been disabled as well to be able to call it from smp_start_secondary. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:26 +02:00
Vasily Gorbik	d58106c3ec	s390/kasan: use noexec and large pages To lower memory footprint and speed up kasan initialisation detect EDAT availability and use large pages if possible. As we know how much memory is needed for initialisation, another simplistic large page allocator is introduced to avoid memory fragmentation. Since facilities list is retrieved anyhow, detect noexec support and adjust pages attributes. Handle noexec kernel option to avoid inconsistent kasan shadow memory pages flags. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:24 +02:00
Vasily Gorbik	7fef92ccad	s390/kasan: double the stack size Kasan stack instrumentation pads stack variables with redzones, which increases stack frames size significantly. Stack sizes are increased from 16k to 32k in the code, as well as for the kernel stack overflow detection option (CHECK_STACK). Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:21 +02:00
Vasily Gorbik	42db5ed860	s390/kasan: add initialization code and enable it Kasan needs 1/8 of kernel virtual address space to be reserved as the shadow area. And eventually it requires the shadow memory offset to be known at compile time (passed to the compiler when full instrumentation is enabled). Any value picked as the shadow area offset for 3-level paging would eat up identity mapping on 4-level paging (with 1PB shadow area size). So, the kernel sticks to 3-level paging when kasan is enabled. 3TB border is picked as the shadow offset. The memory layout is adjusted so, that physical memory border does not exceed KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END. Due to the fact that on s390 paging is set up very late and to cover more code with kasan instrumentation, temporary identity mapping and final shadow memory are set up early. The shadow memory mapping is later carried over to init_mm.pgd during paging_init. For the needs of paging structures allocation and shadow memory population a primitive allocator is used, which simply chops off memory blocks from the end of the physical memory. Kasan currenty doesn't track vmemmap and vmalloc areas. Current memory layout (for 3-level paging, 2GB physical memory). ---[ Identity Mapping ]--- 0x0000000000000000-0x0000000000100000 ---[ Kernel Image Start ]--- 0x0000000000100000-0x0000000002b00000 ---[ Kernel Image End ]--- 0x0000000002b00000-0x0000000080000000 2G <- physical memory border 0x0000000080000000-0x0000030000000000 3070G PUD I ---[ Kasan Shadow Start ]--- 0x0000030000000000-0x0000030010000000 256M PMD RW X <- shadow for 2G memory 0x0000030010000000-0x0000037ff0000000 523776M PTE RO NX <- kasan zero ro page 0x0000037ff0000000-0x0000038000000000 256M PMD RW X <- shadow for 2G modules ---[ Kasan Shadow End ]--- 0x0000038000000000-0x000003d100000000 324G PUD I ---[ vmemmap Area ]--- 0x000003d100000000-0x000003e080000000 ---[ vmalloc Area ]--- 0x000003e080000000-0x000003ff80000000 ---[ Modules Area ]--- 0x000003ff80000000-0x0000040000000000 2G Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:20 +02:00
Vasily Gorbik	d0e2eb0a36	s390: add pgd_page primitive Add pgd_page primitive which is required by kasan common code. Also fixes typo in p4d_page definition. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:19 +02:00
Vasily Gorbik	34377d3cfb	s390: introduce MAX_PTRS_PER_P4D Kasan common code requires MAX_PTRS_PER_P4D definition, which in case of s390 is always PTRS_PER_P4D. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:18 +02:00
Vasily Gorbik	fb594ec13e	s390/kasan: replace some memory functions Follow the common kasan approach: "KASan replaces memory functions with manually instrumented variants. Original functions declared as weak symbols so strong definitions in mm/kasan/kasan.c could replace them. Original functions have aliases with '__' prefix in name, so we could call non-instrumented variant if needed." Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:18 +02:00
Vasily Gorbik	75f195420a	s390/mm: add missing pfn_to_kaddr helper kasan common code uses pfn_to_kaddr, which is defined by many other architectures. Adding it as well to avoid a build error. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:15 +02:00
Vasily Gorbik	49698745e5	s390: move ipl block and cmd line handling to early boot phase To distinguish zfcpdump case and to be able to parse some of the kernel command line arguments early (e.g. mem=) ipl block retrieval and command line construction code is moved to the early boot phase. "memory_end" is set up correctly respecting "mem=" and hsa_size in case of the zfcpdump. arch/s390/boot/string.c is introduced to provide string handling and command line parsing functions to early boot phase code for the compressed kernel image case. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:14 +02:00
Vasily Gorbik	b09decfd99	s390/sclp: introduce sclp_early_get_hsa_size Introduce sclp_early_get_hsa_size function to be used during early memory detection. This function allows to find a memory limit imposed during zfcpdump. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:13 +02:00
Vasily Gorbik	54c57795e8	s390/mem_detect: replace tprot loop with binary search In a situation when other memory detection methods are not available (no SCLP and no z/VM diag260), continuous online memory is assumed. Replacing tprot loop with faster binary search, as only online memory end has to be found. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:12 +02:00
Vasily Gorbik	cd45c99561	s390/mem_detect: use SCLP info for continuous memory detection When neither SCLP storage info, nor z/VM diag260 "storage configuration" are available assume a continuous online memory of size specified by SCLP info. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:11 +02:00
Vasily Gorbik	6e98e64329	s390/mem_detect: introduce z/VM specific diag260 call In the case when z/VM memory is defined with "define storage config" command, SCLP storage info is not available. Utilize diag260 "storage configuration" call, to get information about z/VM specific guest memory definitions with potential memory holes. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:10 +02:00
Vasily Gorbik	fddbaa5c42	s390/mem_detect: introduce SCLP storage info SCLP storage info allows to detect continuous and non-continuous online memory under LPAR, z/VM and KVM, when standby memory is defined. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:09 +02:00
Vasily Gorbik	6966d604e2	s390/mem_detect: move tprot loop to early boot phase Move memory detection to early boot phase. To store online memory regions "struct mem_detect_info" has been introduced together with for_each_mem_detect_block iterator. mem_detect_info is later converted to memblock. Also introduces sclp_early_get_meminfo function to get maximum physical memory and maximum increment number. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:08 +02:00
Vasily Gorbik	17aacfbfa1	s390/sclp: move sclp_early_read_info to sclp_early_core.c To enable early online memory detection sclp_early_read_info has been moved to sclp_early_core.c. sclp_info_sccb has been made a part of .boot.data, which allows to reuse it later during early kernel startup and make sclp_early_read_info call just once. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:07 +02:00
Vasily Gorbik	d1b52a4388	s390: introduce .boot.data section Introduce .boot.data section which is "shared" between the decompressor code and the decompressed kernel. The decompressor will store values in it, and copy over to the decompressed image before starting it. This method allows to avoid using pre-defined addresses and other hacks to pass values between those boot phases. .boot.data section is a part of init data, and will be freed after kernel initialization is complete. For uncompressed kernel image, .boot.data section is basically the same as .init.data Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:21:06 +02:00
Vasily Gorbik	32ce55a659	s390: unify stack size definitions Remove STACK_ORDER and STACK_SIZE in favour of identical THREAD_SIZE_ORDER and THREAD_SIZE definitions. THREAD_SIZE and THREAD_SIZE_ORDER naming is misleading since it is used as general kernel stack size information. But both those definitions are used in the common code and throughout architectures specific code, so changing the naming is problematic. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:20:58 +02:00
Martin Schwidefsky	ce3dc44749	s390: add support for virtually mapped kernel stacks With virtually mapped kernel stacks the kernel stack overflow detection is now fault based, every stack has a guard page in the vmalloc space. The panic_stack is renamed to nodat_stack and is used for all function that need to run without DAT, e.g. memcpy_real or do_start_kdump. The main effect is a reduction in the kernel image size as with vmap stacks the old style overflow checking that adds two instructions per function is not needed anymore. Result from bloat-o-meter: add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042) In regard to performance the micro-benchmark for fork has a hit of a few microseconds, allocating 4 pages in vmalloc space is more expensive compare to an order-2 page allocation. But with real workload I could not find a noticeable difference. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:20:57 +02:00
Martin Schwidefsky	ff340d2472	s390: add stack switch helper Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:20:56 +02:00
Martin Schwidefsky	f689789a28	s390/appldata: pass parameter list pointer to appldata_asm In preparation for CONFIG_VMAP_STACK=y move the allocation of the struct appldata_parameter_list to the caller of appldata_asm(). Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-09 11:20:50 +02:00
Christian Borntraeger	ed3054a302	Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext	2018-10-08 12:14:54 +02:00
Julian Wiedmann	346e485d42	s390/ccwgroup: add get_ccwgroupdev_by_busid() Provide function to find a ccwgroup device by its busid. Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-08 09:09:59 +02:00
Harald Freudenberger	00fab2350e	s390/zcrypt: multiple zcrypt device nodes support This patch is an extension to the zcrypt device driver to provide, support and maintain multiple zcrypt device nodes. The individual zcrypt device nodes can be restricted in terms of crypto cards, domains and available ioctls. Such a device node can be used as a base for container solutions like docker to control and restrict the access to crypto resources. The handling is done with a new sysfs subdir /sys/class/zcrypt. Echoing a name (or an empty sting) into the attribute "create" creates a new zcrypt device node. In /sys/class/zcrypt a new link will appear which points to the sysfs device tree of this new device. The attribute files "ioctlmask", "apmask" and "aqmask" in this directory are used to customize this new zcrypt device node instance. Finally the zcrypt device node can be destroyed by echoing the name into /sys/class/zcrypt/destroy. The internal structs holding the device info are reference counted - so a destroy will not hard remove a device but only marks it as removable when the reference counter drops to zero. The mask values are bitmaps in big endian order starting with bit 0. So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs attributes accept 2 different formats: * Absolute hex string starting with 0x like "0x12345678" does set the mask starting from left to right. If the given string is shorter than the mask it is padded with 0s on the right. If the string is longer than the mask an error comes back (EINVAL). * Relative format - a concatenation (done with ',') of the terms +<bitnr>[-<bitnr>] or -<bitnr>[-<bitnr>]. <bitnr> may be any valid number (hex, decimal or octal) in the range 0...255. Here are some examples: "+0-15,+32,-128,-0xFF" "-0-255,+1-16,+0x128" "+1,+2,+3,+4,-5,-7-10" A simple usage examples: # create new zcrypt device 'my_zcrypt': echo "my_zcrypt" >/sys/class/zcrypt/create # go into the device dir of this new device echo "my_zcrypt" >create cd my_zcrypt/ ls -l total 0 -rw-r--r-- 1 root root 4096 Jul 20 15:23 apmask -rw-r--r-- 1 root root 4096 Jul 20 15:23 aqmask -r--r--r-- 1 root root 4096 Jul 20 15:23 dev -rw-r--r-- 1 root root 4096 Jul 20 15:23 ioctlmask lrwxrwxrwx 1 root root 0 Jul 20 15:23 subsystem -> ../../../../class/zcrypt ... # customize this zcrypt node clone # enable only adapter 0 and 2 echo "0xa0" >apmask # enable only domain 6 echo "+6" >aqmask # enable all 256 ioctls echo "+0-255" >ioctls # now the /dev/my_zcrypt may be used # finally destroy it echo "my_zcrypt" >/sys/class/zcrypt/destroy Please note that a very similar 'filtering behavior' also applies to the parent z90crypt device. The two mask attributes apmask and aqmask in /sys/bus/ap act the very same for the z90crypt device node. However the implementation here is totally different as the ap bus acts on bind/unbind of queue devices and associated drivers but the effect is still the same. So there are two filters active for each additional zcrypt device node: The adapter/domain needs to be enabled on the ap bus level and it needs to be active on the zcrypt device node level. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-10-08 09:09:58 +02:00
Pierre Morel	0e237e4469	KVM: s390: Tracing APCB changes kvm_arch_crypto_set_masks is a new function to centralize the setup the APCB masks inside the CRYCB SIE satellite. To trace APCB mask changes, we add KVM_EVENT() tracing to both kvm_arch_crypto_set_masks and kvm_arch_crypto_clear_masks. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <1538728270-10340-2-git-send-email-pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-10-05 13:10:18 +02:00
Eric W. Biederman	f283801851	signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE Rework the defintion of struct siginfo so that the array padding struct siginfo to SI_MAX_SIZE can be placed in a union along side of the rest of the struct siginfo members. The result is that we no longer need the __ARCH_SI_PREAMBLE_SIZE or SI_PAD_SIZE definitions. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2018-10-03 16:46:43 +02:00
Christian Borntraeger	55d09dd4c8	Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext	2018-10-01 08:53:23 +02:00
Collin Walling	67d49d52ae	KVM: s390: set host program identifier A host program identifier (HPID) provides information regarding the underlying host environment. A level-2 (VM) guest will have an HPID denoting Linux/KVM, which is set during VCPU setup. A level-3 (VM on a VM) and beyond guest will have an HPID denoting KVM vSIE, which is set for all shadow control blocks, overriding the original value of the HPID. Signed-off-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <1535734279-10204-4-git-send-email-walling@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-10-01 08:51:42 +02:00
Tony Krowiak	37940fb0b6	KVM: s390: device attrs to enable/disable AP interpretation Introduces two new VM crypto device attributes (KVM_S390_VM_CRYPTO) to enable or disable AP instruction interpretation from userspace via the KVM_SET_DEVICE_ATTR ioctl: * The KVM_S390_VM_CRYPTO_ENABLE_APIE attribute enables hardware interpretation of AP instructions executed on the guest. * The KVM_S390_VM_CRYPTO_DISABLE_APIE attribute disables hardware interpretation of AP instructions executed on the guest. In this case the instructions will be intercepted and pass through to the guest. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20180925231641.4954-25-akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-28 15:50:11 +02:00
Tony Krowiak	258287c994	s390: vfio-ap: implement mediated device open callback Implements the open callback on the mediated matrix device. The function registers a group notifier to receive notification of the VFIO_GROUP_NOTIFY_SET_KVM event. When notified, the vfio_ap device driver will get access to the guest's kvm structure. The open callback must ensure that only one mediated device shall be opened per guest. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> Tested-by: Farhan Ali <alifm@linux.ibm.com> Tested-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20180925231641.4954-12-akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-28 15:50:09 +02:00
Heiko Carstens	13ddb52c16	s390/jump_label: Switch to relative references Enable support for relative references in jump_label entries. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20180919065144.25010-10-ard.biesheuvel@linaro.org	2018-09-27 17:56:49 +02:00
Tony Krowiak	42104598ef	KVM: s390: interface to clear CRYCB masks Introduces a new KVM function to clear the APCB0 and APCB1 in the guest's CRYCB. This effectively clears all bits of the APM, AQM and ADM masks configured for the guest. The VCPUs are taken out of SIE to ensure the VCPUs do not get out of sync. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> Tested-by: Farhan Ali <alifm@linux.ibm.com> Tested-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20180925231641.4954-11-akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-26 21:02:59 +02:00
Tony Krowiak	e585b24aeb	KVM: s390: refactor crypto initialization This patch refactors the code that initializes and sets up the crypto configuration for a guest. The following changes are implemented via this patch: 1. Introduces a flag indicating AP instructions executed on the guest shall be interpreted by the firmware. This flag is used to set a bit in the guest's state description indicating AP instructions are to be interpreted. 2. Replace code implementing AP interfaces with code supplied by the AP bus to query the AP configuration. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> Tested-by: Farhan Ali <alifm@linux.ibm.com> Message-Id: <20180925231641.4954-4-akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-26 20:45:20 +02:00
David Hildenbrand	3194cdb711	KVM: s390: introduce and use KVM_REQ_VSIE_RESTART When we change the crycb (or execution controls), we also have to make sure that the vSIE shadow datastructures properly consider the changed values before rerunning the vSIE. We can achieve that by simply using a VCPU request now. This has to be a synchronous request (== handled before entering the (v)SIE again). The request will make sure that the vSIE handler is left, and that the request will be processed (NOP), therefore forcing a reload of all vSIE data (including rebuilding the crycb) when re-entering the vSIE interception handler the next time. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20180925231641.4954-3-akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-26 09:13:20 +02:00
Julian Wiedmann	ccc413f621	s390/qdio: clean up AOB handling I've stumbled over this too many times now... AOBs are only ever used on Output Queues. So in qdio_kick_handler(), move the call to their handler into the Output-only path, and get rid of the convoluted contains_aobs() helper. No functional change. While at it, also remove 1. the unused sbal_state->aob field. For processing an async completion, upper-layer drivers get their AOB pointer from the CQ buffer. 2. an unused EXPORT for qdio_allocate_aob(). External users would have no way of passing an allocated AOB back into qdio.ko anyways... Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-09-20 13:20:29 +02:00
Vasily Gorbik	d1befa6582	s390/vdso: avoid 64-bit vdso mapping for compat tasks vdso_fault used is_compat_task function (on s390 it tests "current" thread_info flags) to distinguish compat tasks and map 31-bit vdso pages. But "current" task might not correspond to mm context. When 31-bit compat inferior is executed under gdb, gdb does PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being 64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped. To avoid this problem a new compat_mm flag has been introduced into mm context. This flag is used in vdso_fault and vdso_mremap instead of is_compat_task. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-09-20 13:20:29 +02:00
Jan Höppner	6779df406b	s390/sclp: Allow to request adapter reset The SCLP event 24 "Adapter Error Notification" supports three different action qualifier of which 'adapter reset' is currently not enabled in the sysfs interface. However, userspace tools might want to be able to use the reset functionality as well. Enable the 'adapter reset' qualifier. Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-09-20 13:20:27 +02:00
Gerald Schaefer	55a5542a54	s390/hibernate: fix error handling when suspend cpu != resume cpu The resume code checks if the resume cpu is the same as the suspend cpu. If not, and if it is also not possible to switch to the suspend cpu, an error message should be printed and the resume process should be stopped by loading a disabled wait psw. The current logic is broken in multiple ways, the message is never printed, and the disabled wait psw never loaded because the kernel panics before that: - sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break on the first 64bit instruction in sclp_early_printk(). - The init stack should be used, but the stack pointer is not set up correctly (missing aghi %r15,-STACK_FRAME_OVERHEAD). - __sclp_early_printk() checks the sclp_init_state. If it is not sclp_init_state_uninitialized, it simply returns w/o printing anything. In the resumed kernel however, sclp_init_state will never be uninitialized. This patch fixes those issues by removing the sam31/ESA logic, adding a correct init stack pointer, and also introducing sclp_early_printk_force() to allow using sclp_early_printk() even when sclp_init_state is not uninitialized. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-09-20 13:20:23 +02:00
Janosch Frank	df88f3181f	KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting We have to do down_write on the mm semaphore to set a bitfield in the mm context. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Fixes: `a4499382` ("KVM: s390: Add huge page enablement control") Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-09-04 11:40:26 +02:00
Arnd Bergmann	4faea239e5	y2038: utimes: Rework #ifdef guards for compat syscalls After changing over to 64-bit time_t syscalls, many architectures will want compat_sys_utimensat() but not respective handlers for utime(), utimes() and futimesat(). This adds a new __ARCH_WANT_SYS_UTIME32 to complement __ARCH_WANT_SYS_UTIME. For now, all 64-bit architectures that support CONFIG_COMPAT set it, but future 64-bit architectures will not (tile would not have needed it either, but got removed). As older 32-bit architectures get converted to using CONFIG_64BIT_TIME, they will have to use __ARCH_WANT_SYS_UTIME32 instead of __ARCH_WANT_SYS_UTIME. Architectures using the generic syscall ABI don't need either of them as they never had a utime syscall. Since the compat_utimbuf structure is now required outside of CONFIG_COMPAT, I'm moving it into compat_time.h. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- changed from last version: - renamed __ARCH_WANT_COMPAT_SYS_UTIME to __ARCH_WANT_SYS_UTIME32	2018-08-29 15:42:23 +02:00
Arnd Bergmann	caf6f9c8a3	asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro The sys_llseek sytem call is needed on all 32-bit architectures and none of the 64-bit ones, so we can remove the __ARCH_WANT_SYS_LLSEEK guard and simplify the include/asm-generic/unistd.h header further. Since 32-bit tasks can run either natively or in compat mode on 64-bit architectures, we have to check for both !CONFIG_64BIT and CONFIG_COMPAT. There are a few 64-bit architectures that also reference sys_llseek in their 64-bit ABI (e.g. sparc), but I verified that those all select CONFIG_COMPAT, so the #if check is still correct here. It's a bit odd to include it in the syscall table though, as it's the same as sys_lseek() on 64-bit, but with strange calling conventions. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2018-08-29 15:42:21 +02:00
Arnd Bergmann	fb37397594	asm-generic: Move common compat types to asm-generic/compat.h While converting compat system call handlers to work on 32-bit architectures, I found a number of types used in those handlers that are identical between all architectures. Let's move all the identical ones into asm-generic/compat.h to avoid having to add even more identical definitions of those types. For unknown reasons, mips defines __compat_gid32_t, __compat_uid32_t and compat_caddr_t as signed, while all others have them unsigned. This seems to be a mistake, but I'm leaving it alone here. The other types all differ by size or alignment on at least on architecture. compat_aio_context_t is currently defined in linux/compat.h but also needed for compat_sys_io_getevents(), so let's move it into the same place. While we still have not decided whether the 32-bit time handling will always use the compat syscalls, or in which form, I think this is a useful cleanup that we can merge regardless. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2018-08-29 15:42:20 +02:00
Arnd Bergmann	82b355d161	y2038: Remove newstat family from default syscall set We have four generations of stat() syscalls: - the oldstat syscalls that are only used on the older architectures - the newstat family that is used on all 64-bit architectures but lacked support for large files on 32-bit architectures. - the stat64 family that is used mostly on 32-bit architectures to replace newstat - statx() to replace all of the above, adding 64-bit timestamps among other things. We already compile stat64 only on those architectures that need it, but newstat is always built, including on those that don't reference it. This adds a new __ARCH_WANT_NEW_STAT symbol along the lines of __ARCH_WANT_OLD_STAT and __ARCH_WANT_STAT64 to control compilation of newstat. All architectures that need it use an explict define, the others now get a little bit smaller, and future architecture (including 64-bit targets) won't ever see it. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2018-08-29 15:42:20 +02:00
Linus Torvalds	e1dbc5a410	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - A couple of patches for the zcrypt driver: + Add two masks to determine which AP cards and queues are host devices, this will be useful for KVM AP device passthrough + Add-on patch to improve the parsing of the new apmask and aqmask + Some code beautification - Second try to reenable the GCC plugins, the first patch set had a patch to do this but the merge somehow missed this - Remove the s390 specific GCC version check and use the generic one - Three patches for kdump, two bug fixes and one cleanup - Three patches for the PCI layer, one bug fix and two cleanups * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: remove gcc version check (4.3 or newer) s390/zcrypt: hex string mask improvements for apmask and aqmask. s390/zcrypt: AP bus support for alternate driver(s) s390/zcrypt: code beautify s390/zcrypt: switch return type to bool for ap_instructions_available() s390/kdump: Remove kzalloc_panic s390/kdump: Fix memleak in nt_vmcoreinfo s390/kdump: Make elfcorehdr size calculation ABI compliant s390/pci: remove fmb address from debug output s390/pci: remove stale rc s390/pci: fix out of bounds access during irq setup s390/zcrypt: fix ap_instructions_available() returncodes s390: reenable gcc plugins for real	2018-08-24 09:31:34 -07:00
Linus Torvalds	7140ad3898	Updates for v4.19: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockde and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW3ruhRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qiM7AP47NhYdSnCFCRUJfrt6PovXmQtuCHt3 c3QMoGGdvzh9YAEAqcSXwh7uLhpHUp1LjMAPkXdZVwNddf4zJQ1zyxQ+EAU= =vgEr -----END PGP SIGNATURE----- Merge tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockdep and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix SPDX format headers to use C++ style comments tracing: Add SPDX License format tags to tracing files tracing: Add SPDX License format to bpf_trace.c blktrace: Add SPDX License format header s390/ftrace: Add -mfentry and -mnop-mcount support tracing: Add -mcount-nop option support tracing: Avoid calling cc-option -mrecord-mcount for every Makefile tracing: Handle CC_FLAGS_FTRACE more accurately Uprobe: Additional argument arch_uprobe to uprobe_write_opcode() Uprobes: Simplify uprobe_register() body tracepoints: Free early tracepoints after RCU is initialized uprobes: Use synchronize_rcu() not synchronize_sched() tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister() ftrace: Remove unused pointer ftrace_swapper_pid tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing/irqsoff: Handle preempt_count for different configs tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing: irqsoff: Account for additional preempt_disable trace: Use rcu_dereference_raw for hooks from trace-event subsystem tracing/kprobes: Fix within_notrace_func() to check only notrace functions ...	2018-08-20 18:32:00 -07:00
Harald Freudenberger	ac2b96f351	s390/zcrypt: code beautify Code beautify by following most of the checkpatch suggestions: - SPDX license identifier line complains by checkpatch - missing space or newline complains by checkpatch - octal numbers for permssions complains by checkpatch - renaming of static sysfs functions complains by checkpatch - fix of block comment complains by checkpatch - fix printf like calls where function name instead of %s __func__ was used - __packed instead of __attribute__((packed)) - init to zero for static variables removed - use of DEVICE_ATTR_RO and DEVICE_ATTR_RW macros No functional code changes or API changes! Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-08-20 16:02:11 +02:00
Harald Freudenberger	9b97e9f555	s390/zcrypt: switch return type to bool for ap_instructions_available() Function ap_instructions_available() had returntype int but in fact returned 1 for true and 0 for false. Changed returntype to bool. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-08-20 16:02:10 +02:00
Linus Torvalds	e61cf2e3a5	Minor code cleanups for PPC. For x86 this brings in PCID emulation and CR3 caching for shadow page tables, nested VMX live migration, nested VMCS shadowing, an optimized IPI hypercall, and some optimizations. ARM will come next week. There is a semantic conflict because tip also added an .init_platform callback to kvm.c. Please keep the initializer from this branch, and add a call to kvmclock_init (added by tip) inside kvm_init_platform (added here). Also, there is a backmerge from 4.18-rc6. This is because of a refactoring that conflicted with a relatively late bugfix and resulted in a particularly hellish conflict. Because the conflict was only due to unfortunate timing of the bugfix, I backmerged and rebased the refactoring rather than force the resolution on you. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJbdwNFAAoJEL/70l94x66DiPEH/1cAGZWGd85Y3yRu1dmTmqiz kZy0V+WTQ5kyJF4ZsZKKOp+xK7Qxh5e9kLdTo70uPZCHwLu9IaGKN9+dL9Jar3DR yLPX5bMsL8UUed9g9mlhdaNOquWi7d7BseCOnIyRTolb+cqnM5h3sle0gqXloVrS UQb4QogDz8+86czqR8tNfazjQRKW/D2HEGD5NDNVY1qtpY+leCDAn9/u6hUT5c6z EtufgyDh35UN+UQH0e2605gt3nN3nw3FiQJFwFF1bKeQ7k5ByWkuGQI68XtFVhs+ 2WfqL3ftERkKzUOy/WoSJX/C9owvhMcpAuHDGOIlFwguNGroZivOMVnACG1AI3I= =9Mgw -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull first set of KVM updates from Paolo Bonzini: "PPC: - minor code cleanups x86: - PCID emulation and CR3 caching for shadow page tables - nested VMX live migration - nested VMCS shadowing - optimized IPI hypercall - some optimizations ARM will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (85 commits) kvm: x86: Set highest physical address bits in non-present/reserved SPTEs KVM/x86: Use CC_SET()/CC_OUT in arch/x86/kvm/vmx.c KVM: X86: Implement PV IPIs in linux guest KVM: X86: Add kvm hypervisor init time platform setup callback KVM: X86: Implement "send IPI" hypercall KVM/x86: Move X86_CR4_OSXSAVE check into kvm_valid_sregs() KVM: x86: Skip pae_root shadow allocation if tdp enabled KVM/MMU: Combine flushing remote tlb in mmu_set_spte() KVM: vmx: skip VMWRITE of HOST_{FS,GS}_BASE when possible KVM: vmx: skip VMWRITE of HOST_{FS,GS}_SEL when possible KVM: vmx: always initialize HOST_{FS,GS}_BASE to zero during setup KVM: vmx: move struct host_state usage to struct loaded_vmcs KVM: vmx: compute need to reload FS/GS/LDT on demand KVM: nVMX: remove a misleading comment regarding vmcs02 fields KVM: vmx: rename __vmx_load_host_state() and vmx_save_host_state() KVM: vmx: add dedicated utility to access guest's kernel_gs_base KVM: vmx: track host_state.loaded using a loaded_vmcs pointer KVM: vmx: refactor segmentation code in vmx_save_host_state() kvm: nVMX: Fix fault priority for VMX operations kvm: nVMX: Fix fault vector for VMX operation at CPL > 0 ...	2018-08-19 10:38:36 -07:00
Harald Freudenberger	2395103b3f	s390/zcrypt: fix ap_instructions_available() returncodes During review of KVM patches it was complained that the ap_instructions_available() function returns 0 if AP instructions are available and -ENODEV if not. The function acts like a boolean function to check for AP instructions available and thus should return 0 on failure and != 0 on success. Changed to the suggested behaviour and adapted the one and only caller of this function which is the ap bus core code. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2018-08-16 14:49:11 +02:00
Vasily Gorbik	d983c89cc9	s390/ftrace: Add -mfentry and -mnop-mcount support Utilize -mfentry and -mnop-mcount gcc options together with -mrecord-mcount to get compiler generated calls to the profiling functions as nops which are compatible with current -mhotpatch=0,3 approach. At the same time -mrecord-mcount enables __mcount_loc section generation by the compiler which allows to avoid using scripts/recordmcount.pl script. Link: http://lkml.kernel.org/r/patch-4.thread-aa7b8d.git-aa7b8dbf236f.your-ad-here.call-01533557518-ext-9465@work.hours Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-08-15 22:39:53 -04:00
Linus Torvalds	9a76aba02a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: - Gustavo A. R. Silva keeps working on the implicit switch fallthru changes. - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From Luca Coelho. - Re-enable ASPM in r8169, from Kai-Heng Feng. - Add virtual XFRM interfaces, which avoids all of the limitations of existing IPSEC tunnels. From Steffen Klassert. - Convert GRO over to use a hash table, so that when we have many flows active we don't traverse a long list during accumluation. - Many new self tests for routing, TC, tunnels, etc. Too many contributors to mention them all, but I'm really happy to keep seeing this stuff. - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu. - Lots of cleanups and fixes in L2TP code from Guillaume Nault. - Add IPSEC offload support to netdevsim, from Shannon Nelson. - Add support for slotting with non-uniform distribution to netem packet scheduler, from Yousuk Seung. - Add UDP GSO support to mlx5e, from Boris Pismenny. - Support offloading of Team LAG in NFP, from John Hurley. - Allow to configure TX queue selection based upon RX queue, from Amritha Nambiar. - Support ethtool ring size configuration in aquantia, from Anton Mikaev. - Support DSCP and flowlabel per-transport in SCTP, from Xin Long. - Support list based batching and stack traversal of SKBs, this is very exciting work. From Edward Cree. - Busyloop optimizations in vhost_net, from Toshiaki Makita. - Introduce the ETF qdisc, which allows time based transmissions. IGB can offload this in hardware. From Vinicius Costa Gomes. - Add parameter support to devlink, from Moshe Shemesh. - Several multiplication and division optimizations for BPF JIT in nfp driver, from Jiong Wang. - Lots of prepatory work to make more of the packet scheduler layer lockless, when possible, from Vlad Buslov. - Add ACK filter and NAT awareness to sch_cake packet scheduler, from Toke Høiland-Jørgensen. - Support regions and region snapshots in devlink, from Alex Vesker. - Allow to attach XDP programs to both HW and SW at the same time on a given device, with initial support in nfp. From Jakub Kicinski. - Add TLS RX offload and support in mlx5, from Ilya Lesokhin. - Use PHYLIB in r8169 driver, from Heiner Kallweit. - All sorts of changes to support Spectrum 2 in mlxsw driver, from Ido Schimmel. - PTP support in mv88e6xxx DSA driver, from Andrew Lunn. - Make TCP_USER_TIMEOUT socket option more accurate, from Jon Maxwell. - Support for templates in packet scheduler classifier, from Jiri Pirko. - IPV6 support in RDS, from Ka-Cheong Poon. - Native tproxy support in nf_tables, from Máté Eckl. - Maintain IP fragment queue in an rbtree, but optimize properly for in-order frags. From Peter Oskolkov. - Improvde handling of ACKs on hole repairs, from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits) bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT" hv/netvsc: Fix NULL dereference at single queue mode fallback net: filter: mark expected switch fall-through xen-netfront: fix warn message as irq device name has '/' cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0 net: dsa: mv88e6xxx: missing unlock on error path rds: fix building with IPV6=m inet/connection_sock: prefer _THIS_IP_ to current_text_addr net: dsa: mv88e6xxx: bitwise vs logical bug net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() ieee802154: hwsim: using right kind of iteration net: hns3: Add vlan filter setting by ethtool command -K net: hns3: Set tx ring' tc info when netdev is up net: hns3: Remove tx ring BD len register in hns3_enet net: hns3: Fix desc num set to default when setting channel net: hns3: Fix for phy link issue when using marvell phy driver net: hns3: Fix for information of phydev lost problem when down/up net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero net: hns3: Add support for serdes loopback selftest bnxt_en: take coredump_record structure off stack ...	2018-08-15 15:04:25 -07:00
Linus Torvalds	85a0b791bc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: "Since Martin is on vacation you get the s390 pull request from me: - Host large page support for KVM guests. As the patches have large impact on arch/s390/mm/ this series goes out via both the KVM and the s390 tree. - Add an option for no compression to the "Kernel compression mode" menu, this will come in handy with the rework of the early boot code. - A large rework of the early boot code that will make life easier for KASAN and KASLR. With the rework the bootable uncompressed image is not generated anymore, only the bzImage is available. For debuggung purposes the new "no compression" option is used. - Re-enable the gcc plugins as the issue with the latent entropy plugin is solved with the early boot code rework. - More spectre relates changes: + Detect the etoken facility and remove expolines automatically. + Add expolines to a few more indirect branches. - A rewrite of the common I/O layer trace points to make them consumable by 'perf stat'. - Add support for format-3 PCI function measurement blocks. - Changes for the zcrypt driver: + Add attributes to indicate the load of cards and queues. + Restructure some code for the upcoming AP device support in KVM. - Build flags improvements in various Makefiles. - A few fixes for the kdump support. - A couple of patches for gcc 8 compile warning cleanup. - Cleanup s390 specific proc handlers. - Add s390 support to the restartable sequence self tests. - Some PTR_RET vs PTR_ERR_OR_ZERO cleanup. - Lots of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (107 commits) s390/dasd: fix hanging offline processing due to canceled worker s390/dasd: fix panic for failed online processing s390/mm: fix addressing exception after suspend/resume rseq/selftests: add s390 support s390: fix br_r1_trampoline for machines without exrl s390/lib: use expoline for all bcr instructions s390/numa: move initial setup of node_to_cpumask_map s390/kdump: Fix elfcorehdr size calculation s390/cpum_sf: save TOD clock base in SDBs for time conversion KVM: s390: Add huge page enablement control s390/mm: Add huge page gmap linking support s390/mm: hugetlb pages within a gmap can not be freed KVM: s390: Add skey emulation fault handling s390/mm: Add huge pmd storage key handling s390/mm: Clear skeys for newly mapped huge guest pmds s390/mm: Clear huge page storage keys on enable_skey s390/mm: Add huge page dirty sync support s390/mm: Add gmap pmd invalidation and clearing s390/mm: Add gmap pmd notification bit setting s390/mm: Add gmap pmd linking ...	2018-08-13 19:07:17 -07:00
Linus Torvalds	8603596a32	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf update from Thomas Gleixner: "The perf crowd presents: Kernel updates: - Removal of jprobes - Cleanup and consolidatation the handling of kprobes - Cleanup and consolidation of hardware breakpoints - The usual pile of fixes and updates to PMUs and event descriptors Tooling updates: - Updates and improvements all over the place. Nothing outstanding, just the (good) boring incremental grump work" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) perf trace: Do not require --no-syscalls to suppress strace like output perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h perf tools: Allow overriding MAX_NR_CPUS at compile time perf bpf: Show better message when failing to load an object perf list: Unify metric group description format with PMU event description perf vendor events arm64: Update ThunderX2 implementation defined pmu core events perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet perf cs-etm: Fix start tracing packet handling perf build: Fix installation directory for eBPF perf c2c report: Fix crash for empty browser perf tests: Fix indexing when invoking subtests perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg perf trace beauty: Do not print NULL strarray entries perf beauty: Add a generator for IPPROTO_ socket's protocol constants tools include uapi: Grab a copy of linux/in.h perf tests: Fix complex event name parsing perf evlist: Fix error out while applying initial delay and LBR ...	2018-08-13 12:55:49 -07:00
Hendrik Brueckner	5223c67167	s390/cpum_sf: save TOD clock base in SDBs for time conversion Processing the samples in the AUX-area by perf requires the computation of respective time stamps. The time stamps used by perf are based on the monotonic clock. To convert the TOD clock value contained in an SDB to a monotonic clock value, the TOD clock base is required. Hence, also save the TOD clock base in the SDB. Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-07-31 11:02:27 +02:00
Martin Schwidefsky	03760d44b1	KVM: s390: initial host large page support - must be enabled via module parameter hpage=1 - cannot be used together with nested - does support migration - does support hugetlbfs - no THP yet -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJbX4AwAAoJEBF7vIC1phx85eMP/ifsNHwqfAOrZBdlJuLVPla5 47J8iY4i4DOKGhKI4YOTcJQhn1izKZhECXS8d8hghB/sQUCE2CLVr1X/r1Udy2Pq bpKG4apYtcJZBF6qn7yDMjBGkIRK4OCBD1pkuKEq2NyvUgPsHUVUgpuq2gngMTBk ZN9MIfRQMdIEJsT389D6T9as0lwABJ0MJap5AudkQwguN2dDhQGeZv8l0QYV8C2I WqRI2VsI1QEo3cJr1lJ5li/F9fC7q0l6QwlvPVocIHJAnq01zJvOekeAgQ4hzz16 JIoQckJq8m4d4PqZ7aWmAaMEemoQ9llmCavovspJNtFT79jho6cWWtBEvq+t0GLQ qTsG9Yi20hONZMWAw+JIdSdOuFMD0HCpOWdUtSMjENFRbr8LLHUr91dGIxRLjF8Z gv3vDJrbGzCQ+b9qPA8SrAN7U3VNCZG384MEmobwTuv5hxOopWp6chcK7RCriV/m 7cFDfO7+2pZymdW7D4DWlFiZl4mWpwOxip32C9tCt0CQveqeYSZsb5Qb9Pe+50vr JhpB74UL79Wffvd65InGlu5jx1SdGG0QAzmBOkdOsAhX+0WMmXRB1ddn4whu7HPU ssNtdKgLt9KkM/kIsB9RC/YLvUFK1lBVHrfnzUmLw3CBHP3QeO+V+arLwdVLVDjV PA/LPECBWtGtQtxGWb2H =Y0Wl -----END PGP SIGNATURE----- Merge tag 'hlp_stage1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features Pull hlp_stage1 from Christian Borntraeger with the following changes: KVM: s390: initial host large page support - must be enabled via module parameter hpage=1 - cannot be used together with nested - does support migration - does support hugetlbfs - no THP yet	2018-07-31 07:14:19 +02:00
Janosch Frank	2375846193	KVM: s390: initial host large page support - must be enabled via module parameter hpage=1 - cannot be used together with nested - does support migration - does support hugetlbfs - no THP yet -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJbX4AwAAoJEBF7vIC1phx85eMP/ifsNHwqfAOrZBdlJuLVPla5 47J8iY4i4DOKGhKI4YOTcJQhn1izKZhECXS8d8hghB/sQUCE2CLVr1X/r1Udy2Pq bpKG4apYtcJZBF6qn7yDMjBGkIRK4OCBD1pkuKEq2NyvUgPsHUVUgpuq2gngMTBk ZN9MIfRQMdIEJsT389D6T9as0lwABJ0MJap5AudkQwguN2dDhQGeZv8l0QYV8C2I WqRI2VsI1QEo3cJr1lJ5li/F9fC7q0l6QwlvPVocIHJAnq01zJvOekeAgQ4hzz16 JIoQckJq8m4d4PqZ7aWmAaMEemoQ9llmCavovspJNtFT79jho6cWWtBEvq+t0GLQ qTsG9Yi20hONZMWAw+JIdSdOuFMD0HCpOWdUtSMjENFRbr8LLHUr91dGIxRLjF8Z gv3vDJrbGzCQ+b9qPA8SrAN7U3VNCZG384MEmobwTuv5hxOopWp6chcK7RCriV/m 7cFDfO7+2pZymdW7D4DWlFiZl4mWpwOxip32C9tCt0CQveqeYSZsb5Qb9Pe+50vr JhpB74UL79Wffvd65InGlu5jx1SdGG0QAzmBOkdOsAhX+0WMmXRB1ddn4whu7HPU ssNtdKgLt9KkM/kIsB9RC/YLvUFK1lBVHrfnzUmLw3CBHP3QeO+V+arLwdVLVDjV PA/LPECBWtGtQtxGWb2H =Y0Wl -----END PGP SIGNATURE----- Merge tag 'hlp_stage1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvms390/next KVM: s390: initial host large page support - must be enabled via module parameter hpage=1 - cannot be used together with nested - does support migration - does support hugetlbfs - no THP yet	2018-07-30 23:20:48 +02:00
Janosch Frank	a9e00d8349	s390/mm: Add huge page gmap linking support Let's allow huge pmd linking when enabled through the KVM_CAP_S390_HPAGE_1M capability. Also we can now restrict gmap invalidation and notification to the cases where the capability has been activated and save some cycles when that's not the case. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2018-07-30 23:13:38 +02:00
Janosch Frank	57cb198cfd	KVM: s390: Beautify skey enable check Let's introduce an explicit check if skeys have already been enabled for the vcpu, so we don't have to check the mm context if we don't have the storage key facility. This lets us check for enablement without having to take the mm semaphore and thus speedup skey emulation. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-07-30 17:05:52 +02:00
Janosch Frank	3afdfca698	s390/mm: Clear skeys for newly mapped huge guest pmds Similarly to the pte skey handling, where we set the storage key to the default key for each newly mapped pte, we have to also do that for huge pmds. With the PG_arch_1 flag we keep track if the area has already been cleared of its skeys. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-07-30 11:20:18 +01:00
Janosch Frank	0959e16867	s390/mm: Add huge page dirty sync support To do dirty loging with huge pages, we protect huge pmds in the gmap. When they are written to, we unprotect them and mark them dirty. We introduce the function gmap_test_and_clear_dirty_pmd which handles dirty sync for huge pages. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com>	2018-07-30 11:20:18 +01:00
Janosch Frank	6a3762778d	s390/mm: Add gmap pmd invalidation and clearing If the host invalidates a pmd, we also have to invalidate the corresponding gmap pmds, as well as flush them from the TLB. This is necessary, as we don't share the pmd tables between host and guest as we do with ptes. The clearing part of these three new functions sets a guest pmd entry to _SEGMENT_ENTRY_EMPTY, so the guest will fault on it and we will re-link it. Flushing the gmap is not necessary in the host's lazy local and csp cases. Both purge the TLB completely. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: David Hildenbrand <david@redhat.com>	2018-07-30 11:20:17 +01:00
Janosch Frank	7c4b13a7c0	s390/mm: Add gmap pmd notification bit setting Like for ptes, we also need invalidation notification for pmds, to make sure the guest lowcore pages are always accessible and later addition of shadowed pmds. With PMDs we do not have PGSTEs or some other bits we could use in the host PMD. Instead we pick one of the free bits in the gmap PMD. Every time a host pmd will be invalidated, we will check if the respective gmap PMD has the bit set and in that case fire up the notifier. Signed-off-by: Janosch Frank <frankja@linux.ibm.com>	2018-07-30 11:20:17 +01:00
Janosch Frank	58b7e200d2	s390/mm: Add gmap pmd linking Let's allow pmds to be linked into gmap for the upcoming s390 KVM huge page support. Before this patch we copied the full userspace pmd entry. This is not correct, as it contains SW defined bits that might be interpreted differently in the GMAP context. Now we only copy over all hardware relevant information leaving out the software bits. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2018-07-30 11:20:17 +01:00
Janosch Frank	2c46e974dd	s390/mm: Abstract gmap notify bit setting Currently we use the software PGSTE bits PGSTE_IN_BIT and PGSTE_VSIE_BIT to notify before an invalidation occurs on a prefix page or a VSIE page respectively. Both bits are pgste specific, but are used when protecting a memory range. Let's introduce abstract GMAP_NOTIFY_* bits that will be realized into the respective bits when gmap DAT table entries are protected. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2018-07-30 11:20:17 +01:00
Christian Borntraeger	a3da7b4a3b	KVM: s390: add etoken support for guests We want to provide facility 156 (etoken facility) to our guests. This includes migration support (via sync regs) and VSIE changes. The tokens are being reset on clear reset. This has to be implemented by userspace (via sync regs). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com>	2018-07-19 12:59:36 +02:00
Sebastian Ott	ccaabeea02	s390/chsc: fix packed-not-aligned warnings Remove attribute packed where possible failing this add proper alignment information to fix warnings like the one below: drivers/s390/cio/chsc.c: In function 'chsc_siosl': drivers/s390/cio/chsc.c:1287:2: warning: alignment 1 of 'struct <anonymous>' is less than 4 [-Wpacked-not-aligned] } __attribute__ ((packed)) siosl_area; Note: this patch should be a nop since non of these structs use auto storage but allocated pages. However there are changes to the generated code because of additional padding at the end of some of the structs due to alignment when memset(foo, 0, sizeof(foo)) is used. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-07-17 07:27:56 +02:00
Claudio Imbrenda	afdad61615	KVM: s390: Fix storage attributes migration with memory slots This is a fix for several issues that were found in the original code for storage attributes migration. Now no bitmap is allocated to keep track of dirty storage attributes; the extra bits of the per-memslot bitmap that are always present anyway are now used for this purpose. The code has also been refactored a little to improve readability. Fixes: `190df4a212` ("KVM: s390: CMMA tracking, ESSA emulation, migration mode") Fixes: `4036e3874a` ("KVM: s390: ioctls to get and set guest storage attributes") Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Message-Id: <1525106005-13931-3-git-send-email-imbrenda@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2018-07-13 09:48:57 +02:00
Philipp Rudo	287d6070ac	s390/purgatory: Remove duplicate variable definitions Currently there are some variables in the purgatory (e.g. kernel_entry) which are defined twice, once in assembler- and once in c-code. The reason for this is that these variables are set during purgatory load, where sanity checks on the corresponding Elf_Sym's are made, while they are used in assembler-code. Thus adding a second definition in c-code is a handy workaround to guarantee correct Elf_Sym's are created. When the purgatory is compiled with -fcommon (default for gcc on s390) this is no problem because both symbols are merged by the linker. However this is not required by ISO C and when the purgatory is built with -fno-common the linker fails with errors like arch/s390/purgatory/purgatory.o:(.bss+0x18): multiple definition of `kernel_entry' arch/s390/purgatory/head.o:/.../arch/s390/purgatory/head.S:230: first defined here Thus remove the duplicate definitions and add the required size and type information to the assembler definition. Also add -fno-common to the command line options to prevent similar hacks in the future. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2018-07-06 08:47:51 +02:00

... 2 3 4 5 6 ...

2541 Commits