linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-12 10:46:47 +07:00

Author	SHA1	Message	Date
Tom Herbert	a87cb3e48e	net: Facility to report route quality of connected sockets This patch add the SO_CNX_ADVICE socket option (setsockopt only). The purpose is to allow an application to give feedback to the kernel about the quality of the network path for a connected socket. The value argument indicates the type of quality report. For this initial patch the only supported advice is a value of 1 which indicates "bad path, please reroute"-- the action taken by the kernel is to call dst_negative_advice which will attempt to choose a different ECMP route, reset the TX hash for flow label and UDP source port in encapsulation, etc. This facility should be useful for connected UDP sockets where only the application can provide any feedback about path quality. It could also be useful for TCP applications that have additional knowledge about the path outside of the normal TCP control loop. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-25 22:01:22 -05:00
Marcelo Tosatti	8577370fb0	KVM: Use simple waitqueue for vcpu->wq The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-02-25 11:27:16 +01:00
Heiko Carstens	993e068108	s390/oprofile: add z13/z13s model numbers Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-24 12:18:10 +01:00
Heiko Carstens	bb3aa61401	s390: add z13s model number to z13 elf platform Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-24 12:18:07 +01:00
Martin Schwidefsky	13c6a790d7	s390/mm: correct comment about segment table entries The comment describing the bit encoding for segment table entries is incorrect in regard to the read and write bits. The segment read bit is 0x0002 and write is 0x0001, not the other way around. Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:20 +01:00
Heiko Carstens	758d39ebd3	s390/dumpstack: merge all four stack tracers We have four different stack tracers of which three had bugs. So it's time to merge them to a single stack tracer which allows to specify a call back function which will be called for each step. This patch changes behavior a bit: - the "nosched" and "in_sched_functions" check within save_stack_trace_tsk did work only for the last stack frame within a context. Now it considers the check for each stack frame like it should. - both the oprofile variant and the perf_events variant did save a return address twice if a zero back chain was detected, which indicates an interrupt frame. The new dump_trace function will call the oprofile and perf_events backends with the psw address that is contained within the corresponding pt_regs structure instead. - the original show_trace and save_context_stack functions did already use the psw address of the pt_regs structure if a zero back chain was detected. However now we ignore the psw address if it is a user space address. After all we trace the kernel stack and not the user space stack. This way we also get rid of the garbage user space address in case of warnings and / or panic call traces. So this should make life easier since now there is only one stack tracer left which we can break. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:20 +01:00
Martin Schwidefsky	3c2c126a67	s390/mm: remove unnecessary indirection with pgste_update_all The first parameter of pgste_update_all is a pointer to a pte. Simplify the code by passing the pte value. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:19 +01:00
Heiko Carstens	1f021ea0da	s390/dumpstack: use bit fields to decode psw mask in show_registers() Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:19 +01:00
Heiko Carstens	ca0fd75a54	s390/dumpstack: add missing ri bit to show_registers() output Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:18 +01:00
Heiko Carstens	76737ce17a	s390: add current_stack_pointer() helper function Implement current_stack_pointer() helper function and use it everywhere, instead of having several different inline assembly variants. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:18 +01:00
Heiko Carstens	77ec8a5451	s390/stacktrace: use nosched instead of savesched parameter Use the inversed "nosched" logic like all other architectures. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:17 +01:00
Martin Schwidefsky	007ccec53d	s390/pageattr: do a single TLB flush for change_page_attr The change of the access rights for an address range in the kernel address space is currently done with a loop of IPTE + a store of the modified PTE. Between the IPTE and the store the PTE will be invalid, this intermediate state can cause problems with concurrent accesses. Consider a change of a kernel area from read-write to read-only, a concurrent reader of that area should be fine but with the invalid PTE it might get an unexpected exception. Remove the IPTEs for each PTE and do a global flush after all PTEs have been modified. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:17 +01:00
Martin Schwidefsky	2cfc5f9ce7	s390/xor: optimized xor routing using the XC instruction Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:17 +01:00
Sebastian Ott	9a99649f2a	s390/pci: remove pdev pointer from arch data For each PCI function we need to maintain arch specific data in struct zpci_dev which also contains a pointer to struct pci_dev. When a function is registered or deregistered (which is triggered by PCI common code) we need to adjust that pointer which could interfere with the machine check handler (triggered by FW) using zpci_dev->pdev. Since multiple instances of the same pdev could exist at a time this can't be solved with locking. Fix that by ditching the pdev pointer and use a bus walk to reach struct pci_dev (only one instance of a pdev can be registered at the bus at a time). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-23 08:56:16 +01:00
Martin Schwidefsky	1b17cb796f	s390/fpu: signals vs. floating point control register git commit `904818e2f2` "s390/kernel: introduce fpu-internal.h with fpu helper functions" introduced the fpregs_store / fp_regs_load helper. These function fail to save and restore the floating pointer control registers. The effect is that the FPC is not correctly handled on signal delivery and signal return. Cc: stable@vger.kernel.org # 4.4 Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-22 09:29:35 +01:00
Martin Schwidefsky	342300cc9c	s390/compat: correct restore of high gprs on signal return git commit `8070361799` "s390: add support for vector extension" broke 31-bit compat processes in regard to signal handling. The restore_sigregs_ext32() function is used to restore the additional elements from the user space signal frame. Among the additional elements are the upper registers halves for 64-bit register support for 31-bit processes. The copy_from_user that is used to retrieve the high-gprs array from the user stack uses an incorrect length, 8 bytes instead of 64 bytes. This causes incorrect upper register halves to get loaded. Cc: stable@vger.kernel.org # 3.8+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-22 09:29:35 +01:00
Linus Torvalds	ff5f16820f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Several bug fixes: - There are four different stack tracers, and three of them have bugs. For 4.5 the bugs are fixed and we prepare a cleanup patch for the next merge window. - Three bug fixes for the dasd driver in regard to parallel access volumes and the new max_dev_sectors block device queue limit - The irq restore optimization needs a fixup for memcpy_real - The diagnose trace code has a conflict with lockdep" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: fix performance drop s390/maccess: reduce stnsm instructions s390/diag: avoid lockdep recursion s390/dasd: fix refcount for PAV reassignment s390/dasd: prevent incorrect length error under z/VM after PAV changes s390: fix DAT off memory access, e.g. on kdump s390/oprofile: fix address range for asynchronous stack s390/perf_event: fix address range for asynchronous stack s390/stacktrace: add save_stack_trace_regs() s390/stacktrace: save full stack traces s390/stacktrace: add missing end marker s390/stacktrace: fix address ranges for asynchronous and panic stack s390/stacktrace: fix save_stack_trace_tsk() for current task	2016-02-19 08:33:12 -08:00
Linus Torvalds	705d43dbe1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fixes from Jiri Kosina: - regression (from 4.4) fix for ordering issue, introduced by an earlier ftrace change, that broke live patching of modules. The fix replaces the ftrace module notifier by direct call in order to make the ordering guaranteed and well-defined. The patch, from Jessica Yu, has been acked both by Steven and Rusty - error message fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: ftrace/module: remove ftrace module notifier livepatch: change the error message in asm/livepatch.h header files	2016-02-18 16:34:15 -08:00
Dave Hansen	d61172b4b6	mm/core, x86/mm/pkeys: Differentiate instruction fetches As discussed earlier, we attempt to enforce protection keys in software. However, the code checks all faults to ensure that they are not violating protection key permissions. It was assumed that all faults are either write faults where we check PKRU[key].WD (write disable) or read faults where we check the AD (access disable) bit. But, there is a third category of faults for protection keys: instruction faults. Instruction faults never run afoul of protection keys because they do not affect instruction fetches. So, plumb the PF_INSTR bit down in to the arch_vma_access_permitted() function where we do the protection key checks. We also add a new FAULT_FLAG_INSTRUCTION. This is because handle_mm_fault() is not passed the architecture-specific error_code where we keep PF_INSTR, so we need to encode the instruction fetch information in to the arch-generic fault flags. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210224.96928009@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-18 19:46:29 +01:00
Dave Hansen	1b2ee1266e	mm/core: Do not enforce PKEY permissions on remote mm access We try to enforce protection keys in software the same way that we do in hardware. (See long example below). But, we only want to do this when accessing our own process's memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then tried to PTRACE_POKE a target process which just happened to have some mprotect_pkey(pkey=6) memory, we do not want to deny the debugger access to that memory. PKRU is fundamentally a thread-local structure and we do not want to enforce it on access to _another_ thread's data. This gets especially tricky when we have workqueues or other delayed-work mechanisms that might run in a random process's context. We can check that we only enforce pkeys when operating on our own mm, but delayed work gets performed when a random user context is active. We might end up with a situation where a delayed-work gup fails when running randomly under its "own" task but succeeds when running under another process. We want to avoid that. To avoid that, we use the new GUP flag: FOLL_REMOTE and add a fault flag: FAULT_FLAG_REMOTE. They indicate that we are walking an mm which is not guranteed to be the same as current->mm and should not be subject to protection key enforcement. Thanks to Jerome Glisse for pointing out this scenario. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: iommu@lists.linux-foundation.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-18 19:46:28 +01:00
Dave Hansen	33a709b25a	mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys Today, for normal faults and page table walks, we check the VMA and/or PTE to ensure that it is compatible with the action. For instance, if we get a write fault on a non-writeable VMA, we SIGSEGV. We try to do the same thing for protection keys. Basically, we try to make sure that if a user does this: mprotect(ptr, size, PROT_NONE); ptr = foo; they see the same effects with protection keys when they do this: mprotect(ptr, size, PROT_READ\|PROT_WRITE); set_pkey(ptr, size, 4); wrpkru(0xffffff3f); // access disable pkey 4 ptr = foo; The state to do that checking is in the VMA, but we also sometimes have to do it on the page tables only, like when doing a get_user_pages_fast() where we have no VMA. We add two functions and expose them to generic code: arch_pte_access_permitted(pte_flags, write) arch_vma_access_permitted(vma, write) These are, of course, backed up in x86 arch code with checks against the PTE or VMA's protection key. But, there are also cases where we do not want to respect protection keys. When we ptrace(), for instance, we do not want to apply the tracer's PKRU permissions to the PTEs from the process being traced. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20160212210219.14D5D715@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-18 09:32:44 +01:00
Heiko Carstens	52499d93d6	s390/maccess: reduce stnsm instructions When fixing the DAT off bug ("s390: fix DAT off memory access, e.g. on kdump") both Christian and I missed that we can save an additional stnsm instruction. This saves us a couple of cycles which could improve the speed of memcpy_real. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-17 09:05:04 +01:00
Stephan Mueller	49abc0d2e1	crypto: xts - fix compile errors Commit `28856a9e52` missed the addition of the crypto/xts.h include file for different architecture-specific AES implementations. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-02-17 15:53:02 +08:00
Stephan Mueller	28856a9e52	crypto: xts - consolidate sanity check for keys The patch centralizes the XTS key check logic into the service function xts_check_key which is invoked from the different XTS implementations. With this, the XTS implementations in ARM, ARM64, PPC and S390 have now a sanity check for the XTS keys similar to the other arches. In addition, this service function received a check to ensure that the key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the check is not present in the standards defining XTS, it is only enforced in FIPS mode of the kernel. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-02-17 04:07:51 +08:00
Dave Hansen	d4edcf0d56	mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm We will soon modify the vanilla get_user_pages() so it can no longer be used on mm/tasks other than 'current/current->mm', which is by far the most common way it is called. For now, we allow the old-style calls, but warn when they are used. (implemented in previous patch) This patch switches all callers of: get_user_pages() get_user_pages_unlocked() get_user_pages_locked() to stop passing tsk/mm so they will no longer see the warnings. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: jack@suse.cz Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-16 10:11:12 +01:00
Heiko Carstens	f6c9b16023	s390/diag: avoid lockdep recursion The diagnose tracer will indirectly call back into the lockdep code when lockdep does not expect it (arch_spinlock). This causes lockdep to disable itself and therefore we don't have a working lock dependency validator anymore. This patch effectively disables tracing of diag 0x9c and 0x44 if lockdep is enabled. If however lockdep is enabled spinlocks are mainly implemented using a trylock variant, which will not issue any diag 0x9c or 0x44. So this change has hardly any effect on tracing except when arch_spinlock and friends are explicitly used. Reported-and-Tested-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-11 13:05:56 +01:00
Christian Borntraeger	0986d97741	s390: fix DAT off memory access, e.g. on kdump commit `204ee2c564` ("s390/irqflags: optimize irq restore") optimized irqrestore to really only care about interrupts and adapted the remaining low level users. One spot (memcpy_real) was not touched, though - fix it. Otherwise a kdump kernel will fail while reading the old kernel. As we re-enable irqs with a non-standard function we have to tell lockdep about that. Fixes: `204ee2c564` ("s390/irqflags: optimize irq restore") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-11 13:05:51 +01:00
Christian Borntraeger	1763f8d09d	KVM: s390: bail out early on fatal signal in dirty logging A KVM_GET_DIRTY_LOG ioctl might take a long time. This can result in fatal signals seemingly being ignored. Lets bail out during the dirty bit sync, if a fatal signal is pending. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:57 +01:00
Christian Borntraeger	70c88a00fb	KVM: s390: do not block CPU on dirty logging When doing dirty logging on huge guests (e.g.600GB) we sometimes get rcu stall timeouts with backtraces like [ 2753.194083] ([<0000000000112fb2>] show_trace+0x12a/0x130) [ 2753.194092] [<0000000000113024>] show_stack+0x6c/0xe8 [ 2753.194094] [<00000000001ee6a8>] rcu_pending+0x358/0xa48 [ 2753.194099] [<00000000001f20cc>] rcu_check_callbacks+0x84/0x168 [ 2753.194102] [<0000000000167654>] update_process_times+0x54/0x80 [ 2753.194107] [<00000000001bdb5c>] tick_sched_handle.isra.16+0x4c/0x60 [ 2753.194113] [<00000000001bdbd8>] tick_sched_timer+0x68/0x90 [ 2753.194115] [<0000000000182a88>] __run_hrtimer+0x88/0x1f8 [ 2753.194119] [<00000000001838ba>] hrtimer_interrupt+0x122/0x2b0 [ 2753.194121] [<000000000010d034>] do_extint+0x16c/0x170 [ 2753.194123] [<00000000005e206e>] ext_skip+0x38/0x3e [ 2753.194129] [<000000000012157c>] gmap_test_and_clear_dirty+0xcc/0x118 [ 2753.194134] ([<00000000001214ea>] gmap_test_and_clear_dirty+0x3a/0x118) [ 2753.194137] [<0000000000132da4>] kvm_vm_ioctl_get_dirty_log+0xd4/0x1b0 [ 2753.194143] [<000000000012ac12>] kvm_vm_ioctl+0x21a/0x548 [ 2753.194146] [<00000000002b57f6>] do_vfs_ioctl+0x30e/0x518 [ 2753.194149] [<00000000002b5a9c>] SyS_ioctl+0x9c/0xb0 [ 2753.194151] [<00000000005e1ae6>] sysc_tracego+0x14/0x1a [ 2753.194153] [<000003ffb75f3972>] 0x3ffb75f3972 We should do a cond_resched in here. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:57 +01:00
Christian Borntraeger	ab99a1cc7a	KVM: s390: do not take mmap_sem on dirty log query Dirty log query can take a long time for huge guests. Holding the mmap_sem for very long times can cause some unwanted latencies. Turns out that we do not need to hold the mmap semaphore. We hold the slots_lock for gfn->hva translation and walk the page tables with that address, so no need to look at the VMAs. KVM also holds a reference to the mm, which should prevent other things going away. During the walk we take the necessary ptl locks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:56 +01:00
David Hildenbrand	efa48163b8	KVM: s390: remove old fragment of vector registers Since commit `9977e886cb` ("s390/kernel: lazy restore fpu registers"), vregs in struct sie_page is unsed. We can safely remove the field and the definition. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:54 +01:00
David Hildenbrand	9b0d721a07	KVM: s390: instruction-fetching exceptions on SIE faults On instruction-fetch exceptions, we have to forward the PSW by any valid ilc and correctly use that ilc when injecting the irq. Injection will already take care of rewinding the PSW if we injected a nullifying program irq, so we don't need special handling prior to injection. Until now, autodetection would have guessed an ilc of 0. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:54 +01:00
David Hildenbrand	5631792053	KVM: s390: provide prog irq ilc on SIE faults On SIE faults, the ilc cannot be detected automatically, as the icptcode is 0. The ilc indicated in the program irq will always be 0. Therefore we have to manually specify the ilc in order to tell the guest which ilen was used when forwarding the PSW. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:53 +01:00
David Hildenbrand	eaa4f41642	KVM: s390: irq delivery should not rely on icptcode Program irq injection during program irq intercepts is the last candidates that injects nullifying irqs and relies on delivery to do the right thing. As we should not rely on the icptcode during any delivery (because that value will not be migrated), let's add a flag, telling prog IRQ delivery to not rewind the PSW in case of nullifying prog IRQs. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:53 +01:00
David Hildenbrand	f6af84e7e7	KVM: s390: clean up prog irq injection on prog irq icpts __extract_prog_irq() is used only once for getting the program check data in one place. Let's combine it with an injection function to avoid a memset and to prevent misuse on injection by simplifying the interface to only have the VCPU as parameter. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:52 +01:00
David Hildenbrand	6597732275	KVM: s390: read the correct opcode on SIE faults Let's use our fresh new function read_guest_instr() to access guest storage via the correct addressing schema. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:51 +01:00
David Hildenbrand	34346b9a93	KVM: s390: gaccess: implement instruction fetching mode When an instruction is to be fetched, special handling applies to secondary-space mode and access-register mode. The instruction is to be fetched from primary space. We can easily support this by selecting the right asce for translation. Access registers will never be used during translation, so don't include them in the interface. As we only want to read from the current PSW address for now, let's also hide that detail. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:51 +01:00
David Hildenbrand	92c9632119	KVM: s390: gaccess: introduce access modes We will need special handling when fetching instructions, so let's introduce new guest access modes GACC_FETCH and GACC_STORE instead of a write flag. An additional patch will then introduce GACC_IFETCH. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:50 +01:00
David Hildenbrand	634790b827	KVM: s390: migration / injection of prog irq ilc We have to migrate the program irq ilc and someday we will have to specify the ilc without KVM trying to autodetect the value. Let's reuse one of the spare fields in our program irq that should always be set to 0 by user space. Because we also want to make use of 0 ilcs ("not available"), we need a validity indicator. If no valid ilc is given, we try to autodetect the ilc via the current icptcode and icptstatus + parameter and store the valid ilc in the irq structure. This has a nice effect: QEMU's making use of KVM_S390_IRQ / KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will directly migrate the ilc without any changes. Please note that we use bit 0 as validity and bit 1,2 for the ilc, so by applying the ilc mask we directly get the ilen which is usually what we work with. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:50 +01:00
David Hildenbrand	0e8bc06a2f	KVM: s390: PSW forwarding / rewinding / ilc rework We have some confusion about ilc vs. ilen in our current code. So let's correctly use the term ilen when dealing with (ilc << 1). Program irq injection didn't take care of the correct ilc in case of irqs triggered by EXECUTE functions, let's provide one function kvm_s390_get_ilen() to take care of all that. Also, manually specifying in intercept handlers the size of the instruction (and sometimes overwriting that value for EXECUTE internally) doesn't make too much sense. So also provide the functions: - kvm_s390_retry_instr to retry the currently intercepted instruction - kvm_s390_rewind_psw to rewind the PSW without internal overwrites - kvm_s390_forward_psw to forward the PSW Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:49 +01:00
David Hildenbrand	6fd8e67dd8	KVM: s390: sync of fp registers via kvm_run As we already store the floating point registers in the vector save area in floating point register format when we don't have MACHINE_HAS_VX, we can directly expose them to user space using a new sync flag. The floating point registers will be valid when KVM_SYNC_FPRS is set. The fpc will also be valid when KVM_SYNC_FPRS is set. Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both. Let's also change two positions where we access vrs, making the code easier to read and one comment superfluous. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:49 +01:00
David Hildenbrand	f6aa6dc449	KVM: s390: allow sync of fp registers via vregs If we have MACHINE_HAS_VX, the floating point registers are stored in the vector register format, event if the guest isn't enabled for vector registers. So we can allow KVM_SYNC_VRS as soon as MACHINE_HAS_VX is available. This can in return be used by user space to support floating point registers via struct kvm_run when the machine has vector registers. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-02-10 13:12:48 +01:00
Heiko Carstens	232f5dd785	s390/oprofile: fix address range for asynchronous stack git commit `dc7ee00d47` ("s390: lowcore stack pointer offsets") introduced a regression in regard to s390_backtrace(). The stack pointer for the asynchronous stack in the lowcore now has an additional offset applied. This offset needs to be taken into account in the calculation for the low and high address for the stack. This bug was already partially fixed with commit `9cc5c206d9` ("s390/dumpstack: fix address ranges for asynchronous and panic stack"). This patch fixes it also for the oprofile code. Fixes: `dc7ee00d47` ("s390: lowcore stack pointer offsets") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:22 +01:00
Heiko Carstens	1f8cbb9c83	s390/perf_event: fix address range for asynchronous stack git commit `dc7ee00d47` ("s390: lowcore stack pointer offsets") introduced a regression in regard to perf_callchain_kernel(). The stack pointer for the asynchronous stack in the lowcore now has an additional offset applied. This offset needs to be taken into account in the calculation for the low and high address for the stack. This bug was already partially fixed with `9cc5c206d9` ("s390/dumpstack: fix address ranges for asynchronous and panic stack"). This patch fixes it also for the perf_event code. Fixes: `dc7ee00d47` ("s390: lowcore stack pointer offsets") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:22 +01:00
Pratyush Anand	e0115875c0	s390/stacktrace: add save_stack_trace_regs() Implement save_stack_trace_regs, so that a stack trace of a kprobe event can be obtained. Without this we see following warning: "save_stack_trace_regs() not implemented yet." when we execute: echo stacktrace > /sys/kernel/debug/tracing/trace_options echo "p kfree" >> /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable Reported-by: Chunyu Hu <chuhu@redhat.com> Signed-off-by: Pratyush Anand <panand@redhat.com> [heiko.carstens@de.ibm.com]: changed patch to use __save_stack_trace() Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:22 +01:00
Heiko Carstens	66adce8f1f	s390/stacktrace: save full stack traces save_stack_trace() only saves the stack trace of the current context (interrupt or process context). This is different to what other architectures like x86 do, which save the full stack trace across different contexts. Also extract a __save_stack_trace() helper function which will be used by a follow on patch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:21 +01:00
Heiko Carstens	f6331aaccb	s390/stacktrace: add missing end marker save_stack_trace() did not write the ULONG_MAX end marker if there is enough space left. So simply follow x86 and arm64. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:21 +01:00
Heiko Carstens	9900c48c46	s390/stacktrace: fix address ranges for asynchronous and panic stack git commit `dc7ee00d47` ("s390: lowcore stack pointer offsets") introduced a regression in regard to save_stack_trace(). The stack pointer for the asynchronous and the panic stack in the lowcore now have an additional offset applied to them. This offset needs to be taken into account in the calculation for the low and high address for the stacks. This bug was already partially fixed with `9cc5c206d9` ("s390/dumpstack: fix address ranges for asynchronous and panic stack"). This patch fixes it also for the stacktrace code. Fixes: `dc7ee00d47` ("s390: lowcore stack pointer offsets") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:20 +01:00
Heiko Carstens	665ca9187c	s390/stacktrace: fix save_stack_trace_tsk() for current task The function save_stack_trace_tsk() did not consider that it can be used for tsk == current, for which the current stack pointer obviously cannot be found in the thread structure. Fix this and get the stack pointer with an inline assembly. This fixes e.g. the output of "cat /proc/self/stack". Before: [<0000000000000000>] (null) [<ffffffffffffffff>] 0xffffffffffffffff After: [<000000000011b3ee>] save_stack_trace_tsk+0x56/0x98 [<0000000000366cde>] proc_pid_stack+0xae/0x108 [<00000000003636f0>] proc_single_show+0x70/0xc0 [<0000000000311fbc>] seq_read+0xcc/0x448 [<00000000002e7716>] __vfs_read+0x36/0x100 [<00000000002e872e>] vfs_read+0x76/0x130 [<00000000002e975e>] SyS_read+0x66/0xd8 [<000000000089490e>] system_call+0xd6/0x264 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-02-10 09:25:20 +01:00
Andrey Ryabinin	06bea3dbfe	locking/lockdep: Eliminate lockdep_init() Lockdep is initialized at compile time now. Get rid of lockdep_init(). Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: mm-commits@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-09 12:03:25 +01:00
Toshi Kani	35d98e93fe	arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM Set IORESOURCE_SYSTEM_RAM in flags of resource ranges with "System RAM", "Kernel code", "Kernel data", and "Kernel bss". Note that: - IORESOURCE_SYSRAM (i.e. modifier bit) is set in flags when IORESOURCE_MEM is already set. IORESOURCE_SYSTEM_RAM is defined as (IORESOURCE_MEM\|IORESOURCE_SYSRAM). - Some archs do not set 'flags' for children nodes, such as "Kernel code". This patch does not change 'flags' in this case. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org Cc: linux-mm <linux-mm@kvack.org> Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1453841853-11383-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-01-30 09:49:57 +01:00
Linus Torvalds	6b292a8abd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "An optimization for irq-restore, the SSM instruction is quite a bit slower than an if-statement and a STOSM. The copy_file_range system all is added. Cleanup for PCI and CIO. And a couple of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: update measurement characteristics s390/cio: ensure consistent measurement state s390/cio: fix measurement characteristics memleak s390/zcrypt: Fix cryptographic device id in kernel messages s390/pci: remove iomap sanity checks s390/pci: set error state for unusable functions s390/pci: fix bar check s390/pci: resize iomap s390/pci: improve ZPCI_* macros s390/pci: provide ZPCI_ADDR macro s390/pci: adjust IOMAP_MAX_ENTRIES s390/numa: move numa_init_late() from device to arch_initcall s390: remove all usages of PSW_ADDR_INSN s390: remove all usages of PSW_ADDR_AMODE s390: wire up copy_file_range syscall s390: remove superfluous memblock_alloc() return value checks s390/numa: allocate memory with correct alignment s390/irqflags: optimize irq restore s390/mm: use TASK_MAX_SIZE where applicable	2016-01-29 16:05:18 -08:00
Linus Torvalds	f0ce3ff42e	s390 and POWER bug fixes, plus enabling the KVM-VFIO interface on s390. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWp5FIAAoJEL/70l94x66DozoIAKWFLnwi6PQ7Sm+Tvr0Sl/mp yGDM+PuVyG4C6bPS66xd4XkXEFIwfpIJQzq1Zt7Xg0m27t50DRtmInLJU4ql7MFD vYn3h6lOudtUR0i5kPYSy6SMehFjx8wS0GX+O+iX9tMlYI0vGWEJ+7C06FmnqpGP RUUjnVvljAMih8wsAHLOjH8rwZHnO+EfHNi+V7Q+eFBHJnn2R06IqK8z5DmTBeTQ ZecNdZslX5qZvMNulTo7nC2P6VShkdRuMJkLEFeSaTlCqx5WcnzgwEiMMavUeYI8 aIWuE51v0RMxRhsvRXUQOeRpRgU0AF2OYJEKXBRbO3P6IRikFohJq4QH67zN2HA= =54tv -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "s390 and POWER bug fixes, plus enabling the KVM-VFIO interface on s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM doc: Fix KVM_SMI chapter number KVM: s390: fix memory overwrites when vx is disabled KVM: s390: Enable the KVM-VFIO device KVM: s390: fix guest fprs memory leak KVM: PPC: Fix ONE_REG AltiVec support KVM: PPC: Increase memslots to 512 KVM: PPC: Book3S PR: Remove unused variable 'vcpu_book3s' KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8 KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better	2016-01-27 10:50:42 -08:00
David Hildenbrand	9abc2a08a7	KVM: s390: fix memory overwrites when vx is disabled The kernel now always uses vector registers when available, however KVM has special logic if support is really enabled for a guest. If support is disabled, guest_fpregs.fregs will only contain memory for the fpu. The kernel, however, will store vector registers into that area, resulting in crazy memory overwrites. Simply extending that area is not enough, because the format of the registers also changes. We would have to do additional conversions, making the code even more complex. Therefore let's directly use one place for the vector/fpu registers + fpc (in kvm_run). We just have to convert the data properly when accessing it. This makes current code much easier. Please note that vector/fpu registers are now always stored to vcpu->run->s.regs.vrs. Although this data is visible to QEMU and used for migration, we only guarantee valid values to user space when KVM_SYNC_VRS is set. As that is only the case when we have vector register support, we are on the safe side. Fixes: `b5510d9b68` ("s390/fpu: always enable the vector facility if it is available") Cc: stable@vger.kernel.org # v4.4 `d9a3a09af5` s390/kvm: remove dependency on struct save_area definition Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [adopt to `d9a3a09af5`]	2016-01-26 15:40:21 +01:00
Dong Jia Shi	14b0b4ac37	KVM: s390: Enable the KVM-VFIO device The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. While we don't need this on s390 currently, let's try to be like everyone else. Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-01-26 15:40:17 +01:00
David Hildenbrand	9c7ebb613b	KVM: s390: fix guest fprs memory leak fprs is never freed, therefore resulting in a memory leak if kvm_vcpu_init() fails or the vcpu is destroyed. Fixes: `9977e886cb` ("s390/kernel: lazy restore fpu registers") Cc: stable@vger.kernel.org # v4.3+ Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-01-26 15:40:09 +01:00
Sebastian Ott	f5e44f82c1	s390/pci: remove iomap sanity checks Since each iomap_entry handles only one bar of one pci function (even when disjunct ranges of a bar are mapped) the sanity check in pci_iomap_range is not needed and can be removed. Also convert the remaining BUG_ONs to WARN_ONs. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:46:45 +01:00
Sebastian Ott	8ead7efb63	s390/pci: set error state for unusable functions We receive special notifications from firmware when an error was detected and a pci function became unusable. Set the error_state accordingly to give device drivers a hint that they don't need to try error recovery. Suggested-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:46:28 +01:00
Sebastian Ott	c0cabaddee	s390/pci: fix bar check Fix the check which bar space we should map to allow available bars only. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:46:17 +01:00
Sebastian Ott	c506fff3d3	s390/pci: resize iomap On s390 we need to maintain a mapping between iomem addresses and arch specific function identifiers. Currently the mapping table is created as such that we could span the whole iomem address space. Since we can only map each bar space from each possible function we have an upper bound for the number of mapping entries. This reduces the size of the iomap from 256K to less than 4K (using the defconfig). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:45:58 +01:00
Sebastian Ott	bf19c94d5c	s390/pci: improve ZPCI_* macros Most of the constants defined in pci_io.h depend on each other and thus can be calculated. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:45:49 +01:00
Sebastian Ott	9e00caaea1	s390/pci: provide ZPCI_ADDR macro Provide and use a ZPCI_ADDR macro as the complement of ZPCI_IDX to get rid of some constants in the code. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:45:41 +01:00
Sebastian Ott	c2e1fcf3ec	s390/pci: adjust IOMAP_MAX_ENTRIES ZPCI_IOMAP_MAX_ENTRIES is off by one. Let's adjust this for the sake of correctness. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:45:33 +01:00
Michael Holzheu	2d0f76a6ca	s390/numa: move numa_init_late() from device to arch_initcall Commit `3e89e1c5ea` ("hugetlb: make mm and fs code explicitly non-modular") moves hugetlb_init() from module_init to subsys_initcall. The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj" which is initialized in numa_init_late(). Since numa_init_late() is a device_initcall which is called after subsys_initcall the above mentioned patch breaks NUMA on s390. So fix this and move numa_init_late() to arch_initcall. Fixes: `3e89e1c5ea` ("hugetlb: make mm and fs code explicitly non-modular") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-26 12:45:24 +01:00
Al Viro	5955102c99	wrappers for ->i_mutex access parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-01-22 18:04:28 -05:00
Christoph Hellwig	e1c7e32453	dma-mapping: always provide the dma_map_ops based implementation Move the generic implementation to <linux/dma-mapping.h> now that all architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now that everyone supports them. [valentinrothberg@gmail.com: remove leftovers in Kconfig] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Steven Miao <realmz6@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-20 17:09:18 -08:00
Heiko Carstens	9cb1ccecb6	s390: remove all usages of PSW_ADDR_INSN Yet another leftover from the 31 bit era. The usual operation "y = x & PSW_ADDR_INSN" with the PSW_ADDR_INSN mask is a nop for CONFIG_64BIT. Therefore remove all usages and hope the code is a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>	2016-01-19 12:14:03 +01:00
Heiko Carstens	fecc868a66	s390: remove all usages of PSW_ADDR_AMODE This is a leftover from the 31 bit area. For CONFIG_64BIT the usual operation "y = x \| PSW_ADDR_AMODE" is a nop. Therefore remove all usages of PSW_ADDR_AMODE and make the code a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>	2016-01-19 12:14:02 +01:00
Heiko Carstens	696aafd369	s390: wire up copy_file_range syscall Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-19 12:14:02 +01:00
Heiko Carstens	cd17e153af	s390: remove superfluous memblock_alloc() return value checks memblock_alloc() and memblock_alloc_base() will panic on their own if they can't find free memory. Therefore remove some pointless checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-19 12:14:02 +01:00
Heiko Carstens	ef1f7fd7eb	s390/numa: allocate memory with correct alignment Allocating memory with a requested minimum alignment of 1 is wrong since pg_data_t contains a spinlock which requires an alignment of 4 bytes. Therefore fix this and ask for an alignment of 8 bytes like it is guarenteed for all kmalloc requests. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-19 12:14:01 +01:00
Christian Borntraeger	204ee2c564	s390/irqflags: optimize irq restore The ssm instruction takes longer that stnsm/stosm as it is often used to modify DAT and PER. We know that irqsave/irqrestore only deals with external and I/O interrupts and we know that irqrestore can transition only from disabled->disabled or disabled->enabled, so we can use the faster stosm. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-01-19 12:14:01 +01:00
Dominik Dingel	a9d7ab9781	s390/mm: use TASK_MAX_SIZE where applicable To improve readability we can use TASK_MAX_SIZE when we just check for the upper limit. All places explicitly dealing with 3 vs 4 level pgtables were left unchanged. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>	2016-01-19 12:14:01 +01:00
Linus Torvalds	a200dcb346	virtio: barrier rework+fixes This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWlU2kAAoJECgfDbjSjVRpZ6IH/Ra19ecG8sCQo9zskr4zo22Z DZXC3u0sJDBYjjBAiw3IY1FKh7wx2Fr1RhUOj1bteBgcFCMCV1zInP5ITiCyzd1H YYh1w9C2tZaj2T4t9L4hIrAdtIF8fGS+oI2IojXPjOuDLEt6pfFBEjHp/sfl3UJq ZmZvw4OXviSNej7jBw8Xni3Uv18yfmLGXvMdkvMSPC1/XL29voGDqTVwhqJwxLVz k/ZLcKFOzIs9N7Nja0Jl1EiZtC2Y9cpItqweicNAzszlpkSL44vQxmCSefB+WyQ4 gt0O3+AxYkLfrxzCBhUA4IpRex3/XPW1b+1e/V1XjfR2n/FlyLe+AIa8uPJElFc= =ukaV -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio barrier rework+fixes from Michael Tsirkin: "This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits) checkpatch: add virt barriers checkpatch: check for __smp outside barrier.h checkpatch.pl: add missing memory barriers virtio: make find_vqs() checkpatch.pl-friendly virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak s390: more efficient smp barriers s390: use generic memory barriers xen/events: use virt_xxx barriers xen/io: use virt_xxx barriers xenbus: use virt_xxx barriers virtio_ring: use virt_store_mb sh: move xchg_cmpxchg to a header by itself sh: support 1 and 2 byte xchg virtio_ring: update weak barriers to use virt_xxx Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb" asm-generic: implement virt_xxx memory barriers x86: define __smp_xxx xtensa: define __smp_xxx tile: define __smp_xxx ...	2016-01-18 16:44:24 -08:00
Miroslav Benes	383bf44d1a	livepatch: change the error message in asm/livepatch.h header files If anyone includes asm/livepatch.h when CONFIG_LIVEPATCH=n the build fails with the existing error message. Change it to something saner. [jkosina@suse.cz: fixed changelog typo spotted by Josh] Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-01-18 21:35:43 +01:00
Will Deacon	da48d094ce	Kconfig: remove HAVE_LATENCYTOP_SUPPORT As illustrated by commit `a3afe70b83` ("[S390] latencytop s390 support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an implementation of save_stack_trace_tsk. However, as of `9212ddb5ea` ("stacktrace: provide save_stack_trace_tsk() weak alias") a dummy implementation is provided if STACKTRACE=y. Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Helge Deller <deller@gmx.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-16 11:17:23 -08:00
Dominik Dingel	fef8953ae4	s390/mm: enable fixup_user_fault retrying By passing a non-null flag we allow fixup_user_fault to retry, which enables userfaultfd. As during these retries we might drop the mmap_sem we need to check if that happened and redo the complete chain of actions. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Dominik Dingel	4a9e1cda27	mm: bring in additional flag for fixup_user_fault to signal unlock During Jason's work with postcopy migration support for s390 a problem regarding gmap faults was discovered. The gmap code will call fixup_user_fault which will end up always in handle_mm_fault. Till now we never cared about retries, but as the userfaultfd code kind of relies on it. this needs some fix. This patchset does not take care of the futex code. I will now look closer at this. This patch (of 2): With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the faulting we ever unlocked mmap_sem. This patch brings in the logic to handle retries as well as it cleans up the current documentation. fixup_user_fault was not having the same semantics as filemap_fault. It never indicated if a retry happened and so a caller wasn't able to handle that case. So we now changed the behaviour to always retry a locked mmap_sem. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Kirill A. Shutemov	fecffad254	s390, thp: remove infrastructure for handling splitting PMDs With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Kirill A. Shutemov	ddc58f27f9	mm: drop tail page refcounting Tail page refcounting is utterly complicated and painful to support. It uses ->_mapcount on tail pages to store how many times this page is pinned. get_page() bumps ->_mapcount on tail page in addition to ->_count on head. This information is required by split_huge_page() to be able to distribute pins from head of compound page to tails during the split. We will need ->_mapcount to account PTE mappings of subpages of the compound page. We eliminate need in current meaning of ->_mapcount in tail pages by forbidding split entirely if the page is pinned. The only user of tail page refcounting is THP which is marked BROKEN for now. Let's drop all this mess. It makes get_page() and put_page() much simpler. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Linus Torvalds	875fc4f5dd	Merge branch 'akpm' (patches from Andrew) Merge first patch-bomb from Andrew Morton: - A few hotfixes which missed 4.4 becasue I was asleep. cc'ed to -stable - A few misc fixes - OCFS2 updates - Part of MM. Including pretty large changes to page-flags handling and to thp management which have been buffered up for 2-3 cycles now. I have a lot of MM material this time. [ It turns out the THP part wasn't quite ready, so that got dropped from this series - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) zsmalloc: reorganize struct size_class to pack 4 bytes hole mm/zbud.c: use list_last_entry() instead of list_tail_entry() zram/zcomp: do not zero out zcomp private pages zram: pass gfp from zcomp frontend to backend zram: try vmalloc() after kmalloc() zram/zcomp: use GFP_NOIO to allocate streams mm: add tracepoint for scanning pages drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64 mm/page_isolation: use macro to judge the alignment mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE() mm: rework virtual memory accounting include/linux/memblock.h: fix ordering of 'flags' argument in comments mm: move lru_to_page to mm_inline.h Documentation/filesystems: describe the shared memory usage/accounting memory-hotplug: don't BUG() in register_memory_resource() hugetlb: make mm and fs code explicitly non-modular mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd() mm: make sure isolate_lru_page() is never called for tail page vmstat: make vmstat_updater deferrable again and shut down on idle ...	2016-01-15 11:41:44 -08:00
Linus Torvalds	0f0836b7eb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - RO/NX attribute fixes for patch module relocations from Josh Poimboeuf. As part of this effort, module.c has been cleaned up as well and livepatching is piggy-backing on this cleanup. Rusty is OK with this whole lot going through livepatching tree. - symbol disambiguation support from Chris J Arges. That series is also Reviewed-by: Miroslav Benes <mbenes@suse.cz> but this came in only after I've alredy pushed out. Didn't want to rebase because of that, hence I am mentioning it here. - symbol lookup fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Cleanup module page permission changes module: keep percpu symbols in module's symtab module: clean up RO/NX handling. module: use a structure to encapsulate layout. gcov: use within_module() helper. module: Use the same logic for setting and unsetting RO/NX livepatch: function,sympos scheme in livepatch sysfs directory livepatch: add sympos as disambiguator field to klp_reloc livepatch: add old_sympos as disambiguator field to klp_func	2016-01-14 16:38:02 -08:00
Jerome Marchand	eca56ff906	mm, shmem: add internal shmem resident memory accounting Currently looking at /proc/<pid>/status or statm, there is no way to distinguish shmem pages from pages mapped to a regular file (shmem pages are mapped to /dev/zero), even though their implication in actual memory use is quite different. The internal accounting currently counts shmem pages together with regular files. As a preparation to extend the userspace interfaces, this patch adds MM_SHMEMPAGES counter to mm_rss_stat to account for shmem pages separately from MM_FILEPAGES. The next patch will expose it to userspace - this patch doesn't change the exported values yet, by adding up MM_SHMEMPAGES to MM_FILEPAGES at places where MM_FILEPAGES was used before. The only user-visible change after this patch is the OOM killer message that separates the reported "shmem-rss" from "file-rss". [vbabka@suse.cz: forward-porting, tweak changelog] Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-14 16:00:49 -08:00
Linus Torvalds	d080827f85	libnvdimm for 4.5 1/ Media error handling: The 'badblocks' implementation that originated in md-raid is up-levelled to a generic capability of a block device. This initial implementation is limited to being consulted in the pmem block-i/o path. Later, 'badblocks' will be consulted when creating dax mappings. 2/ Raw block device dax: For virtualization and other cases that want large contiguous mappings of persistent memory, add the capability to dax-mmap a block device directly. 3/ Increased /dev/mem restrictions: Add an option to treat all io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access while a driver is actively using an address range. This behavior is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be overridden by the existing "iomem=relaxed" kernel command line option. 4/ Miscellaneous fixes include a 'pfn'-device huge page alignment fix, block device shutdown crash fix, and other small libnvdimm fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWlrhjAAoJEB7SkWpmfYgCFbAQALKsQfFwT6JFS+zlPgiNpbqw 2VMNKEH0AfGYGj96mT02j2q+vSUmXLMIDMTsbe0sDdtwFZtQbFmhmryzPWUVppSu KGTlLPW8vuEhQVs91+UI3BQKkvpi0+tbR8hPOh9W6QhjpRT+lyHFKnsNR5HZy5wB K4/VMaT5ffd5/pXRTjkYiPQYTwWyfcvNjICj0YtqhPvOwS031m77JpFsWJ8HSpEX K99VlzNUPMXd1pYkHmFNXWw52fhRGNhwAEomLeKMdQfKms+KnbKp8BOSA0aCqU8E kpujQcilDXJwykFQZOFI3Z5Dxvrv8lxFTU8HRMBvo3ESzfTWjfqcvyjGOjDUcruw ihESFSJtdZzhrBiMnf9RRqSpMFJvAT8MVT6Q4D3mZUHCMPbUqFJsQjMPt9hEH3ho 4F0D2lesOCkubUKFTZmjMoDb+szuKbVhYK8TeFVVEhizinc/Aj0NKuazJqi+CXB/ xh0ER4ZxD8wvzqFFWvS5UvR1G9I5fr7+3jGRUrqGLHlSdeXP9dkEg28ao3QbWk3x 1dPOen6ZqQ9WJ/E7eGmXbVEz2R4Xd79hMXQzdQwmKDk/KbxRoAp7hyU8BslAyrBf HCdmVt+RAgrxZYfFRXuLhqwEBThJnNrgZA3qu74FUpkpFg6xRUu1bAYBiF7N+bFi 82b5UbMkveBTtkXjJoiR =7V5r -----END PGP SIGNATURE----- Merge tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "The bulk of this has appeared in -next and independently received a build success notification from the kbuild robot. The 'for-4.5/block- dax' topic branch was rebased over the weekend to drop the "block device end-of-life" rework that Al would like to see re-implemented with a notifier, and to address bug reports against the badblocks integration. There is pending feedback against "libnvdimm: Add a poison list and export badblocks" received last week. Linda identified some localized fixups that we will handle incrementally. Summary: - Media error handling: The 'badblocks' implementation that originated in md-raid is up-levelled to a generic capability of a block device. This initial implementation is limited to being consulted in the pmem block-i/o path. Later, 'badblocks' will be consulted when creating dax mappings. - Raw block device dax: For virtualization and other cases that want large contiguous mappings of persistent memory, add the capability to dax-mmap a block device directly. - Increased /dev/mem restrictions: Add an option to treat all io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access while a driver is actively using an address range. This behavior is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be overridden by the existing "iomem=relaxed" kernel command line option. - Miscellaneous fixes include a 'pfn'-device huge page alignment fix, block device shutdown crash fix, and other small libnvdimm fixes" * tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits) block: kill disk_{check\|set\|clear\|alloc}_badblocks libnvdimm, pmem: nvdimm_read_bytes() badblocks support pmem, dax: disable dax in the presence of bad blocks pmem: fail io-requests to known bad blocks libnvdimm: convert to statically allocated badblocks libnvdimm: don't fail init for full badblocks list block, badblocks: introduce devm_init_badblocks block: clarify badblocks lifetime badblocks: rename badblocks_free to badblocks_exit libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h libnvdimm: Add a poison list and export badblocks nfit_test: Enable DSMs for all test NFITs md: convert to use the generic badblocks code block: Add badblock management for gendisks badblocks: Add core badblock management code block: fix del_gendisk() vs blkdev_ioctl crash block: enable dax for raw block devices block: introduce bdev_file_inode() restrict /dev/mem to idle io memory ranges arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug ...	2016-01-13 19:15:14 -08:00
Linus Torvalds	cbd88cd4c0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "Among the traditional bug fixes and cleanups are some improvements: - A tool to generated the facility lists, generating the bit fields by hand has been a source of bugs in the past - The spinlock loop is reordered to avoid bursts of hypervisor calls - Add support for the open-for-business interface to the service element - The get_cpu call is added to the vdso - A set of tracepoints is defined for the common I/O layer - The deprecated sclp_cpi module is removed - Update default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits) s390/sclp: fix possible control register corruption s390: fix normalization bug in exception table sorting s390/configs: update default configurations s390/vdso: optimize getcpu system call s390: drop smp_mb in vdso_init s390: rename struct _lowcore to struct lowcore s390/mem_detect: use unsigned longs s390/ptrace: get rid of long longs in psw_bits s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK s390/Kconfig: remove pointless 64 bit dependencies s390/dasd: fix failfast for disconnected devices s390/con3270: testing return kzalloc retval s390/hmcdrv: constify hmcdrv_ftp_ops structs s390/cio: add NULL test s390/cio: Change I/O instructions from inline to normal functions s390/cio: Introduce common I/O layer tracepoints s390/cio: Consolidate inline assemblies and related data definitions s390/cio: Fix incorrect xsch opcode specification s390/cio: Remove unused inline assemblies ...	2016-01-13 13:16:16 -08:00
Linus Torvalds	aee3bfa330	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from Davic Miller: 1) Support busy polling generically, for all NAPI drivers. From Eric Dumazet. 2) Add byte/packet counter support to nft_ct, from Floriani Westphal. 3) Add RSS/XPS support to mvneta driver, from Gregory Clement. 4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes Frederic Sowa. 5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai. 6) Add support for VLAN device bridging to mlxsw switch driver, from Ido Schimmel. 7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski. 8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko. 9) Reorganize wireless drivers into per-vendor directories just like we do for ethernet drivers. From Kalle Valo. 10) Provide a way for administrators "destroy" connected sockets via the SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti. 11) Add support to add/remove multicast routes via netlink, from Nikolay Aleksandrov. 12) Make TCP keepalive settings per-namespace, from Nikolay Borisov. 13) Add forwarding and packet duplication facilities to nf_tables, from Pablo Neira Ayuso. 14) Dead route support in MPLS, from Roopa Prabhu. 15) TSO support for thunderx chips, from Sunil Goutham. 16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon. 17) Rationalize, consolidate, and more completely document the checksum offloading facilities in the networking stack. From Tom Herbert. 18) Support aborting an ongoing scan in mac80211/cfg80211, from Vidyullatha Kanchanapally. 19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits) net: bnxt: always return values from _bnxt_get_max_rings net: bpf: reject invalid shifts phonet: properly unshare skbs in phonet_rcv() dwc_eth_qos: Fix dma address for multi-fragment skbs phy: remove an unneeded condition mdio: remove an unneed condition mdio_bus: NULL dereference on allocation error net: Fix typo in netdev_intersect_features net: freescale: mac-fec: Fix build error from phy_device API change net: freescale: ucc_geth: Fix build error from phy_device API change bonding: Prevent IPv6 link local address on enslaved devices IB/mlx5: Add flow steering support net/mlx5_core: Export flow steering API net/mlx5_core: Make ipv4/ipv6 location more clear net/mlx5_core: Enable flow steering support for the IB driver net/mlx5_core: Initialize namespaces only when supported by device net/mlx5_core: Set priority attributes net/mlx5_core: Connect flow tables net/mlx5_core: Introduce modify flow table command net/mlx5_core: Managing root flow table ...	2016-01-12 18:57:02 -08:00
Linus Torvalds	33caf82acf	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of stuff. That probably should've been 5 or 6 separate branches, but by the time I'd realized how large and mixed that bag had become it had been too close to -final to play with rebasing. Some fs/namei.c cleanups there, memdup_user_nul() introduction and switching open-coded instances, burying long-dead code, whack-a-mole of various kinds, several new helpers for ->llseek(), assorted cleanups and fixes from various people, etc. One piece probably deserves special mention - Neil's lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets called without ->i_mutex and tries to avoid ever taking it. That, of course, means that it's not useful for any directory modifications, but things like getting inode attributes in nfds readdirplus are fine with that. I really should've asked for moratorium on lookup-related changes this cycle, but since I hadn't done that early enough... I am asking for that for the coming cycle, though - I'm going to try and get conversion of i_mutex to rwsem with ->lookup() done under lock taken shared. There will be a patch closer to the end of the window, along the lines of the one Linus had posted last May - mechanical conversion of ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/ inode_is_locked()/inode_lock_nested(). To quote Linus back then: ----- \| This is an automated patch using \| \| sed 's/mutex_lock(&\(.\)->i_mutex)/inode_lock(\1)/' \| sed 's/mutex_unlock(&\(.\)->i_mutex)/inode_unlock(\1)/' \| sed 's/mutex_lock_nested(&\(.\)->i_mutex,[ ]I_MUTEX_\([A-Z0-9_]\))/inode_lock_nested(\1, I_MUTEX_\2)/' \| sed 's/mutex_is_locked(&\(.\)->i_mutex)/inode_is_locked(\1)/' \| sed 's/mutex_trylock(&\(.\)->i_mutex)/inode_trylock(\1)/' \| \| with a very few manual fixups ----- I'm going to send that once the ->i_mutex-affecting stuff in -next gets mostly merged (or when Linus says he's about to stop taking merges)" 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) nfsd: don't hold i_mutex over userspace upcalls fs:affs:Replace time_t with time64_t fs/9p: use fscache mutex rather than spinlock proc: add a reschedule point in proc_readfd_common() logfs: constify logfs_block_ops structures fcntl: allow to set O_DIRECT flag on pipe fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE fs: xattr: Use kvfree() [s390] page_to_phys() always returns a multiple of PAGE_SIZE nbd: use ->compat_ioctl() fs: use block_device name vsprintf helper lib/vsprintf: add %*pg format specifier fs: use gendisk->disk_name where possible poll: plug an unused argument to do_poll amdkfd: don't open-code memdup_user() cdrom: don't open-code memdup_user() rsxx: don't open-code memdup_user() mtip32xx: don't open-code memdup_user() [um] mconsole: don't open-code memdup_user_nul() [um] hostaudio: don't open-code memdup_user() ...	2016-01-12 17:11:47 -08:00
Linus Torvalds	1baa5efbeb	* s390: Support for runtime instrumentation within guests, support of 248 VCPUs. * ARM: rewrite of the arm64 world switch in C, support for 16-bit VM identifiers. Performance counter virtualization missed the boat. * x86: Support for more Hyper-V features (synthetic interrupt controller), MMU cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWlSKwAAoJEL/70l94x66DY0UIAK5vp4zfQoQOJC4KP4Xgxwdu kpnK2Boz3/74o1b0y5+eJZoUZCsXCVLtmP5uhmMxUYWDgByFG2X8ZDhPFwB5FYLT 2dN+Lr4tsolgIfRdHZtrT6Svp9SDL039bWTdscnbR6l37/j9FRWvpKdhI3orloFD /i4CSW2dVIq1/9Xctwu/rtcOEesEx4Cad+6YV3/530eVAXFzE908nXfmqJNZTocY YCGcmrMVCOu0ng5QM4xSzmmYjKMLUcRs+QzZWkVBzdJtTgwZUr09yj7I2dZ1yj/i cxYrJy6shSwE74XkXsmvG+au3C5u3vX4tnXjBFErnPJ99oqzHatVnFWNRhj4dLQ= =PIj1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "PPC changes will come next week. - s390: Support for runtime instrumentation within guests, support of 248 VCPUs. - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM identifiers. Performance counter virtualization missed the boat. - x86: Support for more Hyper-V features (synthetic interrupt controller), MMU cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits) kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL kvm/x86: Hyper-V SynIC timers tracepoints kvm/x86: Hyper-V SynIC tracepoints kvm/x86: Update SynIC timers on guest entry only kvm/x86: Skip SynIC vector check for QEMU side kvm/x86: Hyper-V fix SynIC timer disabling condition kvm/x86: Reorg stimer_expiration() to better control timer restart kvm/x86: Hyper-V unify stimer_start() and stimer_restart() kvm/x86: Drop stimer_stop() function kvm/x86: Hyper-V timers fix incorrect logical operation KVM: move architecture-dependent requests to arch/ KVM: renumber vcpu->request bits KVM: document which architecture uses each request bit KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock KVM: s390: implement the RI support of guest kvm/s390: drop unpaired smp_mb kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table() arm/arm64: KVM: Detect vGIC presence at runtime ...	2016-01-12 13:22:12 -08:00
Michael S. Tsirkin	779a6a3696	s390: more efficient smp barriers As per: lkml.kernel.org/r/20150921112252.3c2937e1@mschwide atomics imply a barrier on s390, so s390 should change smp_mb__before_atomic and smp_mb__after_atomic to barrier() instead of smp_mb() and hence should not use the generic versions. Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:47:05 +02:00
Michael S. Tsirkin	a677f48695	s390: use generic memory barriers The s390 kernel is SMP to 99.99%, we just didn't bother with a non-smp variant for the memory-barriers. If the generic header is used we'd get the non-smp version for free. It will save a small amount of text space for CONFIG_SMP=n. Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:47:04 +02:00
Michael S. Tsirkin	82b44496ab	s390: define __smp_xxx This defines __smp_xxx barriers for s390, for use by virtualization. Some smp_xxx barriers are removed as they are defined correctly by asm-generic/barriers.h Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers unconditionally on this architecture. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:46:56 +02:00
Michael S. Tsirkin	21535aaed9	s390: reuse asm-generic/barrier.h On s390 read_barrier_depends, smp_read_barrier_depends smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the asm-generic variants exactly. Drop the local definitions and pull in asm-generic/barrier.h instead. This is in preparation to refactoring this code area. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:46:48 +02:00
Davidlohr Bueso	5a1b26d7c6	lcoking/barriers, arch: Use smp barriers in smp_store_release() With commit `b92b8b35a2` ("locking/arch: Rename set_mb() to smp_store_mb()") it was made clear that the context of this call (and thus set_mb) is strictly for CPU ordering, as opposed to IO. As such all archs should use the smp variant of mb(), respecting the semantics and saving a mandatory barrier on UP. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2016-01-12 20:46:46 +02:00
Linus Torvalds	24af98c4cf	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "So we have a laundry list of locking subsystem changes: - continuing barrier API and code improvements - futex enhancements - atomics API improvements - pvqspinlock enhancements: in particular lock stealing and adaptive spinning - qspinlock micro-enhancements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op futex: Cleanup the goto confusion in requeue_pi() futex: Remove pointless put_pi_state calls in requeue() futex: Document pi_state refcounting in requeue code futex: Rename free_pi_state() to put_pi_state() futex: Drop refcount if requeue_pi() acquired the rtmutex locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation lcoking/barriers, arch: Use smp barriers in smp_store_release() locking/cmpxchg, arch: Remove tas() definitions locking/pvqspinlock: Queue node adaptive spinning locking/pvqspinlock: Allow limited lock stealing locking/pvqspinlock: Collect slowpath lock statistics sched/core, locking: Document Program-Order guarantees locking, sched: Introduce smp_cond_acquire() and use it locking/pvqspinlock, x86: Optimize the PV unlock code path locking/qspinlock: Avoid redundant read of next pointer locking/qspinlock: Prefetch the next node cacheline locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg() atomics: Add test for atomic operations with _relaxed variants	2016-01-11 14:18:38 -08:00
Ard Biesheuvel	bcb7825a77	s390: fix normalization bug in exception table sorting The normalization pass in the sorting routine of the relative exception table serves two purposes: - it ensures that the address fields of the exception table entries are fully ordered, so that no ambiguities arise between entries with identical instruction offsets (i.e., when two instructions that are exactly 8 bytes apart each have an exception table entry associated with them) - it ensures that the offsets of both the instruction and the fixup fields of each entry are relative to their final location after sorting. Commit `eb608fb366` ("s390/exceptions: switch to relative exception table entries") ported the relative exception table format from x86, but modified the sorting routine to only normalize the instruction offset field and not the fixup offset field. The result is that the fixup offset of each entry will be relative to the original location of the entry before sorting, likely leading to crashes when those entries are dereferenced. Fixes: `eb608fb366` ("s390/exceptions: switch to relative exception table entries") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 13:02:28 +01:00
Heiko Carstens	cb14def6a9	s390/configs: update default configurations Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 13:01:28 +01:00
Martin Schwidefsky	249c543b97	s390/vdso: optimize getcpu system call Add the CPU number to the per-cpu vdso data page and add the __kernel_getcpu function to the vdso object to retrieve the CPU number in user space. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 13:01:24 +01:00
Michael S. Tsirkin	0dab3e0e59	s390: drop smp_mb in vdso_init The initial s390 vdso code is heavily influenced by the powerpc version which does have a smp_wmb in vdso_init right before the vdso_ready=1 assignment. s390 has no need for that. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <1452010645-25380-1-git-send-email-mst@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 12:27:22 +01:00
Heiko Carstens	c667aeacc1	s390: rename struct _lowcore to struct lowcore Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 12:27:15 +01:00
Heiko Carstens	423d5b364c	s390/mem_detect: use unsigned longs The memory detection code historically had to use unsigned long long since the machine reported the true memory size (>4GB) even if the virtual machine was running in ESA/390 mode. Since the old code is gone use unsigned long everywhere and also get rid of an unused ADDR2G define. (this patch converts all long longs within sclp_info to longs) There are many more possible conversions, however that can be done if somebody touches the corresponding code. Since people started to convert unrelated long types to long longs because of the types within struct sclp_info convert this now. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 12:27:11 +01:00
Heiko Carstens	cb951785a7	s390/ptrace: get rid of long longs in psw_bits The long longs were introduced by me in order to have a working definition of the struct psw_bits also in 31 bit mode. Since that is gone also get rid of the long longs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 12:27:07 +01:00
Heiko Carstens	7022ec4960	s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields Add missing multithreading fields of SYSIB 1.2.2 (Basic-Machine CPUs) to the output of /proc/sysinfo. Also use bitfields for SYSIB 2.2.2 to simplify the C code a bit. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-01-11 12:27:00 +01:00
Dan Williams	21266be9ed	arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug Let all the archs that implement devmem_is_allowed() opt-in to a common definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug. Cc: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [heiko: drop 'default y' for s390] Acked-by: Ingo Molnar <mingo@redhat.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-01-09 06:30:49 -08:00
Al Viro	bdb97e91e0	[s390] page_to_phys() always returns a multiple of PAGE_SIZE Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-01-09 02:16:04 -05:00
Paolo Bonzini	2860c4b167	KVM: move architecture-dependent requests to arch/ Since the numbers now overlap, it makes sense to enumerate them in asm/kvm_host.h rather than linux/kvm_host.h. Functions that refer to architecture-specific requests are also moved to arch/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-01-08 19:04:36 +01:00
Fan Zhang	c6e5f16637	KVM: s390: implement the RI support of guest This patch adds runtime instrumentation support for KVM guest. We need to setup a save area for the runtime instrumentation-controls control block(RICCB) and implement the necessary interfaces to live migrate the guest settings. We setup the sie control block in a way, that the runtime instrumentation instructions of a guest are handled by hardware. We also add a capability KVM_CAP_S390_RI to make this feature opt-in as it needs migration support. Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-01-07 14:48:26 +01:00
Michael S. Tsirkin	c57ee5faf4	kvm/s390: drop unpaired smp_mb smp_mb on vcpu destroy isn't paired with anything, violating pairing rules, and seems to be useless. Drop it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <1452010811-25486-1-git-send-email-mst@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2016-01-07 14:48:26 +01:00
Craig Gallek	538950a1b7	soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF Expose socket options for setting a classic or extended BPF program for use when selecting sockets in an SO_REUSEPORT group. These options can be used on the first socket to belong to a group before bind or on any socket in the group after bind. This change includes refactoring of the existing sk_filter code to allow reuse of the existing BPF filter validation checks. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-04 22:49:59 -05:00
David S. Miller	c07f30ad68	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-12-31 18:20:10 -05:00
Heiko Carstens	9236b4dd6b	s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK Use CONFIG_TOPOLOGY which selects CONFIG_SCHED_* all over the place to reduce the random usage of the previous config options. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-30 10:34:57 +01:00
Heiko Carstens	c095a94932	s390/Kconfig: remove pointless 64 bit dependencies s390 is always 64 bit. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-30 10:34:47 +01:00
Daniel Borkmann	8b614aebec	bpf: move clearing of A/X into classic to eBPF migration prologue Back in the days where eBPF (or back then "internal BPF" ;->) was not exposed to user space, and only the classic BPF programs internally translated into eBPF programs, we missed the fact that for classic BPF A and X needed to be cleared. It was fixed back then via `83d5b7ef99` ("net: filter: initialize A and X registers"), and thus classic BPF specifics were added to the eBPF interpreter core to work around it. This added some confusion for JIT developers later on that take the eBPF interpreter code as an example for deriving their JIT. F.e. in `f75298f5c3` ("s390/bpf: clear correct BPF accumulator register"), at least X could leak stack memory. Furthermore, since this is only needed for classic BPF translations and not for eBPF (verifier takes care that read access to regs cannot be done uninitialized), more complexity is added to JITs as they need to determine whether they deal with migrations or native eBPF where they can just omit clearing A/X in their prologue and thus reduce image size a bit, see f.e. `cde66c2d88` ("s390/bpf: Only clear A and X for converted BPF programs"). In other cases (x86, arm64), A and X is being cleared in the prologue also for eBPF case, which is unnecessary. Lets move this into the BPF migration in bpf_convert_filter() where it actually belongs as long as the number of eBPF JITs are still few. It can thus be done generically; allowing us to remove the quirk from __bpf_prog_run() and to slightly reduce JIT image size in case of eBPF, while reducing code duplication on this matter in current(/future) eBPF JITs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Zi Shen Lim <zlim.lnx@gmail.com> Cc: Yang Shi <yang.shi@linaro.org> Acked-by: Yang Shi <yang.shi@linaro.org> Acked-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-18 16:04:51 -05:00
Peter Oberparleiter	2ab59de7c5	s390/cio: Consolidate inline assemblies and related data definitions Replace the current semi-arbitrary distribution of inline assemblies: - Inline assemblies used by CIO go into ioasm.h - Data definitions used by inline assemblies go into cio.h Beyond cleaning up the current structure this is also required for use of tracepoints in inline assemblies introduced by a follow-on patch. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:34 +01:00
Christian Borntraeger	d7ae65aec0	s390/setup: cleanup machine flags Over time some machine flags got unused (e.g. MACHINE_FLAG_MVPG) or are available on all 64bit systems (MACHINE_FLAG_CSP, MACHINE_FLAG_IEEE) - let's remove them. Reorder the other ones to match the order of the MACHINE_HAS_* macros and renumber all bits to avoid holes. Also fix the comment about where the flags are detected. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:32 +01:00
Heiko Carstens	3dbc78d3a1	s390/smp: save timestamp on external calls This is supposed to make debugging easier: if within a dump we can see that an external call or emergency signal IPI is pending but all cpus are idle, we have no idea for how long the interrupt is outstanding. Therefore save a timestamp into the per cpu pcpu array of the target cpu whenever such an IPI is sent. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:31 +01:00
Christian Borntraeger	e527aec434	s390/sysinfo: Remove unused variables max_mnest and rc are never used. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:30 +01:00
Christian Borntraeger	90f405e859	s390/extmem: remove unused variable findseg_scode is assigned, but never used. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:29 +01:00
Christian Borntraeger	292d8d7157	s390/fault: remove unused variable address is assigned but never used. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:28 +01:00
Christian Borntraeger	f9101e639c	s390/traps: Remove unused variable location is assigned but never used. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:27 +01:00
Heiko Carstens	6527f6e0cb	s390: compile head.S always with -march=z900 head.S on s390 contains some sanity checks if the kernel will run on a machine or if the machine is too old, e.g. if the kernel contains instructions not available on the machine. If so, it will emit an error message to the console before it stops execution. Therefore head.S contains only instructions which are availanble with the earliest machine generation (z900). In order to make sure we don't accidently add instructions which are not available on z900, always compile with -march=z900. This makes sure compilation will fail if wrong instructions are used. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:26 +01:00
Heiko Carstens	f3d05f9de7	s390/facilities: add z13 als bit If configured for z13 assume the kernel makes use of the instructions that are part of the load-and-zero-rightmost-byte facility and load/store-on-condition facility 2. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:24 +01:00
Heiko Carstens	6ab6c31a54	s390/facilities: optimize test_facility() test_facility() can be optimized for bits which must be set anyway, due to the check in head.S. This removes a couple of superfluous runtime checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:23 +01:00
Heiko Carstens	185edd4431	s390/facilities: remove unneeded facility bits The facility lists contain a lot of bits which are not necessary to run the kernel. Therefore remove them and keep only those bits which are required for the kernel. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:22 +01:00
Heiko Carstens	0358ecf732	s390/facilities: make use of generated facility list Change head.S to make use of the generated facility list. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:21 +01:00
Heiko Carstens	c30f6828fe	s390/facilities: add helper tool to generate facility lists Modifying the architecture level set facility lists was always very error prone. Given the numbering of the facility bits within the Principles of Operation, where the most significant bit number is 0, it happened a lot of times that wrong bits were set or cleared. Therefore this patch adds a tool "gen_facilities" which generates include/generated/facilites.h. The definition of the bits to be set is contained within arch/s390/include/asm/facilities_src.h and can be easily extended to e.g. also generate such lists for the KVM module. The generated file looks like this: #define FACILITIES_ALS _AC(0xc1006450f0040000,UL) #define FACILITIES_ALS_DWORDS 1 The facility bits defined in this patch match 1:1 to the current masks that can be found in head.S. That is if the tool gets executed with -march=z990 then the generated masks will equal the masks in head.S for CONFIG_MARCH_Z990. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:20 +01:00
Heiko Carstens	76cdd44c2e	s390/facilities: always use lowcore's stfle field for storing facility bits head.s contains an stfle instruction which stores it result at the storage location that is assigned to the stfl instruction. This is currently no problem, since we only care about one double word. However if the number of double words in the ALS bitfield grows the current code is not very stable. E.g. before issuing the stfle command the memory to which it stores must be cleared, since the instruction may or may not clear memory contents where no bits are set. In order to simplify the code a bit always use the storage location that we reserved for the stfle result. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:19 +01:00
Heiko Carstens	9552a66fe6	s390/facilities: use stfl mnemonic instead of insn magic Now that 31 bit support is gone, the assembler always knows about the stfl instruction. Therefore lets use a readable mnemonic. Also remove the not needed extable entry for the inline assembly and fix the output constraint. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:59:18 +01:00
Michael Holzheu	272fa59ccb	s390/dis: Fix handling of format specifiers The print_insn() function returns strings like "lghi %r1,0". To escape the '%' character in sprintf() a second '%' is used. For example "lghi %%r1,0" is converted into "lghi %r1,0". After print_insn() the output string is passed to printk(). Because format specifiers like "%r" or "%f" are ignored by printk() this works by chance most of the time. But for instructions with control registers like "lctl %c6,%c6,780" this fails because printk() interprets "%c" as character format specifier. Fix this problem and escape the '%' characters twice. For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf() into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780". Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-12-18 14:43:21 +01:00
Guenther Hutzl	32e6b236d2	KVM: s390: consider system MHA for guest storage Verify that the guest maximum storage address is below the MHA (maximum host address) value allowed on the host. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> [adopt to match recent limit,size changes] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-12-15 17:08:22 +01:00
Dominik Dingel	a3a92c31bf	KVM: s390: fix mismatch between user and in-kernel guest limit While the userspace interface requests the maximum size the gmap code expects to get a maximum address. This error resulted in bigger page tables than necessary for some guest sizes, e.g. a 2GB guest used 3 levels instead of 2. At the same time we introduce KVM_S390_NO_MEM_LIMIT, which allows in a bright future that a guest spans the complete 64 bit address space. We also switch to TASK_MAX_SIZE for the initial memory size, this is a cosmetic change as the previous size also resulted in a 4 level pagetable creation. Reported-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-12-15 17:08:21 +01:00
Christian Borntraeger	8335713ad0	KVM: s390: obey kptr_restrict in traces The s390dbf and trace events provide a debugfs interface. If kptr_restrict is active, we should not expose kernel pointers. We can fence the debugfs output by using %pK instead of %p. Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-12-15 17:06:32 +01:00
Christian Borntraeger	7ec7c8c70b	KVM: s390: use assignment instead of memcpy Replace two memcpy with proper assignment. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-12-15 16:06:48 +01:00
Rusty Russell	7523e4dc50	module: use a structure to encapsulate layout. Makes it easier to handle init vs core cleanly, though the change is fairly invasive across random architectures. It simplifies the rbtree code immediately, however, while keeping the core data together in the same cachline (now iff the rbtree code is enabled). Acked-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2015-12-04 22:46:25 +01:00
Davidlohr Bueso	d5a73cadf3	lcoking/barriers, arch: Use smp barriers in smp_store_release() With commit `b92b8b35a2` ("locking/arch: Rename set_mb() to smp_store_mb()") it was made clear that the context of this call (and thus set_mb) is strictly for CPU ordering, as opposed to IO. As such all archs should use the smp variant of mb(), respecting the semantics and saving a mandatory barrier on UP. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-12-04 11:39:51 +01:00
Christian Borntraeger	2f8a43d45d	KVM: s390: remove redudant assigment of error code rc already contains -ENOMEM, no need to assign it twice. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>	2015-11-30 12:47:13 +01:00
Heiko Carstens	a6aacc3f87	KVM: s390: remove pointless test_facility(2) check This evaluates always to 'true'. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:12 +01:00
David Hildenbrand	07197fd05f	KVM: s390: don't load kvm without virtualization support If we don't have support for virtualization (SIE), e.g. when running under a hypervisor not supporting execution of the SIE instruction, we should immediately abort loading the kvm module, as the SIE instruction cannot be enabled dynamically. Currently, the SIE instructions fails with an exception on a non-SIE host, resulting in the guest making no progress, instead of failing hard. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:12 +01:00
David Hildenbrand	7f16d7e787	s390: show virtualization support in /proc/cpuinfo This patch exposes the SIE capability (aka virtualization support) via /proc/cpuinfo -> "features" as "sie". As we don't want to expose this hwcap via elf, let's add a second, "internal"/non-elf capability list. The content is simply concatenated to the existing features when printing /proc/cpuinfo. We also add the defines to elf.h to keep the hwcap stuff at a common place. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:11 +01:00
David Hildenbrand	8dfd523f85	s390/sclp: introduce check for SIE This patch adds a way to check if the SIE with zArchitecture support is available. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:11 +01:00
David Hildenbrand	4215825eeb	KVM: s390: don't switch to ESCA for ucontrol sca_add_vpcu is not called for ucontrol guests. We must also not apply the sca checking for sca_can_add_vcpu as ucontrol guests do not have to follow the sca limits. As common code already checks that id < KVM_MAX_VCPUS all other data structures are safe as well. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:11 +01:00
David Hildenbrand	eaa78f3432	KVM: s390: cleanup sca_add_vcpu Now that we already have kvm and the VCPU id set for the VCPU, we can convert sda_add_vcpu to look much more like sda_del_vcpu. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:10 +01:00
David Hildenbrand	10ce32d5b0	KVM: s390: always set/clear the SCA sda field Let's always set and clear the sda when enabling/disabling a VCPU. Dealing with sda being set to something else makes no sense anymore as we enable a VCPU in the SCA now after it has been registered at the VM. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:10 +01:00
David Hildenbrand	2550882449	KVM: s390: fix SCA related races and double use If something goes wrong in kvm_arch_vcpu_create, the VCPU has already been added to the sca but will never be removed. Trying to create VCPUs with duplicate ids (e.g. after a failed attempt) is problematic. Also, when creating multiple VCPUs in parallel, we could theoretically forget to set the correct SCA when the switch to ESCA happens just before the VCPU is registered. Let's add the VCPU to the SCA in kvm_arch_vcpu_postcreate, where we can be sure that no duplicate VCPU with the same id is around and the VCPU has already been registered at the VM. We also have to make sure to update ECB at that point. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:09 +01:00
David Hildenbrand	5f3fe620a5	KVM: s390: we always have a SCA Having no sca can never happen, even when something goes wrong when switching to ESCA. Otherwise we would have a serious bug. Let's remove this superfluous check. Acked-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:09 +01:00
David Hildenbrand	2c1bb2be98	KVM: s390: fast path for sca_ext_call_pending If CPUSTAT_ECALL_PEND isn't set, we can't have an external call pending, so we can directly avoid taking the lock. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:09 +01:00
Eugene (jno) Dvurechenski	fe0edcb731	KVM: s390: Enable up to 248 VCPUs per VM This patch allows s390 to have more than 64 VCPUs for a guest (up to 248 for memory usage considerations), if supported by the underlaying hardware (sclp.has_esca). Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:08 +01:00
Eugene (jno) Dvurechenski	5e04431523	KVM: s390: Introduce switching code This patch adds code that performs transparent switch to Extended SCA on addition of 65th VCPU in a VM. Disposal of ESCA is added too. The entier ESCA functionality, however, is still not enabled. The enablement will be provided in a separate patch. This patch also uses read/write lock protection of SCA and its subfields for possible disposal at the BSCA-to-ESCA transition. While only Basic SCA needs such a protection (for the swap), any SCA access is now guarded. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:08 +01:00
Eugene (jno) Dvurechenski	7d43bafcff	KVM: s390: Make provisions for ESCA utilization This patch updates the routines (sca_) to provide transparent access to and manipulation on the data for both Basic and Extended SCA in use. The kvm.arch.sca is generalized to (void ) to handle BSCA/ESCA cases. Also the kvm.arch.use_esca flag is provided. The actual functionality is kept the same. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:08 +01:00
Eugene (jno) Dvurechenski	bc784ccee5	KVM: s390: Introduce new structures This patch adds new structures and updates some existing ones to provide the base for Extended SCA functionality. The old sca_* structures were renamed to bsca_* to keep things uniform. The access to fields of SIGP controls were turned into bitfields instead of hardcoded bitmasks. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:07 +01:00
Eugene (jno) Dvurechenski	a6e2f683e7	KVM: s390: Provide SCA-aware helpers for VCPU add/del This patch provides SCA-aware helpers to create/delete a VCPU. This is to prepare for upcoming introduction of Extended SCA support. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:07 +01:00
Eugene (jno) Dvurechenski	a5bd764734	KVM: s390: Generalize access to SIGP controls This patch generalizes access to the SIGP controls, which is a part of SCA. This is to prepare for upcoming introduction of Extended SCA support. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:06 +01:00
Eugene (jno) Dvurechenski	605145103a	KVM: s390: Generalize access to IPTE controls This patch generalizes access to the IPTE controls, which is a part of SCA. This is to prepare for upcoming introduction of Extended SCA support. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:06 +01:00
Eugene (jno) Dvurechenski	f7ba1d3426	s390/sclp: introduce checks for ESCA and HVS Introduce sclp.has_hvs and sclp.has_esca to provide a way for kvm to check whether the extended-SCA and the home-virtual-SCA facilities are available. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:06 +01:00
David Hildenbrand	71f116bfed	KVM: s390: rewrite vcpu_post_run and drop out early Let's rewrite this function to better reflect how we actually handle exit_code. By dropping out early we can save a few cycles. This especially speeds up sie exits caused by host irqs. Also, let's move the special -EOPNOTSUPP for intercepts to the place where it belongs and convert it to -EREMOTE. Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-30 12:47:05 +01:00
David Hildenbrand	e09fefdeeb	KVM: Use common function for VCPU lookup by id Let's reuse the new common function for VPCU lookup by id. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split out the new function into a separate patch]	2015-11-30 12:47:04 +01:00
Martin Schwidefsky	419123f900	s390/spinlock: do not yield to a CPU in udelay/mdelay It does not make sense to try to relinquish the time slice with diag 0x9c to a CPU in a state that does not allow to schedule the CPU. The scenario where this can happen is a CPU waiting in udelay/mdelay while holding a spin-lock. Add a CIF bit to tag a CPU in enabled wait and use it to detect that the yield of a CPU will not be successful and skip the diagnose call. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:18 +01:00
Heiko Carstens	9eb31be33c	s390: remove is_32bit_task() helper is_32bit_task() used to be helpful when we still had CONFIG_32BIT. Since that is gone, it is nowadays identical to is_compat_task(). So remove it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:17 +01:00
Michael Holzheu	b8eecf36a4	s390: add 'install' target to 'make help' Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:17 +01:00
Sascha Silbe	3f975df69d	s390/sclp: Add VT220 support to early sclp console When running under qemu with the default configuration (-nographic), there is only a VT220 SCLP console, no line-mode SCLP console. Add VT220 support to the early SCLP console so the user has a chance to see critical error messages during early boot. None of the existing users of _sclp_print_early() check the return code. Instead of trying to come up with return code semantics when printing to multiple consoles (any or all of which may fail), we just drop the return code entirely. Tested on z/VM (line mode console) and LPAR (VT220 and line mode console). Tested on qemu/KVM with VT220 console and / or line mode console. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:17 +01:00
Christian Borntraeger	561e103002	s390/dis: Fix printing of the register numbers Since commit `b006f19b05` ("lib/vsprintf.c: handle invalid format specifiers more robustly") I get errors like [...] Krnl Code: 00000000004e2410: c00400000000 brcl 0,4e2410 Please remove unsupported %r in format string [ 8.179483] ------------[ cut here ]------------ [ 8.179484] WARNING: at lib/vsprintf.c:1781 Turns out that our disassembler relied on %r not being used as format string. Let's do the proper escaping of our decode buffers. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:16 +01:00
Gerald Schaefer	69eea95c48	s390/pci_dma: fix DMA table corruption with > 4 TB main memory DMA addresses returned from map_page() are calculated by using an iommu bitmap plus a start_dma offset. The size of this bitmap is based on the main memory size. If we have more than (4 TB - start_dma) main memory, the DMA address calculation will also produce addresses > 4 TB. Such addresses cannot be inserted in the 3-level DMA page table, instead the entries modulo 4 TB will be overwritten. Fix this by restricting the iommu bitmap size to (4 TB - start_dma). Also set zdev->end_dma to the actual end address of the usable range, instead of the theoretical maximum as reported by the hardware, which fixes a sanity check in dma_map() and also the IOMMU API domain geometry aperture calculation. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:15 +01:00
David Hildenbrand	406123517b	s390: get_user_pages_fast() might sleep Let's annotate it correctly, so we directly get a warning if we ever were to use it in atomic/preempt_disable/spinlock environment. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:15 +01:00
Martin Schwidefsky	db1c45154a	s390/spinlock: avoid diagnose loop The spinlock implementation calls the diagnose 0x9c / 0x44 immediately if the SIGP sense running reported the target CPU as not running. The diagnose 0x9c is a hint to the hypervisor to schedule the target CPU in preference to the source CPU that issued the diagnose. It can happen that on return from the diagnose the target CPU has not been scheduled yet, e.g. if the target logical CPU is on another physical CPU and the hypervisor did not want to migrate the logical CPU. Avoid the immediate repeat of the diagnose instruction, instead do the retry loop before the next invocation of diagnose 0x9c. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:15 +01:00
Martin Schwidefsky	1a2c5840ac	s390/dump: cleanup CPU save area handling Introduce save_area_alloc(), save_area_boot_cpu(), save_area_add_regs() and save_area_add_vxrs to deal with storing the CPU state in case of a system dump. Remove struct save_area and save_area_ext, and create a new struct save_area as a local definition to arch/s390/kernel/crash_dump.c. Copy each individual field from the hardware status area to the save area, storing the minimum of required data. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:14 +01:00
Martin Schwidefsky	1a36a39e22	s390/dump: rework CPU register dump code To collect the CPU registers of the crashed system allocated a single page with memblock_alloc_base and use it as a copy buffer. Replace the stop-and-store-status sigp with a store-status-at-address sigp in smp_save_dump_cpus() and smp_store_status(). In both cases the target CPU is already stopped and store-status-at-address avoids the detour via the absolute zero page. For kexec simplify s390_reset_system and call store_status() before the prefix register of the boot CPU has been set to zero. Use STPX to store the prefix register and remove dump_prefix_page. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:14 +01:00
Martin Schwidefsky	f08b841463	s390/dump: remove SAVE_AREA_BASE Replace the SAVE_AREA_BASE offset calculations in reipl.S with the assembler constant for the location of each register status area. Use __LC_FPREGS_SAVE_AREA instead of SAVE_AREA_BASE in the three remaining code locations and remove the definition of SAVE_AREA_BASE. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:14 +01:00
Martin Schwidefsky	d9a3a09af5	s390/kvm: remove dependency on struct save_area definition Replace the offsets based on the struct area_area with the offset constants from asm-offsets.c based on the struct _lowcore. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:13 +01:00
Martin Schwidefsky	df9694c797	s390/dump: streamline oldmem copy functions Introduce two copy functions for the memory of the dumped system, copy_oldmem_kernel() to copy to the virtual kernel address space and copy_oldmem_user() to copy to user space. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:12 +01:00
Martin Schwidefsky	8a07dd02d7	s390/kdump: remove code to create ELF notes in the crashed system The s390 architecture can store the CPU registers of the crashed system after the kdump kernel has been started and this is the preferred way. Remove the remaining code fragments that deal with storing CPU registers while the crashed system is still active. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:12 +01:00
Martin Schwidefsky	ffa52d02c5	s390/zcore: remove /sys/kernel/debug/zcore/mem New versions of the SCSI dumper use the /dev/vmcore interface instead of zcore mem. Remove the outdated interface. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:12 +01:00
Martin Schwidefsky	bbfed511c2	s390/zcore: copy vector registers into the image data The /sys/kernel/debug/zcore/mem interface delivers the memory of the old system with the CPU registers stored to the assigned locations in each prefix page. For the vector registers the prefix page of each CPU has an address of a 1024 byte save area at 0x11b0. But the /sys/kernel/debug/zcore/mem interface fails copy the vector registers saved at boot of the zfcpdump kernel into the dump image. Copy the saved vector registers of a CPU to the outout buffer if the memory area that is read via /sys/kernel/debug/zcore/mem intersects with the vector register save area of this CPU. Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-27 09:24:11 +01:00
Paolo Bonzini	8bd142c016	KVM/ARM Fixes for v4.4-rc3. Includes some timer fixes, properly unmapping PTEs, an errata fix, and two tweaks to the EL2 panic code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWVJ7gAAoJEEtpOizt6ddyD5MH/3M/nhtZTnT6v0RPDvHWJo7s 5BQmITJYPHFkTO14OHWTVLXiGgLws8gPZnWHxC4jjHjpuJnL+/MM551FpCOqDDd7 vweYgVlSqD8ANH5nKbv1PPnzjrqhTVN+yi3ZItXy2pxsfvu63FC6Z43B2axelLvw XYmHoMZaeWBBw2gHi3djGfju3Yj/2SOe+ozuvAXpxA5+NhSiPHHnMefGy5k3wKnJ sETwshPdjiMeK4ItfMhveFTDRjl4uh9uQyORfaa5gqG0uePt3EalYynw+gEjZ6RX Bpc3nLwboIfRIa/WwyoHm+nmLIUYjU8dAgLwUOIbdeG0igpdALdvsB0aBHCgngk= =+7ED -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM Fixes for v4.4-rc3. Includes some timer fixes, properly unmapping PTEs, an errata fix, and two tweaks to the EL2 panic code.	2015-11-24 19:34:40 +01:00
David Hildenbrand	152e9f65d6	KVM: s390: fix wrong lookup of VCPUs by array index For now, VCPUs were always created sequentially with incrementing VCPU ids. Therefore, the index in the VCPUs array matched the id. As sequential creation might change with cpu hotplug, let's use the correct lookup function to find a VCPU by id, not array index. Let's also use kvm_lookup_vcpu() for validation of the sending VCPU on external call injection. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # `db27a7a` KVM: Provide function for VCPU lookup by id	2015-11-19 14:47:43 +01:00
David Hildenbrand	b85de33a1a	KVM: s390: avoid memory overwrites on emergency signal injection Commit `383d0b0501` ("KVM: s390: handle pending local interrupts via bitmap") introduced a possible memory overwrite from user space. User space could pass an invalid emergency signal code (sending VCPU) and therefore exceed the bitmap. Let's take care of this case and check that the id is in the valid range. Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v3.19+ `db27a7a` KVM: Provide function for VCPU lookup by id Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-19 14:47:32 +01:00
Heiko Carstens	03c02807e2	KVM: s390: fix pfmf intercept handler The pfmf intercept handler should check if the EDAT 1 facility is installed in the guest, not if it is installed in the host. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-11-19 11:08:17 +01:00
David Hildenbrand	5967c17b11	KVM: s390: enable SIMD only when no VCPUs were created We should never allow to enable/disable any facilities for the guest when other VCPUs were already created. kvm_arch_vcpu_(load\|put) relies on SIMD not changing during runtime. If somebody would create and run VCPUs and then decides to enable SIMD, undefined behaviour could be possible (e.g. vector save area not being set up). Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # 4.1+	2015-11-19 11:08:16 +01:00
Heiko Carstens	f52c74fee9	s390: remove SALIPL loader There is no known user, therefore remove the code. Acked-by: Rob Van Der Heij <robvdheij@nl.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-16 12:51:11 +01:00
Heiko Carstens	932f608193	s390: wire up mlock2 system call Passes mlock2-tests test case in 64 bit and compat mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-16 12:51:07 +01:00
Heiko Carstens	ddfd4a054b	s390: remove g5 elf platform support Remove dead code, since this could only happen on a 31 bit machine where the kernel wouldn't IPL. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-16 12:04:41 +01:00
Martin Schwidefsky	c7e8b2c21c	s390: avoid cache aliasing under z/VM and KVM commit `1f6b83e5e4` ("s390: avoid z13 cache aliasing") checks for the machine type to optimize address space randomization and zero page allocation to avoid cache aliases. This check might fail under a hypervisor with migration support. z/VMs "Single System Image and Live Guest Relocation" facility will "fake" the machine type of the oldest system in the group. For example in a group of zEC12 and Z13 the guest appears to run on a zEC12 (architecture fencing within the relocation domain) Remove the machine type detection and always use cache aliasing rules that are known to work for all machines. These are the z13 aliasing rules. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-16 12:04:18 +01:00
Sascha Silbe	f07f21b3e2	s390/sclp: _sclp_wait_int(): retain full PSW mask There's no reason to clear all PSW mask bits other than the addressing mode bits. Just use the previous PSW mask as-is. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-12 13:08:00 +01:00
Sebastian Ott	18e22a1772	s390: add support for ipl devices in subchannel sets > 0 Allow to ipl from CCW based devices residing in any subchannel set. Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-11 13:56:27 +01:00
Sebastian Ott	e0bedada3a	s390/ipl: fix out of bounds access in scpdata_write The input buffer in reipl_fcp_scpdata_write is accessed out of bounds when an offset is specified. The problem is that the offset refers to the data we should write to and not to the buffer we read from. So instead of memcpy(scp_data, buf + off, count); we could just do memcpy(scp_data + off, buf, count); However we not only modify the data but also store its length. For this to work we'd need to remember a state per open FH. Since that's not possible with sysfs callbacks let's just fail when an offset is specified. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-11 09:07:06 +01:00
Sebastian Ott	52d43d8184	s390/pci_dma: improve debugging of errors during dma map Improve debugging to find out what went wrong during a failed dma map/unmap operation. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:49 +01:00
Sebastian Ott	66728eeea6	s390/pci_dma: handle dma table failures We use lazy allocation for translation table entries but don't handle allocation (and other) failures during translation table updates. Handle these failures and undo translation table updates when it's meaningful. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:49 +01:00
Sebastian Ott	4d5a6b7295	s390/pci_dma: unify label of invalid translation table entries Newly allocated translation table entries are flagged as invalid and protected. If an existing translation table entry is invalidated, the protection flag is left unchanged. If a page (with invalid and protection flag set) is accessed it's undefined which type of exception we'll receive. Make sure to always set the invalid flag only. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:49 +01:00
Heiko Carstens	86b68c3873	s390/syscalls: remove system call number calculation Explicitly write the system call number for each define instead of calculating it. This makes it easier to parse the file when generating system call tables for various tools and libraries. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:48 +01:00
Martin Schwidefsky	230ccb370f	s390/diag: add a s390 prefix to the diagnose trace point Documentation/trace/tracepoints.txt states that the naming scheme for tracepoints is "subsys_event" to avoid collisions. Rename the 'diagnose' tracepoint to 's390_diagnose'. Reported-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:47 +01:00
Sascha Silbe	c6eafbf990	s390/head: fix error message on unsupported hardware startup calls the C function _sclp_print_early() if the machine we're running on is not supported by the kernel. sclp.c is getting built with -m64, so _sclp_print_early() expects the zSeries ELF ABI to be used. We previously called _sclp_print_early() using the S/390 ELF ABI, with a stack frame size of 96 bytes and while being in 31-bit address mode. This caused _sclp_wait_int() (called indirectly from _sclp_print_early()) to jump to an undefined address. While _sclp_wait_int() contained some code to deal with being called in 31-bit addressing mode, it didn't quite work. While fixing this is possible, the code would still only work by chance and could break any time. Ensure compliance with the zSeries ELF ABI by switching to 64-bit addressing mode early and using a minimum stack frame size of 160 bytes. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-09 09:10:47 +01:00
Linus Torvalds	933425fb00	s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc= =iv2z -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.4. s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: Quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits) KVM: VMX: Fix commit which broke PML KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() KVM: x86: allow RSM from 64-bit mode KVM: VMX: fix SMEP and SMAP without EPT KVM: x86: move kvm_set_irq_inatomic to legacy device assignment KVM: device assignment: remove pointless #ifdefs KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic KVM: x86: zero apic_arb_prio on reset drivers/hv: share Hyper-V SynIC constants with userspace KVM: x86: handle SMBASE as physical address in RSM KVM: x86: add read_phys to x86_emulate_ops KVM: x86: removing unused variable KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr() KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings KVM: arm/arm64: Optimize away redundant LR tracking KVM: s390: use simple switch statement as multiplexer KVM: s390: drop useless newline in debugging data KVM: s390: SCA must not cross page boundaries KVM: arm: Do not indent the arguments of DECLARE_BITMAP ...	2015-11-05 16:26:26 -08:00
Linus Torvalds	39cf7c3981	IOMMU Updates for Linux v4.4 This time including: * A new IOMMU driver for s390 pci devices * Common dma-ops support based on iommu-api for ARM64. The plan is to use this as a basis for ARM32 and hopefully other architectures as well in the future. * MSI support for ARM-SMMUv3 * Cleanups and dead code removal in the AMD IOMMU driver * Better RMRR handling for the Intel VT-d driver * Various other cleanups and small fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJWOz7hAAoJECvwRC2XARrjbvYQALwtITTA5iTm0y/ApwNMxI7n pZpjZVPoBPNsGBc4t/MT8pVhUSdmpBOljbV4Y4CayL1mSSB6Bl2gooZjd66m7Z81 qMJYEVWhFQqVsIKkCSNOgaO7W5y+xt3rTgqN6vCu86/CCDfKrTPP/+CRl1T/z9bo 1J8ioM3KnZG9KzG8JuXYFg5wwbKToaBh6swSmj+O4U9hru7zV/ILP7ikcc9pyMji 12WbzCqchRORsJZD65xMRYAqRaPNN/3IlDejs00TOFhY3qpWgEgFUucyeRJBJ/+q K4U8T5vZsnr1a04l7/BeYbLmP7y/9Qv0N0xMGtTyoy/w/BieGqRWu4hHhqf/44NO EhCSXcEThMNCGTjP2VWC4dnQ/s7Y8OmSW9nCreUcFVxHoE5LfDoh8RngA2fpeNuS ixb3OwP+YXHN9Ck+1BQqQCeBznsPTLuDxlhRjCJsWntIfMSkXebOkz83YxyZ9b0Q gFvptfuknU7cotUwWa3dg8RiUB8kNlKJyEEByaVpWEbEOabnONKEMkstvuBx6Ots kA63wbe7QcPgbUYuq7g0nijDw6E2aEtf0nx2Xx4ZDL932qjg/xUkiBpmbDXHw4Gu nimNXVQtbCzF74SyTvxEtupiijOTm5eHtoKtg0mYnqPZ+V9eOwEvW8IHaFFf8XHD SecikoTtH1Q4RVtqOcAQ =jLlB -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "This time including: - A new IOMMU driver for s390 pci devices - Common dma-ops support based on iommu-api for ARM64. The plan is to use this as a basis for ARM32 and hopefully other architectures as well in the future. - MSI support for ARM-SMMUv3 - Cleanups and dead code removal in the AMD IOMMU driver - Better RMRR handling for the Intel VT-d driver - Various other cleanups and small fixes" * tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits) iommu/vt-d: Fix return value check of parse_ioapics_under_ir() iommu/vt-d: Propagate error-value from ir_parse_ioapic_hpet_scope() iommu/vt-d: Adjust the return value of the parse_ioapics_under_ir iommu: Move default domain allocation to iommu_group_get_for_dev() iommu: Remove is_pci_dev() fall-back from iommu_group_get_for_dev iommu/arm-smmu: Switch to device_group call-back iommu/fsl: Convert to device_group call-back iommu: Add device_group call-back to x86 iommu drivers iommu: Add generic_device_group() function iommu: Export and rename iommu_group_get_for_pci_dev() iommu: Revive device_group iommu-ops call-back iommu/amd: Remove find_last_devid_on_pci() iommu/amd: Remove first/last_device handling iommu/amd: Initialize amd_iommu_last_bdf for DEV_ALL iommu/amd: Cleanup buffer allocation iommu/amd: Remove cmd_buf_size and evt_buf_size from struct amd_iommu iommu/amd: Align DTE flag definitions iommu/amd: Remove old alias handling code iommu/amd: Set alias DTE in do_attach/do_detach iommu/amd: WARN when __[attach\|detach]_device are called with irqs enabled ...	2015-11-05 16:12:10 -08:00
Linus Torvalds	e627078a0c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "There is only one new feature in this pull for the 4.4 merge window, most of it is small enhancements, cleanup and bug fixes: - Add the s390 backend for the software dirty bit tracking. This adds two new pgtable functions pte_clear_soft_dirty and pmd_clear_soft_dirty which is why there is a hit to arch/x86/include/asm/pgtable.h in this pull request. - A series of cleanup patches for the AP bus, this includes the removal of the support for two outdated crypto cards (PCICC and PCICA). - The irq handling / signaling on buffer full in the runtime instrumentation code is dropped. - Some micro optimizations: remove unnecessary memory barriers for a couple of functions: [smb_]rmb, [smb_]wmb, atomics, bitops, and for spin_unlock. Use the builtin bswap if available and make test_and_set_bit_lock more cache friendly. - Statistics and a tracepoint for the diagnose calls to the hypervisor. - The CPU measurement facility support to sample KVM guests is improved. - The vector instructions are now always enabled for user space processes if the hardware has the vector facility. This simplifies the FPU handling code. The fpu-internal.h header is split into fpu internals, api and types just like x86. - Cleanup and improvements for the common I/O layer. - Rework udelay to solve a problem with kprobe. udelay has busy loop semantics but still uses an idle processor state for the wait" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (66 commits) s390: remove runtime instrumentation interrupts s390/cio: de-duplicate subchannel validation s390/css: unneeded initialization in for_each_subchannel s390/Kconfig: use builtin bswap s390/dasd: fix disconnected device with valid path mask s390/dasd: fix invalid PAV assignment after suspend/resume s390/dasd: fix double free in dasd_eckd_read_conf s390/kernel: fix ptrace peek/poke for floating point registers s390/cio: move ccw_device_stlck functions s390/cio: move ccw_device_call_handler s390/topology: reduce per_cpu() invocations s390/nmi: reduce size of percpu variable s390/nmi: fix terminology s390/nmi: remove casts s390/nmi: remove pointless error strings s390: don't store registers on disabled wait anymore s390: get rid of __set_psw_mask() s390/fpu: split fpu-internal.h into fpu internals, api, and type headers s390/dasd: fix list_del corruption after lcu changes s390/spinlock: remove unneeded serializations at unlock ...	2015-11-04 11:31:31 -08:00
Linus Torvalds	b0f85fa11a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: Changes of note: 1) Allow to schedule ICMP packets in IPVS, from Alex Gartrell. 2) Provide FIB table ID in ipv4 route dumps just as ipv6 does, from David Ahern. 3) Allow the user to ask for the statistics to be filtered out of ipv4/ipv6 address netlink dumps. From Sowmini Varadhan. 4) More work to pass the network namespace context around deep into various packet path APIs, starting with the netfilter hooks. From Eric W Biederman. 5) Add layer 2 TX/RX checksum offloading to qeth driver, from Thomas Richter. 6) Use usec resolution for SYN/ACK RTTs in TCP, from Yuchung Cheng. 7) Support Very High Throughput in wireless MESH code, from Bob Copeland. 8) Allow setting the ageing_time in switchdev/rocker. From Scott Feldman. 9) Properly autoload L2TP type modules, from Stephen Hemminger. 10) Fix and enable offload features by default in 8139cp driver, from David Woodhouse. 11) Support both ipv4 and ipv6 sockets in a single vxlan device, from Jiri Benc. 12) Fix CWND limiting of thin streams in TCP, from Bendik Rønning Opstad. 13) Fix IPSEC flowcache overflows on large systems, from Steffen Klassert. 14) Convert bridging to track VLANs using rhashtable entries rather than a bitmap. From Nikolay Aleksandrov. 15) Make TCP listener handling completely lockless, this is a major accomplishment. Incoming request sockets now live in the established hash table just like any other socket too. From Eric Dumazet. 15) Provide more bridging attributes to netlink, from Nikolay Aleksandrov. 16) Use hash based algorithm for ipv4 multipath routing, this was very long overdue. From Peter Nørlund. 17) Several y2038 cures, mostly avoiding timespec. From Arnd Bergmann. 18) Allow non-root execution of EBPF programs, from Alexei Starovoitov. 19) Support SO_INCOMING_CPU as setsockopt, from Eric Dumazet. This influences the port binding selection logic used by SO_REUSEPORT. 20) Add ipv6 support to VRF, from David Ahern. 21) Add support for Mellanox Spectrum switch ASIC, from Jiri Pirko. 22) Add rtl8xxxu Realtek wireless driver, from Jes Sorensen. 23) Implement RACK loss recovery in TCP, from Yuchung Cheng. 24) Support multipath routes in MPLS, from Roopa Prabhu. 25) Fix POLLOUT notification for listening sockets in AF_UNIX, from Eric Dumazet. 26) Add new QED Qlogic river, from Yuval Mintz, Manish Chopra, and Sudarsana Kalluru. 27) Don't fetch timestamps on AF_UNIX sockets, from Hannes Frederic Sowa. 28) Support ipv6 geneve tunnels, from John W Linville. 29) Add flood control support to switchdev layer, from Ido Schimmel. 30) Fix CHECKSUM_PARTIAL handling of potentially fragmented frames, from Hannes Frederic Sowa. 31) Support persistent maps and progs in bpf, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1790 commits) sh_eth: use DMA barriers switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion net: sched: kill dead code in sch_choke.c irda: Delete an unnecessary check before the function call "irlmp_unregister_service" net: dsa: mv88e6xxx: include DSA ports in VLANs net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports net/core: fix for_each_netdev_feature vlan: Invoke driver vlan hooks only if device is present arcnet/com20020: add LEDS_CLASS dependency bpf, verifier: annotate verbose printer with __printf dp83640: Only wait for timestamps for packets with timestamping enabled. ptp: Change ptp_class to a proper bitmask dp83640: Prune rx timestamp list before reading from it dp83640: Delay scheduled work. dp83640: Include hash in timestamp/packet matching ipv6: fix tunnel error handling net/mlx5e: Fix LSO vlan insertion net/mlx5e: Re-eanble client vlan TX acceleration net/mlx5e: Return error in case mlx5e_set_features() fails net/mlx5e: Don't allow more than max supported channels ...	2015-11-04 09:41:05 -08:00
Linus Torvalds	ccc9d4a6d6	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Add support for cipher output IVs in testmgr - Add missing crypto_ahash_blocksize helper - Mark authenc and des ciphers as not allowed under FIPS. Algorithms: - Add CRC support to 842 compression - Add keywrap algorithm - A number of changes to the akcipher interface: + Separate functions for setting public/private keys. + Use SG lists. Drivers: - Add Intel SHA Extension optimised SHA1 and SHA256 - Use dma_map_sg instead of custom functions in crypto drivers - Add support for STM32 RNG - Add support for ST RNG - Add Device Tree support to exynos RNG driver - Add support for mxs-dcp crypto device on MX6SL - Add xts(aes) support to caam - Add ctr(aes) and xts(aes) support to qat - A large set of fixes from Russell King for the marvell/cesa driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (115 commits) crypto: asymmetric_keys - Fix unaligned access in x509_get_sig_params() crypto: akcipher - Don't #include crypto/public_key.h as the contents aren't used hwrng: exynos - Add Device Tree support hwrng: exynos - Fix missing configuration after suspend to RAM hwrng: exynos - Add timeout for waiting on init done dt-bindings: rng: Describe Exynos4 PRNG bindings crypto: marvell/cesa - use __le32 for hardware descriptors crypto: marvell/cesa - fix missing cpu_to_le32() in mv_cesa_dma_add_op() crypto: marvell/cesa - use memcpy_fromio()/memcpy_toio() crypto: marvell/cesa - use gfp_t for gfp flags crypto: marvell/cesa - use dma_addr_t for cur_dma crypto: marvell/cesa - use readl_relaxed()/writel_relaxed() crypto: caam - fix indentation of close braces crypto: caam - only export the state we really need to export crypto: caam - fix non-block aligned hash calculation crypto: caam - avoid needlessly saving and restoring caam_hash_ctx crypto: caam - print errno code when hash registration fails crypto: marvell/cesa - fix memory leak crypto: marvell/cesa - fix first-fragment handling in mv_cesa_ahash_dma_last_req() crypto: marvell/cesa - rearrange handling for sw padded hashes ...	2015-11-04 09:11:12 -08:00
Paolo Bonzini	197a4f4b06	KVM/ARM Changes for v4.4-rc1 Includes a number of fixes for the arch-timer, introducing proper level-triggered semantics for the arch-timers, a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups getting rid of redundant state, and finally a stylistic change that gets rid of some ctags warnings. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWOhgAAAoJEEtpOizt6ddyKS8H/2ZHMTPjo6yChnrusNWy4Qbr 6laPDlzL+g45oMQRwNL7GnM1deRftaxvT2Wi+X84D/6Y/BD6MPds4HgtBfuWcSZ1 CyRJ0Ot/zrxenucSuJuOjq+a9gdizdAczkbB1MfYDULJH8fb6D+7RYLo3zgh4Xo4 pla3L9U6gSWe+YopBjZtZH43m3fwiwSM/v+uHOTIcXrsbR+fEgx/EFSKmA/DUCuo P5cFO/ceUGu7nATCexu5V82TgR2hvurrsR7mqfwY8YcF6HRM+NEOoS29xWC77v5S u/F08TKuKQLv0YTEFTyLETI/oEeuC0cHtrRQBNf4+9kXEOzKyXaae0wR/I6X2Ss= =GMNk -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.4-rc1 Includes a number of fixes for the arch-timer, introducing proper level-triggered semantics for the arch-timers, a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups getting rid of redundant state, and finally a stylistic change that gets rid of some ctags warnings. Conflicts: arch/x86/include/asm/kvm_host.h	2015-11-04 16:24:17 +01:00
Martin Schwidefsky	b38feccd66	s390: remove runtime instrumentation interrupts The external interrupts for runtime instrumentation buffer-full and runtime instrumentation halted are unused and have no current user. Remove the support and ignore the second parameter of the s390_runtime_instr system call from now on. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-03 14:40:51 +01:00
Christian Borntraeger	295d8fa961	s390/Kconfig: use builtin bswap Depending on the gcc version we can use builtin_bswap instead of architecture functions. Doing so is better than the inline assembly version of load reverse for two reasons: - the sequence of load reversed, apply constant mask, save reversed can be optimized to load, apply reversed mask, save - builtins are slightly better to optimize e.g. gcc instruction scheduler cannot optimize grouping on inline assemblies. To enable set we have to ARCH_USE_BUILTIN_BSWAP. bloat-o-meter results: add/remove: 1/1 grow/shrink: 75/533 up/down: 1711/-9394 (-7683) Suggested-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-03 14:40:48 +01:00
Martin Schwidefsky	55a423b6f1	s390/kernel: fix ptrace peek/poke for floating point registers git commit `155e839a81` "s390/kernel: dynamically allocate FP register save area" introduced a regression in regard to ptrace. If the vector register extension is not present or unused the ptrace peek of a floating pointer register return incorrect data and the ptrace poke to a floating pointer register overwrites the task structure starting at task->thread.fpu.fprs. Cc: stable@kernel.org # v4.3 Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2015-11-03 14:40:42 +01:00
Joerg Roedel	b67ad2f7c7	Merge branches 'x86/vt-d', 'arm/omap', 'arm/smmu', 's390', 'core' and 'x86/amd' into next Conflicts: drivers/iommu/amd_iommu_types.h	2015-11-02 20:03:34 +09:00
Christian Borntraeger	46b708ea87	KVM: s390: use simple switch statement as multiplexer We currently do some magic shifting (by exploiting that exit codes are always a multiple of 4) and a table lookup to jump into the exit handlers. This causes some calculations and checks, just to do an potentially expensive function call. Changing that to a switch statement gives the compiler the chance to inline and dynamically decide between jump tables or inline compare and branches. In addition it makes the code more readable. bloat-o-meter gives me a small reduction in code size: add/remove: 0/7 grow/shrink: 1/1 up/down: 986/-1334 (-348) function old new delta kvm_handle_sie_intercept 72 1058 +986 handle_prog 704 696 -8 handle_noop 54 - -54 handle_partial_execution 60 - -60 intercept_funcs 120 - -120 handle_instruction 198 - -198 handle_validity 210 - -210 handle_stop 316 - -316 handle_external_interrupt 368 - -368 Right now my gcc does conditional branches instead of jump tables. The inlining seems to give us enough cycles as some micro-benchmarking shows minimal improvements, but still in noise. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2015-10-29 15:59:11 +01:00

... 2 3 4 5 6 ...

4525 Commits