linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-23 20:59:56 +07:00

Author	SHA1	Message	Date
Sean Christopherson	6129ed877d	KVM: x86/mmu: Set mmio_value to '0' if reserved #PF can't be generated Set the mmio_value to '0' instead of simply clearing the present bit to squash a benign warning in kvm_mmu_set_mmio_spte_mask() that complains about the mmio_value overlapping the lower GFN mask on systems with 52 bits of PA space. Opportunistically clean up the code and comments. Cc: stable@vger.kernel.org Fixes: `d43e2675e9` ("KVM: x86: only do L1TF workaround on affected processors") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200527084909.23492-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-27 13:06:45 -04:00
Paolo Bonzini	d43e2675e9	KVM: x86: only do L1TF workaround on affected processors KVM stores the gfn in MMIO SPTEs as a caching optimization. These are split in two parts, as in "[high 11111 low]", to thwart any attempt to use these bits in an L1TF attack. This works as long as there are 5 free bits between MAXPHYADDR and bit 50 (inclusive), leaving bit 51 free so that the MMIO access triggers a reserved-bit-set page fault. The bit positions however were computed wrongly for AMD processors that have encryption support. In this case, x86_phys_bits is reduced (for example from 48 to 43, to account for the C bit at position 47 and four bits used internally to store the SEV ASID and other stuff) while x86_cache_bits in would remain set to 48, and _all_ bits between the reduced MAXPHYADDR and bit 51 are set. Then low_phys_bits would also cover some of the bits that are set in the shadow_mmio_value, terribly confusing the gfn caching mechanism. To fix this, avoid splitting gfns as long as the processor does not have the L1TF bug (which includes all AMD processors). When there is no splitting, low_phys_bits can be set to the reduced MAXPHYADDR removing the overlap. This fixes "npt=0" operation on EPYC processors. Thanks to Maxim Levitsky for bisecting this bug. Cc: stable@vger.kernel.org Fixes: `52918ed5fc` ("KVM: SVM: Override default MMIO mask if memory encryption is enabled") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-19 05:47:06 -04:00
Jim Mattson	c4e0e4ab4c	KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce Bank_num is a one-based count of banks, not a zero-based index. It overflows the allocated space only when strictly greater than KVM_MAX_MCE_BANKS. Fixes: `a9e38c3e01` ("KVM: x86: Catch potential overrun in MCE setup") Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20200511225616.19557-1-jmattson@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-15 13:48:56 -04:00
Paolo Bonzini	f6bfd9c8ff	Merge branch 'kvm-amd-fixes' into HEAD This topic branch will be included in both kvm/master and kvm/next (for 5.8) in order to simplify testing of kvm/next.	2020-05-15 13:48:12 -04:00
Babu Moger	37486135d3	KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c Though rdpkru and wrpkru are contingent upon CR4.PKE, the PKRU resource isn't. It can be read with XSAVE and written with XRSTOR. So, if we don't set the guest PKRU value here(kvm_load_guest_xsave_state), the guest can read the host value. In case of kvm_load_host_xsave_state, guest with CR4.PKE clear could potentially use XRSTOR to change the host PKRU value. While at it, move pkru state save/restore to common code and the host_pkru field to kvm_vcpu_arch. This will let SVM support protection keys. Cc: stable@vger.kernel.org Reported-by: Jim Mattson <jmattson@google.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Message-Id: <158932794619.44260.14508381096663848853.stgit@naples-babu.amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-13 11:27:41 -04:00
Suravee Suthikulpanit	7d611233b0	KVM: SVM: Disable AVIC before setting V_IRQ The commit `64b5bd2704` ("KVM: nSVM: ignore L1 interrupt window while running L2 with V_INTR_MASKING=1") introduced a WARN_ON, which checks if AVIC is enabled when trying to set V_IRQ in the VMCB for enabling irq window. The following warning is triggered because the requesting vcpu (to deactivate AVIC) does not get to process APICv update request for itself until the next #vmexit. WARNING: CPU: 0 PID: 118232 at arch/x86/kvm/svm/svm.c:1372 enable_irq_window+0x6a/0xa0 [kvm_amd] RIP: 0010:enable_irq_window+0x6a/0xa0 [kvm_amd] Call Trace: kvm_arch_vcpu_ioctl_run+0x6e3/0x1b50 [kvm] ? kvm_vm_ioctl_irq_line+0x27/0x40 [kvm] ? _copy_to_user+0x26/0x30 ? kvm_vm_ioctl+0xb3e/0xd90 [kvm] ? set_next_entity+0x78/0xc0 kvm_vcpu_ioctl+0x236/0x610 [kvm] ksys_ioctl+0x8a/0xc0 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x58/0x210 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes by sending APICV update request to all other vcpus, and immediately update APIC for itself. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lkml.org/lkml/2020/5/2/167 Fixes: `64b5bd2704` ("KVM: nSVM: ignore L1 interrupt window while running L2 with V_INTR_MASKING=1") Message-Id: <1588818939-54264-1-git-send-email-suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-08 07:44:32 -04:00
Suravee Suthikulpanit	54163a346d	KVM: Introduce kvm_make_all_cpus_request_except() This allows making request to all other vcpus except the one specified in the parameter. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <1588771076-73790-2-git-send-email-suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-08 07:44:32 -04:00
Paolo Bonzini	45981dedf5	KVM: VMX: pass correct DR6 for GD userspace exit When KVM_EXIT_DEBUG is raised for the disabled-breakpoints case (DR7.GD), DR6 was incorrectly copied from the value in the VM. Instead, DR6.BD should be set in order to catch this case. On AMD this does not need any special code because the processor triggers a #DB exception that is intercepted. However, the testcase would fail without the previous patch because both DR6.BS and DR6.BD would be set. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-08 07:44:31 -04:00
Paolo Bonzini	d67668e9dd	KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6 There are two issues with KVM_EXIT_DEBUG on AMD, whose root cause is the different handling of DR6 on intercepted #DB exceptions on Intel and AMD. On Intel, #DB exceptions transmit the DR6 value via the exit qualification field of the VMCS, and the exit qualification only contains the description of the precise event that caused a vmexit. On AMD, instead the DR6 field of the VMCB is filled in as if the #DB exception was to be injected into the guest. This has two effects when guest debugging is in use: * the guest DR6 is clobbered * the kvm_run->debug.arch.dr6 field can accumulate more debug events, rather than just the last one that happened (the testcase in the next patch covers this issue). This patch fixes both issues by emulating, so to speak, the Intel behavior on AMD processors. The important observation is that (after the previous patches) the VMCB value of DR6 is only ever observable from the guest is KVM_DEBUGREG_WONT_EXIT is set. Therefore we can actually set vmcb->save.dr6 to any value we want as long as KVM_DEBUGREG_WONT_EXIT is clear, which it will be if guest debugging is enabled. Therefore it is possible to enter the guest with an all-zero DR6, reconstruct the #DB payload from the DR6 we get at exit time, and let kvm_deliver_exception_payload move the newly set bits into vcpu->arch.dr6. Some extra bits may be included in the payload if KVM_DEBUGREG_WONT_EXIT is set, but this is harmless. This may not be the most optimized way to deal with this, but it is simple and, being confined within SVM code, it gets rid of the set_dr6 callback and kvm_update_dr6. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-08 07:44:31 -04:00
Paolo Bonzini	5679b803e4	KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6 kvm_x86_ops.set_dr6 is only ever called with vcpu->arch.dr6 as the second argument. Ensure that the VMCB value is synchronized to vcpu->arch.dr6 on #DB (both "normal" and nested) and nested vmentry, so that the current value of DR6 is always available in vcpu->arch.dr6. The get_dr6 callback can just access vcpu->arch.dr6 and becomes redundant. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-08 07:43:47 -04:00
Paolo Bonzini	2c19dba680	KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 07:45:16 -04:00
Peter Xu	449aa906e6	KVM: selftests: Add KVM_SET_GUEST_DEBUG test Covers fundamental tests for KVM_SET_GUEST_DEBUG. It is very close to the debug test in kvm-unit-test, but doing it from outside the guest. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505205000.188252-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 06:13:42 -04:00
Peter Xu	d5d260c5ff	KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG When single-step triggered with KVM_SET_GUEST_DEBUG, we should fill in the pc value with current linear RIP rather than the cached singlestep address. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505205000.188252-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 06:13:41 -04:00
Peter Xu	13196638d5	KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG RTM should always been set even with KVM_EXIT_DEBUG on #DB. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505205000.188252-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 06:13:41 -04:00
Paolo Bonzini	4d5523cfd5	KVM: x86: fix DR6 delivery for various cases of #DB injection Go through kvm_queue_exception_p so that the payload is correctly delivered through the exit qualification, and add a kvm_update_dr6 call to kvm_deliver_exception_payload that is needed on AMD. Reported-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 06:13:41 -04:00
Peter Xu	b9b2782cd5	KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-07 06:13:40 -04:00
Paolo Bonzini	2673cb6849	KVM: s390: Fix for running nested uner z/VM There are circumstances when running nested under z/VM that would trigger a WARN_ON_ONCE. Remove the WARN_ON_ONCE. Long term we certainly want to make this code more robust and flexible, but just returning instead of WARNING makes guest bootable again. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJesqW/AAoJEBF7vIC1phx8AvwQAK4QRoi6rnYkVQTZD639h2KJ 8bDfuzzFROI52tJ//+zZgf0XRhuqMWJuSTmeTYsQv24Wtwbkbt3oYMpdSyyxd9FU 1cjnGdg5x9/TFwYrMJNZDsOO2CUF1mz8I2j6VC9oIP/BAzc96vYQ+zQQR/Kfz9dm ESOAQYGcjDSwJT0vMD+u8YSKlDJCNM/8DtbwqnFHJSPjmemI1oVNUmtVoy3f9z/t XH3UFear4c9y3RY3+mvGQtrPP7ufzt9pKC4AFO1XlFr+mDpW2jfaujwrDcM4c/HH d6VzavZ6LPxTZ4IF8PPpBTXhfhENfU1c7W7N7pVoNgBbEqPd6KqQZJYZuTz57I30 FeKmdhgyuv/YvOqUUjNo92QEfqhfm2jRAjIUDQTXIB+4g/BrwiebmFKcYgDh6GKi lJztlEiJgmdcI56aacL1r8XY8qEisMcrhUWwfGo6TvR+5fiU1Mtm2ZI57CklFYxP QHlo/tZ3f3iI9IgTnh9cVHxPYC8hAhfvAH/Jbfl0EfjGj7HVu/NNH8EOJzyBb4Zo Vohr+GqinDl5SoiZ3sQd/cOeGWeJsMi/IKdPbNvGVIZNkZz1RrHe8uoVO+RZ0WOA a634CW3i/y3WblzAZ7W/oOOn51si3n2zzhVjVF1QbTXzswrGr0o7/dbl+veB2/Ro SLg2bpdejCYCxtaC4CTr =cSBf -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for running nested uner z/VM There are circumstances when running nested under z/VM that would trigger a WARN_ON_ONCE. Remove the WARN_ON_ONCE. Long term we certainly want to make this code more robust and flexible, but just returning instead of WARNING makes guest bootable again.	2020-05-06 08:09:17 -04:00
Peter Xu	495907ec36	KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-06 06:51:38 -04:00
Peter Xu	8ffdaf9155	KVM: selftests: Fix build for evmcs.h I got this error when building kvm selftests: /usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: multiple definition of `current_evmcs'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:222: first defined here /usr/bin/ld: /home/xz/git/linux/tools/testing/selftests/kvm/libkvm.a(vmx.o):/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: multiple definition of `current_vp_assist'; /tmp/cco1G48P.o:/home/xz/git/linux/tools/testing/selftests/kvm/include/evmcs.h:223: first defined here I think it's because evmcs.h is included both in a test file and a lib file so the structs have multiple declarations when linking. After all it's not a good habit to declare structs in the header files. Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200504220607.99627-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-06 06:51:36 -04:00
Paolo Bonzini	139f7425fd	kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits Using CPUID data can be useful for the processor compatibility check, but that's it. Using it to compute guest-reserved bits can have both false positives (such as LA57 and UMIP which we are already handling) and false negatives: in particular, with this patch we don't allow anymore a KVM guest to set CR4.PKE when CR4.PKE is clear on the host. Fixes: `b9dd21e104` ("KVM: x86: simplify handling of PKRU") Reported-by: Jim Mattson <jmattson@google.com> Tested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-06 06:51:36 -04:00
Sean Christopherson	c7cb2d650c	KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path Clear CF and ZF in the VM-Exit path after doing __FILL_RETURN_BUFFER so that KVM doesn't interpret clobbered RFLAGS as a VM-Fail. Filling the RSB has always clobbered RFLAGS, its current incarnation just happens clear CF and ZF in the processs. Relying on the macro to clear CF and ZF is extremely fragile, e.g. commit `089dd8e531` ("x86/speculation: Change FILL_RETURN_BUFFER to work with objtool") tweaks the loop such that the ZF flag is always set. Reported-by: Qian Cai <cai@lca.pw> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Fixes: `f2fde6a5bc` ("KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200506035355.2242-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-06 06:51:35 -04:00
Kashyap Chamarthy	27abe57770	docs/virt/kvm: Document configuring and running nested guests This is a rewrite of this[1] Wiki page with further enhancements. The doc also includes a section on debugging problems in nested environments, among other improvements. [1] https://www.linux-kvm.org/page/Nested_Guests Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Message-Id: <20200505112839.30534-1-kchamart@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-06 05:45:47 -04:00
Christian Borntraeger	5615e74f48	KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction In LPAR we will only get an intercept for FC==3 for the PQAP instruction. Running nested under z/VM can result in other intercepts as well as ECA_APIE is an effective bit: If one hypervisor layer has turned this bit off, the end result will be that we will get intercepts for all function codes. Usually the first one will be a query like PQAP(QCI). So the WARN_ON_ONCE is not right. Let us simply remove it. Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: stable@vger.kernel.org # v5.3+ Fixes: `e5282de931` ("s390: ap: kvm: add PQAP interception for AQIC") Link: https://lore.kernel.org/kvm/20200505083515.2720-1-borntraeger@de.ibm.com Reported-by: Qian Cai <cailca@icloud.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-05-05 11:15:05 +02:00
Paolo Bonzini	8be8f932e3	kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts Commit `f458d039db` ("kvm: ioapic: Lazy update IOAPIC EOI") introduces the following infinite loop: BUG: stack guard page was hit at 000000008f595917 \ (stack is 00000000bdefe5a4..00000000ae2b06f5) kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI RIP: 0010:kvm_set_irq+0x51/0x160 [kvm] Call Trace: irqfd_resampler_ack+0x32/0x90 [kvm] kvm_notify_acked_irq+0x62/0xd0 [kvm] kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm] ioapic_set_irq+0x20e/0x240 [kvm] kvm_ioapic_set_irq+0x5c/0x80 [kvm] kvm_set_irq+0xbb/0x160 [kvm] ? kvm_hv_set_sint+0x20/0x20 [kvm] irqfd_resampler_ack+0x32/0x90 [kvm] kvm_notify_acked_irq+0x62/0xd0 [kvm] kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm] ioapic_set_irq+0x20e/0x240 [kvm] kvm_ioapic_set_irq+0x5c/0x80 [kvm] kvm_set_irq+0xbb/0x160 [kvm] ? kvm_hv_set_sint+0x20/0x20 [kvm] .... The re-entrancy happens because the irq state is the OR of the interrupt state and the resamplefd state. That is, we don't want to show the state as 0 until we've had a chance to set the resamplefd. But if the interrupt has _not_ gone low then ioapic_set_irq is invoked again, causing an infinite loop. This can only happen for a level-triggered interrupt, otherwise irqfd_inject would immediately set the KVM_USERSPACE_IRQ_SOURCE_ID high and then low. Fortunately, in the case of level-triggered interrupts the VMEXIT already happens because TMR is set. Thus, fix the bug by restricting the lazy invocation of the ack notifier to edge-triggered interrupts, the only ones that need it. Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reported-by: borisvk@bstnet.org Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://www.spinics.net/lists/kvm/msg213512.html Fixes: `f458d039db` ("kvm: ioapic: Lazy update IOAPIC EOI") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207489 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-04 12:29:05 -04:00
Suravee Suthikulpanit	637543a8d6	KVM: x86: Fixes posted interrupt check for IRQs delivery modes Current logic incorrectly uses the enum ioapic_irq_destination_types to check the posted interrupt destination types. However, the value was set using APIC_DM_XXX macros, which are left-shifted by 8 bits. Fixes by using the APIC_DM_FIXED and APIC_DM_LOWEST instead. Fixes: (`fdcf756213` 'KVM: x86: Disable posted interrupts for non-standard IRQs delivery modes') Cc: Alexander Graf <graf@amazon.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <1586239989-58305-1-git-send-email-suravee.suthikulpanit@amd.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-04 12:16:51 -04:00
Paolo Bonzini	7134fa0709	KVM/arm fixes for Linux 5.7, take #2 - Fix compilation with Clang - Correctly initialize GICv4.1 in the absence of a virtual ITS - Move SP_EL0 save/restore to the guest entry/exit code - Handle PC wrap around on 32bit guests, and narrow all 32bit registers on userspace access -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl6r7LMPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDOioQAI3MzfQ/sGaJN/83ZLOdKdqvSmRwJrwoHH/K qX0HgDky3/OMPD+uIlYpo5f1RLM2R/pDj6rhpg8IcWhfVXWEZZHU9Z8xqc3o8Hpo Hp4if0pe9+6iaUPuGyzP0Di5Dj+6eNglHoSsvyeeGsH1b7YzE812wN0VnGHB7+T5 /lEMfCSDWmtMa63FvcX9oxqKCWr1pjpUJ46u0D2uszcbYpIPXm4AMZgX0ZxnlreT IPQ6uvG7bBeTjrkucScwqoH8L2/xBP2y6D2HoC7ANmvn4Wv8neJNYh0LQt0zgsTI DTNwy2E1R27lxtQtp9Y05itA1N1qkj6hRowgEWgtMtlLQyz0PUT+xFHl+T1iBQjz zcEoL49/A4x01fw6JVqDraItEBW6g8fjnJul/FZ7K6Psncxz9oRjSSz+sSVLgn/W wthA2ChVlGVzpQsfByVmARTFew65Ls/rm1h9TzZcMWZsEdQRLi5NtyFkLBq2aMMz D15//aFQf7jmiSv+uVALZcnU1dBxqqzGBY8pwSrNSv4LsZAcDOsKRpgoe3zFVj48 rzbUOWXthEpXo4RipOoEeNavuFwetwcCKlyO5hnvUhlR5Yc0ofQiWKZE5vZ6yGm4 cg2CUMBy7Mjcg+80vo5qnRS5E6S+xQHgBnzwau0DOTIZDerKjH69gsn8JxiRNRbo Ix9uMPY8 =455e -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm fixes for Linux 5.7, take #2 - Fix compilation with Clang - Correctly initialize GICv4.1 in the absence of a virtual ITS - Move SP_EL0 save/restore to the guest entry/exit code - Handle PC wrap around on 32bit guests, and narrow all 32bit registers on userspace access	2020-05-04 12:01:37 -04:00
Paolo Bonzini	9e5e19f585	KVM/arm fixes for Linux 5.7, take #1 - Prevent the userspace API from interacting directly with the HW stage of the virtual GIC - Fix a couple of vGIC memory leaks - Tighten the rules around the use of the 32bit PSCI functions for 64bit guest, as well as the opposite situation (matches the specification) -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl6htMcPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDR2AP/jjTgHDfwEFhODZ6Wi2tF8r6tnkvn4U1TOoN 5WXulhn5eCCGuoTLWocX3/6Qh2kVY97KaXLhB6VUFD6RZDwn6BQ9K7QR/sAVAv7i GRILq/sbXWIz3sb+JiNK/T/6nZ/BxBTmDEE+ZovOvq3RCPqjCoYkHaqZ4pgIO7Ty UDtfbLtczhgpZfGgZJkvU7k+fHmWITDN/XHB+tJoCYzinowO7AU315D+UOh/1df3 9MfTw7IyrypMhm59g3TeTo4ca1H3dLU0lyNQbNbSV6ttDs7ZZgxXO0ZoNyHMEYaZ ZO7r4Tf6hCekrrhoG8gSg3GiBJ//UvvlireoEvAaCxQquoBAq+RseL+upVutwyPq I00ShagbL0jRV48MbwayJ7ctzhohecRGrCpPIRCJm7WEg7lH7QmWD9xEoFlUNka5 Nc6hpob0zXPBIK+k92Czl2XNllInwhthPkvRb/nQs/0Z6dZE73eAltYv7eaggiuA Ud4+1aRAxU1oXvHQGnHWCO9pkkLpjeB5Xnh90tmZyvfx+ZdL/ZOHvlKzL9+8Iyx0 HGmf/R72XhKyBQ1/CMpS1HAG4uDS2Cw6L9ullpzLaKD37FSzK/ZXZsTXdGG1IG5Q ywipbO8tPRp/D5oML9DwFcep5nVf/Y3EViMzzrznlNq1SwK2ekM2t717U5ODtTdt KGtpPXCi =fh9d -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm fixes for Linux 5.7, take #1 - Prevent the userspace API from interacting directly with the HW stage of the virtual GIC - Fix a couple of vGIC memory leaks - Tighten the rules around the use of the 32bit PSCI functions for 64bit guest, as well as the opposite situation (matches the specification)	2020-05-04 12:01:12 -04:00
Paolo Bonzini	dee919d15d	KVM: SVM: fill in kvm_run->debug.arch.dr[67] The corresponding code was added for VMX in commit `42dbaa5a05` ("KVM: x86: Virtualize debug registers, 2008-12-15) but never for AMD. Fix this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-04 11:59:03 -04:00
Sean Christopherson	f9336e3281	KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning Use BUG() in the impossible-to-hit default case when switching on the scope of INVEPT to squash a warning with clang 11 due to clang treating the BUG_ON() as conditional. >> arch/x86/kvm/vmx/nested.c:5246:3: warning: variable 'roots_to_free' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] BUG_ON(1); Reported-by: kbuild test robot <lkp@intel.com> Fixes: `ce8fe7b77b` ("KVM: nVMX: Free only the affected contexts when emulating INVEPT") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200504153506.28898-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-04 11:58:55 -04:00
Marc Zyngier	0225fd5e0a	KVM: arm64: Fix 32bit PC wrap-around In the unlikely event that a 32bit vcpu traps into the hypervisor on an instruction that is located right at the end of the 32bit range, the emulation of that instruction is going to increment PC past the 32bit range. This isn't great, as userspace can then observe this value and get a bit confused. Conversly, userspace can do things like (in the context of a 64bit guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, set PC to a 64bit value, change PSTATE to AArch32-USR, and observe that PC hasn't been truncated. More confusion. Fix both by: - truncating PC increments for 32bit guests - sanitizing all 32bit regs every time a core reg is changed by userspace, and that PSTATE indicates a 32bit mode. Cc: stable@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-05-01 09:51:08 +01:00
Marc Zyngier	958e8e14fd	KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITS KVM now expects to be able to use HW-accelerated delivery of vSGIs as soon as the guest has enabled thm. Unfortunately, we only initialize the GICv4 context if we have a virtual ITS exposed to the guest. Fix it by always initializing the GICv4.1 context if it is available on the host. Fixes: `2291ff2f2a` ("KVM: arm64: GICv4.1: Plumb SGI implementation selection in the distributor") Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-04-30 12:50:23 +01:00
Marc Zyngier	6e977984f6	KVM: arm64: Save/restore sp_el0 as part of __guest_enter We currently save/restore sp_el0 in C code. This is a bit unsafe, as a lot of the C code expects 'current' to be accessible from there (and the opportunity to run kernel code in HYP is specially great with VHE). Instead, let's move the save/restore of sp_el0 to the assembly code (in __guest_enter), making sure that sp_el0 is correct very early on when we exit the guest, and is preserved as long as possible to its host value when we enter the guest. Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-04-30 12:01:35 +01:00
Fangrui Song	6aea9e0503	KVM: arm64: Delete duplicated label in invalid_vector SYM_CODE_START defines \label , so it is redundant to define \label again. A redefinition at the same place is accepted by GNU as (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=159fbb6088f17a341bcaaac960623cab881b4981) but rejected by the clang integrated assembler. Fixes: `617a2f392c` ("arm64: kvm: Annotate assembly using modern annoations") Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/988 Link: https://lore.kernel.org/r/20200413231016.250737-1-maskray@google.com	2020-04-30 11:17:00 +01:00
Marc Zyngier	446c0768f5	Merge branch 'kvm-arm64/vgic-fixes-5.7' into kvmarm-master/master	2020-04-23 16:27:33 +01:00
Marc Zyngier	66f63474da	Merge branch 'kvm-arm64/psci-fixes-5.7' into kvmarm-master/master	2020-04-23 16:27:26 +01:00
Zenghui Yu	57bdb436ce	KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() If we're going to fail out the vgic_add_lpi(), let's make sure the allocated vgic_irq memory is also freed. Though it seems that both cases are unlikely to fail. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200414030349.625-3-yuzenghui@huawei.com	2020-04-23 16:26:56 +01:00
Zenghui Yu	969ce8b526	KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy It's likely that the vcpu fails to handle all virtual interrupts if userspace decides to destroy it, leaving the pending ones stay in the ap_list. If the un-handled one is a LPI, its vgic_irq structure will be eventually leaked because of an extra refcount increment in vgic_queue_irq_unlock(). This was detected by kmemleak on almost every guest destroy, the backtrace is as follows: unreferenced object 0xffff80725aed5500 (size 128): comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... backtrace: [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 [<0000000078197602>] io_mem_abort+0x484/0x7b8 [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 [<00000000e0d0cd65>] handle_exit+0x24c/0x770 [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 Fix it by retiring all pending LPIs in the ap_list on the destroy path. p.s. I can also reproduce it on a normal guest shutdown. It is because userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while the guest is being shutdown and unable to handle it. A little strange though and haven't dig further... Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> [maz: moved the distributor deallocation down to avoid an UAF splat] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200414030349.625-2-yuzenghui@huawei.com	2020-04-23 16:26:56 +01:00
Marc Zyngier	ba1ed9e17b	KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits There is no point in accessing the HW when writing to any of the ISPENDR/ICPENDR registers from userspace, as only the guest should be allowed to change the HW state. Introduce new userspace-specific accessors that deal solely with the virtual state. Note that the API differs from that of GICv3, where userspace exclusively uses ISPENDR to set the state. Too bad we can't reuse it. Fixes: `82e40f558d` ("KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI") Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-04-23 16:26:31 +01:00
Marc Zyngier	41ee52ecbc	KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits There is no point in accessing the HW when writing to any of the ISENABLER/ICENABLER registers from userspace, as only the guest should be allowed to change the HW state. Introduce new userspace-specific accessors that deal solely with the virtual state. Reported-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-04-22 17:13:30 +01:00
Marc Zyngier	9a50ebbffa	KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER read When a guest tries to read the active state of its interrupts, we currently just return whatever state we have in memory. This means that if such an interrupt lives in a List Register on another CPU, we fail to obsertve the latest active state for this interrupt. In order to remedy this, stop all the other vcpus so that they exit and we can observe the most recent value for the state. This is similar to what we are doing for the write side of the same registers, and results in new MMIO handlers for userspace (which do not need to stop the guest, as it is supposed to be stopped already). Reported-by: Julien Grall <julien@xen.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-04-22 17:13:16 +01:00
Paolo Bonzini	00a6a5ef39	PPC KVM fix for 5.7 - Fix a regression introduced in the last merge window, which results in guests in HPT mode dying randomly. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJeni/pAAoJEJ2a6ncsY3GfTRoIANAQjIZi96AfJcfnrYQ4yUF7 scxawTiJ9VavvsEJLJ7vsozrJ4xxmvmA0fFWC84uw9+BwPqoLFFvZTjazbGEDVvF FGwNBR/k7nfFVMIHS3K9iy9KjvYL3xkL26AgFTDJFq8hmOO9pH0txuk4r7SXb+NX bGG0mScAD/Dg/HwAHAS6EP3jT35QtGTK62p8foqVTziTNcmBn9Ywtg0lEzAcq2iY Y1BUD4Ov3cggshMI9SqHE8Yyq0XA2Wi6ggcyz/gVzvcbdFQmtg57Tri8nN8661LX XKh+VTpYSIxNs5GgjwlNesJzJ9h6CSynJF556qrjQ0XsXcNqvn8fcZdNQ+hnRYw= =Y19W -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fix for 5.7 - Fix a regression introduced in the last merge window, which results in guests in HPT mode dying randomly.	2020-04-21 09:39:55 -04:00
Paolo Bonzini	3bda03865f	KVM: s390: Fix for 5.7 and maintainer update - Silence false positive lockdep warning - add Claudio as reviewer -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJenY6AAAoJEBF7vIC1phx8bykQAK+QZyD+H/zGNuqeUVn0sh8e yKUVMR+kuE+l57q77nt2AYVxqpCD9xSKRR+SOSLzhVH/HJf625nm+Ny/WOWMebwJ EA/KK+v15T5rga8gFza+4cPg4v/pHwjHhSbjTb1JWg+8cJR1BTj6OxRuTtWr5+25 GF4RhkJOit/VhNbCo1aIgs7/7F1pPALstdPAUsHYe1PeULdRMVqSVluXT2KTPhpi /kzDw8sKKcYgv/eaVdcNoHv+VX1AWIRDAKEttCywyocfbu0ESwadmR7C0qlm1446 HqowP6F0xCF0Whi/65aN4ZOv7wjO/qrV08DZ7JLA3/oKlXtZ1ieyiE2q/P1frSo1 gvmuHiH5/UI6t6a/BSCpJwqcilxKYArqAAYBKoGiJhTbsJStqw0wl41klWTKXlTq VrCvjoUxQ9JMjFCQ1GXOU+ODNyX2IwZYptJ5vF24HYzBJwUBe3HPG9/BA8YcodzG qGQ5IKv0Q1IFTwOqnt557H0MjcBtNIEx54aLJrPy3wldsiNSj39Ft0cuvnbR+Q4F QhKk88dHtd7NW1IirfgYmLGe0rB1ANKM7wUGEdM5w2y5Eg8wCs8/P4KeGh0YyFI9 xPqZDfwof6KkDjOGFXr/CeD/thi+km0/FpePb7cL5Ow4a+JmrCvqQiXrf0TbnFpv t5ZlHnGzoSHsEaRgmJ+X =d46L -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: Fix for 5.7 and maintainer update - Silence false positive lockdep warning - add Claudio as reviewer	2020-04-21 09:37:13 -04:00
Paul Mackerras	ae49dedaa9	KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions Since `cd758a9b57` "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler", it's been possible in fairly rare circumstances to load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a guest on a POWER8 host. Because that case wasn't checked for, we could misinterpret the non-present PTE as being a cache-inhibited PTE. That could mismatch with the corresponding hash PTE, which would cause the function to fail with -EFAULT a little further down. That would propagate up to the KVM_RUN ioctl() generally causing the KVM userspace (usually qemu) to fall over. This addresses the problem by catching that case and returning to the guest instead. For completeness, this fixes the radix page fault handler in the same way. For radix this didn't cause any obvious misbehaviour, because we ended up putting the non-present PTE into the guest's partition-scoped page tables, leading immediately to another hypervisor data/instruction storage interrupt, which would go through the page fault path again and fix things up. Fixes: `cd758a9b57` "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler" Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402 Reported-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2020-04-21 09:23:41 +10:00
Josh Poimboeuf	7f4b5cde24	kvm: Disable objtool frame pointer checking for vmenter.S Frame pointers are completely broken by vmenter.S because it clobbers RBP: arch/x86/kvm/svm/vmenter.o: warning: objtool: __svm_vcpu_run()+0xe4: BP used as a scratch register That's unavoidable, so just skip checking that file when frame pointers are configured in. On the other hand, ORC can handle that code just fine, so leave objtool enabled in the !FRAME_POINTER case. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Message-Id: <01fae42917bacad18be8d2cbc771353da6603473.1587398610.git.jpoimboe@redhat.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Fixes: `199cd1d7b5` ("KVM: SVM: Split svm_vcpu_run inline assembly to separate file") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-04-20 17:11:19 -04:00
Claudio Imbrenda	2a173ec993	MAINTAINERS: add a reviewer for KVM/s390 Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20200417152936.772256-1-imbrenda@linux.ibm.com	2020-04-20 11:24:00 +02:00
Eric Farman	d47c4c454a	KVM: s390: Fix PV check in deliverable_irqs() The diag 0x44 handler, which handles a directed yield, goes into a a codepath that does a kvm_for_each_vcpu() and ultimately deliverable_irqs(). The new check for kvm_s390_pv_cpu_is_protected() contains an assertion that the vcpu->mutex is held, which isn't going to be the case in this scenario. The result is a plethora of these messages if the lock debugging is enabled, and thus an implication that we have a problem. WARNING: CPU: 9 PID: 16167 at arch/s390/kvm/kvm-s390.h:239 deliverable_irqs+0x1c6/0x1d0 [kvm] ...snip... Call Trace: [<000003ff80429bf2>] deliverable_irqs+0x1ca/0x1d0 [kvm] ([<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm]) [<000003ff8042ba82>] kvm_s390_vcpu_has_irq+0x2a/0xa8 [kvm] [<000003ff804101e2>] kvm_arch_dy_runnable+0x22/0x38 [kvm] [<000003ff80410284>] kvm_vcpu_on_spin+0x8c/0x1d0 [kvm] [<000003ff80436888>] kvm_s390_handle_diag+0x3b0/0x768 [kvm] [<000003ff80425af4>] kvm_handle_sie_intercept+0x1cc/0xcd0 [kvm] [<000003ff80422bb0>] __vcpu_run+0x7b8/0xfd0 [kvm] [<000003ff80423de6>] kvm_arch_vcpu_ioctl_run+0xee/0x3e0 [kvm] [<000003ff8040ccd8>] kvm_vcpu_ioctl+0x2c8/0x8d0 [kvm] [<00000001504ced06>] ksys_ioctl+0xae/0xe8 [<00000001504cedaa>] __s390x_sys_ioctl+0x2a/0x38 [<0000000150cb9034>] system_call+0xd8/0x2d8 2 locks held by CPU 2/KVM/16167: #0: 00000001951980c0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x90/0x8d0 [kvm] #1: 000000019599c0f0 (&kvm->srcu){....}, at: __vcpu_run+0x4bc/0xfd0 [kvm] Last Breaking-Event-Address: [<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm] irq event stamp: 11967 hardirqs last enabled at (11975): [<00000001502992f2>] console_unlock+0x4ca/0x650 hardirqs last disabled at (11982): [<0000000150298ee8>] console_unlock+0xc0/0x650 softirqs last enabled at (7940): [<0000000150cba6ca>] __do_softirq+0x422/0x4d8 softirqs last disabled at (7929): [<00000001501cd688>] do_softirq_own_stack+0x70/0x80 Considering what's being done here, let's fix this by removing the mutex assertion rather than acquiring the mutex for every other vcpu. Fixes: `201ae986ea` ("KVM: s390: protvirt: Implement interrupt injection") Signed-off-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20200415190353.63625-1-farman@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-04-20 11:23:45 +02:00
Linus Torvalds	ae83d0b416	Linux 5.7-rc2	2020-04-19 14:35:30 -07:00
Brian Geffon	dadbd85f2a	mm: Fix MREMAP_DONTUNMAP accounting on VMA merge When remapping a mapping where a portion of a VMA is remapped into another portion of the VMA it can cause the VMA to become split. During the copy_vma operation the VMA can actually be remerged if it's an anonymous VMA whose pages have not yet been faulted. This isn't normally a problem because at the end of the remap the original portion is unmapped causing it to become split again. However, MREMAP_DONTUNMAP leaves that original portion in place which means that the VMA which was split and then remerged is not actually split at the end of the mremap. This patch fixes a bug where we don't detect that the VMAs got remerged and we end up putting back VM_ACCOUNT on the next mapping which is completely unreleated. When that next mapping is unmapped it results in incorrectly unaccounting for the memory which was never accounted, and eventually we will underflow on the memory comittment. There is also another issue which is similar, we're currently accouting for the number of pages in the new_vma but that's wrong. We need to account for the length of the remap operation as that's all that is being added. If there was a mapping already at that location its comittment would have been adjusted as part of the munmap at the start of the mremap. A really simple repro can be seen in: https://gist.github.com/bgaff/e101ce99da7d9a8c60acc641d07f312c Fixes: `e346b38130` ("mm/mremap: add MREMAP_DONTUNMAP to mremap()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Brian Geffon <bgeffon@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-04-19 14:07:10 -07:00
Linus Torvalds	86cc339856	Two build fixes for a couple clk drivers and a fix for the Unisoc serial clk where we want to keep it on for earlycon. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAl6cfVgRHHNib3lkQGtl cm5lbC5vcmcACgkQrQKIl8bklSXNkA/+LRR8Z+BmvpUxuo9YxrzeoQrVTm/3YgzU 0puj9+RC1KGyFrW4McP+dX6izWT049cswt+em1fojkrQW7Ojp20t5P20SK5NTa0j hS90tIoSpORdcQBpfgBUOfk7oGmRFEGLSEjJVF+MMizFpnNroz57Y7jn0RksQe1A CDyc5WmgmayoGhnwrKc91ern9qYJW595Bpanv+vsw/wwJvpypQJ1/eT2LIb9MAlR 8GBJWGhhlNqsFsXEPZEnSFYzUZR8jE6uB2hQ70jKSzR2T/YTZO26MUZvj26WfG8O VHN0zxGqpWad9u+xasDlzPv9l7fxuKViNr5zdLrFUP+0NEgDMaIQNFg88bSov6PE UpDe9ImGbMrcaWR4QOFICYWHp1C4EPQp9VZjSJN4fSFUxQLu3WVqxVaMi/kly1w0 IH1YNU+7G/q4TRURenqUWxXOAY0ti89pW2IvhYrvAWFErJXw3XfsYFbfUdphtk1f wxF7YulCO3OnhtZ3P0E2K2gIdF8PYTR//qPwX9MYKKipnNKkeYskmirjRuCK59yF lu7DgMduprdTNMHVFwT6TmpnPrdn+g5pyEz7OMeDUklk/dwyzofHTd/GeVdj5rRC eeI8I0zka9klCEdkTWlAlH4RA4Ccn3sBD3O5fAs7ue+7xuUqj3PZqCPFtTlxp63t tVuDRwrob9A= =6Qda -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two build fixes for a couple clk drivers and a fix for the Unisoc serial clk where we want to keep it on for earlycon" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sprd: don't gate uart console clock clk: mmp2: fix link error without mmp2 clk: asm9260: fix __clk_hw_register_fixed_rate_with_accuracy typo	2020-04-19 13:59:06 -07:00
Linus Torvalds	0fe5f9ca22	A set of fixes for x86 and objtool: objtool: - Ignore the double UD2 which is emitted in BUG() when CONFIG_UBSAN_TRAP is enabled. - Support clang non-section symbols in objtool ORC dump - Fix switch table detection in .text.unlikely - Make the BP scratch register warning more robust. x86: - Increase microcode maximum patch size for AMD to cope with new CPUs which have a larger patch size. - Fix a crash in the resource control filesystem when the removal of the default resource group is attempted. - Preserve Code and Data Prioritization enabled state accross CPU hotplug. - Update split lock cpu matching to use the new X86_MATCH macros. - Change the split lock enumeration as Intel finaly decided that the IA32_CORE_CAPABILITIES bits are not architectural contrary to what the SDM claims. !@#%$^! - Add Tremont CPU models to the split lock detection cpu match. - Add a missing static attribute to make sparse happy. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6cWGsTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYod2jD/4kZqz+nEzAvx8RC/7zfLr1S6mDYcLb kqWEblLRfPofFNO3W/1Ri7xUs2VCyBcOJeG9JIugI8YV/b/5LY9j2nW30unXi84y 8DHLWgM7OG+EiNDMvdQwgnjNb9Pdl4F1e9yTTD6IRg0bHOjvtHVyq9bNg7f3iaED ZE4X5Hh5u4qFK/jmcsTF5HA/wIjELdmT32F4RxceAlmvpa5SUGlOfVVo1cSZpCbx XkrvUvEzyZhbzY+Gy1q3SHTt+fvzx1++LsnJD0Dyfe5Q47PA1Iy6Zo2+Epn3FnCu XuQKLaiDhidpkPzTGULZUsubavXbrSEu5/yhFJHyUqMy5WNOmvXBN8eVC4j1I9Ga tnt43s3AS8noz4qIb7bpoVgETFtoCfWfqwhtZmALPzrfutwxe2Ujtsi9FUca6HtA T5dKuNwc8G+Q5ZiNi+rPjcV/QGGncZFwtwwRwUl/YKgQ2VgrTgfsPc431tfSl3Q8 hVQIOhQNHCKqe3uGhiCsI29pNMDXVijZcI8w2SSmxnPyrMRXD7bTfLWnPav7SGFO aSSi9HWtghkU/MsmRgRcZc9PI5bNs6w5IkfQqfXjd/lJwea2yQg1cn1KdmGi3Q33 BNj9FudNMe4K8ITaNWiLdt5rYCDIvWEzmbwawAhevstbKrjVtrAYgNAjvgJEnXAt mZwTu+Hpd6d+JA== =raUm -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 and objtool fixes from Thomas Gleixner: "A set of fixes for x86 and objtool: objtool: - Ignore the double UD2 which is emitted in BUG() when CONFIG_UBSAN_TRAP is enabled. - Support clang non-section symbols in objtool ORC dump - Fix switch table detection in .text.unlikely - Make the BP scratch register warning more robust. x86: - Increase microcode maximum patch size for AMD to cope with new CPUs which have a larger patch size. - Fix a crash in the resource control filesystem when the removal of the default resource group is attempted. - Preserve Code and Data Prioritization enabled state accross CPU hotplug. - Update split lock cpu matching to use the new X86_MATCH macros. - Change the split lock enumeration as Intel finaly decided that the IA32_CORE_CAPABILITIES bits are not architectural contrary to what the SDM claims. !@#%$^! - Add Tremont CPU models to the split lock detection cpu match. - Add a missing static attribute to make sparse happy" * tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/split_lock: Add Tremont family CPU models x86/split_lock: Bits in IA32_CORE_CAPABILITIES are not architectural x86/resctrl: Preserve CDP enable over CPU hotplug x86/resctrl: Fix invalid attempt at removing the default resource group x86/split_lock: Update to use X86_MATCH_INTEL_FAM6_MODEL() x86/umip: Make umip_insns static x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE objtool: Make BP scratch register warning more robust objtool: Fix switch table detection in .text.unlikely objtool: Support Clang non-section symbols in ORC generation objtool: Support Clang non-section symbols in ORC dump objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings	2020-04-19 11:58:32 -07:00

1 2 3 4 5 ...

915790 Commits