linux_dsm_epyc7002/arch/x86
Paolo Bonzini 71cc849b70 KVM: x86: Fix split-irqchip vs interrupt injection window request
kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
a hodge-podge of conditions, hacked together to get something that
more or less works.  But what is actually needed is much simpler;
in both cases the fundamental question is, do we have a place to stash
an interrupt if userspace does KVM_INTERRUPT?

In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
unnecessarily restrictive.

In split irqchip mode it's a bit more complicated, we need to check
kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
cycle and thus requires ExtINTs not to be masked) as well as
!pending_userspace_extint(vcpu).  However, there is no need to
check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
pending ExtINT state separate from event injection state, and checking
kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
priority than APIC interrupts.  In fact the latter fixes a bug:
when userspace requests an IRQ window vmexit, an interrupt in the
local APIC can cause kvm_cpu_has_interrupt() to be true and thus
kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
happens, vcpu_run does not exit to userspace but the interrupt window
vmexits keep occurring.  The VM loops without any hope of making progress.

Once we try to fix these with something like

     return kvm_arch_interrupt_allowed(vcpu) &&
-        !kvm_cpu_has_interrupt(vcpu) &&
-        !kvm_event_needs_reinjection(vcpu) &&
-        kvm_cpu_accept_dm_intr(vcpu);
+        (!lapic_in_kernel(vcpu)
+         ? !vcpu->arch.interrupt.injected
+         : (kvm_apic_accept_pic_intr(vcpu)
+            && !pending_userspace_extint(v)));

we realize two things.  First, thanks to the previous patch the complex
conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
window request in vcpu_enter_guest()

        bool req_int_win =
                dm_request_for_irq_injection(vcpu) &&
                kvm_cpu_accept_dm_intr(vcpu);

should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
it is unnecessary to ask the processor for an interrupt window
if we would not be able to return to userspace.  Therefore,
kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
ANDed with the existing check for masked ExtINT.  It all makes sense:

- we can accept an interrupt from userspace if there is a place
  to stash it (and, for irqchip split, ExtINTs are not masked).
  Interrupts from userspace _can_ be accepted even if right now
  EFLAGS.IF=0.

- in order to tell userspace we will inject its interrupt ("IRQ
  window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
  KVM and the vCPU need to be ready to accept the interrupt.

... and this is what the patch implements.

Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Analyzed-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nikos Tsironis <ntsironis@arrikto.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
2020-11-27 09:27:28 -05:00
..
boot x86/boot/compressed/64: Check SEV encryption in 64-bit boot-path 2020-10-29 18:06:52 +01:00
configs * A defconfig fix, from Daniel Díaz. 2020-09-20 15:06:43 -07:00
crypto crypto: x86/poly1305 - add back a needed assignment 2020-10-24 09:38:32 +11:00
entry A couple of x86 fixes which missed rc1 due to my stupidity: 2020-10-27 14:39:29 -07:00
events These are the performance events changes for v5.10: 2020-10-12 14:14:35 -07:00
hyperv hyperv-fixes for 5.10-rc3 2020-11-05 11:32:03 -08:00
ia32 x86: remove address space overrides using set_fs() 2020-09-08 22:21:36 -04:00
include KVM: x86: Fix split-irqchip vs interrupt injection window request 2020-11-27 09:27:28 -05:00
kernel A set of x86 fixes: 2020-11-08 10:09:36 -08:00
kvm KVM: x86: Fix split-irqchip vs interrupt injection window request 2020-11-27 09:27:28 -05:00
lib x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S 2020-11-04 12:30:20 +01:00
math-emu treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mm x86/head/64: Check SEV encryption before switching to kernel page-table 2020-10-29 18:09:59 +01:00
net bpf: x64: Do not emit sub/add 0, %rsp when !stack_depth 2020-09-29 16:47:39 -07:00
oprofile
pci pci-v5.10-changes 2020-10-22 12:41:00 -07:00
platform treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
power Kbuild updates for v5.9 2020-08-09 14:10:26 -07:00
purgatory treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
ras treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
realmode x86/head/64: Don't call verify_cpu() on starting APs 2020-09-09 11:33:20 +02:00
tools x86/insn: Make inat-tables.c suitable for pre-decompression code 2020-09-07 19:45:24 +02:00
um arch/um: partially revert the conversion to __section() macro 2020-10-26 15:39:37 -07:00
video
xen treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
.gitignore
Kbuild
Kconfig This feature enhances the current guest memory encryption support 2020-10-14 10:21:34 -07:00
Kconfig.assembler
Kconfig.cpu treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Kconfig.debug x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() 2020-10-06 11:18:04 +02:00
Makefile x86/build: Warn on orphan section placement 2020-09-03 10:28:36 +02:00
Makefile_32.cpu
Makefile.um