linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-24 11:46:54 +07:00

Author	SHA1	Message	Date
Steve Rutherford	b053b2aef2	KVM: x86: Add EOI exit bitmap inference In order to support a userspace IOAPIC interacting with an in kernel APIC, the EOI exit bitmaps need to be configurable. If the IOAPIC is in userspace (i.e. the irqchip has been split), the EOI exit bitmaps will be set whenever the GSI Routes are configured. In particular, for the low MSI routes are reservable for userspace IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the destination vector of the route will be set for the destination VCPU. The intention is for the userspace IOAPICs to use the reservable MSI routes to inject interrupts into the guest. This is a slight abuse of the notion of an MSI Route, given that MSIs classically bypass the IOAPIC. It might be worthwhile to add an additional route type to improve clarity. Compile tested for Intel x86. Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:28 +02:00
Steve Rutherford	7543a635aa	KVM: x86: Add KVM exit for IOAPIC EOIs Adds KVM_EXIT_IOAPIC_EOI which allows the kernel to EOI level-triggered IOAPIC interrupts. Uses a per VCPU exit bitmap to decide whether or not the IOAPIC needs to be informed (which is identical to the EOI_EXIT_BITMAP field used by modern x86 processors, but can also be used to elide kvm IOAPIC EOI exits on older processors). [Note: A prototype using ResampleFDs found that decoupling the EOI from the VCPU's thread made it possible for the VCPU to not see a recent EOI after reentering the guest. This does not match real hardware.] Compile tested for Intel x86. Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:27 +02:00
Steve Rutherford	49df6397ed	KVM: x86: Split the APIC from the rest of IRQCHIP. First patch in a series which enables the relocation of the PIC/IOAPIC to userspace. Adds capability KVM_CAP_SPLIT_IRQCHIP; KVM_CAP_SPLIT_IRQCHIP enables the construction of LAPICs without the rest of the irqchip. Compile tested for x86. Signed-off-by: Steve Rutherford <srutherford@google.com> Suggested-by: Andrew Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:26 +02:00
Paolo Bonzini	4ca7dd8ce4	KVM: x86: unify handling of interrupt window The interrupt window is currently checked twice, once in vmx.c/svm.c and once in dm_request_for_irq_injection. The only difference is the extra check for kvm_arch_interrupt_allowed in dm_request_for_irq_injection, and the different return value (EINTR/KVM_EXIT_INTR for vmx.c/svm.c vs. 0/KVM_EXIT_IRQ_WINDOW_OPEN for dm_request_for_irq_injection). However, dm_request_for_irq_injection is basically dead code! Revive it by removing the checks in vmx.c and svm.c's vmexit handlers, and fixing the returned values for the dm_request_for_irq_injection case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:26 +02:00
Paolo Bonzini	35754c987f	KVM: x86: introduce lapic_in_kernel Avoid pointer chasing and memory barriers, and simplify the code when split irqchip (LAPIC in kernel, IOAPIC/PIC in userspace) is introduced. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:25 +02:00
Paolo Bonzini	3bb345f387	KVM: x86: store IOAPIC-handled vectors in each VCPU We can reuse the algorithm that computes the EOI exit bitmap to figure out which vectors are handled by the IOAPIC. The only difference between the two is for edge-triggered interrupts other than IRQ8 that have no notifiers active; however, the IOAPIC does not have to do anything special for these interrupts anyway. This again limits the interactions between the IOAPIC and the LAPIC, making it easier to move the former to userspace. Inspired by a patch from Steve Rutherford. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:23 +02:00
Paolo Bonzini	bdaffe1d93	KVM: x86: set TMR when the interrupt is accepted Do not compute TMR in advance. Instead, set the TMR just before the interrupt is accepted into the IRR. This limits the coupling between IOAPIC and LAPIC. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-01 15:06:22 +02:00
Radim Krčmář	9bac175d8e	Revert "KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR" Shifting pvclock_vcpu_time_info.system_time on write to KVM system time MSR is a change of ABI. Probably only 2.6.16 based SLES 10 breaks due to its custom enhancements to kvmclock, but KVM never declared the MSR only for one-shot initialization. (Doc says that only one write is needed.) This reverts commit `b7e60c5aed`. And adds a note to the definition of PVCLOCK_COUNTS_FROM_ZERO. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-28 13:06:37 +02:00
Paolo Bonzini	3afb112180	KVM: x86: trap AMD MSRs for the TSeg base and mask These have roughly the same purpose as the SMRR, which we do not need to implement in KVM. However, Linux accesses MSR_K8_TSEG_ADDR at boot, which causes problems when running a Xen dom0 under KVM. Just return 0, meaning that processor protection of SMRAM is not in effect. Reported-by: M A Young <m.a.young@durham.ac.uk> Cc: stable@vger.kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-21 07:41:22 +02:00
Paolo Bonzini	62bea5bff4	KVM: add halt_attempted_poll to VCPU stats This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-16 12:17:00 +02:00
Linus Torvalds	519f526d39	ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJV7qd/AAoJEL/70l94x66DDBcH/2OLomKHjDOGXqJ/dpkqf4UU FYI1pVjs2zP4z3L7RYV/DeuEsD6XaWzS7EXQOS3mcb9d8GWahPrdofeVmpmhg/8y jmkuUEFHl2Ut6imk8qDlG3m42c86Mk8/1k38l1bp8S3lL0/Q7IyADyYAlHdwzpOx yEyOAE4VU4n+VyQH5dbnzc12QRTeHfRQc/dI3eQq238gf37SF/1qzOzeLIdbEa+N DCzqQ8SExbctiRaLzCY5Ogan+unZBQbFfhrDrUSryywrzo/8WRFVmbjuf5O5Ucxa +UTLMvmm1YgxvBvWhlcmA+HSzSVeWNvaHQ9illgE5+74G5CzaD2ukurmoz/+r+A= =XtrL -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm updates from Paolo Bonzini: "ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits) kvm: irqchip: fix memory leak kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF KVM: trace kvm_halt_poll_ns grow/shrink KVM: dynamic halt-polling KVM: make halt_poll_ns per-vCPU Silence compiler warning in arch/x86/kvm/emulate.c kvm: compile process_smi_save_seg_64() only for x86_64 KVM: x86: avoid uninitialized variable warning KVM: PPC: Book3S: Fix typo in top comment about locking KVM: PPC: Book3S: Fix size of the PSPB register KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set KVM: PPC: Book3S HV: Fix race in starting secondary threads KVM: PPC: Book3S: correct width in XER handling KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation KVM: PPC: Book3S HV: Fix preempted vcore list locking KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD KVM: PPC: Book3S HV: Fix bug in dirty page tracking KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 KVM: PPC: Book3S HV: Make use of unused threads when running guests ...	2015-09-10 16:42:49 -07:00
Alexander Kuleshov	efbb288afc	kvm: compile process_smi_save_seg_64() only for x86_64 The process_smi_save_seg_64() function called only in the process_smi_save_state_64() if the CONFIG_X86_64 is set. This patch adds #ifdef CONFIG_X86_64 around process_smi_save_seg_64() to prevent following warning message: arch/x86/kvm/x86.c:5946:13: warning: â€˜process_smi_save_seg_64â€™ defined but not used [-Wunused-function] static void process_smi_save_seg_64(struct kvm_vcpu vcpu, char buf, int n) ^ Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-06 16:26:22 +02:00
Linus Torvalds	5778077d03	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm changes from Ingo Molnar: "The biggest changes in this cycle were: - Revamp, simplify (and in some cases fix) Time Stamp Counter (TSC) primitives. (Andy Lutomirski) - Add new, comprehensible entry and exit handlers written in C. (Andy Lutomirski) - vm86 mode cleanups and fixes. (Brian Gerst) - 32-bit compat code cleanups. (Brian Gerst) The amount of simplification in low level assembly code is already palpable: arch/x86/entry/entry_32.S \| 130 +---- arch/x86/entry/entry_64.S \| 197 ++----- but more simplifications are planned. There's also the usual laudry mix of low level changes - see the changelog for details" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (83 commits) x86/asm: Drop repeated macro of X86_EFLAGS_AC definition x86/asm/msr: Make wrmsrl() a function x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer x86/asm: Add MONITORX/MWAITX instruction support x86/traps: Weaken context tracking entry assertions x86/asm/tsc: Add rdtscll() merge helper selftests/x86: Add syscall_nt selftest selftests/x86: Disable sigreturn_64 x86/vdso: Emit a GNU hash x86/entry: Remove do_notify_resume(), syscall_trace_leave(), and their TIF masks x86/entry/32: Migrate to C exit path x86/entry/32: Remove 32-bit syscall audit optimizations x86/vm86: Rename vm86->v86flags and v86mask x86/vm86: Rename vm86->vm86_info to user_vm86 x86/vm86: Clean up vm86.h includes x86/vm86: Move the vm86 IRQ definitions to vm86.h x86/vm86: Use the normal pt_regs area for vm86 x86/vm86: Eliminate 'struct kernel_vm86_struct' x86/vm86: Move fields from 'struct kernel_vm86_struct' to 'struct vm86' x86/vm86: Move vm86 fields out of 'thread_struct' ...	2015-09-01 08:40:25 -07:00
Linus Torvalds	44e98edcd1	A very small release for x86 and s390 KVM. s390: timekeeping changes, cleanups and fixes x86: support for Hyper-V MSRs to report crashes, and a bunch of cleanups. One interesting feature that was planned for 4.3 (emulating the local APIC in kernel while keeping the IOAPIC and 8254 in userspace) had to be delayed because Intel complained about my reading of the manual. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJVznW4AAoJEL/70l94x66Dt+gH/3vydhh6kv+mKhnR+kADaGfM gaunw0CUpJLU6gkOkYOm5M32WGhsT9Hd3WtRTJO6PhSo7cQ88hMx24u4XAffoewo Os5tDwAaHeV2enVSTri6xX8e2F2mgPDghGcYJPUBwnmMjRzZ8tj2VHUcbxqVT6Pb pX3V8ZxOZ81+ACZU2tdNRzLUd2H1v4d74gtVS7ove1Vb0CvPOBdHf1KQuUCUa2Pi 73fvnaEuSaFYtSWZIP1PYxLnsQHpApH3Kco/5kHeqUPpYaGa/g2bnfncHRw20Svr gb3opwbfyiq91xfGbRVR3+E63Cw4G6aTl5MDNv9UFJ+xFKuj8WJ72xXXTSwzUi4= =HgT+ -----END PGP SIGNATURE----- Merge tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "A very small release for x86 and s390 KVM. - s390: timekeeping changes, cleanups and fixes - x86: support for Hyper-V MSRs to report crashes, and a bunch of cleanups. One interesting feature that was planned for 4.3 (emulating the local APIC in kernel while keeping the IOAPIC and 8254 in userspace) had to be delayed because Intel complained about my reading of the manual" * tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) x86/kvm: Rename VMX's segment access rights defines KVM: x86/vPMU: Fix unnecessary signed extension for AMD PERFCTRn kvm: x86: Fix error handling in the function kvm_lapic_sync_from_vapic KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully KVM: VMX: drop ept misconfig check KVM: MMU: fully check zero bits for sptes KVM: MMU: introduce is_shadow_zero_bits_set() KVM: MMU: introduce the framework to check zero bits on sptes KVM: MMU: split reset_rsvds_bits_mask_ept KVM: MMU: split reset_rsvds_bits_mask KVM: MMU: introduce rsvd_bits_validate KVM: MMU: move FNAME(is_rsvd_bits_set) to mmu.c KVM: MMU: fix validation of mmio page fault KVM: MTRR: Use default type for non-MTRR-covered gfn before WARN_ON KVM: s390: host STP toleration for VMs KVM: x86: clean/fix memory barriers in irqchip_in_kernel KVM: document memory barriers for kvm->vcpus/kvm->online_vcpus KVM: x86: remove unnecessary memory barriers for shared MSRs KVM: move code related to KVM_SET_BOOT_CPU_ID to x86 KVM: s390: log capability enablement and vm attribute changes ...	2015-08-31 08:27:44 -07:00
Ingo Molnar	a5dd192496	Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes Conflicts: arch/x86/entry/entry_64_compat.S arch/x86/math-emu/get_address.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-18 09:39:47 +02:00
Haozhong Zhang	d7add05458	KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST When kvm_set_msr_common() handles a guest's write to MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data written by guest and then use it to adjust TSC offset by calling a call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset() indicates whether the adjustment is in host TSC cycles or in guest TSC cycles. If SVM TSC scaling is enabled, adjust_tsc_offset() [i.e. svm_adjust_tsc_offset()] will first scale the adjustment; otherwise, it will just use the unscaled one. As the MSR write here comes from the guest, the adjustment is in guest TSC cycles. However, the current kvm_set_msr_common() uses it as a value in host TSC cycles (by using true as the 3rd parameter of adjust_tsc_offset()), which can result in an incorrect adjustment of TSC offset if SVM TSC scaling is enabled. This patch fixes this problem. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Cc: stable@vger.linux.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-07 13:28:03 +02:00
Paolo Bonzini	18c3626e3d	KVM: x86: zero IDT limit on entry to SMM The recent BlackHat 2015 presentation "The Memory Sinkhole" mentions that the IDT limit is zeroed on entry to SMM. This is not documented, and must have changed some time after 2010 (see http://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf). KVM was not doing it, but the fix is easy. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-07 12:46:32 +02:00
Xiao Guangrong	a0a64f50aa	KVM: MMU: introduce rsvd_bits_validate These two fields, rsvd_bits_mask and bad_mt_xwr, in "struct kvm_mmu" are used to check if reserved bits set on guest ptes, move them to a data struct so that the approach can be applied to check host shadow page table entries as well Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-05 12:47:23 +02:00
Ingo Molnar	5b929bd11d	Merge branch 'x86/urgent' into x86/asm, before applying dependent patches Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-31 10:23:35 +02:00
Paolo Bonzini	71ba994c94	KVM: x86: clean/fix memory barriers in irqchip_in_kernel The memory barriers are trying to protect against concurrent RCU-based interrupt injection, but the IRQ routing table is not valid at the time kvm->arch.vpic is written. Fix this by writing kvm->arch.vpic last. kvm_destroy_pic then need not set kvm->arch.vpic to NULL; modify it to take a struct kvm_pic* and reuse it if the IOAPIC creation fails. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-30 16:02:56 +02:00
Paolo Bonzini	c847fe8895	KVM: x86: remove unnecessary memory barriers for shared MSRs There is no smp_rmb matching the smp_wmb. shared_msr_update is called from hardware_enable, which in turn is called via on_each_cpu. on_each_cpu and must imply a read memory barrier (on x86 the rmb is achieved simply through asm volatile in native_apic_mem_write). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-29 14:27:23 +02:00
Paolo Bonzini	d71ba78834	KVM: move code related to KVM_SET_BOOT_CPU_ID to x86 This is another remnant of ia64 support. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-29 14:27:21 +02:00
Andrey Smetanin	2ce7918990	kvm/x86: add sending hyper-v crash notification to user space Sending of notification is done by exiting vcpu to user space if KVM_REQ_HV_CRASH is enabled for vcpu. At exit to user space the kvm_run structure contains system_event with type KVM_SYSTEM_EVENT_CRASH to notify about guest crash occurred. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Peter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-23 08:27:06 +02:00
Andrey Smetanin	e7d9513b60	kvm/x86: added hyper-v crash msrs into kvm hyperv context Added kvm Hyper-V context hv crash variables as storage of Hyper-V crash msrs. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Peter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-23 08:27:06 +02:00
Andrey Smetanin	e83d58874b	kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file This patch introduce Hyper-V related source code file - hyperv.c and per vm and per vcpu hyperv context structures. All Hyper-V MSR's and hypercall code moved into hyperv.c. All Hyper-V kvm/vcpu fields moved into appropriate hyperv context structures. Copyrights and authors information copied from x86.c to hyperv.c. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Peter Hornyack <peterhornyack@google.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-23 08:27:06 +02:00
Wanpeng Li	ee4100da16	kvm: x86: fix load xsave feature warning [ 68.196974] WARNING: CPU: 1 PID: 2140 at arch/x86/kvm/x86.c:3161 kvm_arch_vcpu_ioctl+0xe88/0x1340 [kvm]() [ 68.196975] Modules linked in: snd_hda_codec_hdmi i915 rfcomm bnep bluetooth i2c_algo_bit rfkill nfsd drm_kms_helper nfs_acl nfs drm lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_dummy snd_seq_oss x86_pkg_temp_thermal snd_seq_midi kvm_intel snd_seq_midi_event snd_rawmidi kvm snd_seq ghash_clmulni_intel fuse snd_timer aesni_intel parport_pc ablk_helper snd_seq_device cryptd ppdev snd lp parport lrw dcdbas gf128mul i2c_core glue_helper lpc_ich video shpchp mfd_core soundcore serio_raw acpi_cpufreq ext4 mbcache jbd2 sd_mod crc32c_intel ahci libahci libata e1000e ptp pps_core [ 68.197005] CPU: 1 PID: 2140 Comm: qemu-system-x86 Not tainted 4.2.0-rc1+ #2 [ 68.197006] Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015 [ 68.197007] ffffffffa03b0657 ffff8800d984bca8 ffffffff815915a2 0000000000000000 [ 68.197009] 0000000000000000 ffff8800d984bce8 ffffffff81057c0a 00007ff6d0001000 [ 68.197010] 0000000000000002 ffff880211c1a000 0000000000000004 ffff8800ce0288c0 [ 68.197012] Call Trace: [ 68.197017] [<ffffffff815915a2>] dump_stack+0x45/0x57 [ 68.197020] [<ffffffff81057c0a>] warn_slowpath_common+0x8a/0xc0 [ 68.197022] [<ffffffff81057cfa>] warn_slowpath_null+0x1a/0x20 [ 68.197029] [<ffffffffa037bed8>] kvm_arch_vcpu_ioctl+0xe88/0x1340 [kvm] [ 68.197035] [<ffffffffa037aede>] ? kvm_arch_vcpu_load+0x4e/0x1c0 [kvm] [ 68.197040] [<ffffffffa03696a6>] kvm_vcpu_ioctl+0xc6/0x5c0 [kvm] [ 68.197043] [<ffffffff811252d2>] ? perf_pmu_enable+0x22/0x30 [ 68.197044] [<ffffffff8112663e>] ? perf_event_context_sched_in+0x7e/0xb0 [ 68.197048] [<ffffffff811a6882>] do_vfs_ioctl+0x2c2/0x4a0 [ 68.197050] [<ffffffff8107bf33>] ? finish_task_switch+0x173/0x220 [ 68.197053] [<ffffffff8123307f>] ? selinux_file_ioctl+0x4f/0xd0 [ 68.197055] [<ffffffff8122cac3>] ? security_file_ioctl+0x43/0x60 [ 68.197057] [<ffffffff811a6ad9>] SyS_ioctl+0x79/0x90 [ 68.197060] [<ffffffff81597e57>] entry_SYSCALL_64_fastpath+0x12/0x6a [ 68.197061] ---[ end trace 558a5ebf9445fc80 ]--- After commit (`0c4109bec0` 'x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions'), there is no assumption an xsave bit is present in the hardware (pcntxt_mask) that it is always present in a given xsave buffer. An enabled state to be present on 'pcntxt_mask', but not in 'xstate_bv' could happen when the last 'xsave' did not request that this feature be saved (unlikely) or because the "init optimization" caused it to not be saved. This patch kill the assumption. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-10 13:26:45 +02:00
Paolo Bonzini	5544eb9b81	KVM: count number of assigned devices If there are no assigned devices, the guest PAT are not providing any useful information and can be overridden to writeback; VMX always does this because it has the "IPAT" bit in its extended page table entries, but SVM does not have anything similar. Hook into VFIO and legacy device assignment so that they provide this information to KVM. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-10 13:25:26 +02:00
Radim Krčmář	370777daab	KVM: VMX: fix vmwrite to invalid VMCS fpu_activate is called outside of vcpu_load(), which means it should not touch VMCS, but fpu_activate needs to. Avoid the call by moving it to a point where we know that the guest needs eager FPU and VMCS is loaded. This will get rid of the following trace vmwrite error: reg 6800 value 0 (err 1) [<ffffffff8162035b>] dump_stack+0x19/0x1b [<ffffffffa046c701>] vmwrite_error+0x2c/0x2e [kvm_intel] [<ffffffffa045f26f>] vmcs_writel+0x1f/0x30 [kvm_intel] [<ffffffffa04617e5>] vmx_fpu_activate.part.61+0x45/0xb0 [kvm_intel] [<ffffffffa0461865>] vmx_fpu_activate+0x15/0x20 [kvm_intel] [<ffffffffa0560b91>] kvm_arch_vcpu_create+0x51/0x70 [kvm] [<ffffffffa0548011>] kvm_vm_ioctl+0x1c1/0x760 [kvm] [<ffffffff8118b55a>] ? handle_mm_fault+0x49a/0xec0 [<ffffffff811e47d5>] do_vfs_ioctl+0x2e5/0x4c0 [<ffffffff8127abbe>] ? file_has_perm+0xae/0xc0 [<ffffffff811e4a51>] SyS_ioctl+0xa1/0xc0 [<ffffffff81630949>] system_call_fastpath+0x16/0x1b (Note: we also unconditionally activate FPU in vmx_vcpu_reset(), so the removed code added nothing.) Fixes: `c447e76b4c` ("kvm/fpu: Enable eager restore kvm FPU for MPX") Cc: <stable@vger.kernel.org> Reported-by: Vlastimil Holer <vlastimil.holer@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-10 13:25:25 +02:00
Andy Lutomirski	03b9730b76	x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites rdtsc_barrier(); rdtsc() is an unnecessary mouthful and requires more thought than should be necessary. Add an rdtsc_ordered() helper and replace the trivial call sites with it. This should not change generated code. The duplication of the fence asm is temporary. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-06 15:23:29 +02:00
Andy Lutomirski	4ea1636b04	x86/asm/tsc: Rename native_read_tsc() to rdtsc() Now that there is no paravirt TSC, the "native" is inappropriate. The function does RDTSC, so give it the obvious name: rdtsc(). Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/fd43e16281991f096c1e4d21574d9e1402c62d39.1434501121.git.luto@kernel.org [ Ported it to v4.2-rc1. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-06 15:23:28 +02:00
Andy Lutomirski	881d7bf843	x86/asm/tsc, kvm: Remove vget_cycles() The only caller was KVM's read_tsc(). The only difference between vget_cycles() and native_read_tsc() was that vget_cycles() returned zero instead of crashing on TSC-less systems. KVM already checks vclock_mode() before calling that function, so the extra check is unnecessary. Also, KVM (host-side) requires the TSC to exist. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Link: http://lkml.kernel.org/r/20615df14ae2eb713ea7a5f5123c1dc4c7ca993d.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-06 15:23:25 +02:00
Nicolas Iooss	b0996ae482	KVM: x86: remove data variable from kvm_get_msr_common Commit `609e36d372` ("KVM: x86: pass host_initiated to functions that read MSRs") modified kvm_get_msr_common function to use msr_info->data instead of data but missed one occurrence. Replace it and remove the unused local variable. Fixes: `609e36d372` ("KVM: x86: pass host_initiated to functions that read MSRs") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-07-03 18:55:19 +02:00
Linus Torvalds	4e241557fc	The bulk of the changes here is for x86. And for once it's not for silicon that no one owns: these are really new features for everyone. * ARM: several features are in progress but missed the 4.2 deadline. So here is just a smattering of bug fixes, plus enabling the VFIO integration. * s390: Some fixes/refactorings/optimizations, plus support for 2GB pages. * x86: 1) host and guest support for marking kvmclock as a stable scheduler clock. 2) support for write combining. 3) support for system management mode, needed for secure boot in guests. 4) a bunch of cleanups required for 2+3. 5) support for virtualized performance counters on AMD; 6) legacy PCI device assignment is deprecated and defaults to "n" in Kconfig; VFIO replaces it. On top of this there are also bug fixes and eager FPU context loading for FPU-heavy guests. * Common code: Support for multiple address spaces; for now it is used only for x86 SMM but the s390 folks also have plans. There are some x86 conflicts, one with the rc8 pull request and the rest with Ingo's FPU rework. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJViYzhAAoJEL/70l94x66Dda0H/1IepMbfEy+o849d5G71fNTs F8Y8qUP2GZuL7T53FyFUGSBw+AX7kimu9ia4gR/PmDK+QYsdosYeEjwlsolZfTBf sHuzNtPoJhi5o1o/ur4NGameo0WjGK8f1xyzr+U8z74QDQyQv/QYCdK/4isp4BJL ugHNHkuROX6Zng4i7jc9rfaSRg29I3GBxQUYpMkEnD3eMYMUBWGm6Rs8pHgGAMvL vqzntgW00WNxehTqcAkmD/Wv+txxhkvIadZnjgaxH49e9JeXeBKTIR5vtb7Hns3s SuapZUyw+c95DIipXq4EznxxaOrjbebOeFgLCJo8+XMXZum8RZf/ob24KroYad0= =YsAR -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull first batch of KVM updates from Paolo Bonzini: "The bulk of the changes here is for x86. And for once it's not for silicon that no one owns: these are really new features for everyone. Details: - ARM: several features are in progress but missed the 4.2 deadline. So here is just a smattering of bug fixes, plus enabling the VFIO integration. - s390: Some fixes/refactorings/optimizations, plus support for 2GB pages. - x86: * host and guest support for marking kvmclock as a stable scheduler clock. * support for write combining. * support for system management mode, needed for secure boot in guests. * a bunch of cleanups required for the above * support for virtualized performance counters on AMD * legacy PCI device assignment is deprecated and defaults to "n" in Kconfig; VFIO replaces it On top of this there are also bug fixes and eager FPU context loading for FPU-heavy guests. - Common code: Support for multiple address spaces; for now it is used only for x86 SMM but the s390 folks also have plans" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits) KVM: s390: clear floating interrupt bitmap and parameters KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs KVM: x86/vPMU: Implement AMD vPMU code for KVM KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch KVM: x86/vPMU: introduce kvm_pmu_msr_idx_to_pmc KVM: x86/vPMU: reorder PMU functions KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU KVM: x86/vPMU: introduce pmu.h header KVM: x86/vPMU: rename a few PMU functions KVM: MTRR: do not map huge page for non-consistent range KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type KVM: MTRR: introduce mtrr_for_each_mem_type KVM: MTRR: introduce fixed_mtrr_addr_* functions KVM: MTRR: sort variable MTRRs KVM: MTRR: introduce var_mtrr_range KVM: MTRR: introduce fixed_mtrr_segment table KVM: MTRR: improve kvm_mtrr_get_guest_memory_type KVM: MTRR: do not split 64 bits MSR content KVM: MTRR: clean up mtrr default type ...	2015-06-24 09:36:49 -07:00
Wei Huang	6912ac326d	KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs This patch enables AMD guest VM to access (R/W) PMU related MSRs, which include PERFCTR[0..3] and EVNTSEL[0..3]. Reviewed-by: Joerg Roedel <jroedel@suse.de> Tested-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-23 14:12:15 +02:00
Wei Huang	474a5bb944	KVM: x86/vPMU: introduce pmu.h header This will be used for private function used by AMD- and Intel-specific PMU implementations. Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:29 +02:00
Wei Huang	c6702c9dcf	KVM: x86/vPMU: rename a few PMU functions Before introducing a pmu.h header for them, make the naming more consistent. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:29 +02:00
Xiao Guangrong	19efffa244	KVM: MTRR: sort variable MTRRs Sort all valid variable MTRRs based on its base address, it will help us to check a range to see if it's fully contained in variable MTRRs Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> [Fix list insertion sort, simplify var_mtrr_range_is_valid to just test the V bit. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:28 +02:00
Xiao Guangrong	70109e7d9d	KVM: MTRR: remove mtrr_state.have_fixed vMTRR does not depend on any host MTRR feature and fixed MTRRs have always been implemented, so drop this field Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:27 +02:00
Xiao Guangrong	eb839917a7	KVM: MTRR: handle MSR_MTRRcap in kvm_mtrr_get_msr MSR_MTRRcap is a MTRR msr so move the handler to the common place, also add some comments to make the hard code more readable Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:27 +02:00
Xiao Guangrong	ff53604b40	KVM: x86: move MTRR related code to a separate file MTRR code locates in x86.c and mmu.c so that move them to a separate file to make the organization more clearer and it will be the place where we fully implement vMTRR Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:26 +02:00
Xiao Guangrong	b18d5431ac	KVM: x86: fix CR0.CD virtualization Currently, CR0.CD is not checked when we virtualize memory cache type for noncoherent_dma guests, this patch fixes it by : - setting UC for all memory if CR0.CD = 1 - zapping all the last sptes in MMU if CR0.CD is changed Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-19 17:16:26 +02:00
Paolo Bonzini	6d396b5520	KVM: x86: advertise KVM_CAP_X86_SMM ... and we're done. :) Because SMBASE is usually relocated above 1M on modern chipsets, and SMM handlers might indeed rely on 4G segment limits, we only expose it if KVM is able to run the guest in big real mode. This includes any of VMX+emulate_invalid_guest_state, VMX+unrestricted_guest, or SVM. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:26:38 +02:00
Paolo Bonzini	699023e239	KVM: x86: add SMM to the MMU role, support SMRAM address space This is now very simple to do. The only interesting part is a simple trick to find the right memslot in gfn_to_rmap, retrieving the address space from the spte role word. The same trick is used in the auditing code. The comment on top of union kvm_mmu_page_role has been stale forever, so remove it. Speaking of stale code, remove pad_for_nice_hex_output too: it was splitting the "access" bitfield across two bytes and thus had effectively turned into pad_for_ugly_hex_output. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:26:37 +02:00
Paolo Bonzini	9da0e4d5ac	KVM: x86: work on all available address spaces This patch has no semantic change, but it prepares for the introduction of a second address space for system management mode. A new function x86_set_memory_region (and the "slots_lock taken" counterpart __x86_set_memory_region) is introduced in order to operate on all address spaces when adding or deleting private memory slots. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:26:37 +02:00
Paolo Bonzini	54bf36aac5	KVM: x86: use vcpu-specific functions to read/write/translate GFNs We need to hide SMRAM from guests not running in SMM. Therefore, all uses of kvm_read_guest* and kvm_write_guest* must be changed to check whether the VCPU is in system management mode and use a different set of memslots. Switch from kvm_* to the newly-introduced kvm_vcpu_*, which call into kvm_arch_vcpu_memslots_id. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:26:36 +02:00
Paolo Bonzini	660a5d517a	KVM: x86: save/load state on SMM switch The big ugly one. This patch adds support for switching in and out of system management mode, respectively upon receiving KVM_REQ_SMI and upon executing a RSM instruction. Both 32- and 64-bit formats are supported for the SMM state save area. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-04 16:17:46 +02:00
Paolo Bonzini	cd7764fe9f	KVM: x86: latch INITs while in system management mode Do not process INITs immediately while in system management mode, keep it instead in apic->pending_events. Tell userspace if an INIT is pending when they issue GET_VCPU_EVENTS, and similarly handle the new field in SET_VCPU_EVENTS. Note that the same treatment should be done while in VMX non-root mode. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-04 16:01:51 +02:00
Paolo Bonzini	64d6067057	KVM: x86: stubs for SMM support This patch adds the interface between x86.c and the emulator: the SMBASE register, a new emulator flag, the RSM instruction. It also adds a new request bit that will be used by the KVM_SMI ioctl. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-04 16:01:45 +02:00
Paolo Bonzini	f077825a87	KVM: x86: API changes for SMM support This patch includes changes to the external API for SMM support. Userspace can predicate the availability of the new fields and ioctls on a new capability, KVM_CAP_X86_SMM, which is added at the end of the patch series. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-04 16:01:11 +02:00
Paolo Bonzini	a584539b24	KVM: x86: pass the whole hflags field to emulator and back The hflags field will contain information about system management mode and will be useful for the emulator. Pass the entire field rather than just the guest-mode information. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-04 16:01:05 +02:00

1 2 3 4 5 ...

1114 Commits