2013-01-21 06:28:06 +07:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2012 - Virtual Open Systems and Columbia University
|
|
|
|
* Author: Christoffer Dall <c.dall@virtualopensystems.com>
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License, version 2, as
|
|
|
|
* published by the Free Software Foundation.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef __ARM_KVM_HOST_H__
|
|
|
|
#define __ARM_KVM_HOST_H__
|
|
|
|
|
2014-08-29 19:01:17 +07:00
|
|
|
#include <linux/types.h>
|
|
|
|
#include <linux/kvm_types.h>
|
2013-01-21 06:28:06 +07:00
|
|
|
#include <asm/kvm.h>
|
|
|
|
#include <asm/kvm_asm.h>
|
2013-01-21 06:43:58 +07:00
|
|
|
#include <asm/kvm_mmio.h>
|
KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.
The following Hyp-ABI is also documented in the code:
Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
Switching to Hyp mode is done through a simple HVC #0 instruction. The
exception vector code will check that the HVC comes from VMID==0 and if
so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
- r0 contains a pointer to a HYP function
- r1, r2, and r3 contain arguments to the above function.
- The HYP function will be called with its arguments in r0, r1 and r2.
On HYP function return, we return directly to SVC.
A call to a function executing in Hyp mode is performed like the following:
<svc code>
ldr r0, =BSYM(my_hyp_fn)
ldr r1, =my_param
hvc #0 ; Call my_hyp_fn(my_param) from HYP mode
<svc code>
Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.
SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.
To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU. After a guest exit, the VFP state is
returned to the host. When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state. We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.
Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR. We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest. This quirk was fixed by
Marc Zyngier.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-21 06:47:42 +07:00
|
|
|
#include <asm/fpstate.h>
|
ARM: KVM: move GIC/timer code to a common location
As KVM/arm64 is looming on the horizon, it makes sense to move some
of the common code to a single location in order to reduce duplication.
The code could live anywhere. Actually, most of KVM is already built
with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
not exactly talking about style here. But maybe it is time to start
moving into a less ugly direction.
The include files must be in a "public" location, as they are accessed
from non-KVM files (arch/arm/kernel/asm-offsets.c).
For this purpose, introduce two new locations:
- virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
virt/kvm, so this could be seen as a (very ugly) precedent.
- include/kvm/ : there is already an include/xen, and while the
intent is slightly different, this seems as good a location as
any
Eventually, we should probably have independant Makefiles at every
levels (just like everywhere else in the kernel), but this is just
the first step.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-14 20:31:01 +07:00
|
|
|
#include <kvm/arm_arch_timer.h>
|
2013-01-21 06:28:06 +07:00
|
|
|
|
2015-03-04 17:14:34 +07:00
|
|
|
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
|
|
|
|
|
2013-02-16 02:20:07 +07:00
|
|
|
#define KVM_USER_MEM_SLOTS 32
|
2013-01-21 06:28:10 +07:00
|
|
|
#define KVM_HAVE_ONE_REG
|
2015-09-18 17:34:53 +07:00
|
|
|
#define KVM_HALT_POLL_NS_DEFAULT 500000
|
2013-01-21 06:28:06 +07:00
|
|
|
|
2014-04-29 12:54:16 +07:00
|
|
|
#define KVM_VCPU_MAX_FEATURES 2
|
2013-01-21 06:28:06 +07:00
|
|
|
|
ARM: KVM: move GIC/timer code to a common location
As KVM/arm64 is looming on the horizon, it makes sense to move some
of the common code to a single location in order to reduce duplication.
The code could live anywhere. Actually, most of KVM is already built
with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
not exactly talking about style here. But maybe it is time to start
moving into a less ugly direction.
The include files must be in a "public" location, as they are accessed
from non-KVM files (arch/arm/kernel/asm-offsets.c).
For this purpose, introduce two new locations:
- virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
virt/kvm, so this could be seen as a (very ugly) precedent.
- include/kvm/ : there is already an include/xen, and while the
intent is slightly different, this seems as good a location as
any
Eventually, we should probably have independant Makefiles at every
levels (just like everywhere else in the kernel), but this is just
the first step.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-14 20:31:01 +07:00
|
|
|
#include <kvm/arm_vgic.h>
|
2013-01-22 07:36:12 +07:00
|
|
|
|
2016-09-12 21:49:24 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_ARM_GIC_V3
|
|
|
|
#define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
|
|
|
|
#else
|
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-02 13:31:21 +07:00
|
|
|
#define KVM_MAX_VCPUS VGIC_V2_MAX_CPUS
|
2016-09-12 21:49:24 +07:00
|
|
|
#endif
|
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-02 13:31:21 +07:00
|
|
|
|
2017-06-04 19:43:58 +07:00
|
|
|
#define KVM_REQ_SLEEP \
|
2017-06-04 19:43:51 +07:00
|
|
|
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
2017-06-04 19:43:59 +07:00
|
|
|
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
|
2016-04-27 16:28:00 +07:00
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
|
2014-08-26 21:13:20 +07:00
|
|
|
int __attribute_const__ kvm_target_cpu(void);
|
2013-01-21 06:28:06 +07:00
|
|
|
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
|
|
|
void kvm_reset_coprocs(struct kvm_vcpu *vcpu);
|
|
|
|
|
|
|
|
struct kvm_arch {
|
|
|
|
/* VTTBR value associated with below pgd and vmid */
|
|
|
|
u64 vttbr;
|
|
|
|
|
2016-10-19 00:37:49 +07:00
|
|
|
/* The last vcpu id that ran on each physical CPU */
|
|
|
|
int __percpu *last_vcpu_ran;
|
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
/*
|
|
|
|
* Anything that is not used directly from assembly code goes
|
|
|
|
* here.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* The VMID generation used for the virt. memory system */
|
|
|
|
u64 vmid_gen;
|
|
|
|
u32 vmid;
|
|
|
|
|
|
|
|
/* Stage-2 page table */
|
|
|
|
pgd_t *pgd;
|
2013-01-22 07:36:12 +07:00
|
|
|
|
|
|
|
/* Interrupt controller */
|
|
|
|
struct vgic_dist vgic;
|
2014-06-02 21:26:01 +07:00
|
|
|
int max_vcpus;
|
2013-01-21 06:28:06 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
#define KVM_NR_MEM_OBJS 40
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We don't want allocation failures within the mmu code, so we preallocate
|
|
|
|
* enough memory for a single page fault in a cache.
|
|
|
|
*/
|
|
|
|
struct kvm_mmu_memory_cache {
|
|
|
|
int nobjs;
|
|
|
|
void *objects[KVM_NR_MEM_OBJS];
|
|
|
|
};
|
|
|
|
|
2012-09-18 01:27:09 +07:00
|
|
|
struct kvm_vcpu_fault_info {
|
|
|
|
u32 hsr; /* Hyp Syndrome Register */
|
|
|
|
u32 hxfar; /* Hyp Data/Inst. Fault Address Register */
|
|
|
|
u32 hpfar; /* Hyp IPA Fault Address Register */
|
|
|
|
};
|
|
|
|
|
2016-01-06 05:53:33 +07:00
|
|
|
/*
|
|
|
|
* 0 is reserved as an invalid value.
|
|
|
|
* Order should be kept in sync with the save/restore code.
|
|
|
|
*/
|
|
|
|
enum vcpu_sysreg {
|
|
|
|
__INVALID_SYSREG__,
|
|
|
|
c0_MPIDR, /* MultiProcessor ID Register */
|
|
|
|
c0_CSSELR, /* Cache Size Selection Register */
|
|
|
|
c1_SCTLR, /* System Control Register */
|
|
|
|
c1_ACTLR, /* Auxiliary Control Register */
|
|
|
|
c1_CPACR, /* Coprocessor Access Control */
|
|
|
|
c2_TTBR0, /* Translation Table Base Register 0 */
|
|
|
|
c2_TTBR0_high, /* TTBR0 top 32 bits */
|
|
|
|
c2_TTBR1, /* Translation Table Base Register 1 */
|
|
|
|
c2_TTBR1_high, /* TTBR1 top 32 bits */
|
|
|
|
c2_TTBCR, /* Translation Table Base Control R. */
|
|
|
|
c3_DACR, /* Domain Access Control Register */
|
|
|
|
c5_DFSR, /* Data Fault Status Register */
|
|
|
|
c5_IFSR, /* Instruction Fault Status Register */
|
|
|
|
c5_ADFSR, /* Auxilary Data Fault Status R */
|
|
|
|
c5_AIFSR, /* Auxilary Instrunction Fault Status R */
|
|
|
|
c6_DFAR, /* Data Fault Address Register */
|
|
|
|
c6_IFAR, /* Instruction Fault Address Register */
|
|
|
|
c7_PAR, /* Physical Address Register */
|
|
|
|
c7_PAR_high, /* PAR top 32 bits */
|
|
|
|
c9_L2CTLR, /* Cortex A15/A7 L2 Control Register */
|
|
|
|
c10_PRRR, /* Primary Region Remap Register */
|
|
|
|
c10_NMRR, /* Normal Memory Remap Register */
|
|
|
|
c12_VBAR, /* Vector Base Address Register */
|
|
|
|
c13_CID, /* Context ID Register */
|
|
|
|
c13_TID_URW, /* Thread ID, User R/W */
|
|
|
|
c13_TID_URO, /* Thread ID, User R/O */
|
|
|
|
c13_TID_PRIV, /* Thread ID, Privileged */
|
|
|
|
c14_CNTKCTL, /* Timer Control Register (PL1) */
|
|
|
|
c10_AMAIR0, /* Auxilary Memory Attribute Indirection Reg0 */
|
|
|
|
c10_AMAIR1, /* Auxilary Memory Attribute Indirection Reg1 */
|
|
|
|
NR_CP15_REGS /* Number of regs (incl. invalid) */
|
|
|
|
};
|
|
|
|
|
2016-01-03 18:01:49 +07:00
|
|
|
struct kvm_cpu_context {
|
2016-01-03 18:26:01 +07:00
|
|
|
struct kvm_regs gp_regs;
|
2016-01-03 18:01:49 +07:00
|
|
|
struct vfp_hard_struct vfp;
|
2016-01-03 18:26:01 +07:00
|
|
|
u32 cp15[NR_CP15_REGS];
|
2016-01-03 18:01:49 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
typedef struct kvm_cpu_context kvm_cpu_context_t;
|
2012-10-28 00:23:25 +07:00
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
struct kvm_vcpu_arch {
|
2016-01-03 18:01:49 +07:00
|
|
|
struct kvm_cpu_context ctxt;
|
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
int target; /* Processor target */
|
|
|
|
DECLARE_BITMAP(features, KVM_VCPU_MAX_FEATURES);
|
|
|
|
|
|
|
|
/* The CPU type we expose to the VM */
|
|
|
|
u32 midr;
|
|
|
|
|
2014-01-22 16:43:38 +07:00
|
|
|
/* HYP trapping configuration */
|
|
|
|
u32 hcr;
|
|
|
|
|
|
|
|
/* Interrupt related fields */
|
|
|
|
u32 irq_lines; /* IRQ and FIQ levels */
|
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
/* Exception Information */
|
2012-09-18 01:27:09 +07:00
|
|
|
struct kvm_vcpu_fault_info fault;
|
2013-01-21 06:28:06 +07:00
|
|
|
|
2013-04-08 22:47:19 +07:00
|
|
|
/* Host FP context */
|
|
|
|
kvm_cpu_context_t *host_cpu_context;
|
KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.
The following Hyp-ABI is also documented in the code:
Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
Switching to Hyp mode is done through a simple HVC #0 instruction. The
exception vector code will check that the HVC comes from VMID==0 and if
so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
- r0 contains a pointer to a HYP function
- r1, r2, and r3 contain arguments to the above function.
- The HYP function will be called with its arguments in r0, r1 and r2.
On HYP function return, we return directly to SVC.
A call to a function executing in Hyp mode is performed like the following:
<svc code>
ldr r0, =BSYM(my_hyp_fn)
ldr r1, =my_param
hvc #0 ; Call my_hyp_fn(my_param) from HYP mode
<svc code>
Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.
SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.
To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU. After a guest exit, the VFP state is
returned to the host. When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state. We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.
Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR. We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest. This quirk was fixed by
Marc Zyngier.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-21 06:47:42 +07:00
|
|
|
|
2013-01-22 07:36:12 +07:00
|
|
|
/* VGIC state */
|
|
|
|
struct vgic_cpu vgic_cpu;
|
2013-01-24 01:21:58 +07:00
|
|
|
struct arch_timer_cpu timer_cpu;
|
2013-01-22 07:36:12 +07:00
|
|
|
|
KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.
The following Hyp-ABI is also documented in the code:
Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
Switching to Hyp mode is done through a simple HVC #0 instruction. The
exception vector code will check that the HVC comes from VMID==0 and if
so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
- r0 contains a pointer to a HYP function
- r1, r2, and r3 contain arguments to the above function.
- The HYP function will be called with its arguments in r0, r1 and r2.
On HYP function return, we return directly to SVC.
A call to a function executing in Hyp mode is performed like the following:
<svc code>
ldr r0, =BSYM(my_hyp_fn)
ldr r1, =my_param
hvc #0 ; Call my_hyp_fn(my_param) from HYP mode
<svc code>
Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.
SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.
To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU. After a guest exit, the VFP state is
returned to the host. When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state. We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.
Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR. We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest. This quirk was fixed by
Marc Zyngier.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-21 06:47:42 +07:00
|
|
|
/*
|
|
|
|
* Anything that is not used directly from assembly code goes
|
|
|
|
* here.
|
|
|
|
*/
|
2013-01-21 06:28:09 +07:00
|
|
|
|
2015-09-26 04:41:14 +07:00
|
|
|
/* vcpu power-off state */
|
|
|
|
bool power_off;
|
2013-01-21 06:28:13 +07:00
|
|
|
|
2015-09-26 04:41:17 +07:00
|
|
|
/* Don't run the guest (internal implementation need) */
|
|
|
|
bool pause;
|
|
|
|
|
2013-01-21 06:43:58 +07:00
|
|
|
/* IO related fields */
|
|
|
|
struct kvm_decode mmio_decode;
|
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
/* Cache some mmu pages needed inside spinlock regions */
|
|
|
|
struct kvm_mmu_memory_cache mmu_page_cache;
|
KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.
The following Hyp-ABI is also documented in the code:
Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
Switching to Hyp mode is done through a simple HVC #0 instruction. The
exception vector code will check that the HVC comes from VMID==0 and if
so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
- r0 contains a pointer to a HYP function
- r1, r2, and r3 contain arguments to the above function.
- The HYP function will be called with its arguments in r0, r1 and r2.
On HYP function return, we return directly to SVC.
A call to a function executing in Hyp mode is performed like the following:
<svc code>
ldr r0, =BSYM(my_hyp_fn)
ldr r1, =my_param
hvc #0 ; Call my_hyp_fn(my_param) from HYP mode
<svc code>
Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.
SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.
To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU. After a guest exit, the VFP state is
returned to the host. When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state. We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.
Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR. We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest. This quirk was fixed by
Marc Zyngier.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-21 06:47:42 +07:00
|
|
|
|
|
|
|
/* Detect first run of a vcpu */
|
|
|
|
bool has_run_once;
|
2013-01-21 06:28:06 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
struct kvm_vm_stat {
|
2016-08-02 11:03:22 +07:00
|
|
|
ulong remote_tlb_flush;
|
2013-01-21 06:28:06 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
struct kvm_vcpu_stat {
|
2016-08-02 11:03:22 +07:00
|
|
|
u64 halt_successful_poll;
|
|
|
|
u64 halt_attempted_poll;
|
|
|
|
u64 halt_poll_invalid;
|
|
|
|
u64 halt_wakeup;
|
|
|
|
u64 hvc_exit_stat;
|
2015-11-26 17:09:43 +07:00
|
|
|
u64 wfe_exit_stat;
|
|
|
|
u64 wfi_exit_stat;
|
|
|
|
u64 mmio_exit_user;
|
|
|
|
u64 mmio_exit_kernel;
|
|
|
|
u64 exits;
|
2013-01-21 06:28:06 +07:00
|
|
|
};
|
|
|
|
|
2016-01-03 18:26:01 +07:00
|
|
|
#define vcpu_cp15(v,r) (v)->arch.ctxt.cp15[r]
|
|
|
|
|
2013-09-30 15:50:05 +07:00
|
|
|
int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init);
|
2013-01-21 06:28:06 +07:00
|
|
|
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
|
|
|
|
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
|
|
|
|
int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
|
|
|
|
int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
|
2016-01-06 19:10:58 +07:00
|
|
|
unsigned long kvm_call_hyp(void *hypfn, ...);
|
KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.
The following Hyp-ABI is also documented in the code:
Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
Switching to Hyp mode is done through a simple HVC #0 instruction. The
exception vector code will check that the HVC comes from VMID==0 and if
so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
- r0 contains a pointer to a HYP function
- r1, r2, and r3 contain arguments to the above function.
- The HYP function will be called with its arguments in r0, r1 and r2.
On HYP function return, we return directly to SVC.
A call to a function executing in Hyp mode is performed like the following:
<svc code>
ldr r0, =BSYM(my_hyp_fn)
ldr r1, =my_param
hvc #0 ; Call my_hyp_fn(my_param) from HYP mode
<svc code>
Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.
SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.
Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.
To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU. After a guest exit, the VFP state is
returned to the host. When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state. We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.
Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR. We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest. This quirk was fixed by
Marc Zyngier.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-21 06:47:42 +07:00
|
|
|
void force_vm_exit(const cpumask_t *mask);
|
2013-01-21 06:28:07 +07:00
|
|
|
|
|
|
|
#define KVM_ARCH_WANT_MMU_NOTIFIER
|
|
|
|
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
|
|
|
|
int kvm_unmap_hva_range(struct kvm *kvm,
|
|
|
|
unsigned long start, unsigned long end);
|
|
|
|
void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
|
|
|
|
|
2013-01-21 06:28:10 +07:00
|
|
|
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
|
|
|
|
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
|
2015-03-13 01:16:51 +07:00
|
|
|
int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
|
|
|
|
int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
|
2013-01-21 06:28:10 +07:00
|
|
|
|
2013-01-22 07:36:11 +07:00
|
|
|
struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
|
|
|
|
struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
|
2016-04-27 16:28:00 +07:00
|
|
|
void kvm_arm_halt_guest(struct kvm *kvm);
|
|
|
|
void kvm_arm_resume_guest(struct kvm *kvm);
|
2013-01-22 07:36:11 +07:00
|
|
|
|
|
|
|
int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
|
|
|
|
unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
|
|
|
|
int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
|
|
|
|
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
|
|
|
|
|
2012-10-05 17:11:11 +07:00
|
|
|
int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
|
|
|
int exception_index);
|
|
|
|
|
2016-07-01 00:40:45 +07:00
|
|
|
static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
|
2012-10-05 21:10:44 +07:00
|
|
|
unsigned long hyp_stack_ptr,
|
|
|
|
unsigned long vector_ptr)
|
|
|
|
{
|
|
|
|
/*
|
ARM: KVM: switch to a dual-step HYP init code
Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
runtime mappings, as pages are manipulated from PL1 exclusively
The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.
To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text
The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
it is unlikely to use the last 64kB (I doubt we'll ever support KVM
on a system with something like 4MB of RAM, but patches are very
welcome).
Let's call this VA the trampoline VA.
Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd
The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).
Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-13 01:12:06 +07:00
|
|
|
* Call initialization code, and switch to the full blown HYP
|
|
|
|
* code. The init code doesn't need to preserve these
|
|
|
|
* registers as r0-r3 are already callee saved according to
|
|
|
|
* the AAPCS.
|
2016-07-01 00:40:47 +07:00
|
|
|
* Note that we slightly misuse the prototype by casting the
|
ARM: KVM: switch to a dual-step HYP init code
Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
runtime mappings, as pages are manipulated from PL1 exclusively
The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.
To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text
The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
it is unlikely to use the last 64kB (I doubt we'll ever support KVM
on a system with something like 4MB of RAM, but patches are very
welcome).
Let's call this VA the trampoline VA.
Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd
The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).
Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-13 01:12:06 +07:00
|
|
|
* stack pointer to a void *.
|
|
|
|
|
2016-07-01 00:40:47 +07:00
|
|
|
* The PGDs are always passed as the third argument, in order
|
|
|
|
* to be passed into r2-r3 to the init code (yes, this is
|
|
|
|
* compliant with the PCS!).
|
|
|
|
*/
|
ARM: KVM: switch to a dual-step HYP init code
Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
runtime mappings, as pages are manipulated from PL1 exclusively
The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.
To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text
The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
it is unlikely to use the last 64kB (I doubt we'll ever support KVM
on a system with something like 4MB of RAM, but patches are very
welcome).
Let's call this VA the trampoline VA.
Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd
The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).
Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-13 01:12:06 +07:00
|
|
|
|
|
|
|
kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
|
2012-10-05 21:10:44 +07:00
|
|
|
}
|
|
|
|
|
2016-02-02 00:54:35 +07:00
|
|
|
static inline void __cpu_init_stage2(void)
|
|
|
|
{
|
2016-02-02 02:56:31 +07:00
|
|
|
kvm_call_hyp(__init_stage2_translation);
|
2016-02-02 00:54:35 +07:00
|
|
|
}
|
|
|
|
|
2016-07-15 18:43:25 +07:00
|
|
|
static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
|
2013-04-08 22:47:18 +07:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-03-05 10:18:00 +07:00
|
|
|
int kvm_perf_init(void);
|
|
|
|
int kvm_perf_teardown(void);
|
|
|
|
|
2015-01-16 06:58:56 +07:00
|
|
|
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
|
|
|
|
|
2014-06-02 20:37:13 +07:00
|
|
|
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
|
|
|
|
|
2014-08-28 20:13:02 +07:00
|
|
|
static inline void kvm_arch_hardware_unsetup(void) {}
|
|
|
|
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
|
|
|
|
static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
|
|
|
|
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
|
2016-05-13 17:16:35 +07:00
|
|
|
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
|
2014-08-28 20:13:02 +07:00
|
|
|
|
2015-07-07 23:29:56 +07:00
|
|
|
static inline void kvm_arm_init_debug(void) {}
|
|
|
|
static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
|
|
|
|
static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
|
2015-07-07 23:30:00 +07:00
|
|
|
static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
|
2017-11-16 22:39:19 +07:00
|
|
|
static inline bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu,
|
|
|
|
struct kvm_run *run)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
2017-05-02 20:17:59 +07:00
|
|
|
|
|
|
|
int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
|
|
|
|
struct kvm_device_attr *attr);
|
|
|
|
int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
|
|
|
|
struct kvm_device_attr *attr);
|
|
|
|
int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
|
|
|
|
struct kvm_device_attr *attr);
|
2015-07-07 23:29:56 +07:00
|
|
|
|
arm64/sve: KVM: Prevent guests from using SVE
Until KVM has full SVE support, guests must not be allowed to
execute SVE instructions.
This patch enables the necessary traps, and also ensures that the
traps are disabled again on exit from the guest so that the host
can still use SVE if it wants to.
On guest exit, high bits of the SVE Zn registers may have been
clobbered as a side-effect the execution of FPSIMD instructions in
the guest. The existing KVM host FPSIMD restore code is not
sufficient to restore these bits, so this patch explicitly marks
the CPU as not containing cached vector state for any task, thus
forcing a reload on the next return to userspace. This is an
interim measure, in advance of adding full SVE awareness to KVM.
This marking of cached vector state in the CPU as invalid is done
using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c. Due
to the repeated use of this rather obscure operation, it makes
sense to factor it out as a separate helper with a clearer name.
This patch factors it out as fpsimd_flush_cpu_state(), and ports
all callers to use it.
As a side effect of this refactoring, a this_cpu_write() in
fpsimd_cpu_pm_notifier() is changed to __this_cpu_write(). This
should be fine, since cpu_pm_enter() is supposed to be called only
with interrupts disabled.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 22:51:16 +07:00
|
|
|
/* All host FP/SIMD state is restored on guest exit, so nothing to save: */
|
|
|
|
static inline void kvm_fpsimd_flush_cpu_state(void) {}
|
|
|
|
|
2013-01-21 06:28:06 +07:00
|
|
|
#endif /* __ARM_KVM_HOST_H__ */
|