These are the x86, MIPS and s390 changes; PPC and ARM will come in a

few days.
 
 MIPS and s390 have little going on this release; just bugfixes, some
 small, some larger.
 
 The highlights for x86 are nested VMX improvements (Jan Kiszka), optimizations
 for old processor (up to Nehalem, by me and Bandan Das), and a lot of x86
 emulator bugfixes (Nadav Amit).
 
 Stephen Rothwell reported a trivial conflict with the tracing branch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJT300XAAoJEBvWZb6bTYby3V8QAJz+XyajnhJ8wH55Vxczz22L
 i2gtUGmBLhEXsBcaVKO4BBfek88lLzg0SGLjfW5wCMQmKtxVlrwTCXNkBoPGjapd
 NwHtWkMKym44PDhRovn7zkSumkxC43uFIBR/ebrhP6Bvhh9s+MnkQUxfw9ILB+YV
 EeKyEG8sSgxFCciuHbp3mIXpDcO6r/ldy6I7009OdyhLoMY+Kvmk7kRe9wtAivdg
 CGJi60QvGOn2RGRPOCEtF6UWr8Ae8fe1t84o0hkXPv/j3jtabzAatXKJa4dYNbIs
 7Mp4NQpxaGV6rq3WCYVeZRxGs+UReGDAS3Il4Z8C9eTOTooSfxdVr8acpM8PY6I8
 UmLT6ECLGycc4ELXrETtR+QLmiXACyJqyVxz4aiLV3kWSWfamKD3hBeQK9NizNcE
 VoPDl+PyISvR1tW4KstBuzfUWAEXi+gO78cqqFr/VW6cl7HKpA1DFQaPfGkYKDae
 2CPwcLwI5/M6RtSgkyXTkEqNZLc2BjldqSeM1lmWjhZVW56X2iqePUL46Vab3Yvt
 U+sELtwEE560NLN3hbaHUsLR1tcUix5w8vTzcXPxgoHQBszHCcAZTWd1XHulr64F
 rp/cangqtkPKcu5j1mNhQs38oLjHI1MUsbQrqFoD4tmHjQ75iXHRFzYGoIVKXyHG
 AnGbQzJzBcdAANhm3LW0
 =UXxV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM changes from Paolo Bonzini:
 "These are the x86, MIPS and s390 changes; PPC and ARM will come in a
  few days.

  MIPS and s390 have little going on this release; just bugfixes, some
  small, some larger.

  The highlights for x86 are nested VMX improvements (Jan Kiszka),
  optimizations for old processor (up to Nehalem, by me and Bandan Das),
  and a lot of x86 emulator bugfixes (Nadav Amit).

  Stephen Rothwell reported a trivial conflict with the tracing branch"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (104 commits)
  x86/kvm: Resolve shadow warnings in macro expansion
  KVM: s390: rework broken SIGP STOP interrupt handling
  KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table
  KVM: vmx: remove duplicate vmx_mpx_supported() prototype
  KVM: s390: Fix memory leak on busy SIGP stop
  x86/kvm: Resolve shadow warning from min macro
  kvm: Resolve missing-field-initializers warnings
  Replace NR_VMX_MSR with its definition
  KVM: x86: Assertions to check no overrun in MSR lists
  KVM: x86: set rflags.rf during fault injection
  KVM: x86: Setting rflags.rf during rep-string emulation
  KVM: x86: DR6/7.RTM cannot be written
  KVM: nVMX: clean up nested_release_vmcs12 and code around it
  KVM: nVMX: fix lifetime issues for vmcs02
  KVM: x86: Defining missing x86 vectors
  KVM: x86: emulator injects #DB when RFLAGS.RF is set
  KVM: x86: Cleanup of rflags.rf cleaning
  KVM: x86: Clear rflags.rf on emulated instructions
  KVM: x86: popf emulation should not change RF
  KVM: x86: Clearing rflags.rf upon skipped emulated instruction
  ...
This commit is contained in:
Linus Torvalds 2014-08-04 12:16:46 -07:00
commit 8533ce7271
47 changed files with 1792 additions and 1446 deletions

View File

@ -297,6 +297,15 @@ struct kvm_regs {
__u64 rip, rflags;
};
/* mips */
struct kvm_regs {
/* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
__u64 gpr[32];
__u64 hi;
__u64 lo;
__u64 pc;
};
4.12 KVM_SET_REGS
@ -378,7 +387,7 @@ struct kvm_translation {
4.16 KVM_INTERRUPT
Capability: basic
Architectures: x86, ppc
Architectures: x86, ppc, mips
Type: vcpu ioctl
Parameters: struct kvm_interrupt (in)
Returns: 0 on success, -1 on error
@ -423,6 +432,11 @@ c) KVM_INTERRUPT_SET_LEVEL
Note that any value for 'irq' other than the ones stated above is invalid
and incurs unexpected behavior.
MIPS:
Queues an external interrupt to be injected into the virtual CPU. A negative
interrupt number dequeues the interrupt.
4.17 KVM_DEBUG_GUEST
@ -512,7 +526,7 @@ struct kvm_cpuid {
4.21 KVM_SET_SIGNAL_MASK
Capability: basic
Architectures: x86
Architectures: all
Type: vcpu ioctl
Parameters: struct kvm_signal_mask (in)
Returns: 0 on success, -1 on error
@ -974,7 +988,7 @@ for vm-wide capabilities.
4.38 KVM_GET_MP_STATE
Capability: KVM_CAP_MP_STATE
Architectures: x86, ia64
Architectures: x86, ia64, s390
Type: vcpu ioctl
Parameters: struct kvm_mp_state (out)
Returns: 0 on success; -1 on error
@ -988,24 +1002,32 @@ uniprocessor guests).
Possible values are:
- KVM_MP_STATE_RUNNABLE: the vcpu is currently running
- KVM_MP_STATE_RUNNABLE: the vcpu is currently running [x86, ia64]
- KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP)
which has not yet received an INIT signal
which has not yet received an INIT signal [x86,
ia64]
- KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is
now ready for a SIPI
now ready for a SIPI [x86, ia64]
- KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and
is waiting for an interrupt
is waiting for an interrupt [x86, ia64]
- KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector
accessible via KVM_GET_VCPU_EVENTS)
accessible via KVM_GET_VCPU_EVENTS) [x86, ia64]
- KVM_MP_STATE_STOPPED: the vcpu is stopped [s390]
- KVM_MP_STATE_CHECK_STOP: the vcpu is in a special error state [s390]
- KVM_MP_STATE_OPERATING: the vcpu is operating (running or halted)
[s390]
- KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state
[s390]
This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
irqchip, the multiprocessing state must be maintained by userspace.
On x86 and ia64, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
these architectures.
4.39 KVM_SET_MP_STATE
Capability: KVM_CAP_MP_STATE
Architectures: x86, ia64
Architectures: x86, ia64, s390
Type: vcpu ioctl
Parameters: struct kvm_mp_state (in)
Returns: 0 on success; -1 on error
@ -1013,8 +1035,9 @@ Returns: 0 on success; -1 on error
Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
arguments.
This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
irqchip, the multiprocessing state must be maintained by userspace.
On x86 and ia64, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
these architectures.
4.40 KVM_SET_IDENTITY_MAP_ADDR
@ -1774,122 +1797,151 @@ and architecture specific registers. Each have their own range of operation
and their own constants and width. To keep track of the implemented
registers, find a list below:
Arch | Register | Width (bits)
| |
PPC | KVM_REG_PPC_HIOR | 64
PPC | KVM_REG_PPC_IAC1 | 64
PPC | KVM_REG_PPC_IAC2 | 64
PPC | KVM_REG_PPC_IAC3 | 64
PPC | KVM_REG_PPC_IAC4 | 64
PPC | KVM_REG_PPC_DAC1 | 64
PPC | KVM_REG_PPC_DAC2 | 64
PPC | KVM_REG_PPC_DABR | 64
PPC | KVM_REG_PPC_DSCR | 64
PPC | KVM_REG_PPC_PURR | 64
PPC | KVM_REG_PPC_SPURR | 64
PPC | KVM_REG_PPC_DAR | 64
PPC | KVM_REG_PPC_DSISR | 32
PPC | KVM_REG_PPC_AMR | 64
PPC | KVM_REG_PPC_UAMOR | 64
PPC | KVM_REG_PPC_MMCR0 | 64
PPC | KVM_REG_PPC_MMCR1 | 64
PPC | KVM_REG_PPC_MMCRA | 64
PPC | KVM_REG_PPC_MMCR2 | 64
PPC | KVM_REG_PPC_MMCRS | 64
PPC | KVM_REG_PPC_SIAR | 64
PPC | KVM_REG_PPC_SDAR | 64
PPC | KVM_REG_PPC_SIER | 64
PPC | KVM_REG_PPC_PMC1 | 32
PPC | KVM_REG_PPC_PMC2 | 32
PPC | KVM_REG_PPC_PMC3 | 32
PPC | KVM_REG_PPC_PMC4 | 32
PPC | KVM_REG_PPC_PMC5 | 32
PPC | KVM_REG_PPC_PMC6 | 32
PPC | KVM_REG_PPC_PMC7 | 32
PPC | KVM_REG_PPC_PMC8 | 32
PPC | KVM_REG_PPC_FPR0 | 64
Arch | Register | Width (bits)
| |
PPC | KVM_REG_PPC_HIOR | 64
PPC | KVM_REG_PPC_IAC1 | 64
PPC | KVM_REG_PPC_IAC2 | 64
PPC | KVM_REG_PPC_IAC3 | 64
PPC | KVM_REG_PPC_IAC4 | 64
PPC | KVM_REG_PPC_DAC1 | 64
PPC | KVM_REG_PPC_DAC2 | 64
PPC | KVM_REG_PPC_DABR | 64
PPC | KVM_REG_PPC_DSCR | 64
PPC | KVM_REG_PPC_PURR | 64
PPC | KVM_REG_PPC_SPURR | 64
PPC | KVM_REG_PPC_DAR | 64
PPC | KVM_REG_PPC_DSISR | 32
PPC | KVM_REG_PPC_AMR | 64
PPC | KVM_REG_PPC_UAMOR | 64
PPC | KVM_REG_PPC_MMCR0 | 64
PPC | KVM_REG_PPC_MMCR1 | 64
PPC | KVM_REG_PPC_MMCRA | 64
PPC | KVM_REG_PPC_MMCR2 | 64
PPC | KVM_REG_PPC_MMCRS | 64
PPC | KVM_REG_PPC_SIAR | 64
PPC | KVM_REG_PPC_SDAR | 64
PPC | KVM_REG_PPC_SIER | 64
PPC | KVM_REG_PPC_PMC1 | 32
PPC | KVM_REG_PPC_PMC2 | 32
PPC | KVM_REG_PPC_PMC3 | 32
PPC | KVM_REG_PPC_PMC4 | 32
PPC | KVM_REG_PPC_PMC5 | 32
PPC | KVM_REG_PPC_PMC6 | 32
PPC | KVM_REG_PPC_PMC7 | 32
PPC | KVM_REG_PPC_PMC8 | 32
PPC | KVM_REG_PPC_FPR0 | 64
...
PPC | KVM_REG_PPC_FPR31 | 64
PPC | KVM_REG_PPC_VR0 | 128
PPC | KVM_REG_PPC_FPR31 | 64
PPC | KVM_REG_PPC_VR0 | 128
...
PPC | KVM_REG_PPC_VR31 | 128
PPC | KVM_REG_PPC_VSR0 | 128
PPC | KVM_REG_PPC_VR31 | 128
PPC | KVM_REG_PPC_VSR0 | 128
...
PPC | KVM_REG_PPC_VSR31 | 128
PPC | KVM_REG_PPC_FPSCR | 64
PPC | KVM_REG_PPC_VSCR | 32
PPC | KVM_REG_PPC_VPA_ADDR | 64
PPC | KVM_REG_PPC_VPA_SLB | 128
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
PPC | KVM_REG_PPC_EPR | 32
PPC | KVM_REG_PPC_TCR | 32
PPC | KVM_REG_PPC_TSR | 32
PPC | KVM_REG_PPC_OR_TSR | 32
PPC | KVM_REG_PPC_CLEAR_TSR | 32
PPC | KVM_REG_PPC_MAS0 | 32
PPC | KVM_REG_PPC_MAS1 | 32
PPC | KVM_REG_PPC_MAS2 | 64
PPC | KVM_REG_PPC_MAS7_3 | 64
PPC | KVM_REG_PPC_MAS4 | 32
PPC | KVM_REG_PPC_MAS6 | 32
PPC | KVM_REG_PPC_MMUCFG | 32
PPC | KVM_REG_PPC_TLB0CFG | 32
PPC | KVM_REG_PPC_TLB1CFG | 32
PPC | KVM_REG_PPC_TLB2CFG | 32
PPC | KVM_REG_PPC_TLB3CFG | 32
PPC | KVM_REG_PPC_TLB0PS | 32
PPC | KVM_REG_PPC_TLB1PS | 32
PPC | KVM_REG_PPC_TLB2PS | 32
PPC | KVM_REG_PPC_TLB3PS | 32
PPC | KVM_REG_PPC_EPTCFG | 32
PPC | KVM_REG_PPC_ICP_STATE | 64
PPC | KVM_REG_PPC_TB_OFFSET | 64
PPC | KVM_REG_PPC_SPMC1 | 32
PPC | KVM_REG_PPC_SPMC2 | 32
PPC | KVM_REG_PPC_IAMR | 64
PPC | KVM_REG_PPC_TFHAR | 64
PPC | KVM_REG_PPC_TFIAR | 64
PPC | KVM_REG_PPC_TEXASR | 64
PPC | KVM_REG_PPC_FSCR | 64
PPC | KVM_REG_PPC_PSPB | 32
PPC | KVM_REG_PPC_EBBHR | 64
PPC | KVM_REG_PPC_EBBRR | 64
PPC | KVM_REG_PPC_BESCR | 64
PPC | KVM_REG_PPC_TAR | 64
PPC | KVM_REG_PPC_DPDES | 64
PPC | KVM_REG_PPC_DAWR | 64
PPC | KVM_REG_PPC_DAWRX | 64
PPC | KVM_REG_PPC_CIABR | 64
PPC | KVM_REG_PPC_IC | 64
PPC | KVM_REG_PPC_VTB | 64
PPC | KVM_REG_PPC_CSIGR | 64
PPC | KVM_REG_PPC_TACR | 64
PPC | KVM_REG_PPC_TCSCR | 64
PPC | KVM_REG_PPC_PID | 64
PPC | KVM_REG_PPC_ACOP | 64
PPC | KVM_REG_PPC_VRSAVE | 32
PPC | KVM_REG_PPC_LPCR | 64
PPC | KVM_REG_PPC_PPR | 64
PPC | KVM_REG_PPC_ARCH_COMPAT 32
PPC | KVM_REG_PPC_DABRX | 32
PPC | KVM_REG_PPC_WORT | 64
PPC | KVM_REG_PPC_TM_GPR0 | 64
PPC | KVM_REG_PPC_VSR31 | 128
PPC | KVM_REG_PPC_FPSCR | 64
PPC | KVM_REG_PPC_VSCR | 32
PPC | KVM_REG_PPC_VPA_ADDR | 64
PPC | KVM_REG_PPC_VPA_SLB | 128
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
PPC | KVM_REG_PPC_EPR | 32
PPC | KVM_REG_PPC_TCR | 32
PPC | KVM_REG_PPC_TSR | 32
PPC | KVM_REG_PPC_OR_TSR | 32
PPC | KVM_REG_PPC_CLEAR_TSR | 32
PPC | KVM_REG_PPC_MAS0 | 32
PPC | KVM_REG_PPC_MAS1 | 32
PPC | KVM_REG_PPC_MAS2 | 64
PPC | KVM_REG_PPC_MAS7_3 | 64
PPC | KVM_REG_PPC_MAS4 | 32
PPC | KVM_REG_PPC_MAS6 | 32
PPC | KVM_REG_PPC_MMUCFG | 32
PPC | KVM_REG_PPC_TLB0CFG | 32
PPC | KVM_REG_PPC_TLB1CFG | 32
PPC | KVM_REG_PPC_TLB2CFG | 32
PPC | KVM_REG_PPC_TLB3CFG | 32
PPC | KVM_REG_PPC_TLB0PS | 32
PPC | KVM_REG_PPC_TLB1PS | 32
PPC | KVM_REG_PPC_TLB2PS | 32
PPC | KVM_REG_PPC_TLB3PS | 32
PPC | KVM_REG_PPC_EPTCFG | 32
PPC | KVM_REG_PPC_ICP_STATE | 64
PPC | KVM_REG_PPC_TB_OFFSET | 64
PPC | KVM_REG_PPC_SPMC1 | 32
PPC | KVM_REG_PPC_SPMC2 | 32
PPC | KVM_REG_PPC_IAMR | 64
PPC | KVM_REG_PPC_TFHAR | 64
PPC | KVM_REG_PPC_TFIAR | 64
PPC | KVM_REG_PPC_TEXASR | 64
PPC | KVM_REG_PPC_FSCR | 64
PPC | KVM_REG_PPC_PSPB | 32
PPC | KVM_REG_PPC_EBBHR | 64
PPC | KVM_REG_PPC_EBBRR | 64
PPC | KVM_REG_PPC_BESCR | 64
PPC | KVM_REG_PPC_TAR | 64
PPC | KVM_REG_PPC_DPDES | 64
PPC | KVM_REG_PPC_DAWR | 64
PPC | KVM_REG_PPC_DAWRX | 64
PPC | KVM_REG_PPC_CIABR | 64
PPC | KVM_REG_PPC_IC | 64
PPC | KVM_REG_PPC_VTB | 64
PPC | KVM_REG_PPC_CSIGR | 64
PPC | KVM_REG_PPC_TACR | 64
PPC | KVM_REG_PPC_TCSCR | 64
PPC | KVM_REG_PPC_PID | 64
PPC | KVM_REG_PPC_ACOP | 64
PPC | KVM_REG_PPC_VRSAVE | 32
PPC | KVM_REG_PPC_LPCR | 64
PPC | KVM_REG_PPC_PPR | 64
PPC | KVM_REG_PPC_ARCH_COMPAT | 32
PPC | KVM_REG_PPC_DABRX | 32
PPC | KVM_REG_PPC_WORT | 64
PPC | KVM_REG_PPC_TM_GPR0 | 64
...
PPC | KVM_REG_PPC_TM_GPR31 | 64
PPC | KVM_REG_PPC_TM_VSR0 | 128
PPC | KVM_REG_PPC_TM_GPR31 | 64
PPC | KVM_REG_PPC_TM_VSR0 | 128
...
PPC | KVM_REG_PPC_TM_VSR63 | 128
PPC | KVM_REG_PPC_TM_CR | 64
PPC | KVM_REG_PPC_TM_LR | 64
PPC | KVM_REG_PPC_TM_CTR | 64
PPC | KVM_REG_PPC_TM_FPSCR | 64
PPC | KVM_REG_PPC_TM_AMR | 64
PPC | KVM_REG_PPC_TM_PPR | 64
PPC | KVM_REG_PPC_TM_VRSAVE | 64
PPC | KVM_REG_PPC_TM_VSCR | 32
PPC | KVM_REG_PPC_TM_DSCR | 64
PPC | KVM_REG_PPC_TM_TAR | 64
PPC | KVM_REG_PPC_TM_VSR63 | 128
PPC | KVM_REG_PPC_TM_CR | 64
PPC | KVM_REG_PPC_TM_LR | 64
PPC | KVM_REG_PPC_TM_CTR | 64
PPC | KVM_REG_PPC_TM_FPSCR | 64
PPC | KVM_REG_PPC_TM_AMR | 64
PPC | KVM_REG_PPC_TM_PPR | 64
PPC | KVM_REG_PPC_TM_VRSAVE | 64
PPC | KVM_REG_PPC_TM_VSCR | 32
PPC | KVM_REG_PPC_TM_DSCR | 64
PPC | KVM_REG_PPC_TM_TAR | 64
| |
MIPS | KVM_REG_MIPS_R0 | 64
...
MIPS | KVM_REG_MIPS_R31 | 64
MIPS | KVM_REG_MIPS_HI | 64
MIPS | KVM_REG_MIPS_LO | 64
MIPS | KVM_REG_MIPS_PC | 64
MIPS | KVM_REG_MIPS_CP0_INDEX | 32
MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64
MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64
MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32
MIPS | KVM_REG_MIPS_CP0_WIRED | 32
MIPS | KVM_REG_MIPS_CP0_HWRENA | 32
MIPS | KVM_REG_MIPS_CP0_BADVADDR | 64
MIPS | KVM_REG_MIPS_CP0_COUNT | 32
MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64
MIPS | KVM_REG_MIPS_CP0_COMPARE | 32
MIPS | KVM_REG_MIPS_CP0_STATUS | 32
MIPS | KVM_REG_MIPS_CP0_CAUSE | 32
MIPS | KVM_REG_MIPS_CP0_EPC | 64
MIPS | KVM_REG_MIPS_CP0_CONFIG | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG3 | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG7 | 32
MIPS | KVM_REG_MIPS_CP0_ERROREPC | 64
MIPS | KVM_REG_MIPS_COUNT_CTL | 64
MIPS | KVM_REG_MIPS_COUNT_RESUME | 64
MIPS | KVM_REG_MIPS_COUNT_HZ | 64
ARM registers are mapped using the lower 32 bits. The upper 16 of that
is the register group type, or coprocessor number:
@ -1928,6 +1980,22 @@ arm64 CCSIDR registers are demultiplexed by CSSELR value:
arm64 system registers have the following id bit patterns:
0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>
MIPS registers are mapped using the lower 32 bits. The upper 16 of that is
the register group type:
MIPS core registers (see above) have the following id bit patterns:
0x7030 0000 0000 <reg:16>
MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit
patterns depending on whether they're 32-bit or 64-bit registers:
0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit)
0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit)
MIPS KVM control registers (see above) have the following id bit patterns:
0x7030 0000 0002 <reg:16>
4.69 KVM_GET_ONE_REG
Capability: KVM_CAP_ONE_REG
@ -2415,7 +2483,7 @@ in VCPU matching underlying host.
4.84 KVM_GET_REG_LIST
Capability: basic
Architectures: arm, arm64
Architectures: arm, arm64, mips
Type: vcpu ioctl
Parameters: struct kvm_reg_list (in/out)
Returns: 0 on success; -1 on error
@ -2866,15 +2934,18 @@ The fields in each entry are defined as follows:
6. Capabilities that can be enabled
-----------------------------------
There are certain capabilities that change the behavior of the virtual CPU when
enabled. To enable them, please see section 4.37. Below you can find a list of
capabilities and what their effect on the vCPU is when enabling them.
There are certain capabilities that change the behavior of the virtual CPU or
the virtual machine when enabled. To enable them, please see section 4.37.
Below you can find a list of capabilities and what their effect on the vCPU or
the virtual machine is when enabling them.
The following information is provided along with the description:
Architectures: which instruction set architectures provide this ioctl.
x86 includes both i386 and x86_64.
Target: whether this is a per-vcpu or per-vm capability.
Parameters: what parameters are accepted by the capability.
Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
@ -2884,6 +2955,7 @@ The following information is provided along with the description:
6.1 KVM_CAP_PPC_OSI
Architectures: ppc
Target: vcpu
Parameters: none
Returns: 0 on success; -1 on error
@ -2898,6 +2970,7 @@ When this capability is enabled, KVM_EXIT_OSI can occur.
6.2 KVM_CAP_PPC_PAPR
Architectures: ppc
Target: vcpu
Parameters: none
Returns: 0 on success; -1 on error
@ -2917,6 +2990,7 @@ When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur.
6.3 KVM_CAP_SW_TLB
Architectures: ppc
Target: vcpu
Parameters: args[0] is the address of a struct kvm_config_tlb
Returns: 0 on success; -1 on error
@ -2959,6 +3033,7 @@ For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
6.4 KVM_CAP_S390_CSS_SUPPORT
Architectures: s390
Target: vcpu
Parameters: none
Returns: 0 on success; -1 on error
@ -2970,9 +3045,13 @@ handled in-kernel, while the other I/O instructions are passed to userspace.
When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
SUBCHANNEL intercepts.
Note that even though this capability is enabled per-vcpu, the complete
virtual machine is affected.
6.5 KVM_CAP_PPC_EPR
Architectures: ppc
Target: vcpu
Parameters: args[0] defines whether the proxy facility is active
Returns: 0 on success; -1 on error
@ -2998,7 +3077,17 @@ This capability connects the vcpu to an in-kernel MPIC device.
6.7 KVM_CAP_IRQ_XICS
Architectures: ppc
Target: vcpu
Parameters: args[0] is the XICS device fd
args[1] is the XICS CPU number (server ID) for this vcpu
This capability connects the vcpu to an in-kernel XICS device.
6.8 KVM_CAP_S390_IRQCHIP
Architectures: s390
Target: vm
Parameters: none
This capability enables the in-kernel irqchip for s390. Please refer to
"4.24 KVM_CREATE_IRQCHIP" for details.

View File

@ -359,13 +359,17 @@ enum emulation_result {
#define MIPS3_PG_FRAME 0x3fffffc0
#define VPN2_MASK 0xffffe000
#define TLB_IS_GLOBAL(x) (((x).tlb_lo0 & MIPS3_PG_G) && \
#define TLB_IS_GLOBAL(x) (((x).tlb_lo0 & MIPS3_PG_G) && \
((x).tlb_lo1 & MIPS3_PG_G))
#define TLB_VPN2(x) ((x).tlb_hi & VPN2_MASK)
#define TLB_ASID(x) ((x).tlb_hi & ASID_MASK)
#define TLB_IS_VALID(x, va) (((va) & (1 << PAGE_SHIFT)) \
? ((x).tlb_lo1 & MIPS3_PG_V) \
#define TLB_IS_VALID(x, va) (((va) & (1 << PAGE_SHIFT)) \
? ((x).tlb_lo1 & MIPS3_PG_V) \
: ((x).tlb_lo0 & MIPS3_PG_V))
#define TLB_HI_VPN2_HIT(x, y) ((TLB_VPN2(x) & ~(x).tlb_mask) == \
((y) & VPN2_MASK & ~(x).tlb_mask))
#define TLB_HI_ASID_HIT(x, y) (TLB_IS_GLOBAL(x) || \
TLB_ASID(x) == ((y) & ASID_MASK))
struct kvm_mips_tlb {
long tlb_mask;
@ -760,7 +764,7 @@ extern int kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc,
struct kvm_vcpu *vcpu);
/* Misc */
extern int kvm_mips_dump_stats(struct kvm_vcpu *vcpu);
extern void kvm_mips_dump_stats(struct kvm_vcpu *vcpu);
extern unsigned long kvm_mips_get_ramsize(struct kvm *kvm);

View File

@ -19,6 +19,9 @@
#include <asm/mipsmtregs.h>
#include <asm/uaccess.h> /* for segment_eq() */
extern void (*r4k_blast_dcache)(void);
extern void (*r4k_blast_icache)(void);
/*
* This macro return a properly sign-extended address suitable as base address
* for indexed cache operations. Two issues here:

View File

@ -5,9 +5,9 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
EXTRA_CFLAGS += -Ivirt/kvm -Iarch/mips/kvm
kvm-objs := $(common-objs) kvm_mips.o kvm_mips_emul.o kvm_locore.o \
kvm_mips_int.o kvm_mips_stats.o kvm_mips_commpage.o \
kvm_mips_dyntrans.o kvm_trap_emul.o
kvm-objs := $(common-objs) mips.o emulate.o locore.o \
interrupt.o stats.o commpage.o \
dyntrans.o trap_emul.o
obj-$(CONFIG_KVM) += kvm.o
obj-y += kvm_cb.o kvm_tlb.o
obj-y += callback.o tlb.o

33
arch/mips/kvm/commpage.c Normal file
View File

@ -0,0 +1,33 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* commpage, currently used for Virtual COP0 registers.
* Mapped into the guest kernel @ 0x0.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
#include <linux/bootmem.h>
#include <asm/page.h>
#include <asm/cacheflush.h>
#include <asm/mmu_context.h>
#include <linux/kvm_host.h>
#include "commpage.h"
void kvm_mips_commpage_init(struct kvm_vcpu *vcpu)
{
struct kvm_mips_commpage *page = vcpu->arch.kseg0_commpage;
/* Specific init values for fields */
vcpu->arch.cop0 = &page->cop0;
}

24
arch/mips/kvm/commpage.h Normal file
View File

@ -0,0 +1,24 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: commpage: mapped into get kernel space
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#ifndef __KVM_MIPS_COMMPAGE_H__
#define __KVM_MIPS_COMMPAGE_H__
struct kvm_mips_commpage {
/* COP0 state is mapped into Guest kernel via commpage */
struct mips_coproc cop0;
};
#define KVM_MIPS_COMM_EIDI_OFFSET 0x0
extern void kvm_mips_commpage_init(struct kvm_vcpu *vcpu);
#endif /* __KVM_MIPS_COMMPAGE_H__ */

View File

@ -1,13 +1,13 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Binary Patching for privileged instructions, reduces traps.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Binary Patching for privileged instructions, reduces traps.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
@ -18,7 +18,7 @@
#include <linux/bootmem.h>
#include <asm/cacheflush.h>
#include "kvm_mips_comm.h"
#include "commpage.h"
#define SYNCI_TEMPLATE 0x041f0000
#define SYNCI_BASE(x) (((x) >> 21) & 0x1f)
@ -28,9 +28,8 @@
#define CLEAR_TEMPLATE 0x00000020
#define SW_TEMPLATE 0xac000000
int
kvm_mips_trans_cache_index(uint32_t inst, uint32_t *opc,
struct kvm_vcpu *vcpu)
int kvm_mips_trans_cache_index(uint32_t inst, uint32_t *opc,
struct kvm_vcpu *vcpu)
{
int result = 0;
unsigned long kseg0_opc;
@ -47,12 +46,11 @@ kvm_mips_trans_cache_index(uint32_t inst, uint32_t *opc,
}
/*
* Address based CACHE instructions are transformed into synci(s). A little heavy
* for just D-cache invalidates, but avoids an expensive trap
* Address based CACHE instructions are transformed into synci(s). A little
* heavy for just D-cache invalidates, but avoids an expensive trap
*/
int
kvm_mips_trans_cache_va(uint32_t inst, uint32_t *opc,
struct kvm_vcpu *vcpu)
int kvm_mips_trans_cache_va(uint32_t inst, uint32_t *opc,
struct kvm_vcpu *vcpu)
{
int result = 0;
unsigned long kseg0_opc;
@ -72,8 +70,7 @@ kvm_mips_trans_cache_va(uint32_t inst, uint32_t *opc,
return result;
}
int
kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu)
int kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu)
{
int32_t rt, rd, sel;
uint32_t mfc0_inst;
@ -115,8 +112,7 @@ kvm_mips_trans_mfc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu)
return 0;
}
int
kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu)
int kvm_mips_trans_mtc0(uint32_t inst, uint32_t *opc, struct kvm_vcpu *vcpu)
{
int32_t rt, rd, sel;
uint32_t mtc0_inst = SW_TEMPLATE;

File diff suppressed because it is too large Load Diff

View File

@ -1,13 +1,13 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Interrupt delivery
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Interrupt delivery
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
@ -20,7 +20,7 @@
#include <linux/kvm_host.h>
#include "kvm_mips_int.h"
#include "interrupt.h"
void kvm_mips_queue_irq(struct kvm_vcpu *vcpu, uint32_t priority)
{
@ -34,7 +34,8 @@ void kvm_mips_dequeue_irq(struct kvm_vcpu *vcpu, uint32_t priority)
void kvm_mips_queue_timer_int_cb(struct kvm_vcpu *vcpu)
{
/* Cause bits to reflect the pending timer interrupt,
/*
* Cause bits to reflect the pending timer interrupt,
* the EXC code will be set when we are actually
* delivering the interrupt:
*/
@ -51,12 +52,13 @@ void kvm_mips_dequeue_timer_int_cb(struct kvm_vcpu *vcpu)
kvm_mips_dequeue_irq(vcpu, MIPS_EXC_INT_TIMER);
}
void
kvm_mips_queue_io_int_cb(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq)
void kvm_mips_queue_io_int_cb(struct kvm_vcpu *vcpu,
struct kvm_mips_interrupt *irq)
{
int intr = (int)irq->irq;
/* Cause bits to reflect the pending IO interrupt,
/*
* Cause bits to reflect the pending IO interrupt,
* the EXC code will be set when we are actually
* delivering the interrupt:
*/
@ -83,11 +85,11 @@ kvm_mips_queue_io_int_cb(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq)
}
void
kvm_mips_dequeue_io_int_cb(struct kvm_vcpu *vcpu,
struct kvm_mips_interrupt *irq)
void kvm_mips_dequeue_io_int_cb(struct kvm_vcpu *vcpu,
struct kvm_mips_interrupt *irq)
{
int intr = (int)irq->irq;
switch (intr) {
case -2:
kvm_clear_c0_guest_cause(vcpu->arch.cop0, (C_IRQ0));
@ -111,9 +113,8 @@ kvm_mips_dequeue_io_int_cb(struct kvm_vcpu *vcpu,
}
/* Deliver the interrupt of the corresponding priority, if possible. */
int
kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority,
uint32_t cause)
int kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority,
uint32_t cause)
{
int allowed = 0;
uint32_t exccode;
@ -164,7 +165,6 @@ kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority,
/* Are we allowed to deliver the interrupt ??? */
if (allowed) {
if ((kvm_read_c0_guest_status(cop0) & ST0_EXL) == 0) {
/* save old pc */
kvm_write_c0_guest_epc(cop0, arch->pc);
@ -195,9 +195,8 @@ kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority,
return allowed;
}
int
kvm_mips_irq_clear_cb(struct kvm_vcpu *vcpu, unsigned int priority,
uint32_t cause)
int kvm_mips_irq_clear_cb(struct kvm_vcpu *vcpu, unsigned int priority,
uint32_t cause)
{
return 1;
}

View File

@ -1,14 +1,15 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Interrupts
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Interrupts
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
/* MIPS Exception Priorities, exceptions (including interrupts) are queued up
/*
* MIPS Exception Priorities, exceptions (including interrupts) are queued up
* for the guest in the order specified by their priorities
*/
@ -27,6 +28,9 @@
#define MIPS_EXC_MAX 12
/* XXXSL More to follow */
extern char mips32_exception[], mips32_exceptionEnd[];
extern char mips32_GuestException[], mips32_GuestExceptionEnd[];
#define C_TI (_ULCAST_(1) << 30)
#define KVM_MIPS_IRQ_DELIVER_ALL_AT_ONCE (0)

View File

@ -1,23 +0,0 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: commpage: mapped into get kernel space
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#ifndef __KVM_MIPS_COMMPAGE_H__
#define __KVM_MIPS_COMMPAGE_H__
struct kvm_mips_commpage {
struct mips_coproc cop0; /* COP0 state is mapped into Guest kernel via commpage */
};
#define KVM_MIPS_COMM_EIDI_OFFSET 0x0
extern void kvm_mips_commpage_init(struct kvm_vcpu *vcpu);
#endif /* __KVM_MIPS_COMMPAGE_H__ */

View File

@ -1,37 +0,0 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* commpage, currently used for Virtual COP0 registers.
* Mapped into the guest kernel @ 0x0.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
#include <linux/bootmem.h>
#include <asm/page.h>
#include <asm/cacheflush.h>
#include <asm/mmu_context.h>
#include <linux/kvm_host.h>
#include "kvm_mips_comm.h"
void kvm_mips_commpage_init(struct kvm_vcpu *vcpu)
{
struct kvm_mips_commpage *page = vcpu->arch.kseg0_commpage;
memset(page, 0, sizeof(struct kvm_mips_commpage));
/* Specific init values for fields */
vcpu->arch.cop0 = &page->cop0;
memset(vcpu->arch.cop0, 0, sizeof(struct mips_coproc));
return;
}

View File

@ -1,24 +0,0 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
/*
* Define opcode values not defined in <asm/isnt.h>
*/
#ifndef __KVM_MIPS_OPCODE_H__
#define __KVM_MIPS_OPCODE_H__
/* COP0 Ops */
#define mfmcz_op 0x0b /* 01011 */
#define wrpgpr_op 0x0e /* 01110 */
/* COP0 opcodes (only if COP0 and CO=1): */
#define wait_op 0x20 /* 100000 */
#endif /* __KVM_MIPS_OPCODE_H__ */

View File

@ -16,7 +16,6 @@
#include <asm/stackframe.h>
#include <asm/asm-offsets.h>
#define _C_LABEL(x) x
#define MIPSX(name) mips32_ ## name
#define CALLFRAME_SIZ 32
@ -91,7 +90,10 @@ FEXPORT(__kvm_mips_vcpu_run)
LONG_S $24, PT_R24(k1)
LONG_S $25, PT_R25(k1)
/* XXXKYMA k0/k1 not saved, not being used if we got here through an ioctl() */
/*
* XXXKYMA k0/k1 not saved, not being used if we got here through
* an ioctl()
*/
LONG_S $28, PT_R28(k1)
LONG_S $29, PT_R29(k1)
@ -132,7 +134,10 @@ FEXPORT(__kvm_mips_vcpu_run)
/* Save the kernel gp as well */
LONG_S gp, VCPU_HOST_GP(k1)
/* Setup status register for running the guest in UM, interrupts are disabled */
/*
* Setup status register for running the guest in UM, interrupts
* are disabled
*/
li k0, (ST0_EXL | KSU_USER | ST0_BEV)
mtc0 k0, CP0_STATUS
ehb
@ -152,7 +157,6 @@ FEXPORT(__kvm_mips_vcpu_run)
mtc0 k0, CP0_STATUS
ehb
/* Set Guest EPC */
LONG_L t0, VCPU_PC(k1)
mtc0 t0, CP0_EPC
@ -165,7 +169,7 @@ FEXPORT(__kvm_mips_load_asid)
INT_ADDIU t1, k1, VCPU_GUEST_KERNEL_ASID /* (BD) */
INT_ADDIU t1, k1, VCPU_GUEST_USER_ASID /* else user */
1:
/* t1: contains the base of the ASID array, need to get the cpu id */
/* t1: contains the base of the ASID array, need to get the cpu id */
LONG_L t2, TI_CPU($28) /* smp_processor_id */
INT_SLL t2, t2, 2 /* x4 */
REG_ADDU t3, t1, t2
@ -229,9 +233,7 @@ FEXPORT(__kvm_mips_load_k0k1)
eret
VECTOR(MIPSX(exception), unknown)
/*
* Find out what mode we came from and jump to the proper handler.
*/
/* Find out what mode we came from and jump to the proper handler. */
mtc0 k0, CP0_ERROREPC #01: Save guest k0
ehb #02:
@ -239,7 +241,8 @@ VECTOR(MIPSX(exception), unknown)
INT_SRL k0, k0, 10 #03: Get rid of CPUNum
INT_SLL k0, k0, 10 #04
LONG_S k1, 0x3000(k0) #05: Save k1 @ offset 0x3000
INT_ADDIU k0, k0, 0x2000 #06: Exception handler is installed @ offset 0x2000
INT_ADDIU k0, k0, 0x2000 #06: Exception handler is
# installed @ offset 0x2000
j k0 #07: jump to the function
nop #08: branch delay slot
VECTOR_END(MIPSX(exceptionEnd))
@ -248,7 +251,6 @@ VECTOR_END(MIPSX(exceptionEnd))
/*
* Generic Guest exception handler. We end up here when the guest
* does something that causes a trap to kernel mode.
*
*/
NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
/* Get the VCPU pointer from DDTATA_LO */
@ -290,9 +292,7 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
LONG_S $30, VCPU_R30(k1)
LONG_S $31, VCPU_R31(k1)
/* We need to save hi/lo and restore them on
* the way out
*/
/* We need to save hi/lo and restore them on the way out */
mfhi t0
LONG_S t0, VCPU_HI(k1)
@ -321,8 +321,10 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
/* Save pointer to run in s0, will be saved by the compiler */
move s0, a0
/* Save Host level EPC, BadVaddr and Cause to VCPU, useful to
* process the exception */
/*
* Save Host level EPC, BadVaddr and Cause to VCPU, useful to
* process the exception
*/
mfc0 k0,CP0_EPC
LONG_S k0, VCPU_PC(k1)
@ -351,7 +353,6 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
LONG_L k0, VCPU_HOST_EBASE(k1)
mtc0 k0,CP0_EBASE
/* Now that the new EBASE has been loaded, unset BEV and KSU_USER */
.set at
and v0, v0, ~(ST0_EXL | KSU_USER | ST0_IE)
@ -369,7 +370,8 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
/* Saved host state */
INT_ADDIU sp, sp, -PT_SIZE
/* XXXKYMA do we need to load the host ASID, maybe not because the
/*
* XXXKYMA do we need to load the host ASID, maybe not because the
* kernel entries are marked GLOBAL, need to verify
*/
@ -383,9 +385,11 @@ NESTED (MIPSX(GuestException), CALLFRAME_SIZ, ra)
/* Jump to handler */
FEXPORT(__kvm_mips_jump_to_handler)
/* XXXKYMA: not sure if this is safe, how large is the stack??
/*
* XXXKYMA: not sure if this is safe, how large is the stack??
* Now jump to the kvm_mips_handle_exit() to see if we can deal
* with this in the kernel */
* with this in the kernel
*/
PTR_LA t9, kvm_mips_handle_exit
jalr.hb t9
INT_ADDIU sp, sp, -CALLFRAME_SIZ /* BD Slot */
@ -394,7 +398,8 @@ FEXPORT(__kvm_mips_jump_to_handler)
di
ehb
/* XXXKYMA: k0/k1 could have been blown away if we processed
/*
* XXXKYMA: k0/k1 could have been blown away if we processed
* an exception while we were handling the exception from the
* guest, reload k1
*/
@ -402,7 +407,8 @@ FEXPORT(__kvm_mips_jump_to_handler)
move k1, s1
INT_ADDIU k1, k1, VCPU_HOST_ARCH
/* Check return value, should tell us if we are returning to the
/*
* Check return value, should tell us if we are returning to the
* host (handle I/O etc)or resuming the guest
*/
andi t0, v0, RESUME_HOST
@ -521,8 +527,10 @@ __kvm_mips_return_to_host:
LONG_L $0, PT_R0(k1)
LONG_L $1, PT_R1(k1)
/* r2/v0 is the return code, shift it down by 2 (arithmetic)
* to recover the err code */
/*
* r2/v0 is the return code, shift it down by 2 (arithmetic)
* to recover the err code
*/
INT_SRA k0, v0, 2
move $2, k0
@ -566,7 +574,6 @@ __kvm_mips_return_to_host:
PTR_LI k0, 0x2000000F
mtc0 k0, CP0_HWRENA
/* Restore RA, which is the address we will return to */
LONG_L ra, PT_R31(k1)
j ra

View File

@ -7,7 +7,7 @@
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
*/
#include <linux/errno.h>
#include <linux/err.h>
@ -21,8 +21,8 @@
#include <linux/kvm_host.h>
#include "kvm_mips_int.h"
#include "kvm_mips_comm.h"
#include "interrupt.h"
#include "commpage.h"
#define CREATE_TRACE_POINTS
#include "trace.h"
@ -31,38 +31,41 @@
#define VECTORSPACING 0x100 /* for EI/VI mode */
#endif
#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x)
struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "wait", VCPU_STAT(wait_exits) },
{ "cache", VCPU_STAT(cache_exits) },
{ "signal", VCPU_STAT(signal_exits) },
{ "interrupt", VCPU_STAT(int_exits) },
{ "cop_unsuable", VCPU_STAT(cop_unusable_exits) },
{ "tlbmod", VCPU_STAT(tlbmod_exits) },
{ "tlbmiss_ld", VCPU_STAT(tlbmiss_ld_exits) },
{ "tlbmiss_st", VCPU_STAT(tlbmiss_st_exits) },
{ "addrerr_st", VCPU_STAT(addrerr_st_exits) },
{ "addrerr_ld", VCPU_STAT(addrerr_ld_exits) },
{ "syscall", VCPU_STAT(syscall_exits) },
{ "resvd_inst", VCPU_STAT(resvd_inst_exits) },
{ "break_inst", VCPU_STAT(break_inst_exits) },
{ "flush_dcache", VCPU_STAT(flush_dcache_exits) },
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
{ "wait", VCPU_STAT(wait_exits), KVM_STAT_VCPU },
{ "cache", VCPU_STAT(cache_exits), KVM_STAT_VCPU },
{ "signal", VCPU_STAT(signal_exits), KVM_STAT_VCPU },
{ "interrupt", VCPU_STAT(int_exits), KVM_STAT_VCPU },
{ "cop_unsuable", VCPU_STAT(cop_unusable_exits), KVM_STAT_VCPU },
{ "tlbmod", VCPU_STAT(tlbmod_exits), KVM_STAT_VCPU },
{ "tlbmiss_ld", VCPU_STAT(tlbmiss_ld_exits), KVM_STAT_VCPU },
{ "tlbmiss_st", VCPU_STAT(tlbmiss_st_exits), KVM_STAT_VCPU },
{ "addrerr_st", VCPU_STAT(addrerr_st_exits), KVM_STAT_VCPU },
{ "addrerr_ld", VCPU_STAT(addrerr_ld_exits), KVM_STAT_VCPU },
{ "syscall", VCPU_STAT(syscall_exits), KVM_STAT_VCPU },
{ "resvd_inst", VCPU_STAT(resvd_inst_exits), KVM_STAT_VCPU },
{ "break_inst", VCPU_STAT(break_inst_exits), KVM_STAT_VCPU },
{ "flush_dcache", VCPU_STAT(flush_dcache_exits), KVM_STAT_VCPU },
{ "halt_wakeup", VCPU_STAT(halt_wakeup), KVM_STAT_VCPU },
{NULL}
};
static int kvm_mips_reset_vcpu(struct kvm_vcpu *vcpu)
{
int i;
for_each_possible_cpu(i) {
vcpu->arch.guest_kernel_asid[i] = 0;
vcpu->arch.guest_user_asid[i] = 0;
}
return 0;
}
/* XXXKYMA: We are simulatoring a processor that has the WII bit set in Config7, so we
* are "runnable" if interrupts are pending
/*
* XXXKYMA: We are simulatoring a processor that has the WII bit set in
* Config7, so we are "runnable" if interrupts are pending
*/
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
{
@ -94,16 +97,17 @@ void kvm_arch_hardware_unsetup(void)
void kvm_arch_check_processor_compat(void *rtn)
{
int *r = (int *)rtn;
*r = 0;
return;
*(int *)rtn = 0;
}
static void kvm_mips_init_tlbs(struct kvm *kvm)
{
unsigned long wired;
/* Add a wired entry to the TLB, it is used to map the commpage to the Guest kernel */
/*
* Add a wired entry to the TLB, it is used to map the commpage to
* the Guest kernel
*/
wired = read_c0_wired();
write_c0_wired(wired + 1);
mtc0_tlbw_hazard();
@ -130,7 +134,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
on_each_cpu(kvm_mips_init_vm_percpu, kvm, 1);
}
return 0;
}
@ -185,8 +188,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
}
}
long
kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
long kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
return -ENOIOCTLCMD;
}
@ -207,20 +210,20 @@ void kvm_arch_memslots_updated(struct kvm *kvm)
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
struct kvm_memory_slot *memslot,
struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
{
return 0;
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
struct kvm_userspace_memory_region *mem,
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
{
unsigned long npages = 0;
int i, err = 0;
int i;
kvm_debug("%s: kvm: %p slot: %d, GPA: %llx, size: %llx, QVA: %llx\n",
__func__, kvm, mem->slot, mem->guest_phys_addr,
@ -238,21 +241,17 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
if (!kvm->arch.guest_pmap) {
kvm_err("Failed to allocate guest PMAP");
err = -ENOMEM;
goto out;
return;
}
kvm_debug("Allocated space for Guest PMAP Table (%ld pages) @ %p\n",
npages, kvm->arch.guest_pmap);
/* Now setup the page table */
for (i = 0; i < npages; i++) {
for (i = 0; i < npages; i++)
kvm->arch.guest_pmap[i] = KVM_INVALID_PAGE;
}
}
}
out:
return;
}
void kvm_arch_flush_shadow_all(struct kvm *kvm)
@ -270,8 +269,6 @@ void kvm_arch_flush_shadow(struct kvm *kvm)
struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
{
extern char mips32_exception[], mips32_exceptionEnd[];
extern char mips32_GuestException[], mips32_GuestExceptionEnd[];
int err, size, offset;
void *gebase;
int i;
@ -290,14 +287,14 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
kvm_debug("kvm @ %p: create cpu %d at %p\n", kvm, id, vcpu);
/* Allocate space for host mode exception handlers that handle
/*
* Allocate space for host mode exception handlers that handle
* guest mode exits
*/
if (cpu_has_veic || cpu_has_vint) {
if (cpu_has_veic || cpu_has_vint)
size = 0x200 + VECTORSPACING * 64;
} else {
else
size = 0x4000;
}
/* Save Linux EBASE */
vcpu->arch.host_ebase = (void *)read_c0_ebase();
@ -345,7 +342,10 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
local_flush_icache_range((unsigned long)gebase,
(unsigned long)gebase + ALIGN(size, PAGE_SIZE));
/* Allocate comm page for guest kernel, a TLB will be reserved for mapping GVA @ 0xFFFF8000 to this page */
/*
* Allocate comm page for guest kernel, a TLB will be reserved for
* mapping GVA @ 0xFFFF8000 to this page
*/
vcpu->arch.kseg0_commpage = kzalloc(PAGE_SIZE << 1, GFP_KERNEL);
if (!vcpu->arch.kseg0_commpage) {
@ -392,9 +392,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
kvm_arch_vcpu_free(vcpu);
}
int
kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
return -ENOIOCTLCMD;
}
@ -431,8 +430,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
return r;
}
int
kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq)
int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
struct kvm_mips_interrupt *irq)
{
int intr = (int)irq->irq;
struct kvm_vcpu *dvcpu = NULL;
@ -459,23 +458,20 @@ kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_mips_interrupt *irq)
dvcpu->arch.wait = 0;
if (waitqueue_active(&dvcpu->wq)) {
if (waitqueue_active(&dvcpu->wq))
wake_up_interruptible(&dvcpu->wq);
}
return 0;
}
int
kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
{
return -ENOIOCTLCMD;
}
int
kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
{
return -ENOIOCTLCMD;
}
@ -632,10 +628,12 @@ static int kvm_mips_get_reg(struct kvm_vcpu *vcpu,
}
if ((reg->id & KVM_REG_SIZE_MASK) == KVM_REG_SIZE_U64) {
u64 __user *uaddr64 = (u64 __user *)(long)reg->addr;
return put_user(v, uaddr64);
} else if ((reg->id & KVM_REG_SIZE_MASK) == KVM_REG_SIZE_U32) {
u32 __user *uaddr32 = (u32 __user *)(long)reg->addr;
u32 v32 = (u32)v;
return put_user(v32, uaddr32);
} else {
return -EINVAL;
@ -728,8 +726,8 @@ static int kvm_mips_set_reg(struct kvm_vcpu *vcpu,
return 0;
}
long
kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl,
unsigned long arg)
{
struct kvm_vcpu *vcpu = filp->private_data;
void __user *argp = (void __user *)arg;
@ -739,6 +737,7 @@ kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
case KVM_SET_ONE_REG:
case KVM_GET_ONE_REG: {
struct kvm_one_reg reg;
if (copy_from_user(&reg, argp, sizeof(reg)))
return -EFAULT;
if (ioctl == KVM_SET_ONE_REG)
@ -773,6 +772,7 @@ kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
case KVM_INTERRUPT:
{
struct kvm_mips_interrupt irq;
r = -EFAULT;
if (copy_from_user(&irq, argp, sizeof(irq)))
goto out;
@ -791,9 +791,7 @@ kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
return r;
}
/*
* Get (and clear) the dirty memory log for a memory slot.
*/
/* Get (and clear) the dirty memory log for a memory slot. */
int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
{
struct kvm_memory_slot *memslot;
@ -815,8 +813,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT);
printk("%s: dirty, ga: %#lx, ga_end %#lx\n", __func__, ga,
ga_end);
kvm_info("%s: dirty, ga: %#lx, ga_end %#lx\n", __func__, ga,
ga_end);
n = kvm_dirty_bitmap_bytes(memslot);
memset(memslot->dirty_bitmap, 0, n);
@ -843,16 +841,12 @@ long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
int kvm_arch_init(void *opaque)
{
int ret;
if (kvm_mips_callbacks) {
kvm_err("kvm: module already exists\n");
return -EEXIST;
}
ret = kvm_mips_emulation_init(&kvm_mips_callbacks);
return ret;
return kvm_mips_emulation_init(&kvm_mips_callbacks);
}
void kvm_arch_exit(void)
@ -860,14 +854,14 @@ void kvm_arch_exit(void)
kvm_mips_callbacks = NULL;
}
int
kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
return -ENOIOCTLCMD;
}
int
kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
struct kvm_sregs *sregs)
{
return -ENOIOCTLCMD;
}
@ -923,24 +917,25 @@ int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu)
if (!vcpu)
return -1;
printk("VCPU Register Dump:\n");
printk("\tpc = 0x%08lx\n", vcpu->arch.pc);
printk("\texceptions: %08lx\n", vcpu->arch.pending_exceptions);
kvm_debug("VCPU Register Dump:\n");
kvm_debug("\tpc = 0x%08lx\n", vcpu->arch.pc);
kvm_debug("\texceptions: %08lx\n", vcpu->arch.pending_exceptions);
for (i = 0; i < 32; i += 4) {
printk("\tgpr%02d: %08lx %08lx %08lx %08lx\n", i,
kvm_debug("\tgpr%02d: %08lx %08lx %08lx %08lx\n", i,
vcpu->arch.gprs[i],
vcpu->arch.gprs[i + 1],
vcpu->arch.gprs[i + 2], vcpu->arch.gprs[i + 3]);
}
printk("\thi: 0x%08lx\n", vcpu->arch.hi);
printk("\tlo: 0x%08lx\n", vcpu->arch.lo);
kvm_debug("\thi: 0x%08lx\n", vcpu->arch.hi);
kvm_debug("\tlo: 0x%08lx\n", vcpu->arch.lo);
cop0 = vcpu->arch.cop0;
printk("\tStatus: 0x%08lx, Cause: 0x%08lx\n",
kvm_read_c0_guest_status(cop0), kvm_read_c0_guest_cause(cop0));
kvm_debug("\tStatus: 0x%08lx, Cause: 0x%08lx\n",
kvm_read_c0_guest_status(cop0),
kvm_read_c0_guest_cause(cop0));
printk("\tEPC: 0x%08lx\n", kvm_read_c0_guest_epc(cop0));
kvm_debug("\tEPC: 0x%08lx\n", kvm_read_c0_guest_epc(cop0));
return 0;
}
@ -980,14 +975,11 @@ static void kvm_mips_comparecount_func(unsigned long data)
kvm_mips_callbacks->queue_timer_int(vcpu);
vcpu->arch.wait = 0;
if (waitqueue_active(&vcpu->wq)) {
if (waitqueue_active(&vcpu->wq))
wake_up_interruptible(&vcpu->wq);
}
}
/*
* low level hrtimer wake routine.
*/
/* low level hrtimer wake routine */
static enum hrtimer_restart kvm_mips_comparecount_wakeup(struct hrtimer *timer)
{
struct kvm_vcpu *vcpu;
@ -1008,11 +1000,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
{
return;
}
int
kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, struct kvm_translation *tr)
int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
struct kvm_translation *tr)
{
return 0;
}
@ -1023,8 +1014,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
return kvm_mips_callbacks->vcpu_setup(vcpu);
}
static
void kvm_mips_set_c0_status(void)
static void kvm_mips_set_c0_status(void)
{
uint32_t status = read_c0_status();
@ -1054,7 +1044,10 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1;
/* Set the appropriate status bits based on host CPU features, before we hit the scheduler */
/*
* Set the appropriate status bits based on host CPU features,
* before we hit the scheduler
*/
kvm_mips_set_c0_status();
local_irq_enable();
@ -1062,7 +1055,8 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
kvm_debug("kvm_mips_handle_exit: cause: %#x, PC: %p, kvm_run: %p, kvm_vcpu: %p\n",
cause, opc, run, vcpu);
/* Do a privilege check, if in UM most of these exit conditions end up
/*
* Do a privilege check, if in UM most of these exit conditions end up
* causing an exception to be delivered to the Guest Kernel
*/
er = kvm_mips_check_privilege(cause, opc, run, vcpu);
@ -1081,9 +1075,8 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
++vcpu->stat.int_exits;
trace_kvm_exit(vcpu, INT_EXITS);
if (need_resched()) {
if (need_resched())
cond_resched();
}
ret = RESUME_GUEST;
break;
@ -1095,9 +1088,8 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
trace_kvm_exit(vcpu, COP_UNUSABLE_EXITS);
ret = kvm_mips_callbacks->handle_cop_unusable(vcpu);
/* XXXKYMA: Might need to return to user space */
if (run->exit_reason == KVM_EXIT_IRQ_WINDOW_OPEN) {
if (run->exit_reason == KVM_EXIT_IRQ_WINDOW_OPEN)
ret = RESUME_HOST;
}
break;
case T_TLB_MOD:
@ -1107,10 +1099,9 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
break;
case T_TLB_ST_MISS:
kvm_debug
("TLB ST fault: cause %#x, status %#lx, PC: %p, BadVaddr: %#lx\n",
cause, kvm_read_c0_guest_status(vcpu->arch.cop0), opc,
badvaddr);
kvm_debug("TLB ST fault: cause %#x, status %#lx, PC: %p, BadVaddr: %#lx\n",
cause, kvm_read_c0_guest_status(vcpu->arch.cop0), opc,
badvaddr);
++vcpu->stat.tlbmiss_st_exits;
trace_kvm_exit(vcpu, TLBMISS_ST_EXITS);
@ -1157,10 +1148,9 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
break;
default:
kvm_err
("Exception Code: %d, not yet handled, @ PC: %p, inst: 0x%08x BadVaddr: %#lx Status: %#lx\n",
exccode, opc, kvm_get_inst(opc, vcpu), badvaddr,
kvm_read_c0_guest_status(vcpu->arch.cop0));
kvm_err("Exception Code: %d, not yet handled, @ PC: %p, inst: 0x%08x BadVaddr: %#lx Status: %#lx\n",
exccode, opc, kvm_get_inst(opc, vcpu), badvaddr,
kvm_read_c0_guest_status(vcpu->arch.cop0));
kvm_arch_vcpu_dump_regs(vcpu);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
@ -1175,7 +1165,7 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
kvm_mips_deliver_interrupts(vcpu, cause);
if (!(ret & RESUME_HOST)) {
/* Only check for signals if not already exiting to userspace */
/* Only check for signals if not already exiting to userspace */
if (signal_pending(current)) {
run->exit_reason = KVM_EXIT_INTR;
ret = (-EINTR << 2) | RESUME_HOST;
@ -1196,11 +1186,13 @@ int __init kvm_mips_init(void)
if (ret)
return ret;
/* On MIPS, kernel modules are executed from "mapped space", which requires TLBs.
* The TLB handling code is statically linked with the rest of the kernel (kvm_tlb.c)
* to avoid the possibility of double faulting. The issue is that the TLB code
* references routines that are part of the the KVM module,
* which are only available once the module is loaded.
/*
* On MIPS, kernel modules are executed from "mapped space", which
* requires TLBs. The TLB handling code is statically linked with
* the rest of the kernel (tlb.c) to avoid the possibility of
* double faulting. The issue is that the TLB code references
* routines that are part of the the KVM module, which are only
* available once the module is loaded.
*/
kvm_mips_gfn_to_pfn = gfn_to_pfn;
kvm_mips_release_pfn_clean = kvm_release_pfn_clean;

22
arch/mips/kvm/opcode.h Normal file
View File

@ -0,0 +1,22 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
/* Define opcode values not defined in <asm/isnt.h> */
#ifndef __KVM_MIPS_OPCODE_H__
#define __KVM_MIPS_OPCODE_H__
/* COP0 Ops */
#define mfmcz_op 0x0b /* 01011 */
#define wrpgpr_op 0x0e /* 01110 */
/* COP0 opcodes (only if COP0 and CO=1): */
#define wait_op 0x20 /* 100000 */
#endif /* __KVM_MIPS_OPCODE_H__ */

View File

@ -1,13 +1,13 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: COP0 access histogram
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: COP0 access histogram
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/kvm_host.h>
@ -63,20 +63,18 @@ char *kvm_cop0_str[N_MIPS_COPROC_REGS] = {
"DESAVE"
};
int kvm_mips_dump_stats(struct kvm_vcpu *vcpu)
void kvm_mips_dump_stats(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_KVM_MIPS_DEBUG_COP0_COUNTERS
int i, j;
printk("\nKVM VCPU[%d] COP0 Access Profile:\n", vcpu->vcpu_id);
kvm_info("\nKVM VCPU[%d] COP0 Access Profile:\n", vcpu->vcpu_id);
for (i = 0; i < N_MIPS_COPROC_REGS; i++) {
for (j = 0; j < N_MIPS_COPROC_SEL; j++) {
if (vcpu->arch.cop0->stat[i][j])
printk("%s[%d]: %lu\n", kvm_cop0_str[i], j,
vcpu->arch.cop0->stat[i][j]);
kvm_info("%s[%d]: %lu\n", kvm_cop0_str[i], j,
vcpu->arch.cop0->stat[i][j]);
}
}
#endif
return 0;
}

View File

@ -1,14 +1,14 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS TLB handling, this file is part of the Linux host kernel so that
* TLB handlers run from KSEG0
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS TLB handling, this file is part of the Linux host kernel so that
* TLB handlers run from KSEG0
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/sched.h>
#include <linux/smp.h>
@ -18,7 +18,6 @@
#include <linux/kvm_host.h>
#include <linux/srcu.h>
#include <asm/cpu.h>
#include <asm/bootinfo.h>
#include <asm/mmu_context.h>
@ -39,13 +38,13 @@ atomic_t kvm_mips_instance;
EXPORT_SYMBOL(kvm_mips_instance);
/* These function pointers are initialized once the KVM module is loaded */
pfn_t(*kvm_mips_gfn_to_pfn) (struct kvm *kvm, gfn_t gfn);
pfn_t (*kvm_mips_gfn_to_pfn)(struct kvm *kvm, gfn_t gfn);
EXPORT_SYMBOL(kvm_mips_gfn_to_pfn);
void (*kvm_mips_release_pfn_clean) (pfn_t pfn);
void (*kvm_mips_release_pfn_clean)(pfn_t pfn);
EXPORT_SYMBOL(kvm_mips_release_pfn_clean);
bool(*kvm_mips_is_error_pfn) (pfn_t pfn);
bool (*kvm_mips_is_error_pfn)(pfn_t pfn);
EXPORT_SYMBOL(kvm_mips_is_error_pfn);
uint32_t kvm_mips_get_kernel_asid(struct kvm_vcpu *vcpu)
@ -53,21 +52,17 @@ uint32_t kvm_mips_get_kernel_asid(struct kvm_vcpu *vcpu)
return vcpu->arch.guest_kernel_asid[smp_processor_id()] & ASID_MASK;
}
uint32_t kvm_mips_get_user_asid(struct kvm_vcpu *vcpu)
{
return vcpu->arch.guest_user_asid[smp_processor_id()] & ASID_MASK;
}
inline uint32_t kvm_mips_get_commpage_asid (struct kvm_vcpu *vcpu)
inline uint32_t kvm_mips_get_commpage_asid(struct kvm_vcpu *vcpu)
{
return vcpu->kvm->arch.commpage_tlb;
}
/*
* Structure defining an tlb entry data set.
*/
/* Structure defining an tlb entry data set. */
void kvm_mips_dump_host_tlbs(void)
{
@ -82,8 +77,8 @@ void kvm_mips_dump_host_tlbs(void)
old_entryhi = read_c0_entryhi();
old_pagemask = read_c0_pagemask();
printk("HOST TLBs:\n");
printk("ASID: %#lx\n", read_c0_entryhi() & ASID_MASK);
kvm_info("HOST TLBs:\n");
kvm_info("ASID: %#lx\n", read_c0_entryhi() & ASID_MASK);
for (i = 0; i < current_cpu_data.tlbsize; i++) {
write_c0_index(i);
@ -97,25 +92,26 @@ void kvm_mips_dump_host_tlbs(void)
tlb.tlb_lo1 = read_c0_entrylo1();
tlb.tlb_mask = read_c0_pagemask();
printk("TLB%c%3d Hi 0x%08lx ",
(tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*',
i, tlb.tlb_hi);
printk("Lo0=0x%09" PRIx64 " %c%c attr %lx ",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0),
(tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo0 >> 3) & 7);
printk("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1),
(tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask);
kvm_info("TLB%c%3d Hi 0x%08lx ",
(tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*',
i, tlb.tlb_hi);
kvm_info("Lo0=0x%09" PRIx64 " %c%c attr %lx ",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0),
(tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo0 >> 3) & 7);
kvm_info("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1),
(tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask);
}
write_c0_entryhi(old_entryhi);
write_c0_pagemask(old_pagemask);
mtc0_tlbw_hazard();
local_irq_restore(flags);
}
EXPORT_SYMBOL(kvm_mips_dump_host_tlbs);
void kvm_mips_dump_guest_tlbs(struct kvm_vcpu *vcpu)
{
@ -123,26 +119,27 @@ void kvm_mips_dump_guest_tlbs(struct kvm_vcpu *vcpu)
struct kvm_mips_tlb tlb;
int i;
printk("Guest TLBs:\n");
printk("Guest EntryHi: %#lx\n", kvm_read_c0_guest_entryhi(cop0));
kvm_info("Guest TLBs:\n");
kvm_info("Guest EntryHi: %#lx\n", kvm_read_c0_guest_entryhi(cop0));
for (i = 0; i < KVM_MIPS_GUEST_TLB_SIZE; i++) {
tlb = vcpu->arch.guest_tlb[i];
printk("TLB%c%3d Hi 0x%08lx ",
(tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*',
i, tlb.tlb_hi);
printk("Lo0=0x%09" PRIx64 " %c%c attr %lx ",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0),
(tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo0 >> 3) & 7);
printk("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1),
(tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask);
kvm_info("TLB%c%3d Hi 0x%08lx ",
(tlb.tlb_lo0 | tlb.tlb_lo1) & MIPS3_PG_V ? ' ' : '*',
i, tlb.tlb_hi);
kvm_info("Lo0=0x%09" PRIx64 " %c%c attr %lx ",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo0),
(tlb.tlb_lo0 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo0 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo0 >> 3) & 7);
kvm_info("Lo1=0x%09" PRIx64 " %c%c attr %lx sz=%lx\n",
(uint64_t) mips3_tlbpfn_to_paddr(tlb.tlb_lo1),
(tlb.tlb_lo1 & MIPS3_PG_D) ? 'D' : ' ',
(tlb.tlb_lo1 & MIPS3_PG_G) ? 'G' : ' ',
(tlb.tlb_lo1 >> 3) & 7, tlb.tlb_mask);
}
}
EXPORT_SYMBOL(kvm_mips_dump_guest_tlbs);
static int kvm_mips_map_page(struct kvm *kvm, gfn_t gfn)
{
@ -152,7 +149,7 @@ static int kvm_mips_map_page(struct kvm *kvm, gfn_t gfn)
if (kvm->arch.guest_pmap[gfn] != KVM_INVALID_PAGE)
return 0;
srcu_idx = srcu_read_lock(&kvm->srcu);
srcu_idx = srcu_read_lock(&kvm->srcu);
pfn = kvm_mips_gfn_to_pfn(kvm, gfn);
if (kvm_mips_is_error_pfn(pfn)) {
@ -169,7 +166,7 @@ static int kvm_mips_map_page(struct kvm *kvm, gfn_t gfn)
/* Translate guest KSEG0 addresses to Host PA */
unsigned long kvm_mips_translate_guest_kseg0_to_hpa(struct kvm_vcpu *vcpu,
unsigned long gva)
unsigned long gva)
{
gfn_t gfn;
uint32_t offset = gva & ~PAGE_MASK;
@ -194,20 +191,20 @@ unsigned long kvm_mips_translate_guest_kseg0_to_hpa(struct kvm_vcpu *vcpu,
return (kvm->arch.guest_pmap[gfn] << PAGE_SHIFT) + offset;
}
EXPORT_SYMBOL(kvm_mips_translate_guest_kseg0_to_hpa);
/* XXXKYMA: Must be called with interrupts disabled */
/* set flush_dcache_mask == 0 if no dcache flush required */
int
kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, unsigned long entryhi,
unsigned long entrylo0, unsigned long entrylo1, int flush_dcache_mask)
int kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, unsigned long entryhi,
unsigned long entrylo0, unsigned long entrylo1,
int flush_dcache_mask)
{
unsigned long flags;
unsigned long old_entryhi;
volatile int idx;
int idx;
local_irq_save(flags);
old_entryhi = read_c0_entryhi();
write_c0_entryhi(entryhi);
mtc0_tlbw_hazard();
@ -240,12 +237,14 @@ kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, unsigned long entryhi,
if (flush_dcache_mask) {
if (entrylo0 & MIPS3_PG_V) {
++vcpu->stat.flush_dcache_exits;
flush_data_cache_page((entryhi & VPN2_MASK) & ~flush_dcache_mask);
flush_data_cache_page((entryhi & VPN2_MASK) &
~flush_dcache_mask);
}
if (entrylo1 & MIPS3_PG_V) {
++vcpu->stat.flush_dcache_exits;
flush_data_cache_page(((entryhi & VPN2_MASK) & ~flush_dcache_mask) |
(0x1 << PAGE_SHIFT));
flush_data_cache_page(((entryhi & VPN2_MASK) &
~flush_dcache_mask) |
(0x1 << PAGE_SHIFT));
}
}
@ -257,10 +256,9 @@ kvm_mips_host_tlb_write(struct kvm_vcpu *vcpu, unsigned long entryhi,
return 0;
}
/* XXXKYMA: Must be called with interrupts disabled */
int kvm_mips_handle_kseg0_tlb_fault(unsigned long badvaddr,
struct kvm_vcpu *vcpu)
struct kvm_vcpu *vcpu)
{
gfn_t gfn;
pfn_t pfn0, pfn1;
@ -270,7 +268,6 @@ int kvm_mips_handle_kseg0_tlb_fault(unsigned long badvaddr,
struct kvm *kvm = vcpu->kvm;
const int flush_dcache_mask = 0;
if (KVM_GUEST_KSEGX(badvaddr) != KVM_GUEST_KSEG0) {
kvm_err("%s: Invalid BadVaddr: %#lx\n", __func__, badvaddr);
kvm_mips_dump_host_tlbs();
@ -302,14 +299,15 @@ int kvm_mips_handle_kseg0_tlb_fault(unsigned long badvaddr,
}
entryhi = (vaddr | kvm_mips_get_kernel_asid(vcpu));
entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) |
(0x1 << 1);
entrylo1 = mips3_paddr_to_tlbpfn(pfn1 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) |
(0x1 << 1);
entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) |
(1 << 2) | (0x1 << 1);
entrylo1 = mips3_paddr_to_tlbpfn(pfn1 << PAGE_SHIFT) | (0x3 << 3) |
(1 << 2) | (0x1 << 1);
return kvm_mips_host_tlb_write(vcpu, entryhi, entrylo0, entrylo1,
flush_dcache_mask);
}
EXPORT_SYMBOL(kvm_mips_handle_kseg0_tlb_fault);
int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr,
struct kvm_vcpu *vcpu)
@ -318,11 +316,10 @@ int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr,
unsigned long flags, old_entryhi = 0, vaddr = 0;
unsigned long entrylo0 = 0, entrylo1 = 0;
pfn0 = CPHYSADDR(vcpu->arch.kseg0_commpage) >> PAGE_SHIFT;
pfn1 = 0;
entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) | (1 << 2) |
(0x1 << 1);
entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) |
(1 << 2) | (0x1 << 1);
entrylo1 = 0;
local_irq_save(flags);
@ -341,9 +338,9 @@ int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr,
mtc0_tlbw_hazard();
tlbw_use_hazard();
kvm_debug ("@ %#lx idx: %2d [entryhi(R): %#lx] entrylo0 (R): 0x%08lx, entrylo1(R): 0x%08lx\n",
vcpu->arch.pc, read_c0_index(), read_c0_entryhi(),
read_c0_entrylo0(), read_c0_entrylo1());
kvm_debug("@ %#lx idx: %2d [entryhi(R): %#lx] entrylo0 (R): 0x%08lx, entrylo1(R): 0x%08lx\n",
vcpu->arch.pc, read_c0_index(), read_c0_entryhi(),
read_c0_entrylo0(), read_c0_entrylo1());
/* Restore old ASID */
write_c0_entryhi(old_entryhi);
@ -353,28 +350,33 @@ int kvm_mips_handle_commpage_tlb_fault(unsigned long badvaddr,
return 0;
}
EXPORT_SYMBOL(kvm_mips_handle_commpage_tlb_fault);
int
kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu,
struct kvm_mips_tlb *tlb, unsigned long *hpa0, unsigned long *hpa1)
int kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu,
struct kvm_mips_tlb *tlb,
unsigned long *hpa0,
unsigned long *hpa1)
{
unsigned long entryhi = 0, entrylo0 = 0, entrylo1 = 0;
struct kvm *kvm = vcpu->kvm;
pfn_t pfn0, pfn1;
if ((tlb->tlb_hi & VPN2_MASK) == 0) {
pfn0 = 0;
pfn1 = 0;
} else {
if (kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo0) >> PAGE_SHIFT) < 0)
if (kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo0)
>> PAGE_SHIFT) < 0)
return -1;
if (kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo1) >> PAGE_SHIFT) < 0)
if (kvm_mips_map_page(kvm, mips3_tlbpfn_to_paddr(tlb->tlb_lo1)
>> PAGE_SHIFT) < 0)
return -1;
pfn0 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo0) >> PAGE_SHIFT];
pfn1 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo1) >> PAGE_SHIFT];
pfn0 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo0)
>> PAGE_SHIFT];
pfn1 = kvm->arch.guest_pmap[mips3_tlbpfn_to_paddr(tlb->tlb_lo1)
>> PAGE_SHIFT];
}
if (hpa0)
@ -385,11 +387,12 @@ kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu,
/* Get attributes from the Guest TLB */
entryhi = (tlb->tlb_hi & VPN2_MASK) | (KVM_GUEST_KERNEL_MODE(vcpu) ?
kvm_mips_get_kernel_asid(vcpu) : kvm_mips_get_user_asid(vcpu));
kvm_mips_get_kernel_asid(vcpu) :
kvm_mips_get_user_asid(vcpu));
entrylo0 = mips3_paddr_to_tlbpfn(pfn0 << PAGE_SHIFT) | (0x3 << 3) |
(tlb->tlb_lo0 & MIPS3_PG_D) | (tlb->tlb_lo0 & MIPS3_PG_V);
(tlb->tlb_lo0 & MIPS3_PG_D) | (tlb->tlb_lo0 & MIPS3_PG_V);
entrylo1 = mips3_paddr_to_tlbpfn(pfn1 << PAGE_SHIFT) | (0x3 << 3) |
(tlb->tlb_lo1 & MIPS3_PG_D) | (tlb->tlb_lo1 & MIPS3_PG_V);
(tlb->tlb_lo1 & MIPS3_PG_D) | (tlb->tlb_lo1 & MIPS3_PG_V);
kvm_debug("@ %#lx tlb_lo0: 0x%08lx tlb_lo1: 0x%08lx\n", vcpu->arch.pc,
tlb->tlb_lo0, tlb->tlb_lo1);
@ -397,6 +400,7 @@ kvm_mips_handle_mapped_seg_tlb_fault(struct kvm_vcpu *vcpu,
return kvm_mips_host_tlb_write(vcpu, entryhi, entrylo0, entrylo1,
tlb->tlb_mask);
}
EXPORT_SYMBOL(kvm_mips_handle_mapped_seg_tlb_fault);
int kvm_mips_guest_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long entryhi)
{
@ -404,10 +408,9 @@ int kvm_mips_guest_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long entryhi)
int index = -1;
struct kvm_mips_tlb *tlb = vcpu->arch.guest_tlb;
for (i = 0; i < KVM_MIPS_GUEST_TLB_SIZE; i++) {
if (((TLB_VPN2(tlb[i]) & ~tlb[i].tlb_mask) == ((entryhi & VPN2_MASK) & ~tlb[i].tlb_mask)) &&
(TLB_IS_GLOBAL(tlb[i]) || (TLB_ASID(tlb[i]) == (entryhi & ASID_MASK)))) {
if (TLB_HI_VPN2_HIT(tlb[i], entryhi) &&
TLB_HI_ASID_HIT(tlb[i], entryhi)) {
index = i;
break;
}
@ -418,21 +421,23 @@ int kvm_mips_guest_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long entryhi)
return index;
}
EXPORT_SYMBOL(kvm_mips_guest_tlb_lookup);
int kvm_mips_host_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long vaddr)
{
unsigned long old_entryhi, flags;
volatile int idx;
int idx;
local_irq_save(flags);
old_entryhi = read_c0_entryhi();
if (KVM_GUEST_KERNEL_MODE(vcpu))
write_c0_entryhi((vaddr & VPN2_MASK) | kvm_mips_get_kernel_asid(vcpu));
write_c0_entryhi((vaddr & VPN2_MASK) |
kvm_mips_get_kernel_asid(vcpu));
else {
write_c0_entryhi((vaddr & VPN2_MASK) | kvm_mips_get_user_asid(vcpu));
write_c0_entryhi((vaddr & VPN2_MASK) |
kvm_mips_get_user_asid(vcpu));
}
mtc0_tlbw_hazard();
@ -452,6 +457,7 @@ int kvm_mips_host_tlb_lookup(struct kvm_vcpu *vcpu, unsigned long vaddr)
return idx;
}
EXPORT_SYMBOL(kvm_mips_host_tlb_lookup);
int kvm_mips_host_tlb_inv(struct kvm_vcpu *vcpu, unsigned long va)
{
@ -460,7 +466,6 @@ int kvm_mips_host_tlb_inv(struct kvm_vcpu *vcpu, unsigned long va)
local_irq_save(flags);
old_entryhi = read_c0_entryhi();
write_c0_entryhi((va & VPN2_MASK) | kvm_mips_get_user_asid(vcpu));
@ -499,8 +504,9 @@ int kvm_mips_host_tlb_inv(struct kvm_vcpu *vcpu, unsigned long va)
return 0;
}
EXPORT_SYMBOL(kvm_mips_host_tlb_inv);
/* XXXKYMA: Fix Guest USER/KERNEL no longer share the same ASID*/
/* XXXKYMA: Fix Guest USER/KERNEL no longer share the same ASID */
int kvm_mips_host_tlb_inv_index(struct kvm_vcpu *vcpu, int index)
{
unsigned long flags, old_entryhi;
@ -510,7 +516,6 @@ int kvm_mips_host_tlb_inv_index(struct kvm_vcpu *vcpu, int index)
local_irq_save(flags);
old_entryhi = read_c0_entryhi();
write_c0_entryhi(UNIQUE_ENTRYHI(index));
@ -546,7 +551,6 @@ void kvm_mips_flush_host_tlb(int skip_kseg0)
int entry = 0;
int maxentry = current_cpu_data.tlbsize;
local_irq_save(flags);
old_entryhi = read_c0_entryhi();
@ -554,7 +558,6 @@ void kvm_mips_flush_host_tlb(int skip_kseg0)
/* Blast 'em all away. */
for (entry = 0; entry < maxentry; entry++) {
write_c0_index(entry);
mtc0_tlbw_hazard();
@ -565,9 +568,8 @@ void kvm_mips_flush_host_tlb(int skip_kseg0)
entryhi = read_c0_entryhi();
/* Don't blow away guest kernel entries */
if (KVM_GUEST_KSEGX(entryhi) == KVM_GUEST_KSEG0) {
if (KVM_GUEST_KSEGX(entryhi) == KVM_GUEST_KSEG0)
continue;
}
}
/* Make sure all entries differ. */
@ -591,17 +593,17 @@ void kvm_mips_flush_host_tlb(int skip_kseg0)
local_irq_restore(flags);
}
EXPORT_SYMBOL(kvm_mips_flush_host_tlb);
void
kvm_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu,
struct kvm_vcpu *vcpu)
void kvm_get_new_mmu_context(struct mm_struct *mm, unsigned long cpu,
struct kvm_vcpu *vcpu)
{
unsigned long asid = asid_cache(cpu);
if (!((asid += ASID_INC) & ASID_MASK)) {
if (cpu_has_vtag_icache) {
asid += ASID_INC;
if (!(asid & ASID_MASK)) {
if (cpu_has_vtag_icache)
flush_icache_all();
}
kvm_local_flush_tlb_all(); /* start new asid cycle */
@ -639,6 +641,7 @@ void kvm_local_flush_tlb_all(void)
local_irq_restore(flags);
}
EXPORT_SYMBOL(kvm_local_flush_tlb_all);
/**
* kvm_mips_migrate_count() - Migrate timer.
@ -699,7 +702,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
if (!newasid) {
/* If we preempted while the guest was executing, then reload the pre-empted ASID */
/*
* If we preempted while the guest was executing, then reload
* the pre-empted ASID
*/
if (current->flags & PF_VCPU) {
write_c0_entryhi(vcpu->arch.
preempt_entryhi & ASID_MASK);
@ -708,9 +714,10 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
} else {
/* New ASIDs were allocated for the VM */
/* Were we in guest context? If so then the pre-empted ASID is no longer
* valid, we need to set it to what it should be based on the mode of
* the Guest (Kernel/User)
/*
* Were we in guest context? If so then the pre-empted ASID is
* no longer valid, we need to set it to what it should be based
* on the mode of the Guest (Kernel/User)
*/
if (current->flags & PF_VCPU) {
if (KVM_GUEST_KERNEL_MODE(vcpu))
@ -728,6 +735,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
local_irq_restore(flags);
}
EXPORT_SYMBOL(kvm_arch_vcpu_load);
/* ASID can change if another task is scheduled during preemption */
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
@ -739,7 +747,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
cpu = smp_processor_id();
vcpu->arch.preempt_entryhi = read_c0_entryhi();
vcpu->arch.last_sched_cpu = cpu;
@ -754,11 +761,12 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
local_irq_restore(flags);
}
EXPORT_SYMBOL(kvm_arch_vcpu_put);
uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu)
{
struct mips_coproc *cop0 = vcpu->arch.cop0;
unsigned long paddr, flags;
unsigned long paddr, flags, vpn2, asid;
uint32_t inst;
int index;
@ -769,16 +777,12 @@ uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu)
if (index >= 0) {
inst = *(opc);
} else {
index =
kvm_mips_guest_tlb_lookup(vcpu,
((unsigned long) opc & VPN2_MASK)
|
(kvm_read_c0_guest_entryhi
(cop0) & ASID_MASK));
vpn2 = (unsigned long) opc & VPN2_MASK;
asid = kvm_read_c0_guest_entryhi(cop0) & ASID_MASK;
index = kvm_mips_guest_tlb_lookup(vcpu, vpn2 | asid);
if (index < 0) {
kvm_err
("%s: get_user_failed for %p, vcpu: %p, ASID: %#lx\n",
__func__, opc, vcpu, read_c0_entryhi());
kvm_err("%s: get_user_failed for %p, vcpu: %p, ASID: %#lx\n",
__func__, opc, vcpu, read_c0_entryhi());
kvm_mips_dump_host_tlbs();
local_irq_restore(flags);
return KVM_INVALID_INST;
@ -793,7 +797,7 @@ uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu)
} else if (KVM_GUEST_KSEGX(opc) == KVM_GUEST_KSEG0) {
paddr =
kvm_mips_translate_guest_kseg0_to_hpa(vcpu,
(unsigned long) opc);
(unsigned long) opc);
inst = *(uint32_t *) CKSEG0ADDR(paddr);
} else {
kvm_err("%s: illegal address: %p\n", __func__, opc);
@ -802,18 +806,4 @@ uint32_t kvm_get_inst(uint32_t *opc, struct kvm_vcpu *vcpu)
return inst;
}
EXPORT_SYMBOL(kvm_local_flush_tlb_all);
EXPORT_SYMBOL(kvm_mips_handle_mapped_seg_tlb_fault);
EXPORT_SYMBOL(kvm_mips_handle_commpage_tlb_fault);
EXPORT_SYMBOL(kvm_mips_dump_host_tlbs);
EXPORT_SYMBOL(kvm_mips_handle_kseg0_tlb_fault);
EXPORT_SYMBOL(kvm_mips_host_tlb_lookup);
EXPORT_SYMBOL(kvm_mips_flush_host_tlb);
EXPORT_SYMBOL(kvm_mips_guest_tlb_lookup);
EXPORT_SYMBOL(kvm_mips_host_tlb_inv);
EXPORT_SYMBOL(kvm_mips_translate_guest_kseg0_to_hpa);
EXPORT_SYMBOL(kvm_mips_dump_guest_tlbs);
EXPORT_SYMBOL(kvm_get_inst);
EXPORT_SYMBOL(kvm_arch_vcpu_load);
EXPORT_SYMBOL(kvm_arch_vcpu_put);

View File

@ -1,11 +1,11 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#if !defined(_TRACE_KVM_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_KVM_H
@ -17,9 +17,7 @@
#define TRACE_INCLUDE_PATH .
#define TRACE_INCLUDE_FILE trace
/*
* Tracepoints for VM eists
*/
/* Tracepoints for VM eists */
extern char *kvm_mips_exit_types_str[MAX_KVM_MIPS_EXIT_TYPES];
TRACE_EVENT(kvm_exit,

View File

@ -1,13 +1,13 @@
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Deliver/Emulate exceptions to the guest kernel
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
* KVM/MIPS: Deliver/Emulate exceptions to the guest kernel
*
* Copyright (C) 2012 MIPS Technologies, Inc. All rights reserved.
* Authors: Sanjay Lal <sanjayl@kymasys.com>
*/
#include <linux/errno.h>
#include <linux/err.h>
@ -16,8 +16,8 @@
#include <linux/kvm_host.h>
#include "kvm_mips_opcode.h"
#include "kvm_mips_int.h"
#include "opcode.h"
#include "interrupt.h"
static gpa_t kvm_trap_emul_gva_to_gpa_cb(gva_t gva)
{
@ -27,7 +27,7 @@ static gpa_t kvm_trap_emul_gva_to_gpa_cb(gva_t gva)
if ((kseg == CKSEG0) || (kseg == CKSEG1))
gpa = CPHYSADDR(gva);
else {
printk("%s: cannot find GPA for GVA: %#lx\n", __func__, gva);
kvm_err("%s: cannot find GPA for GVA: %#lx\n", __func__, gva);
kvm_mips_dump_host_tlbs();
gpa = KVM_INVALID_ADDR;
}
@ -37,7 +37,6 @@ static gpa_t kvm_trap_emul_gva_to_gpa_cb(gva_t gva)
return gpa;
}
static int kvm_trap_emul_handle_cop_unusable(struct kvm_vcpu *vcpu)
{
struct kvm_run *run = vcpu->run;
@ -46,9 +45,9 @@ static int kvm_trap_emul_handle_cop_unusable(struct kvm_vcpu *vcpu)
enum emulation_result er = EMULATE_DONE;
int ret = RESUME_GUEST;
if (((cause & CAUSEF_CE) >> CAUSEB_CE) == 1) {
if (((cause & CAUSEF_CE) >> CAUSEB_CE) == 1)
er = kvm_mips_emulate_fpu_exc(cause, opc, run, vcpu);
} else
else
er = kvm_mips_emulate_inst(cause, opc, run, vcpu);
switch (er) {
@ -83,9 +82,8 @@ static int kvm_trap_emul_handle_tlb_mod(struct kvm_vcpu *vcpu)
if (KVM_GUEST_KSEGX(badvaddr) < KVM_GUEST_KSEG0
|| KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG23) {
kvm_debug
("USER/KSEG23 ADDR TLB MOD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_debug("USER/KSEG23 ADDR TLB MOD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
er = kvm_mips_handle_tlbmod(cause, opc, run, vcpu);
if (er == EMULATE_DONE)
@ -95,20 +93,20 @@ static int kvm_trap_emul_handle_tlb_mod(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else if (KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG0) {
/* XXXKYMA: The guest kernel does not expect to get this fault when we are not
* using HIGHMEM. Need to address this in a HIGHMEM kernel
/*
* XXXKYMA: The guest kernel does not expect to get this fault
* when we are not using HIGHMEM. Need to address this in a
* HIGHMEM kernel
*/
printk
("TLB MOD fault not handled, cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("TLB MOD fault not handled, cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_mips_dump_host_tlbs();
kvm_arch_vcpu_dump_regs(vcpu);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
} else {
printk
("Illegal TLB Mod fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("Illegal TLB Mod fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_mips_dump_host_tlbs();
kvm_arch_vcpu_dump_regs(vcpu);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
@ -134,9 +132,8 @@ static int kvm_trap_emul_handle_tlb_st_miss(struct kvm_vcpu *vcpu)
}
} else if (KVM_GUEST_KSEGX(badvaddr) < KVM_GUEST_KSEG0
|| KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG23) {
kvm_debug
("USER ADDR TLB LD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_debug("USER ADDR TLB LD fault: cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
er = kvm_mips_handle_tlbmiss(cause, opc, run, vcpu);
if (er == EMULATE_DONE)
ret = RESUME_GUEST;
@ -145,8 +142,9 @@ static int kvm_trap_emul_handle_tlb_st_miss(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else if (KVM_GUEST_KSEGX(badvaddr) == KVM_GUEST_KSEG0) {
/* All KSEG0 faults are handled by KVM, as the guest kernel does not
* expect to ever get them
/*
* All KSEG0 faults are handled by KVM, as the guest kernel does
* not expect to ever get them
*/
if (kvm_mips_handle_kseg0_tlb_fault
(vcpu->arch.host_cp0_badvaddr, vcpu) < 0) {
@ -154,9 +152,8 @@ static int kvm_trap_emul_handle_tlb_st_miss(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else {
kvm_err
("Illegal TLB LD fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("Illegal TLB LD fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_mips_dump_host_tlbs();
kvm_arch_vcpu_dump_regs(vcpu);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
@ -185,11 +182,14 @@ static int kvm_trap_emul_handle_tlb_ld_miss(struct kvm_vcpu *vcpu)
kvm_debug("USER ADDR TLB ST fault: PC: %#lx, BadVaddr: %#lx\n",
vcpu->arch.pc, badvaddr);
/* User Address (UA) fault, this could happen if
* (1) TLB entry not present/valid in both Guest and shadow host TLBs, in this
* case we pass on the fault to the guest kernel and let it handle it.
* (2) TLB entry is present in the Guest TLB but not in the shadow, in this
* case we inject the TLB from the Guest TLB into the shadow host TLB
/*
* User Address (UA) fault, this could happen if
* (1) TLB entry not present/valid in both Guest and shadow host
* TLBs, in this case we pass on the fault to the guest
* kernel and let it handle it.
* (2) TLB entry is present in the Guest TLB but not in the
* shadow, in this case we inject the TLB from the Guest TLB
* into the shadow host TLB
*/
er = kvm_mips_handle_tlbmiss(cause, opc, run, vcpu);
@ -206,9 +206,8 @@ static int kvm_trap_emul_handle_tlb_ld_miss(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else {
printk
("Illegal TLB ST fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("Illegal TLB ST fault address , cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_mips_dump_host_tlbs();
kvm_arch_vcpu_dump_regs(vcpu);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
@ -231,7 +230,7 @@ static int kvm_trap_emul_handle_addr_err_st(struct kvm_vcpu *vcpu)
kvm_debug("Emulate Store to MMIO space\n");
er = kvm_mips_emulate_inst(cause, opc, run, vcpu);
if (er == EMULATE_FAIL) {
printk("Emulate Store to MMIO space failed\n");
kvm_err("Emulate Store to MMIO space failed\n");
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
} else {
@ -239,9 +238,8 @@ static int kvm_trap_emul_handle_addr_err_st(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else {
printk
("Address Error (STORE): cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("Address Error (STORE): cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
}
@ -261,7 +259,7 @@ static int kvm_trap_emul_handle_addr_err_ld(struct kvm_vcpu *vcpu)
kvm_debug("Emulate Load from MMIO space @ %#lx\n", badvaddr);
er = kvm_mips_emulate_inst(cause, opc, run, vcpu);
if (er == EMULATE_FAIL) {
printk("Emulate Load from MMIO space failed\n");
kvm_err("Emulate Load from MMIO space failed\n");
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
} else {
@ -269,9 +267,8 @@ static int kvm_trap_emul_handle_addr_err_ld(struct kvm_vcpu *vcpu)
ret = RESUME_HOST;
}
} else {
printk
("Address Error (LOAD): cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
kvm_err("Address Error (LOAD): cause %#lx, PC: %p, BadVaddr: %#lx\n",
cause, opc, badvaddr);
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
ret = RESUME_HOST;
er = EMULATE_FAIL;
@ -349,9 +346,9 @@ static int kvm_trap_emul_vcpu_setup(struct kvm_vcpu *vcpu)
uint32_t config1;
int vcpu_id = vcpu->vcpu_id;
/* Arch specific stuff, set up config registers properly so that the
* guest will come up as expected, for now we simulate a
* MIPS 24kc
/*
* Arch specific stuff, set up config registers properly so that the
* guest will come up as expected, for now we simulate a MIPS 24kc
*/
kvm_write_c0_guest_prid(cop0, 0x00019300);
kvm_write_c0_guest_config(cop0,
@ -373,14 +370,15 @@ static int kvm_trap_emul_vcpu_setup(struct kvm_vcpu *vcpu)
kvm_write_c0_guest_config2(cop0, MIPS_CONFIG2);
/* MIPS_CONFIG2 | (read_c0_config2() & 0xfff) */
kvm_write_c0_guest_config3(cop0,
MIPS_CONFIG3 | (0 << CP0C3_VInt) | (1 <<
CP0C3_ULRI));
kvm_write_c0_guest_config3(cop0, MIPS_CONFIG3 | (0 << CP0C3_VInt) |
(1 << CP0C3_ULRI));
/* Set Wait IE/IXMT Ignore in Config7, IAR, AR */
kvm_write_c0_guest_config7(cop0, (MIPS_CONF7_WII) | (1 << 10));
/* Setup IntCtl defaults, compatibilty mode for timer interrupts (HW5) */
/*
* Setup IntCtl defaults, compatibilty mode for timer interrupts (HW5)
*/
kvm_write_c0_guest_intctl(cop0, 0xFC000000);
/* Put in vcpu id as CPUNum into Ebase Reg to handle SMP Guests */

View File

@ -305,7 +305,6 @@ struct kvm_s390_local_interrupt {
struct list_head list;
atomic_t active;
struct kvm_s390_float_interrupt *float_int;
int timer_due; /* event indicator for waitqueue below */
wait_queue_head_t *wq;
atomic_t *cpuflags;
unsigned int action_bits;
@ -367,7 +366,6 @@ struct kvm_vcpu_arch {
s390_fp_regs guest_fpregs;
struct kvm_s390_local_interrupt local_int;
struct hrtimer ckc_timer;
struct tasklet_struct tasklet;
struct kvm_s390_pgm_info pgm;
union {
struct cpuid cpu_id;
@ -418,6 +416,7 @@ struct kvm_arch{
int css_support;
int use_irqchip;
int use_cmma;
int user_cpu_state_ctrl;
struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
wait_queue_head_t ipte_wq;
spinlock_t start_stop_lock;

View File

@ -108,6 +108,7 @@
exit_code_ipa0(0xB2, 0x17, "STETR"), \
exit_code_ipa0(0xB2, 0x18, "PC"), \
exit_code_ipa0(0xB2, 0x20, "SERVC"), \
exit_code_ipa0(0xB2, 0x21, "IPTE"), \
exit_code_ipa0(0xB2, 0x28, "PT"), \
exit_code_ipa0(0xB2, 0x29, "ISKE"), \
exit_code_ipa0(0xB2, 0x2a, "RRBE"), \

View File

@ -176,7 +176,8 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
return -EOPNOTSUPP;
}
kvm_s390_vcpu_stop(vcpu);
if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
kvm_s390_vcpu_stop(vcpu);
vcpu->run->s390_reset_flags |= KVM_S390_RESET_SUBSYSTEM;
vcpu->run->s390_reset_flags |= KVM_S390_RESET_IPL;
vcpu->run->s390_reset_flags |= KVM_S390_RESET_CPU_INIT;

View File

@ -56,32 +56,26 @@ static int handle_noop(struct kvm_vcpu *vcpu)
static int handle_stop(struct kvm_vcpu *vcpu)
{
int rc = 0;
unsigned int action_bits;
vcpu->stat.exit_stop_request++;
spin_lock_bh(&vcpu->arch.local_int.lock);
trace_kvm_s390_stop_request(vcpu->arch.local_int.action_bits);
if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP) {
kvm_s390_vcpu_stop(vcpu);
vcpu->arch.local_int.action_bits &= ~ACTION_STOP_ON_STOP;
VCPU_EVENT(vcpu, 3, "%s", "cpu stopped");
rc = -EOPNOTSUPP;
}
action_bits = vcpu->arch.local_int.action_bits;
if (vcpu->arch.local_int.action_bits & ACTION_STORE_ON_STOP) {
vcpu->arch.local_int.action_bits &= ~ACTION_STORE_ON_STOP;
/* store status must be called unlocked. Since local_int.lock
* only protects local_int.* and not guest memory we can give
* up the lock here */
spin_unlock_bh(&vcpu->arch.local_int.lock);
if (!(action_bits & ACTION_STOP_ON_STOP))
return 0;
if (action_bits & ACTION_STORE_ON_STOP) {
rc = kvm_s390_vcpu_store_status(vcpu,
KVM_S390_STORE_STATUS_NOADDR);
if (rc >= 0)
rc = -EOPNOTSUPP;
} else
spin_unlock_bh(&vcpu->arch.local_int.lock);
return rc;
if (rc)
return rc;
}
if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
kvm_s390_vcpu_stop(vcpu);
return -EOPNOTSUPP;
}
static int handle_validity(struct kvm_vcpu *vcpu)

View File

@ -158,6 +158,9 @@ static void __reset_intercept_indicators(struct kvm_vcpu *vcpu)
LCTL_CR10 | LCTL_CR11);
vcpu->arch.sie_block->ictl |= (ICTL_STCTL | ICTL_PINT);
}
if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP)
atomic_set_mask(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
}
static void __set_cpuflag(struct kvm_vcpu *vcpu, u32 flag)
@ -544,13 +547,13 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
int rc = 0;
if (atomic_read(&li->active)) {
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_for_each_entry(inti, &li->list, list)
if (__interrupt_is_deliverable(vcpu, inti)) {
rc = 1;
break;
}
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
}
if ((!rc) && atomic_read(&fi->active)) {
@ -585,88 +588,56 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
{
u64 now, sltime;
DECLARE_WAITQUEUE(wait, current);
vcpu->stat.exit_wait_state++;
if (kvm_cpu_has_interrupt(vcpu))
return 0;
__set_cpu_idle(vcpu);
spin_lock_bh(&vcpu->arch.local_int.lock);
vcpu->arch.local_int.timer_due = 0;
spin_unlock_bh(&vcpu->arch.local_int.lock);
/* fast path */
if (kvm_cpu_has_pending_timer(vcpu) || kvm_arch_vcpu_runnable(vcpu))
return 0;
if (psw_interrupts_disabled(vcpu)) {
VCPU_EVENT(vcpu, 3, "%s", "disabled wait");
__unset_cpu_idle(vcpu);
return -EOPNOTSUPP; /* disabled wait */
}
__set_cpu_idle(vcpu);
if (!ckc_interrupts_enabled(vcpu)) {
VCPU_EVENT(vcpu, 3, "%s", "enabled wait w/o timer");
goto no_timer;
}
now = get_tod_clock_fast() + vcpu->arch.sie_block->epoch;
if (vcpu->arch.sie_block->ckc < now) {
__unset_cpu_idle(vcpu);
return 0;
}
sltime = tod_to_ns(vcpu->arch.sie_block->ckc - now);
hrtimer_start(&vcpu->arch.ckc_timer, ktime_set (0, sltime) , HRTIMER_MODE_REL);
VCPU_EVENT(vcpu, 5, "enabled wait via clock comparator: %llx ns", sltime);
no_timer:
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
spin_lock(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
add_wait_queue(&vcpu->wq, &wait);
while (list_empty(&vcpu->arch.local_int.list) &&
list_empty(&vcpu->arch.local_int.float_int->list) &&
(!vcpu->arch.local_int.timer_due) &&
!signal_pending(current) &&
!kvm_s390_si_ext_call_pending(vcpu)) {
set_current_state(TASK_INTERRUPTIBLE);
spin_unlock_bh(&vcpu->arch.local_int.lock);
spin_unlock(&vcpu->arch.local_int.float_int->lock);
schedule();
spin_lock(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
}
kvm_vcpu_block(vcpu);
__unset_cpu_idle(vcpu);
__set_current_state(TASK_RUNNING);
remove_wait_queue(&vcpu->wq, &wait);
spin_unlock_bh(&vcpu->arch.local_int.lock);
spin_unlock(&vcpu->arch.local_int.float_int->lock);
vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
hrtimer_try_to_cancel(&vcpu->arch.ckc_timer);
return 0;
}
void kvm_s390_tasklet(unsigned long parm)
void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu)
{
struct kvm_vcpu *vcpu = (struct kvm_vcpu *) parm;
spin_lock(&vcpu->arch.local_int.lock);
vcpu->arch.local_int.timer_due = 1;
if (waitqueue_active(&vcpu->wq))
if (waitqueue_active(&vcpu->wq)) {
/*
* The vcpu gave up the cpu voluntarily, mark it as a good
* yield-candidate.
*/
vcpu->preempted = true;
wake_up_interruptible(&vcpu->wq);
spin_unlock(&vcpu->arch.local_int.lock);
}
}
/*
* low level hrtimer wake routine. Because this runs in hardirq context
* we schedule a tasklet to do the real work.
*/
enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer)
{
struct kvm_vcpu *vcpu;
vcpu = container_of(timer, struct kvm_vcpu, arch.ckc_timer);
vcpu->preempted = true;
tasklet_schedule(&vcpu->arch.tasklet);
kvm_s390_vcpu_wakeup(vcpu);
return HRTIMER_NORESTART;
}
@ -676,13 +647,13 @@ void kvm_s390_clear_local_irqs(struct kvm_vcpu *vcpu)
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
struct kvm_s390_interrupt_info *n, *inti = NULL;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_for_each_entry_safe(inti, n, &li->list, list) {
list_del(&inti->list);
kfree(inti);
}
atomic_set(&li->active, 0);
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
/* clear pending external calls set by sigp interpretation facility */
atomic_clear_mask(CPUSTAT_ECALL_PEND, &vcpu->arch.sie_block->cpuflags);
@ -701,7 +672,7 @@ void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
if (atomic_read(&li->active)) {
do {
deliver = 0;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_for_each_entry_safe(inti, n, &li->list, list) {
if (__interrupt_is_deliverable(vcpu, inti)) {
list_del(&inti->list);
@ -712,7 +683,7 @@ void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
}
if (list_empty(&li->list))
atomic_set(&li->active, 0);
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
if (deliver) {
__do_deliver_interrupt(vcpu, inti);
kfree(inti);
@ -758,7 +729,7 @@ void kvm_s390_deliver_pending_machine_checks(struct kvm_vcpu *vcpu)
if (atomic_read(&li->active)) {
do {
deliver = 0;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_for_each_entry_safe(inti, n, &li->list, list) {
if ((inti->type == KVM_S390_MCHK) &&
__interrupt_is_deliverable(vcpu, inti)) {
@ -770,7 +741,7 @@ void kvm_s390_deliver_pending_machine_checks(struct kvm_vcpu *vcpu)
}
if (list_empty(&li->list))
atomic_set(&li->active, 0);
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
if (deliver) {
__do_deliver_interrupt(vcpu, inti);
kfree(inti);
@ -817,11 +788,11 @@ int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code)
VCPU_EVENT(vcpu, 3, "inject: program check %d (from kernel)", code);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, inti->type, code, 0, 1);
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_add(&inti->list, &li->list);
atomic_set(&li->active, 1);
BUG_ON(waitqueue_active(li->wq));
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
return 0;
}
@ -842,11 +813,11 @@ int kvm_s390_inject_prog_irq(struct kvm_vcpu *vcpu,
inti->type = KVM_S390_PROGRAM_INT;
memcpy(&inti->pgm, pgm_info, sizeof(inti->pgm));
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
list_add(&inti->list, &li->list);
atomic_set(&li->active, 1);
BUG_ON(waitqueue_active(li->wq));
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
return 0;
}
@ -934,12 +905,10 @@ static int __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info *inti)
}
dst_vcpu = kvm_get_vcpu(kvm, sigcpu);
li = &dst_vcpu->arch.local_int;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
if (waitqueue_active(li->wq))
wake_up_interruptible(li->wq);
kvm_get_vcpu(kvm, sigcpu)->preempted = true;
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
kvm_s390_vcpu_wakeup(kvm_get_vcpu(kvm, sigcpu));
unlock_fi:
spin_unlock(&fi->lock);
mutex_unlock(&kvm->lock);
@ -1081,7 +1050,7 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
mutex_lock(&vcpu->kvm->lock);
li = &vcpu->arch.local_int;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
if (inti->type == KVM_S390_PROGRAM_INT)
list_add(&inti->list, &li->list);
else
@ -1090,11 +1059,9 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
if (inti->type == KVM_S390_SIGP_STOP)
li->action_bits |= ACTION_STOP_ON_STOP;
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);
if (waitqueue_active(&vcpu->wq))
wake_up_interruptible(&vcpu->wq);
vcpu->preempted = true;
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
mutex_unlock(&vcpu->kvm->lock);
kvm_s390_vcpu_wakeup(vcpu);
return 0;
}

View File

@ -166,7 +166,9 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_IOEVENTFD:
case KVM_CAP_DEVICE_CTRL:
case KVM_CAP_ENABLE_CAP_VM:
case KVM_CAP_S390_IRQCHIP:
case KVM_CAP_VM_ATTRIBUTES:
case KVM_CAP_MP_STATE:
r = 1;
break;
case KVM_CAP_NR_VCPUS:
@ -595,7 +597,8 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->pp = 0;
vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
kvm_clear_async_pf_completion_queue(vcpu);
kvm_s390_vcpu_stop(vcpu);
if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
kvm_s390_vcpu_stop(vcpu);
kvm_s390_clear_local_irqs(vcpu);
}
@ -647,8 +650,6 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
return rc;
}
hrtimer_init(&vcpu->arch.ckc_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
tasklet_init(&vcpu->arch.tasklet, kvm_s390_tasklet,
(unsigned long) vcpu);
vcpu->arch.ckc_timer.function = kvm_s390_idle_wakeup;
get_cpu_id(&vcpu->arch.cpu_id);
vcpu->arch.cpu_id.version = 0xff;
@ -926,7 +927,7 @@ static int kvm_arch_vcpu_ioctl_set_initial_psw(struct kvm_vcpu *vcpu, psw_t psw)
{
int rc = 0;
if (!(atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_STOPPED))
if (!is_vcpu_stopped(vcpu))
rc = -EBUSY;
else {
vcpu->run->psw_mask = psw.mask;
@ -980,13 +981,34 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
{
return -EINVAL; /* not implemented yet */
/* CHECK_STOP and LOAD are not supported yet */
return is_vcpu_stopped(vcpu) ? KVM_MP_STATE_STOPPED :
KVM_MP_STATE_OPERATING;
}
int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state)
{
return -EINVAL; /* not implemented yet */
int rc = 0;
/* user space knows about this interface - let it control the state */
vcpu->kvm->arch.user_cpu_state_ctrl = 1;
switch (mp_state->mp_state) {
case KVM_MP_STATE_STOPPED:
kvm_s390_vcpu_stop(vcpu);
break;
case KVM_MP_STATE_OPERATING:
kvm_s390_vcpu_start(vcpu);
break;
case KVM_MP_STATE_LOAD:
case KVM_MP_STATE_CHECK_STOP:
/* fall through - CHECK_STOP and LOAD are not supported yet */
default:
rc = -ENXIO;
}
return rc;
}
bool kvm_s390_cmma_enabled(struct kvm *kvm)
@ -1045,6 +1067,9 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)
goto retry;
}
/* nothing to do, just clear the request */
clear_bit(KVM_REQ_UNHALT, &vcpu->requests);
return 0;
}
@ -1284,7 +1309,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
if (vcpu->sigset_active)
sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
kvm_s390_vcpu_start(vcpu);
if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) {
kvm_s390_vcpu_start(vcpu);
} else if (is_vcpu_stopped(vcpu)) {
pr_err_ratelimited("kvm-s390: can't run stopped vcpu %d\n",
vcpu->vcpu_id);
return -EINVAL;
}
switch (kvm_run->exit_reason) {
case KVM_EXIT_S390_SIEIC:
@ -1413,11 +1444,6 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr)
return kvm_s390_store_status_unloaded(vcpu, addr);
}
static inline int is_vcpu_stopped(struct kvm_vcpu *vcpu)
{
return atomic_read(&(vcpu)->arch.sie_block->cpuflags) & CPUSTAT_STOPPED;
}
static void __disable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
{
kvm_check_request(KVM_REQ_ENABLE_IBS, vcpu);
@ -1451,7 +1477,7 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 1);
/* Only one cpu at a time may enter/leave the STOPPED state. */
spin_lock_bh(&vcpu->kvm->arch.start_stop_lock);
spin_lock(&vcpu->kvm->arch.start_stop_lock);
online_vcpus = atomic_read(&vcpu->kvm->online_vcpus);
for (i = 0; i < online_vcpus; i++) {
@ -1477,7 +1503,7 @@ void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
* Let's play safe and flush the VCPU at startup.
*/
vcpu->arch.sie_block->ihcpu = 0xffff;
spin_unlock_bh(&vcpu->kvm->arch.start_stop_lock);
spin_unlock(&vcpu->kvm->arch.start_stop_lock);
return;
}
@ -1491,10 +1517,18 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
trace_kvm_s390_vcpu_start_stop(vcpu->vcpu_id, 0);
/* Only one cpu at a time may enter/leave the STOPPED state. */
spin_lock_bh(&vcpu->kvm->arch.start_stop_lock);
spin_lock(&vcpu->kvm->arch.start_stop_lock);
online_vcpus = atomic_read(&vcpu->kvm->online_vcpus);
/* Need to lock access to action_bits to avoid a SIGP race condition */
spin_lock(&vcpu->arch.local_int.lock);
atomic_set_mask(CPUSTAT_STOPPED, &vcpu->arch.sie_block->cpuflags);
/* SIGP STOP and SIGP STOP AND STORE STATUS has been fully processed */
vcpu->arch.local_int.action_bits &=
~(ACTION_STOP_ON_STOP | ACTION_STORE_ON_STOP);
spin_unlock(&vcpu->arch.local_int.lock);
__disable_ibs_on_vcpu(vcpu);
for (i = 0; i < online_vcpus; i++) {
@ -1512,7 +1546,7 @@ void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
__enable_ibs_on_vcpu(started_vcpu);
}
spin_unlock_bh(&vcpu->kvm->arch.start_stop_lock);
spin_unlock(&vcpu->kvm->arch.start_stop_lock);
return;
}

View File

@ -45,9 +45,9 @@ do { \
d_args); \
} while (0)
static inline int __cpu_is_stopped(struct kvm_vcpu *vcpu)
static inline int is_vcpu_stopped(struct kvm_vcpu *vcpu)
{
return atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_STOP_INT;
return atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_STOPPED;
}
static inline int kvm_is_ucontrol(struct kvm *kvm)
@ -129,9 +129,15 @@ static inline void kvm_s390_set_psw_cc(struct kvm_vcpu *vcpu, unsigned long cc)
vcpu->arch.sie_block->gpsw.mask |= cc << 44;
}
/* are cpu states controlled by user space */
static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm)
{
return kvm->arch.user_cpu_state_ctrl != 0;
}
int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu);
enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer);
void kvm_s390_tasklet(unsigned long parm);
void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu);
void kvm_s390_deliver_pending_machine_checks(struct kvm_vcpu *vcpu);
void kvm_s390_clear_local_irqs(struct kvm_vcpu *vcpu);

View File

@ -125,8 +125,9 @@ static int __sigp_external_call(struct kvm_vcpu *vcpu, u16 cpu_addr)
return rc ? rc : SIGP_CC_ORDER_CODE_ACCEPTED;
}
static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
static int __inject_sigp_stop(struct kvm_vcpu *dst_vcpu, int action)
{
struct kvm_s390_local_interrupt *li = &dst_vcpu->arch.local_int;
struct kvm_s390_interrupt_info *inti;
int rc = SIGP_CC_ORDER_CODE_ACCEPTED;
@ -135,7 +136,13 @@ static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
return -ENOMEM;
inti->type = KVM_S390_SIGP_STOP;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
if (li->action_bits & ACTION_STOP_ON_STOP) {
/* another SIGP STOP is pending */
kfree(inti);
rc = SIGP_CC_BUSY;
goto out;
}
if ((atomic_read(li->cpuflags) & CPUSTAT_STOPPED)) {
kfree(inti);
if ((action & ACTION_STORE_ON_STOP) != 0)
@ -144,19 +151,17 @@ static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
}
list_add_tail(&inti->list, &li->list);
atomic_set(&li->active, 1);
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
li->action_bits |= action;
if (waitqueue_active(li->wq))
wake_up_interruptible(li->wq);
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
kvm_s390_vcpu_wakeup(dst_vcpu);
out:
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
return rc;
}
static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int action)
{
struct kvm_s390_local_interrupt *li;
struct kvm_vcpu *dst_vcpu = NULL;
int rc;
@ -166,9 +171,8 @@ static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int action)
dst_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
if (!dst_vcpu)
return SIGP_CC_NOT_OPERATIONAL;
li = &dst_vcpu->arch.local_int;
rc = __inject_sigp_stop(li, action);
rc = __inject_sigp_stop(dst_vcpu, action);
VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", cpu_addr);
@ -238,7 +242,7 @@ static int __sigp_set_prefix(struct kvm_vcpu *vcpu, u16 cpu_addr, u32 address,
if (!inti)
return SIGP_CC_BUSY;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
/* cpu must be in stopped state */
if (!(atomic_read(li->cpuflags) & CPUSTAT_STOPPED)) {
*reg &= 0xffffffff00000000UL;
@ -253,13 +257,12 @@ static int __sigp_set_prefix(struct kvm_vcpu *vcpu, u16 cpu_addr, u32 address,
list_add_tail(&inti->list, &li->list);
atomic_set(&li->active, 1);
if (waitqueue_active(li->wq))
wake_up_interruptible(li->wq);
kvm_s390_vcpu_wakeup(dst_vcpu);
rc = SIGP_CC_ORDER_CODE_ACCEPTED;
VCPU_EVENT(vcpu, 4, "set prefix of cpu %02x to %x", cpu_addr, address);
out_li:
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
return rc;
}
@ -275,9 +278,9 @@ static int __sigp_store_status_at_addr(struct kvm_vcpu *vcpu, u16 cpu_id,
if (!dst_vcpu)
return SIGP_CC_NOT_OPERATIONAL;
spin_lock_bh(&dst_vcpu->arch.local_int.lock);
spin_lock(&dst_vcpu->arch.local_int.lock);
flags = atomic_read(dst_vcpu->arch.local_int.cpuflags);
spin_unlock_bh(&dst_vcpu->arch.local_int.lock);
spin_unlock(&dst_vcpu->arch.local_int.lock);
if (!(flags & CPUSTAT_STOPPED)) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INCORRECT_STATE;
@ -338,10 +341,10 @@ static int sigp_check_callable(struct kvm_vcpu *vcpu, u16 cpu_addr)
if (!dst_vcpu)
return SIGP_CC_NOT_OPERATIONAL;
li = &dst_vcpu->arch.local_int;
spin_lock_bh(&li->lock);
spin_lock(&li->lock);
if (li->action_bits & ACTION_STOP_ON_STOP)
rc = SIGP_CC_BUSY;
spin_unlock_bh(&li->lock);
spin_unlock(&li->lock);
return rc;
}
@ -461,12 +464,7 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu)
dest_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
BUG_ON(dest_vcpu == NULL);
spin_lock_bh(&dest_vcpu->arch.local_int.lock);
if (waitqueue_active(&dest_vcpu->wq))
wake_up_interruptible(&dest_vcpu->wq);
dest_vcpu->preempted = true;
spin_unlock_bh(&dest_vcpu->arch.local_int.lock);
kvm_s390_vcpu_wakeup(dest_vcpu);
kvm_s390_set_psw_cc(vcpu, SIGP_CC_ORDER_CODE_ACCEPTED);
return 0;
}

View File

@ -37,6 +37,7 @@ struct x86_instruction_info {
u8 modrm_reg; /* index of register used */
u8 modrm_rm; /* rm part of modrm */
u64 src_val; /* value of source operand */
u64 dst_val; /* value of destination operand */
u8 src_bytes; /* size of source operand */
u8 dst_bytes; /* size of destination operand */
u8 ad_bytes; /* size of src/dst address */
@ -194,6 +195,7 @@ struct x86_emulate_ops {
int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value);
int (*set_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data);
int (*get_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
int (*check_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc);
int (*read_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc, u64 *pdata);
void (*halt)(struct x86_emulate_ctxt *ctxt);
void (*wbinvd)(struct x86_emulate_ctxt *ctxt);
@ -231,7 +233,7 @@ struct operand {
union {
unsigned long val;
u64 val64;
char valptr[sizeof(unsigned long) + 2];
char valptr[sizeof(sse128_t)];
sse128_t vec_val;
u64 mm_val;
void *data;
@ -240,8 +242,8 @@ struct operand {
struct fetch_cache {
u8 data[15];
unsigned long start;
unsigned long end;
u8 *ptr;
u8 *end;
};
struct read_cache {
@ -286,30 +288,36 @@ struct x86_emulate_ctxt {
u8 opcode_len;
u8 b;
u8 intercept;
u8 lock_prefix;
u8 rep_prefix;
u8 op_bytes;
u8 ad_bytes;
u8 rex_prefix;
struct operand src;
struct operand src2;
struct operand dst;
bool has_seg_override;
u8 seg_override;
u64 d;
int (*execute)(struct x86_emulate_ctxt *ctxt);
int (*check_perm)(struct x86_emulate_ctxt *ctxt);
/*
* The following six fields are cleared together,
* the rest are initialized unconditionally in x86_decode_insn
* or elsewhere
*/
bool rip_relative;
u8 rex_prefix;
u8 lock_prefix;
u8 rep_prefix;
/* bitmaps of registers in _regs[] that can be read */
u32 regs_valid;
/* bitmaps of registers in _regs[] that have been written */
u32 regs_dirty;
/* modrm */
u8 modrm;
u8 modrm_mod;
u8 modrm_reg;
u8 modrm_rm;
u8 modrm_seg;
bool rip_relative;
u8 seg_override;
u64 d;
unsigned long _eip;
struct operand memop;
u32 regs_valid; /* bitmaps of registers in _regs[] that can be read */
u32 regs_dirty; /* bitmaps of registers in _regs[] that have been written */
/* Fields above regs are cleared together. */
unsigned long _regs[NR_VCPU_REGS];
struct operand *memopp;
@ -407,6 +415,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt);
#define EMULATION_OK 0
#define EMULATION_RESTART 1
#define EMULATION_INTERCEPTED 2
void init_decode_cache(struct x86_emulate_ctxt *ctxt);
int x86_emulate_insn(struct x86_emulate_ctxt *ctxt);
int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
u16 tss_selector, int idt_index, int reason,

View File

@ -152,14 +152,16 @@ enum {
#define DR6_BD (1 << 13)
#define DR6_BS (1 << 14)
#define DR6_FIXED_1 0xffff0ff0
#define DR6_VOLATILE 0x0000e00f
#define DR6_RTM (1 << 16)
#define DR6_FIXED_1 0xfffe0ff0
#define DR6_INIT 0xffff0ff0
#define DR6_VOLATILE 0x0001e00f
#define DR7_BP_EN_MASK 0x000000ff
#define DR7_GE (1 << 9)
#define DR7_GD (1 << 13)
#define DR7_FIXED_1 0x00000400
#define DR7_VOLATILE 0xffff23ff
#define DR7_VOLATILE 0xffff2bff
/* apic attention bits */
#define KVM_APIC_CHECK_VAPIC 0
@ -448,7 +450,7 @@ struct kvm_vcpu_arch {
u64 tsc_offset_adjustment;
u64 this_tsc_nsec;
u64 this_tsc_write;
u8 this_tsc_generation;
u64 this_tsc_generation;
bool tsc_catchup;
bool tsc_always_catchup;
s8 virtual_tsc_shift;
@ -591,7 +593,7 @@ struct kvm_arch {
u64 cur_tsc_nsec;
u64 cur_tsc_write;
u64 cur_tsc_offset;
u8 cur_tsc_generation;
u64 cur_tsc_generation;
int nr_vcpus_matched_tsc;
spinlock_t pvclock_gtod_sync_lock;
@ -717,7 +719,7 @@ struct kvm_x86_ops {
int (*handle_exit)(struct kvm_vcpu *vcpu);
void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
unsigned char *hypercall_addr);
void (*set_irq)(struct kvm_vcpu *vcpu);
@ -1070,6 +1072,7 @@ void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu);
bool kvm_pmu_msr(struct kvm_vcpu *vcpu, u32 msr);
int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
int kvm_pmu_check_pmc(struct kvm_vcpu *vcpu, unsigned pmc);
int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
void kvm_handle_pmu_event(struct kvm_vcpu *vcpu);
void kvm_deliver_pmi(struct kvm_vcpu *vcpu);

View File

@ -51,6 +51,9 @@
#define CPU_BASED_MONITOR_EXITING 0x20000000
#define CPU_BASED_PAUSE_EXITING 0x40000000
#define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x80000000
#define CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR 0x0401e172
/*
* Definitions of Secondary Processor-Based VM-Execution Controls.
*/
@ -76,7 +79,7 @@
#define PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR 0x00000016
#define VM_EXIT_SAVE_DEBUG_CONTROLS 0x00000002
#define VM_EXIT_SAVE_DEBUG_CONTROLS 0x00000004
#define VM_EXIT_HOST_ADDR_SPACE_SIZE 0x00000200
#define VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL 0x00001000
#define VM_EXIT_ACK_INTR_ON_EXIT 0x00008000
@ -89,7 +92,7 @@
#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff
#define VM_ENTRY_LOAD_DEBUG_CONTROLS 0x00000002
#define VM_ENTRY_LOAD_DEBUG_CONTROLS 0x00000004
#define VM_ENTRY_IA32E_MODE 0x00000200
#define VM_ENTRY_SMM 0x00000400
#define VM_ENTRY_DEACT_DUAL_MONITOR 0x00000800

View File

@ -23,7 +23,10 @@
#define GP_VECTOR 13
#define PF_VECTOR 14
#define MF_VECTOR 16
#define AC_VECTOR 17
#define MC_VECTOR 18
#define XM_VECTOR 19
#define VE_VECTOR 20
/* Select x86 specific features in <linux/kvm.h> */
#define __KVM_HAVE_PIT

View File

@ -558,6 +558,7 @@
/* VMX_BASIC bits and bitmasks */
#define VMX_BASIC_VMCS_SIZE_SHIFT 32
#define VMX_BASIC_TRUE_CTLS (1ULL << 55)
#define VMX_BASIC_64 0x0001000000000000LLU
#define VMX_BASIC_MEM_TYPE_SHIFT 50
#define VMX_BASIC_MEM_TYPE_MASK 0x003c000000000000LLU

View File

@ -95,4 +95,12 @@ static inline bool guest_cpuid_has_gbpages(struct kvm_vcpu *vcpu)
best = kvm_find_cpuid_entry(vcpu, 0x80000001, 0);
return best && (best->edx & bit(X86_FEATURE_GBPAGES));
}
static inline bool guest_cpuid_has_rtm(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
best = kvm_find_cpuid_entry(vcpu, 7, 0);
return best && (best->ebx & bit(X86_FEATURE_RTM));
}
#endif

View File

@ -162,6 +162,10 @@
#define NoWrite ((u64)1 << 45) /* No writeback */
#define SrcWrite ((u64)1 << 46) /* Write back src operand */
#define NoMod ((u64)1 << 47) /* Mod field is ignored */
#define Intercept ((u64)1 << 48) /* Has valid intercept field */
#define CheckPerm ((u64)1 << 49) /* Has valid check_perm field */
#define NoBigReal ((u64)1 << 50) /* No big real mode */
#define PrivUD ((u64)1 << 51) /* #UD instead of #GP on CPL > 0 */
#define DstXacc (DstAccLo | SrcAccHi | SrcWrite)
@ -426,6 +430,7 @@ static int emulator_check_intercept(struct x86_emulate_ctxt *ctxt,
.modrm_reg = ctxt->modrm_reg,
.modrm_rm = ctxt->modrm_rm,
.src_val = ctxt->src.val64,
.dst_val = ctxt->dst.val64,
.src_bytes = ctxt->src.bytes,
.dst_bytes = ctxt->dst.bytes,
.ad_bytes = ctxt->ad_bytes,
@ -511,12 +516,6 @@ static u32 desc_limit_scaled(struct desc_struct *desc)
return desc->g ? (limit << 12) | 0xfff : limit;
}
static void set_seg_override(struct x86_emulate_ctxt *ctxt, int seg)
{
ctxt->has_seg_override = true;
ctxt->seg_override = seg;
}
static unsigned long seg_base(struct x86_emulate_ctxt *ctxt, int seg)
{
if (ctxt->mode == X86EMUL_MODE_PROT64 && seg < VCPU_SREG_FS)
@ -525,14 +524,6 @@ static unsigned long seg_base(struct x86_emulate_ctxt *ctxt, int seg)
return ctxt->ops->get_cached_segment_base(ctxt, seg);
}
static unsigned seg_override(struct x86_emulate_ctxt *ctxt)
{
if (!ctxt->has_seg_override)
return 0;
return ctxt->seg_override;
}
static int emulate_exception(struct x86_emulate_ctxt *ctxt, int vec,
u32 error, bool valid)
{
@ -651,7 +642,12 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
if (!fetch && (desc.type & 8) && !(desc.type & 2))
goto bad;
lim = desc_limit_scaled(&desc);
if ((desc.type & 8) || !(desc.type & 4)) {
if ((ctxt->mode == X86EMUL_MODE_REAL) && !fetch &&
(ctxt->d & NoBigReal)) {
/* la is between zero and 0xffff */
if (la > 0xffff || (u32)(la + size - 1) > 0xffff)
goto bad;
} else if ((desc.type & 8) || !(desc.type & 4)) {
/* expand-up segment */
if (addr.ea > lim || (u32)(addr.ea + size - 1) > lim)
goto bad;
@ -716,68 +712,71 @@ static int segmented_read_std(struct x86_emulate_ctxt *ctxt,
}
/*
* Fetch the next byte of the instruction being emulated which is pointed to
* by ctxt->_eip, then increment ctxt->_eip.
*
* Also prefetch the remaining bytes of the instruction without crossing page
* Prefetch the remaining bytes of the instruction without crossing page
* boundary if they are not in fetch_cache yet.
*/
static int do_insn_fetch_byte(struct x86_emulate_ctxt *ctxt, u8 *dest)
static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
{
struct fetch_cache *fc = &ctxt->fetch;
int rc;
int size, cur_size;
unsigned size;
unsigned long linear;
int cur_size = ctxt->fetch.end - ctxt->fetch.data;
struct segmented_address addr = { .seg = VCPU_SREG_CS,
.ea = ctxt->eip + cur_size };
if (ctxt->_eip == fc->end) {
unsigned long linear;
struct segmented_address addr = { .seg = VCPU_SREG_CS,
.ea = ctxt->_eip };
cur_size = fc->end - fc->start;
size = min(15UL - cur_size,
PAGE_SIZE - offset_in_page(ctxt->_eip));
rc = __linearize(ctxt, addr, size, false, true, &linear);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;
rc = ctxt->ops->fetch(ctxt, linear, fc->data + cur_size,
size, &ctxt->exception);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;
fc->end += size;
}
*dest = fc->data[ctxt->_eip - fc->start];
ctxt->_eip++;
size = 15UL ^ cur_size;
rc = __linearize(ctxt, addr, size, false, true, &linear);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;
size = min_t(unsigned, size, PAGE_SIZE - offset_in_page(linear));
/*
* One instruction can only straddle two pages,
* and one has been loaded at the beginning of
* x86_decode_insn. So, if not enough bytes
* still, we must have hit the 15-byte boundary.
*/
if (unlikely(size < op_size))
return X86EMUL_UNHANDLEABLE;
rc = ctxt->ops->fetch(ctxt, linear, ctxt->fetch.end,
size, &ctxt->exception);
if (unlikely(rc != X86EMUL_CONTINUE))
return rc;
ctxt->fetch.end += size;
return X86EMUL_CONTINUE;
}
static int do_insn_fetch(struct x86_emulate_ctxt *ctxt,
void *dest, unsigned size)
static __always_inline int do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt,
unsigned size)
{
int rc;
/* x86 instructions are limited to 15 bytes. */
if (unlikely(ctxt->_eip + size - ctxt->eip > 15))
return X86EMUL_UNHANDLEABLE;
while (size--) {
rc = do_insn_fetch_byte(ctxt, dest++);
if (rc != X86EMUL_CONTINUE)
return rc;
}
return X86EMUL_CONTINUE;
if (unlikely(ctxt->fetch.end - ctxt->fetch.ptr < size))
return __do_insn_fetch_bytes(ctxt, size);
else
return X86EMUL_CONTINUE;
}
/* Fetch next part of the instruction being emulated. */
#define insn_fetch(_type, _ctxt) \
({ unsigned long _x; \
rc = do_insn_fetch(_ctxt, &_x, sizeof(_type)); \
({ _type _x; \
\
rc = do_insn_fetch_bytes(_ctxt, sizeof(_type)); \
if (rc != X86EMUL_CONTINUE) \
goto done; \
(_type)_x; \
ctxt->_eip += sizeof(_type); \
_x = *(_type __aligned(1) *) ctxt->fetch.ptr; \
ctxt->fetch.ptr += sizeof(_type); \
_x; \
})
#define insn_fetch_arr(_arr, _size, _ctxt) \
({ rc = do_insn_fetch(_ctxt, _arr, (_size)); \
({ \
rc = do_insn_fetch_bytes(_ctxt, _size); \
if (rc != X86EMUL_CONTINUE) \
goto done; \
ctxt->_eip += (_size); \
memcpy(_arr, ctxt->fetch.ptr, _size); \
ctxt->fetch.ptr += (_size); \
})
/*
@ -1063,19 +1062,17 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
struct operand *op)
{
u8 sib;
int index_reg = 0, base_reg = 0, scale;
int index_reg, base_reg, scale;
int rc = X86EMUL_CONTINUE;
ulong modrm_ea = 0;
if (ctxt->rex_prefix) {
ctxt->modrm_reg = (ctxt->rex_prefix & 4) << 1; /* REX.R */
index_reg = (ctxt->rex_prefix & 2) << 2; /* REX.X */
ctxt->modrm_rm = base_reg = (ctxt->rex_prefix & 1) << 3; /* REG.B */
}
ctxt->modrm_reg = ((ctxt->rex_prefix << 1) & 8); /* REX.R */
index_reg = (ctxt->rex_prefix << 2) & 8; /* REX.X */
base_reg = (ctxt->rex_prefix << 3) & 8; /* REX.B */
ctxt->modrm_mod |= (ctxt->modrm & 0xc0) >> 6;
ctxt->modrm_mod = (ctxt->modrm & 0xc0) >> 6;
ctxt->modrm_reg |= (ctxt->modrm & 0x38) >> 3;
ctxt->modrm_rm |= (ctxt->modrm & 0x07);
ctxt->modrm_rm = base_reg | (ctxt->modrm & 0x07);
ctxt->modrm_seg = VCPU_SREG_DS;
if (ctxt->modrm_mod == 3 || (ctxt->d & NoMod)) {
@ -1093,7 +1090,7 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
if (ctxt->d & Mmx) {
op->type = OP_MM;
op->bytes = 8;
op->addr.xmm = ctxt->modrm_rm & 7;
op->addr.mm = ctxt->modrm_rm & 7;
return rc;
}
fetch_register_operand(op);
@ -1190,6 +1187,9 @@ static int decode_modrm(struct x86_emulate_ctxt *ctxt,
}
}
op->addr.mem.ea = modrm_ea;
if (ctxt->ad_bytes != 8)
ctxt->memop.addr.mem.ea = (u32)ctxt->memop.addr.mem.ea;
done:
return rc;
}
@ -1220,12 +1220,14 @@ static void fetch_bit_operand(struct x86_emulate_ctxt *ctxt)
long sv = 0, mask;
if (ctxt->dst.type == OP_MEM && ctxt->src.type == OP_REG) {
mask = ~(ctxt->dst.bytes * 8 - 1);
mask = ~((long)ctxt->dst.bytes * 8 - 1);
if (ctxt->src.bytes == 2)
sv = (s16)ctxt->src.val & (s16)mask;
else if (ctxt->src.bytes == 4)
sv = (s32)ctxt->src.val & (s32)mask;
else
sv = (s64)ctxt->src.val & (s64)mask;
ctxt->dst.addr.mem.ea += (sv >> 3);
}
@ -1315,8 +1317,7 @@ static int pio_in_emulated(struct x86_emulate_ctxt *ctxt,
in_page = (ctxt->eflags & EFLG_DF) ?
offset_in_page(reg_read(ctxt, VCPU_REGS_RDI)) :
PAGE_SIZE - offset_in_page(reg_read(ctxt, VCPU_REGS_RDI));
n = min(min(in_page, (unsigned int)sizeof(rc->data)) / size,
count);
n = min3(in_page, (unsigned int)sizeof(rc->data) / size, count);
if (n == 0)
n = 1;
rc->pos = rc->end = 0;
@ -1358,17 +1359,19 @@ static void get_descriptor_table_ptr(struct x86_emulate_ctxt *ctxt,
u16 selector, struct desc_ptr *dt)
{
const struct x86_emulate_ops *ops = ctxt->ops;
u32 base3 = 0;
if (selector & 1 << 2) {
struct desc_struct desc;
u16 sel;
memset (dt, 0, sizeof *dt);
if (!ops->get_segment(ctxt, &sel, &desc, NULL, VCPU_SREG_LDTR))
if (!ops->get_segment(ctxt, &sel, &desc, &base3,
VCPU_SREG_LDTR))
return;
dt->size = desc_limit_scaled(&desc); /* what if limit > 65535? */
dt->address = get_desc_base(&desc);
dt->address = get_desc_base(&desc) | ((u64)base3 << 32);
} else
ops->get_gdt(ctxt, dt);
}
@ -1422,6 +1425,7 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
ulong desc_addr;
int ret;
u16 dummy;
u32 base3 = 0;
memset(&seg_desc, 0, sizeof seg_desc);
@ -1538,9 +1542,14 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
ret = write_segment_descriptor(ctxt, selector, &seg_desc);
if (ret != X86EMUL_CONTINUE)
return ret;
} else if (ctxt->mode == X86EMUL_MODE_PROT64) {
ret = ctxt->ops->read_std(ctxt, desc_addr+8, &base3,
sizeof(base3), &ctxt->exception);
if (ret != X86EMUL_CONTINUE)
return ret;
}
load:
ctxt->ops->set_segment(ctxt, selector, &seg_desc, 0, seg);
ctxt->ops->set_segment(ctxt, selector, &seg_desc, base3, seg);
return X86EMUL_CONTINUE;
exception:
emulate_exception(ctxt, err_vec, err_code, true);
@ -1575,34 +1584,28 @@ static void write_register_operand(struct operand *op)
static int writeback(struct x86_emulate_ctxt *ctxt, struct operand *op)
{
int rc;
switch (op->type) {
case OP_REG:
write_register_operand(op);
break;
case OP_MEM:
if (ctxt->lock_prefix)
rc = segmented_cmpxchg(ctxt,
return segmented_cmpxchg(ctxt,
op->addr.mem,
&op->orig_val,
&op->val,
op->bytes);
else
return segmented_write(ctxt,
op->addr.mem,
&op->orig_val,
&op->val,
op->bytes);
else
rc = segmented_write(ctxt,
op->addr.mem,
&op->val,
op->bytes);
if (rc != X86EMUL_CONTINUE)
return rc;
break;
case OP_MEM_STR:
rc = segmented_write(ctxt,
op->addr.mem,
op->data,
op->bytes * op->count);
if (rc != X86EMUL_CONTINUE)
return rc;
return segmented_write(ctxt,
op->addr.mem,
op->data,
op->bytes * op->count);
break;
case OP_XMM:
write_sse_reg(ctxt, &op->vec_val, op->addr.xmm);
@ -1671,7 +1674,7 @@ static int emulate_popf(struct x86_emulate_ctxt *ctxt,
return rc;
change_mask = EFLG_CF | EFLG_PF | EFLG_AF | EFLG_ZF | EFLG_SF | EFLG_OF
| EFLG_TF | EFLG_DF | EFLG_NT | EFLG_RF | EFLG_AC | EFLG_ID;
| EFLG_TF | EFLG_DF | EFLG_NT | EFLG_AC | EFLG_ID;
switch(ctxt->mode) {
case X86EMUL_MODE_PROT64:
@ -1754,6 +1757,9 @@ static int em_pop_sreg(struct x86_emulate_ctxt *ctxt)
if (rc != X86EMUL_CONTINUE)
return rc;
if (ctxt->modrm_reg == VCPU_SREG_SS)
ctxt->interruptibility = KVM_X86_SHADOW_INT_MOV_SS;
rc = load_segment_descriptor(ctxt, (u16)selector, seg);
return rc;
}
@ -1991,6 +1997,9 @@ static int em_cmpxchg8b(struct x86_emulate_ctxt *ctxt)
{
u64 old = ctxt->dst.orig_val64;
if (ctxt->dst.bytes == 16)
return X86EMUL_UNHANDLEABLE;
if (((u32) (old >> 0) != (u32) reg_read(ctxt, VCPU_REGS_RAX)) ||
((u32) (old >> 32) != (u32) reg_read(ctxt, VCPU_REGS_RDX))) {
*reg_write(ctxt, VCPU_REGS_RAX) = (u32) (old >> 0);
@ -2017,6 +2026,7 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
{
int rc;
unsigned long cs;
int cpl = ctxt->ops->cpl(ctxt);
rc = emulate_pop(ctxt, &ctxt->_eip, ctxt->op_bytes);
if (rc != X86EMUL_CONTINUE)
@ -2026,6 +2036,9 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
if (rc != X86EMUL_CONTINUE)
return rc;
/* Outer-privilege level return is not implemented */
if (ctxt->mode >= X86EMUL_MODE_PROT16 && (cs & 3) > cpl)
return X86EMUL_UNHANDLEABLE;
rc = load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS);
return rc;
}
@ -2044,8 +2057,10 @@ static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
static int em_cmpxchg(struct x86_emulate_ctxt *ctxt)
{
/* Save real source value, then compare EAX against destination. */
ctxt->dst.orig_val = ctxt->dst.val;
ctxt->dst.val = reg_read(ctxt, VCPU_REGS_RAX);
ctxt->src.orig_val = ctxt->src.val;
ctxt->src.val = reg_read(ctxt, VCPU_REGS_RAX);
ctxt->src.val = ctxt->dst.orig_val;
fastop(ctxt, em_cmp);
if (ctxt->eflags & EFLG_ZF) {
@ -2055,6 +2070,7 @@ static int em_cmpxchg(struct x86_emulate_ctxt *ctxt)
/* Failure: write the value we saw to EAX. */
ctxt->dst.type = OP_REG;
ctxt->dst.addr.reg = reg_rmw(ctxt, VCPU_REGS_RAX);
ctxt->dst.val = ctxt->dst.orig_val;
}
return X86EMUL_CONTINUE;
}
@ -2194,7 +2210,7 @@ static int em_syscall(struct x86_emulate_ctxt *ctxt)
*reg_write(ctxt, VCPU_REGS_RCX) = ctxt->_eip;
if (efer & EFER_LMA) {
#ifdef CONFIG_X86_64
*reg_write(ctxt, VCPU_REGS_R11) = ctxt->eflags & ~EFLG_RF;
*reg_write(ctxt, VCPU_REGS_R11) = ctxt->eflags;
ops->get_msr(ctxt,
ctxt->mode == X86EMUL_MODE_PROT64 ?
@ -2202,14 +2218,14 @@ static int em_syscall(struct x86_emulate_ctxt *ctxt)
ctxt->_eip = msr_data;
ops->get_msr(ctxt, MSR_SYSCALL_MASK, &msr_data);
ctxt->eflags &= ~(msr_data | EFLG_RF);
ctxt->eflags &= ~msr_data;
#endif
} else {
/* legacy mode */
ops->get_msr(ctxt, MSR_STAR, &msr_data);
ctxt->_eip = (u32)msr_data;
ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF);
ctxt->eflags &= ~(EFLG_VM | EFLG_IF);
}
return X86EMUL_CONTINUE;
@ -2258,7 +2274,7 @@ static int em_sysenter(struct x86_emulate_ctxt *ctxt)
break;
}
ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF);
ctxt->eflags &= ~(EFLG_VM | EFLG_IF);
cs_sel = (u16)msr_data;
cs_sel &= ~SELECTOR_RPL_MASK;
ss_sel = cs_sel + 8;
@ -2964,7 +2980,7 @@ static int em_rdpmc(struct x86_emulate_ctxt *ctxt)
static int em_mov(struct x86_emulate_ctxt *ctxt)
{
memcpy(ctxt->dst.valptr, ctxt->src.valptr, ctxt->op_bytes);
memcpy(ctxt->dst.valptr, ctxt->src.valptr, sizeof(ctxt->src.valptr));
return X86EMUL_CONTINUE;
}
@ -3221,7 +3237,8 @@ static int em_lidt(struct x86_emulate_ctxt *ctxt)
static int em_smsw(struct x86_emulate_ctxt *ctxt)
{
ctxt->dst.bytes = 2;
if (ctxt->dst.type == OP_MEM)
ctxt->dst.bytes = 2;
ctxt->dst.val = ctxt->ops->get_cr(ctxt, 0);
return X86EMUL_CONTINUE;
}
@ -3496,7 +3513,7 @@ static int check_rdpmc(struct x86_emulate_ctxt *ctxt)
u64 rcx = reg_read(ctxt, VCPU_REGS_RCX);
if ((!(cr4 & X86_CR4_PCE) && ctxt->ops->cpl(ctxt)) ||
(rcx > 3))
ctxt->ops->check_pmc(ctxt, rcx))
return emulate_gp(ctxt, 0);
return X86EMUL_CONTINUE;
@ -3521,9 +3538,9 @@ static int check_perm_out(struct x86_emulate_ctxt *ctxt)
}
#define D(_y) { .flags = (_y) }
#define DI(_y, _i) { .flags = (_y), .intercept = x86_intercept_##_i }
#define DIP(_y, _i, _p) { .flags = (_y), .intercept = x86_intercept_##_i, \
.check_perm = (_p) }
#define DI(_y, _i) { .flags = (_y)|Intercept, .intercept = x86_intercept_##_i }
#define DIP(_y, _i, _p) { .flags = (_y)|Intercept|CheckPerm, \
.intercept = x86_intercept_##_i, .check_perm = (_p) }
#define N D(NotImpl)
#define EXT(_f, _e) { .flags = ((_f) | RMExt), .u.group = (_e) }
#define G(_f, _g) { .flags = ((_f) | Group | ModRM), .u.group = (_g) }
@ -3532,10 +3549,10 @@ static int check_perm_out(struct x86_emulate_ctxt *ctxt)
#define I(_f, _e) { .flags = (_f), .u.execute = (_e) }
#define F(_f, _e) { .flags = (_f) | Fastop, .u.fastop = (_e) }
#define II(_f, _e, _i) \
{ .flags = (_f), .u.execute = (_e), .intercept = x86_intercept_##_i }
{ .flags = (_f)|Intercept, .u.execute = (_e), .intercept = x86_intercept_##_i }
#define IIP(_f, _e, _i, _p) \
{ .flags = (_f), .u.execute = (_e), .intercept = x86_intercept_##_i, \
.check_perm = (_p) }
{ .flags = (_f)|Intercept|CheckPerm, .u.execute = (_e), \
.intercept = x86_intercept_##_i, .check_perm = (_p) }
#define GP(_f, _g) { .flags = ((_f) | Prefix), .u.gprefix = (_g) }
#define D2bv(_f) D((_f) | ByteOp), D(_f)
@ -3634,8 +3651,8 @@ static const struct opcode group6[] = {
};
static const struct group_dual group7 = { {
II(Mov | DstMem | Priv, em_sgdt, sgdt),
II(Mov | DstMem | Priv, em_sidt, sidt),
II(Mov | DstMem, em_sgdt, sgdt),
II(Mov | DstMem, em_sidt, sidt),
II(SrcMem | Priv, em_lgdt, lgdt),
II(SrcMem | Priv, em_lidt, lidt),
II(SrcNone | DstMem | Mov, em_smsw, smsw), N,
@ -3899,7 +3916,7 @@ static const struct opcode twobyte_table[256] = {
N, N,
N, N, N, N, N, N, N, N,
/* 0x40 - 0x4F */
X16(D(DstReg | SrcMem | ModRM | Mov)),
X16(D(DstReg | SrcMem | ModRM)),
/* 0x50 - 0x5F */
N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N,
/* 0x60 - 0x6F */
@ -4061,12 +4078,12 @@ static int decode_operand(struct x86_emulate_ctxt *ctxt, struct operand *op,
mem_common:
*op = ctxt->memop;
ctxt->memopp = op;
if ((ctxt->d & BitOp) && op == &ctxt->dst)
if (ctxt->d & BitOp)
fetch_bit_operand(ctxt);
op->orig_val = op->val;
break;
case OpMem64:
ctxt->memop.bytes = 8;
ctxt->memop.bytes = (ctxt->op_bytes == 8) ? 16 : 8;
goto mem_common;
case OpAcc:
op->type = OP_REG;
@ -4150,7 +4167,7 @@ static int decode_operand(struct x86_emulate_ctxt *ctxt, struct operand *op,
op->bytes = (ctxt->d & ByteOp) ? 1 : ctxt->op_bytes;
op->addr.mem.ea =
register_address(ctxt, reg_read(ctxt, VCPU_REGS_RSI));
op->addr.mem.seg = seg_override(ctxt);
op->addr.mem.seg = ctxt->seg_override;
op->val = 0;
op->count = 1;
break;
@ -4161,7 +4178,7 @@ static int decode_operand(struct x86_emulate_ctxt *ctxt, struct operand *op,
register_address(ctxt,
reg_read(ctxt, VCPU_REGS_RBX) +
(reg_read(ctxt, VCPU_REGS_RAX) & 0xff));
op->addr.mem.seg = seg_override(ctxt);
op->addr.mem.seg = ctxt->seg_override;
op->val = 0;
break;
case OpImmFAddr:
@ -4208,16 +4225,22 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
int mode = ctxt->mode;
int def_op_bytes, def_ad_bytes, goffset, simd_prefix;
bool op_prefix = false;
bool has_seg_override = false;
struct opcode opcode;
ctxt->memop.type = OP_NONE;
ctxt->memopp = NULL;
ctxt->_eip = ctxt->eip;
ctxt->fetch.start = ctxt->_eip;
ctxt->fetch.end = ctxt->fetch.start + insn_len;
ctxt->fetch.ptr = ctxt->fetch.data;
ctxt->fetch.end = ctxt->fetch.data + insn_len;
ctxt->opcode_len = 1;
if (insn_len > 0)
memcpy(ctxt->fetch.data, insn, insn_len);
else {
rc = __do_insn_fetch_bytes(ctxt, 1);
if (rc != X86EMUL_CONTINUE)
return rc;
}
switch (mode) {
case X86EMUL_MODE_REAL:
@ -4261,11 +4284,13 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
case 0x2e: /* CS override */
case 0x36: /* SS override */
case 0x3e: /* DS override */
set_seg_override(ctxt, (ctxt->b >> 3) & 3);
has_seg_override = true;
ctxt->seg_override = (ctxt->b >> 3) & 3;
break;
case 0x64: /* FS override */
case 0x65: /* GS override */
set_seg_override(ctxt, ctxt->b & 7);
has_seg_override = true;
ctxt->seg_override = ctxt->b & 7;
break;
case 0x40 ... 0x4f: /* REX */
if (mode != X86EMUL_MODE_PROT64)
@ -4314,6 +4339,13 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
if (ctxt->d & ModRM)
ctxt->modrm = insn_fetch(u8, ctxt);
/* vex-prefix instructions are not implemented */
if (ctxt->opcode_len == 1 && (ctxt->b == 0xc5 || ctxt->b == 0xc4) &&
(mode == X86EMUL_MODE_PROT64 ||
(mode >= X86EMUL_MODE_PROT16 && (ctxt->modrm & 0x80)))) {
ctxt->d = NotImpl;
}
while (ctxt->d & GroupMask) {
switch (ctxt->d & GroupMask) {
case Group:
@ -4356,49 +4388,59 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
ctxt->d |= opcode.flags;
}
ctxt->execute = opcode.u.execute;
ctxt->check_perm = opcode.check_perm;
ctxt->intercept = opcode.intercept;
/* Unrecognised? */
if (ctxt->d == 0 || (ctxt->d & NotImpl))
if (ctxt->d == 0)
return EMULATION_FAILED;
if (!(ctxt->d & EmulateOnUD) && ctxt->ud)
return EMULATION_FAILED;
ctxt->execute = opcode.u.execute;
if (mode == X86EMUL_MODE_PROT64 && (ctxt->d & Stack))
ctxt->op_bytes = 8;
if (unlikely(ctxt->d &
(NotImpl|EmulateOnUD|Stack|Op3264|Sse|Mmx|Intercept|CheckPerm))) {
/*
* These are copied unconditionally here, and checked unconditionally
* in x86_emulate_insn.
*/
ctxt->check_perm = opcode.check_perm;
ctxt->intercept = opcode.intercept;
if (ctxt->d & Op3264) {
if (mode == X86EMUL_MODE_PROT64)
if (ctxt->d & NotImpl)
return EMULATION_FAILED;
if (!(ctxt->d & EmulateOnUD) && ctxt->ud)
return EMULATION_FAILED;
if (mode == X86EMUL_MODE_PROT64 && (ctxt->d & Stack))
ctxt->op_bytes = 8;
else
ctxt->op_bytes = 4;
}
if (ctxt->d & Sse)
ctxt->op_bytes = 16;
else if (ctxt->d & Mmx)
ctxt->op_bytes = 8;
if (ctxt->d & Op3264) {
if (mode == X86EMUL_MODE_PROT64)
ctxt->op_bytes = 8;
else
ctxt->op_bytes = 4;
}
if (ctxt->d & Sse)
ctxt->op_bytes = 16;
else if (ctxt->d & Mmx)
ctxt->op_bytes = 8;
}
/* ModRM and SIB bytes. */
if (ctxt->d & ModRM) {
rc = decode_modrm(ctxt, &ctxt->memop);
if (!ctxt->has_seg_override)
set_seg_override(ctxt, ctxt->modrm_seg);
if (!has_seg_override) {
has_seg_override = true;
ctxt->seg_override = ctxt->modrm_seg;
}
} else if (ctxt->d & MemAbs)
rc = decode_abs(ctxt, &ctxt->memop);
if (rc != X86EMUL_CONTINUE)
goto done;
if (!ctxt->has_seg_override)
set_seg_override(ctxt, VCPU_SREG_DS);
if (!has_seg_override)
ctxt->seg_override = VCPU_SREG_DS;
ctxt->memop.addr.mem.seg = seg_override(ctxt);
if (ctxt->memop.type == OP_MEM && ctxt->ad_bytes != 8)
ctxt->memop.addr.mem.ea = (u32)ctxt->memop.addr.mem.ea;
ctxt->memop.addr.mem.seg = ctxt->seg_override;
/*
* Decode and fetch the source operand: register, memory
@ -4420,7 +4462,7 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
rc = decode_operand(ctxt, &ctxt->dst, (ctxt->d >> DstShift) & OpMask);
done:
if (ctxt->memopp && ctxt->memopp->type == OP_MEM && ctxt->rip_relative)
if (ctxt->rip_relative)
ctxt->memopp->addr.mem.ea += ctxt->_eip;
return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK;
@ -4495,6 +4537,16 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
return X86EMUL_CONTINUE;
}
void init_decode_cache(struct x86_emulate_ctxt *ctxt)
{
memset(&ctxt->rip_relative, 0,
(void *)&ctxt->modrm - (void *)&ctxt->rip_relative);
ctxt->io_read.pos = 0;
ctxt->io_read.end = 0;
ctxt->mem_read.end = 0;
}
int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
{
const struct x86_emulate_ops *ops = ctxt->ops;
@ -4503,12 +4555,6 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
ctxt->mem_read.pos = 0;
if ((ctxt->mode == X86EMUL_MODE_PROT64 && (ctxt->d & No64)) ||
(ctxt->d & Undefined)) {
rc = emulate_ud(ctxt);
goto done;
}
/* LOCK prefix is allowed only with some instructions */
if (ctxt->lock_prefix && (!(ctxt->d & Lock) || ctxt->dst.type != OP_MEM)) {
rc = emulate_ud(ctxt);
@ -4520,70 +4566,83 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
goto done;
}
if (((ctxt->d & (Sse|Mmx)) && ((ops->get_cr(ctxt, 0) & X86_CR0_EM)))
|| ((ctxt->d & Sse) && !(ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))) {
rc = emulate_ud(ctxt);
goto done;
}
if ((ctxt->d & (Sse|Mmx)) && (ops->get_cr(ctxt, 0) & X86_CR0_TS)) {
rc = emulate_nm(ctxt);
goto done;
}
if (ctxt->d & Mmx) {
rc = flush_pending_x87_faults(ctxt);
if (rc != X86EMUL_CONTINUE)
if (unlikely(ctxt->d &
(No64|Undefined|Sse|Mmx|Intercept|CheckPerm|Priv|Prot|String))) {
if ((ctxt->mode == X86EMUL_MODE_PROT64 && (ctxt->d & No64)) ||
(ctxt->d & Undefined)) {
rc = emulate_ud(ctxt);
goto done;
/*
* Now that we know the fpu is exception safe, we can fetch
* operands from it.
*/
fetch_possible_mmx_operand(ctxt, &ctxt->src);
fetch_possible_mmx_operand(ctxt, &ctxt->src2);
if (!(ctxt->d & Mov))
fetch_possible_mmx_operand(ctxt, &ctxt->dst);
}
}
if (unlikely(ctxt->guest_mode) && ctxt->intercept) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_PRE_EXCEPT);
if (rc != X86EMUL_CONTINUE)
if (((ctxt->d & (Sse|Mmx)) && ((ops->get_cr(ctxt, 0) & X86_CR0_EM)))
|| ((ctxt->d & Sse) && !(ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))) {
rc = emulate_ud(ctxt);
goto done;
}
}
/* Privileged instruction can be executed only in CPL=0 */
if ((ctxt->d & Priv) && ops->cpl(ctxt)) {
rc = emulate_gp(ctxt, 0);
goto done;
}
/* Instruction can only be executed in protected mode */
if ((ctxt->d & Prot) && ctxt->mode < X86EMUL_MODE_PROT16) {
rc = emulate_ud(ctxt);
goto done;
}
/* Do instruction specific permission checks */
if (ctxt->check_perm) {
rc = ctxt->check_perm(ctxt);
if (rc != X86EMUL_CONTINUE)
if ((ctxt->d & (Sse|Mmx)) && (ops->get_cr(ctxt, 0) & X86_CR0_TS)) {
rc = emulate_nm(ctxt);
goto done;
}
}
if (unlikely(ctxt->guest_mode) && ctxt->intercept) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_EXCEPT);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (ctxt->d & Mmx) {
rc = flush_pending_x87_faults(ctxt);
if (rc != X86EMUL_CONTINUE)
goto done;
/*
* Now that we know the fpu is exception safe, we can fetch
* operands from it.
*/
fetch_possible_mmx_operand(ctxt, &ctxt->src);
fetch_possible_mmx_operand(ctxt, &ctxt->src2);
if (!(ctxt->d & Mov))
fetch_possible_mmx_operand(ctxt, &ctxt->dst);
}
if (ctxt->rep_prefix && (ctxt->d & String)) {
/* All REP prefixes have the same first termination condition */
if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
ctxt->eip = ctxt->_eip;
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_PRE_EXCEPT);
if (rc != X86EMUL_CONTINUE)
goto done;
}
/* Privileged instruction can be executed only in CPL=0 */
if ((ctxt->d & Priv) && ops->cpl(ctxt)) {
if (ctxt->d & PrivUD)
rc = emulate_ud(ctxt);
else
rc = emulate_gp(ctxt, 0);
goto done;
}
/* Instruction can only be executed in protected mode */
if ((ctxt->d & Prot) && ctxt->mode < X86EMUL_MODE_PROT16) {
rc = emulate_ud(ctxt);
goto done;
}
/* Do instruction specific permission checks */
if (ctxt->d & CheckPerm) {
rc = ctxt->check_perm(ctxt);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_EXCEPT);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (ctxt->rep_prefix && (ctxt->d & String)) {
/* All REP prefixes have the same first termination condition */
if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
ctxt->eip = ctxt->_eip;
ctxt->eflags &= ~EFLG_RF;
goto done;
}
}
}
if ((ctxt->src.type == OP_MEM) && !(ctxt->d & NoAccess)) {
@ -4616,13 +4675,18 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
special_insn:
if (unlikely(ctxt->guest_mode) && ctxt->intercept) {
if (unlikely(ctxt->guest_mode) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_MEMACCESS);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (ctxt->rep_prefix && (ctxt->d & String))
ctxt->eflags |= EFLG_RF;
else
ctxt->eflags &= ~EFLG_RF;
if (ctxt->execute) {
if (ctxt->d & Fastop) {
void (*fop)(struct fastop *) = (void *)ctxt->execute;
@ -4657,8 +4721,9 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
break;
case 0x90 ... 0x97: /* nop / xchg reg, rax */
if (ctxt->dst.addr.reg == reg_rmw(ctxt, VCPU_REGS_RAX))
break;
rc = em_xchg(ctxt);
ctxt->dst.type = OP_NONE;
else
rc = em_xchg(ctxt);
break;
case 0x98: /* cbw/cwde/cdqe */
switch (ctxt->op_bytes) {
@ -4709,17 +4774,17 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
goto done;
writeback:
if (!(ctxt->d & NoWrite)) {
rc = writeback(ctxt, &ctxt->dst);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (ctxt->d & SrcWrite) {
BUG_ON(ctxt->src.type == OP_MEM || ctxt->src.type == OP_MEM_STR);
rc = writeback(ctxt, &ctxt->src);
if (rc != X86EMUL_CONTINUE)
goto done;
}
if (!(ctxt->d & NoWrite)) {
rc = writeback(ctxt, &ctxt->dst);
if (rc != X86EMUL_CONTINUE)
goto done;
}
/*
* restore dst type in case the decoding will be reused
@ -4761,6 +4826,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
}
goto done; /* skip rip writeback */
}
ctxt->eflags &= ~EFLG_RF;
}
ctxt->eip = ctxt->_eip;
@ -4793,8 +4859,10 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
ops->get_dr(ctxt, ctxt->modrm_reg, &ctxt->dst.val);
break;
case 0x40 ... 0x4f: /* cmov */
ctxt->dst.val = ctxt->dst.orig_val = ctxt->src.val;
if (!test_cc(ctxt->b, ctxt->eflags))
if (test_cc(ctxt->b, ctxt->eflags))
ctxt->dst.val = ctxt->src.val;
else if (ctxt->mode != X86EMUL_MODE_PROT64 ||
ctxt->op_bytes != 4)
ctxt->dst.type = OP_NONE; /* no writeback */
break;
case 0x80 ... 0x8f: /* jnz rel, etc*/
@ -4818,8 +4886,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
break;
case 0xc3: /* movnti */
ctxt->dst.bytes = ctxt->op_bytes;
ctxt->dst.val = (ctxt->op_bytes == 4) ? (u32) ctxt->src.val :
(u64) ctxt->src.val;
ctxt->dst.val = (ctxt->op_bytes == 8) ? (u64) ctxt->src.val :
(u32) ctxt->src.val;
break;
default:
goto cannot_emulate;

View File

@ -1451,7 +1451,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu)
vcpu->arch.apic_arb_prio = 0;
vcpu->arch.apic_attention = 0;
apic_debug(KERN_INFO "%s: vcpu=%p, id=%d, base_msr="
apic_debug("%s: vcpu=%p, id=%d, base_msr="
"0x%016" PRIx64 ", base_address=0x%0lx.\n", __func__,
vcpu, kvm_apic_id(apic),
vcpu->arch.apic_base, apic->base_address);
@ -1895,7 +1895,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
/* evaluate pending_events before reading the vector */
smp_rmb();
sipi_vector = apic->sipi_vector;
pr_debug("vcpu %d received sipi with vector # %x\n",
apic_debug("vcpu %d received sipi with vector # %x\n",
vcpu->vcpu_id, sipi_vector);
kvm_vcpu_deliver_sipi_vector(vcpu, sipi_vector);
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;

View File

@ -22,7 +22,7 @@
__entry->unsync = sp->unsync;
#define KVM_MMU_PAGE_PRINTK() ({ \
const char *ret = trace_seq_buffer_ptr(p); \
const u32 saved_len = p->len; \
static const char *access_str[] = { \
"---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux" \
}; \
@ -41,7 +41,7 @@
role.nxe ? "" : "!", \
__entry->root_count, \
__entry->unsync ? "unsync" : "sync", 0); \
ret; \
p->buffer + saved_len; \
})
#define kvm_mmu_trace_pferr_flags \

View File

@ -428,6 +428,15 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
}
int kvm_pmu_check_pmc(struct kvm_vcpu *vcpu, unsigned pmc)
{
struct kvm_pmu *pmu = &vcpu->arch.pmu;
bool fixed = pmc & (1u << 30);
pmc &= ~(3u << 30);
return (!fixed && pmc >= pmu->nr_arch_gp_counters) ||
(fixed && pmc >= pmu->nr_arch_fixed_counters);
}
int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data)
{
struct kvm_pmu *pmu = &vcpu->arch.pmu;

View File

@ -486,14 +486,14 @@ static int is_external_interrupt(u32 info)
return info == (SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_INTR);
}
static u32 svm_get_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
static u32 svm_get_interrupt_shadow(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
u32 ret = 0;
if (svm->vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK)
ret |= KVM_X86_SHADOW_INT_STI | KVM_X86_SHADOW_INT_MOV_SS;
return ret & mask;
ret = KVM_X86_SHADOW_INT_STI | KVM_X86_SHADOW_INT_MOV_SS;
return ret;
}
static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
@ -1415,7 +1415,16 @@ static void svm_get_segment(struct kvm_vcpu *vcpu,
var->avl = (s->attrib >> SVM_SELECTOR_AVL_SHIFT) & 1;
var->l = (s->attrib >> SVM_SELECTOR_L_SHIFT) & 1;
var->db = (s->attrib >> SVM_SELECTOR_DB_SHIFT) & 1;
var->g = (s->attrib >> SVM_SELECTOR_G_SHIFT) & 1;
/*
* AMD CPUs circa 2014 track the G bit for all segments except CS.
* However, the SVM spec states that the G bit is not observed by the
* CPU, and some VMware virtual CPUs drop the G bit for all segments.
* So let's synthesize a legal G bit for all segments, this helps
* running KVM nested. It also helps cross-vendor migration, because
* Intel's vmentry has a check on the 'G' bit.
*/
var->g = s->limit > 0xfffff;
/*
* AMD's VMCB does not have an explicit unusable field, so emulate it
@ -1424,14 +1433,6 @@ static void svm_get_segment(struct kvm_vcpu *vcpu,
var->unusable = !var->present || (var->type == 0);
switch (seg) {
case VCPU_SREG_CS:
/*
* SVM always stores 0 for the 'G' bit in the CS selector in
* the VMCB on a VMEXIT. This hurts cross-vendor migration:
* Intel's VMENTRY has a check on the 'G' bit.
*/
var->g = s->limit > 0xfffff;
break;
case VCPU_SREG_TR:
/*
* Work around a bug where the busy flag in the tr selector
@ -2116,22 +2117,27 @@ static void nested_svm_unmap(struct page *page)
static int nested_svm_intercept_ioio(struct vcpu_svm *svm)
{
unsigned port;
u8 val, bit;
unsigned port, size, iopm_len;
u16 val, mask;
u8 start_bit;
u64 gpa;
if (!(svm->nested.intercept & (1ULL << INTERCEPT_IOIO_PROT)))
return NESTED_EXIT_HOST;
port = svm->vmcb->control.exit_info_1 >> 16;
size = (svm->vmcb->control.exit_info_1 & SVM_IOIO_SIZE_MASK) >>
SVM_IOIO_SIZE_SHIFT;
gpa = svm->nested.vmcb_iopm + (port / 8);
bit = port % 8;
val = 0;
start_bit = port % 8;
iopm_len = (start_bit + size > 8) ? 2 : 1;
mask = (0xf >> (4 - size)) << start_bit;
val = 0;
if (kvm_read_guest(svm->vcpu.kvm, gpa, &val, 1))
val &= (1 << bit);
if (kvm_read_guest(svm->vcpu.kvm, gpa, &val, iopm_len))
return NESTED_EXIT_DONE;
return val ? NESTED_EXIT_DONE : NESTED_EXIT_HOST;
return (val & mask) ? NESTED_EXIT_DONE : NESTED_EXIT_HOST;
}
static int nested_svm_exit_handled_msr(struct vcpu_svm *svm)
@ -4205,7 +4211,8 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
if (info->intercept == x86_intercept_cr_write)
icpt_info.exit_code += info->modrm_reg;
if (icpt_info.exit_code != SVM_EXIT_WRITE_CR0)
if (icpt_info.exit_code != SVM_EXIT_WRITE_CR0 ||
info->intercept == x86_intercept_clts)
break;
intercept = svm->nested.intercept;
@ -4250,14 +4257,14 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
u64 exit_info;
u32 bytes;
exit_info = (vcpu->arch.regs[VCPU_REGS_RDX] & 0xffff) << 16;
if (info->intercept == x86_intercept_in ||
info->intercept == x86_intercept_ins) {
exit_info |= SVM_IOIO_TYPE_MASK;
bytes = info->src_bytes;
} else {
exit_info = ((info->src_val & 0xffff) << 16) |
SVM_IOIO_TYPE_MASK;
bytes = info->dst_bytes;
} else {
exit_info = (info->dst_val & 0xffff) << 16;
bytes = info->src_bytes;
}
if (info->intercept == x86_intercept_outs ||

View File

@ -721,10 +721,10 @@ TRACE_EVENT(kvm_emulate_insn,
),
TP_fast_assign(
__entry->rip = vcpu->arch.emulate_ctxt.fetch.start;
__entry->csbase = kvm_x86_ops->get_segment_base(vcpu, VCPU_SREG_CS);
__entry->len = vcpu->arch.emulate_ctxt._eip
- vcpu->arch.emulate_ctxt.fetch.start;
__entry->len = vcpu->arch.emulate_ctxt.fetch.ptr
- vcpu->arch.emulate_ctxt.fetch.data;
__entry->rip = vcpu->arch.emulate_ctxt._eip - __entry->len;
memcpy(__entry->insn,
vcpu->arch.emulate_ctxt.fetch.data,
15);

View File

@ -383,6 +383,9 @@ struct nested_vmx {
struct hrtimer preemption_timer;
bool preemption_timer_expired;
/* to migrate it to L2 if VM_ENTRY_LOAD_DEBUG_CONTROLS is off */
u64 vmcs01_debugctl;
};
#define POSTED_INTR_ON 0
@ -740,7 +743,6 @@ static u32 vmx_segment_access_rights(struct kvm_segment *var);
static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
static bool vmx_mpx_supported(void);
static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@ -820,7 +822,6 @@ static const u32 vmx_msr_index[] = {
#endif
MSR_EFER, MSR_TSC_AUX, MSR_STAR,
};
#define NR_VMX_MSR ARRAY_SIZE(vmx_msr_index)
static inline bool is_page_fault(u32 intr_info)
{
@ -1940,7 +1941,7 @@ static void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
vmcs_writel(GUEST_RFLAGS, rflags);
}
static u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
static u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu)
{
u32 interruptibility = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
int ret = 0;
@ -1950,7 +1951,7 @@ static u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
if (interruptibility & GUEST_INTR_STATE_MOV_SS)
ret |= KVM_X86_SHADOW_INT_MOV_SS;
return ret & mask;
return ret;
}
static void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
@ -2239,10 +2240,13 @@ static inline bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
* or other means.
*/
static u32 nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high;
static u32 nested_vmx_true_procbased_ctls_low;
static u32 nested_vmx_secondary_ctls_low, nested_vmx_secondary_ctls_high;
static u32 nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high;
static u32 nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high;
static u32 nested_vmx_true_exit_ctls_low;
static u32 nested_vmx_entry_ctls_low, nested_vmx_entry_ctls_high;
static u32 nested_vmx_true_entry_ctls_low;
static u32 nested_vmx_misc_low, nested_vmx_misc_high;
static u32 nested_vmx_ept_caps;
static __init void nested_vmx_setup_ctls_msrs(void)
@ -2265,21 +2269,13 @@ static __init void nested_vmx_setup_ctls_msrs(void)
/* pin-based controls */
rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high);
/*
* According to the Intel spec, if bit 55 of VMX_BASIC is off (as it is
* in our case), bits 1, 2 and 4 (i.e., 0x16) must be 1 in this MSR.
*/
nested_vmx_pinbased_ctls_low |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
nested_vmx_pinbased_ctls_high &= PIN_BASED_EXT_INTR_MASK |
PIN_BASED_NMI_EXITING | PIN_BASED_VIRTUAL_NMIS;
nested_vmx_pinbased_ctls_high |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
PIN_BASED_VMX_PREEMPTION_TIMER;
/*
* Exit controls
* If bit 55 of VMX_BASIC is off, bits 0-8 and 10, 11, 13, 14, 16 and
* 17 must be 1.
*/
/* exit controls */
rdmsr(MSR_IA32_VMX_EXIT_CTLS,
nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high);
nested_vmx_exit_ctls_low = VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
@ -2296,10 +2292,13 @@ static __init void nested_vmx_setup_ctls_msrs(void)
if (vmx_mpx_supported())
nested_vmx_exit_ctls_high |= VM_EXIT_CLEAR_BNDCFGS;
/* We support free control of debug control saving. */
nested_vmx_true_exit_ctls_low = nested_vmx_exit_ctls_low &
~VM_EXIT_SAVE_DEBUG_CONTROLS;
/* entry controls */
rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
nested_vmx_entry_ctls_low, nested_vmx_entry_ctls_high);
/* If bit 55 of VMX_BASIC is off, bits 0-8 and 12 must be 1. */
nested_vmx_entry_ctls_low = VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
nested_vmx_entry_ctls_high &=
#ifdef CONFIG_X86_64
@ -2311,10 +2310,14 @@ static __init void nested_vmx_setup_ctls_msrs(void)
if (vmx_mpx_supported())
nested_vmx_entry_ctls_high |= VM_ENTRY_LOAD_BNDCFGS;
/* We support free control of debug control loading. */
nested_vmx_true_entry_ctls_low = nested_vmx_entry_ctls_low &
~VM_ENTRY_LOAD_DEBUG_CONTROLS;
/* cpu-based controls */
rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high);
nested_vmx_procbased_ctls_low = 0;
nested_vmx_procbased_ctls_low = CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
nested_vmx_procbased_ctls_high &=
CPU_BASED_VIRTUAL_INTR_PENDING |
CPU_BASED_VIRTUAL_NMI_PENDING | CPU_BASED_USE_TSC_OFFSETING |
@ -2335,7 +2338,12 @@ static __init void nested_vmx_setup_ctls_msrs(void)
* can use it to avoid exits to L1 - even when L0 runs L2
* without MSR bitmaps.
*/
nested_vmx_procbased_ctls_high |= CPU_BASED_USE_MSR_BITMAPS;
nested_vmx_procbased_ctls_high |= CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
CPU_BASED_USE_MSR_BITMAPS;
/* We support free control of CR3 access interception. */
nested_vmx_true_procbased_ctls_low = nested_vmx_procbased_ctls_low &
~(CPU_BASED_CR3_LOAD_EXITING | CPU_BASED_CR3_STORE_EXITING);
/* secondary cpu-based controls */
rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
@ -2394,7 +2402,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
* guest, and the VMCS structure we give it - not about the
* VMX support of the underlying hardware.
*/
*pdata = VMCS12_REVISION |
*pdata = VMCS12_REVISION | VMX_BASIC_TRUE_CTLS |
((u64)VMCS12_SIZE << VMX_BASIC_VMCS_SIZE_SHIFT) |
(VMX_BASIC_MEM_TYPE_WB << VMX_BASIC_MEM_TYPE_SHIFT);
break;
@ -2404,16 +2412,25 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
nested_vmx_pinbased_ctls_high);
break;
case MSR_IA32_VMX_TRUE_PROCBASED_CTLS:
*pdata = vmx_control_msr(nested_vmx_true_procbased_ctls_low,
nested_vmx_procbased_ctls_high);
break;
case MSR_IA32_VMX_PROCBASED_CTLS:
*pdata = vmx_control_msr(nested_vmx_procbased_ctls_low,
nested_vmx_procbased_ctls_high);
break;
case MSR_IA32_VMX_TRUE_EXIT_CTLS:
*pdata = vmx_control_msr(nested_vmx_true_exit_ctls_low,
nested_vmx_exit_ctls_high);
break;
case MSR_IA32_VMX_EXIT_CTLS:
*pdata = vmx_control_msr(nested_vmx_exit_ctls_low,
nested_vmx_exit_ctls_high);
break;
case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
*pdata = vmx_control_msr(nested_vmx_true_entry_ctls_low,
nested_vmx_entry_ctls_high);
break;
case MSR_IA32_VMX_ENTRY_CTLS:
*pdata = vmx_control_msr(nested_vmx_entry_ctls_low,
nested_vmx_entry_ctls_high);
@ -2442,7 +2459,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
*pdata = -1ULL;
break;
case MSR_IA32_VMX_VMCS_ENUM:
*pdata = 0x1f;
*pdata = 0x2e; /* highest index: VMX_PREEMPTION_TIMER_VALUE */
break;
case MSR_IA32_VMX_PROCBASED_CTLS2:
*pdata = vmx_control_msr(nested_vmx_secondary_ctls_low,
@ -3653,7 +3670,7 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
vmcs_write32(sf->ar_bytes, vmx_segment_access_rights(var));
out:
vmx->emulation_required |= emulation_required(vcpu);
vmx->emulation_required = emulation_required(vcpu);
}
static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
@ -4422,7 +4439,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmx->vcpu.arch.pat = host_pat;
}
for (i = 0; i < NR_VMX_MSR; ++i) {
for (i = 0; i < ARRAY_SIZE(vmx_msr_index); ++i) {
u32 index = vmx_msr_index[i];
u32 data_low, data_high;
int j = vmx->nmsrs;
@ -4873,7 +4890,7 @@ static int handle_exception(struct kvm_vcpu *vcpu)
if (!(vcpu->guest_debug &
(KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) {
vcpu->arch.dr6 &= ~15;
vcpu->arch.dr6 |= dr6;
vcpu->arch.dr6 |= dr6 | DR6_RTM;
if (!(dr6 & ~DR6_RESERVED)) /* icebp */
skip_emulated_instruction(vcpu);
@ -5039,7 +5056,7 @@ static int handle_cr(struct kvm_vcpu *vcpu)
reg = (exit_qualification >> 8) & 15;
switch ((exit_qualification >> 4) & 3) {
case 0: /* mov to cr */
val = kvm_register_read(vcpu, reg);
val = kvm_register_readl(vcpu, reg);
trace_kvm_cr_write(cr, val);
switch (cr) {
case 0:
@ -5056,7 +5073,7 @@ static int handle_cr(struct kvm_vcpu *vcpu)
return 1;
case 8: {
u8 cr8_prev = kvm_get_cr8(vcpu);
u8 cr8 = kvm_register_read(vcpu, reg);
u8 cr8 = (u8)val;
err = kvm_set_cr8(vcpu, cr8);
kvm_complete_insn_gp(vcpu, err);
if (irqchip_in_kernel(vcpu->kvm))
@ -5132,7 +5149,7 @@ static int handle_dr(struct kvm_vcpu *vcpu)
return 0;
} else {
vcpu->arch.dr7 &= ~DR7_GD;
vcpu->arch.dr6 |= DR6_BD;
vcpu->arch.dr6 |= DR6_BD | DR6_RTM;
vmcs_writel(GUEST_DR7, vcpu->arch.dr7);
kvm_queue_exception(vcpu, DB_VECTOR);
return 1;
@ -5165,7 +5182,7 @@ static int handle_dr(struct kvm_vcpu *vcpu)
return 1;
kvm_register_write(vcpu, reg, val);
} else
if (kvm_set_dr(vcpu, dr, kvm_register_read(vcpu, reg)))
if (kvm_set_dr(vcpu, dr, kvm_register_readl(vcpu, reg)))
return 1;
skip_emulated_instruction(vcpu);
@ -5621,7 +5638,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
cpu_exec_ctrl = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
intr_window_requested = cpu_exec_ctrl & CPU_BASED_VIRTUAL_INTR_PENDING;
while (!guest_state_valid(vcpu) && count-- != 0) {
while (vmx->emulation_required && count-- != 0) {
if (intr_window_requested && vmx_interrupt_allowed(vcpu))
return handle_interrupt_window(&vmx->vcpu);
@ -5655,7 +5672,6 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
schedule();
}
vmx->emulation_required = emulation_required(vcpu);
out:
return ret;
}
@ -5754,22 +5770,27 @@ static void nested_free_vmcs02(struct vcpu_vmx *vmx, gpa_t vmptr)
/*
* Free all VMCSs saved for this vcpu, except the one pointed by
* vmx->loaded_vmcs. These include the VMCSs in vmcs02_pool (except the one
* currently used, if running L2), and vmcs01 when running L2.
* vmx->loaded_vmcs. We must be running L1, so vmx->loaded_vmcs
* must be &vmx->vmcs01.
*/
static void nested_free_all_saved_vmcss(struct vcpu_vmx *vmx)
{
struct vmcs02_list *item, *n;
WARN_ON(vmx->loaded_vmcs != &vmx->vmcs01);
list_for_each_entry_safe(item, n, &vmx->nested.vmcs02_pool, list) {
if (vmx->loaded_vmcs != &item->vmcs02)
free_loaded_vmcs(&item->vmcs02);
/*
* Something will leak if the above WARN triggers. Better than
* a use-after-free.
*/
if (vmx->loaded_vmcs == &item->vmcs02)
continue;
free_loaded_vmcs(&item->vmcs02);
list_del(&item->list);
kfree(item);
vmx->nested.vmcs02_num--;
}
vmx->nested.vmcs02_num = 0;
if (vmx->loaded_vmcs != &vmx->vmcs01)
free_loaded_vmcs(&vmx->vmcs01);
}
/*
@ -5918,7 +5939,7 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason,
* which replaces physical address width with 32
*
*/
if (!IS_ALIGNED(vmptr, PAGE_SIZE) || (vmptr >> maxphyaddr)) {
if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) {
nested_vmx_failInvalid(vcpu);
skip_emulated_instruction(vcpu);
return 1;
@ -5936,7 +5957,7 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason,
vmx->nested.vmxon_ptr = vmptr;
break;
case EXIT_REASON_VMCLEAR:
if (!IS_ALIGNED(vmptr, PAGE_SIZE) || (vmptr >> maxphyaddr)) {
if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) {
nested_vmx_failValid(vcpu,
VMXERR_VMCLEAR_INVALID_ADDRESS);
skip_emulated_instruction(vcpu);
@ -5951,7 +5972,7 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason,
}
break;
case EXIT_REASON_VMPTRLD:
if (!IS_ALIGNED(vmptr, PAGE_SIZE) || (vmptr >> maxphyaddr)) {
if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) {
nested_vmx_failValid(vcpu,
VMXERR_VMPTRLD_INVALID_ADDRESS);
skip_emulated_instruction(vcpu);
@ -6086,20 +6107,27 @@ static int nested_vmx_check_permission(struct kvm_vcpu *vcpu)
static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
{
u32 exec_control;
if (vmx->nested.current_vmptr == -1ull)
return;
/* current_vmptr and current_vmcs12 are always set/reset together */
if (WARN_ON(vmx->nested.current_vmcs12 == NULL))
return;
if (enable_shadow_vmcs) {
if (vmx->nested.current_vmcs12 != NULL) {
/* copy to memory all shadowed fields in case
they were modified */
copy_shadow_to_vmcs12(vmx);
vmx->nested.sync_shadow_vmcs = false;
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
vmcs_write64(VMCS_LINK_POINTER, -1ull);
}
/* copy to memory all shadowed fields in case
they were modified */
copy_shadow_to_vmcs12(vmx);
vmx->nested.sync_shadow_vmcs = false;
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
vmcs_write64(VMCS_LINK_POINTER, -1ull);
}
kunmap(vmx->nested.current_vmcs12_page);
nested_release_page(vmx->nested.current_vmcs12_page);
vmx->nested.current_vmptr = -1ull;
vmx->nested.current_vmcs12 = NULL;
}
/*
@ -6110,12 +6138,9 @@ static void free_nested(struct vcpu_vmx *vmx)
{
if (!vmx->nested.vmxon)
return;
vmx->nested.vmxon = false;
if (vmx->nested.current_vmptr != -1ull) {
nested_release_vmcs12(vmx);
vmx->nested.current_vmptr = -1ull;
vmx->nested.current_vmcs12 = NULL;
}
nested_release_vmcs12(vmx);
if (enable_shadow_vmcs)
free_vmcs(vmx->nested.current_shadow_vmcs);
/* Unpin physical memory we referred to in current vmcs02 */
@ -6152,11 +6177,8 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
if (nested_vmx_check_vmptr(vcpu, EXIT_REASON_VMCLEAR, &vmptr))
return 1;
if (vmptr == vmx->nested.current_vmptr) {
if (vmptr == vmx->nested.current_vmptr)
nested_release_vmcs12(vmx);
vmx->nested.current_vmptr = -1ull;
vmx->nested.current_vmcs12 = NULL;
}
page = nested_get_page(vcpu, vmptr);
if (page == NULL) {
@ -6384,7 +6406,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
return 1;
/* Decode instruction info and find the field to read */
field = kvm_register_read(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
/* Read the field, zero-extended to a u64 field_value */
if (!vmcs12_read_any(vcpu, field, &field_value)) {
nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);
@ -6397,7 +6419,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
* on the guest's mode (32 or 64 bit), not on the given field's length.
*/
if (vmx_instruction_info & (1u << 10)) {
kvm_register_write(vcpu, (((vmx_instruction_info) >> 3) & 0xf),
kvm_register_writel(vcpu, (((vmx_instruction_info) >> 3) & 0xf),
field_value);
} else {
if (get_vmx_mem_address(vcpu, exit_qualification,
@ -6434,21 +6456,21 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
return 1;
if (vmx_instruction_info & (1u << 10))
field_value = kvm_register_read(vcpu,
field_value = kvm_register_readl(vcpu,
(((vmx_instruction_info) >> 3) & 0xf));
else {
if (get_vmx_mem_address(vcpu, exit_qualification,
vmx_instruction_info, &gva))
return 1;
if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva,
&field_value, (is_long_mode(vcpu) ? 8 : 4), &e)) {
&field_value, (is_64_bit_mode(vcpu) ? 8 : 4), &e)) {
kvm_inject_page_fault(vcpu, &e);
return 1;
}
}
field = kvm_register_read(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
if (vmcs_field_readonly(field)) {
nested_vmx_failValid(vcpu,
VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT);
@ -6498,9 +6520,8 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
skip_emulated_instruction(vcpu);
return 1;
}
if (vmx->nested.current_vmptr != -1ull)
nested_release_vmcs12(vmx);
nested_release_vmcs12(vmx);
vmx->nested.current_vmptr = vmptr;
vmx->nested.current_vmcs12 = new_vmcs12;
vmx->nested.current_vmcs12_page = page;
@ -6571,7 +6592,7 @@ static int handle_invept(struct kvm_vcpu *vcpu)
}
vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
type = kvm_register_read(vcpu, (vmx_instruction_info >> 28) & 0xf);
type = kvm_register_readl(vcpu, (vmx_instruction_info >> 28) & 0xf);
types = (nested_vmx_ept_caps >> VMX_EPT_EXTENT_SHIFT) & 6;
@ -6751,7 +6772,7 @@ static bool nested_vmx_exit_handled_cr(struct kvm_vcpu *vcpu,
unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
int cr = exit_qualification & 15;
int reg = (exit_qualification >> 8) & 15;
unsigned long val = kvm_register_read(vcpu, reg);
unsigned long val = kvm_register_readl(vcpu, reg);
switch ((exit_qualification >> 4) & 3) {
case 0: /* mov to cr */
@ -7112,7 +7133,26 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
if (max_irr == -1)
return;
vmx_set_rvi(max_irr);
/*
* If a vmexit is needed, vmx_check_nested_events handles it.
*/
if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
return;
if (!is_guest_mode(vcpu)) {
vmx_set_rvi(max_irr);
return;
}
/*
* Fall back to pre-APICv interrupt injection since L2
* is run without virtual interrupt delivery.
*/
if (!kvm_event_needs_reinjection(vcpu) &&
vmx_interrupt_allowed(vcpu)) {
kvm_queue_interrupt(vcpu, max_irr, false);
vmx_inject_irq(vcpu);
}
}
static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
@ -7520,13 +7560,31 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx_complete_interrupts(vmx);
}
static void vmx_load_vmcs01(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
int cpu;
if (vmx->loaded_vmcs == &vmx->vmcs01)
return;
cpu = get_cpu();
vmx->loaded_vmcs = &vmx->vmcs01;
vmx_vcpu_put(vcpu);
vmx_vcpu_load(vcpu, cpu);
vcpu->cpu = cpu;
put_cpu();
}
static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
free_vpid(vmx);
free_loaded_vmcs(vmx->loaded_vmcs);
leave_guest_mode(vcpu);
vmx_load_vmcs01(vcpu);
free_nested(vmx);
free_loaded_vmcs(vmx->loaded_vmcs);
kfree(vmx->guest_msrs);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, vmx);
@ -7548,6 +7606,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
goto free_vcpu;
vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL);
BUILD_BUG_ON(ARRAY_SIZE(vmx_msr_index) * sizeof(vmx->guest_msrs[0])
> PAGE_SIZE);
err = -ENOMEM;
if (!vmx->guest_msrs) {
goto uninit_vcpu;
@ -7836,7 +7897,13 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
vmcs_writel(GUEST_GDTR_BASE, vmcs12->guest_gdtr_base);
vmcs_writel(GUEST_IDTR_BASE, vmcs12->guest_idtr_base);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) {
kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
} else {
kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.vmcs01_debugctl);
}
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
vmcs12->vm_entry_intr_info_field);
vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
@ -7846,7 +7913,6 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
vmcs_write32(GUEST_INTERRUPTIBILITY_INFO,
vmcs12->guest_interruptibility_info);
vmcs_write32(GUEST_SYSENTER_CS, vmcs12->guest_sysenter_cs);
kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
vmx_set_rflags(vcpu, vmcs12->guest_rflags);
vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
vmcs12->guest_pending_dbg_exceptions);
@ -8113,14 +8179,14 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
}
if ((vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_MSR_BITMAPS) &&
!IS_ALIGNED(vmcs12->msr_bitmap, PAGE_SIZE)) {
!PAGE_ALIGNED(vmcs12->msr_bitmap)) {
/*TODO: Also verify bits beyond physical address width are 0*/
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
return 1;
}
if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) &&
!IS_ALIGNED(vmcs12->apic_access_addr, PAGE_SIZE)) {
!PAGE_ALIGNED(vmcs12->apic_access_addr)) {
/*TODO: Also verify bits beyond physical address width are 0*/
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
return 1;
@ -8136,15 +8202,18 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
}
if (!vmx_control_verify(vmcs12->cpu_based_vm_exec_control,
nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high) ||
nested_vmx_true_procbased_ctls_low,
nested_vmx_procbased_ctls_high) ||
!vmx_control_verify(vmcs12->secondary_vm_exec_control,
nested_vmx_secondary_ctls_low, nested_vmx_secondary_ctls_high) ||
!vmx_control_verify(vmcs12->pin_based_vm_exec_control,
nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high) ||
!vmx_control_verify(vmcs12->vm_exit_controls,
nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high) ||
nested_vmx_true_exit_ctls_low,
nested_vmx_exit_ctls_high) ||
!vmx_control_verify(vmcs12->vm_entry_controls,
nested_vmx_entry_ctls_low, nested_vmx_entry_ctls_high))
nested_vmx_true_entry_ctls_low,
nested_vmx_entry_ctls_high))
{
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
return 1;
@ -8221,6 +8290,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
vmx->nested.vmcs01_tsc_offset = vmcs_read64(TSC_OFFSET);
if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
cpu = get_cpu();
vmx->loaded_vmcs = vmcs02;
vmx_vcpu_put(vcpu);
@ -8398,7 +8470,6 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
kvm_get_dr(vcpu, 7, (unsigned long *)&vmcs12->guest_dr7);
vmcs12->guest_rsp = kvm_register_read(vcpu, VCPU_REGS_RSP);
vmcs12->guest_rip = kvm_register_read(vcpu, VCPU_REGS_RIP);
vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
@ -8477,9 +8548,13 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
(vmcs12->vm_entry_controls & ~VM_ENTRY_IA32E_MODE) |
(vm_entry_controls_get(to_vmx(vcpu)) & VM_ENTRY_IA32E_MODE);
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_DEBUG_CONTROLS) {
kvm_get_dr(vcpu, 7, (unsigned long *)&vmcs12->guest_dr7);
vmcs12->guest_ia32_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
}
/* TODO: These cannot have changed unless we have MSR bitmaps and
* the relevant bit asks not to trap the change */
vmcs12->guest_ia32_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_IA32_PAT)
vmcs12->guest_ia32_pat = vmcs_read64(GUEST_IA32_PAT);
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_IA32_EFER)
@ -8670,7 +8745,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
unsigned long exit_qualification)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
int cpu;
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
/* trying to cancel vmlaunch/vmresume is a bug */
@ -8695,12 +8769,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
vmcs12->vm_exit_intr_error_code,
KVM_ISA_VMX);
cpu = get_cpu();
vmx->loaded_vmcs = &vmx->vmcs01;
vmx_vcpu_put(vcpu);
vmx_vcpu_load(vcpu, cpu);
vcpu->cpu = cpu;
put_cpu();
vmx_load_vmcs01(vcpu);
vm_entry_controls_init(vmx, vmcs_read32(VM_ENTRY_CONTROLS));
vm_exit_controls_init(vmx, vmcs_read32(VM_EXIT_CONTROLS));
@ -8890,7 +8959,7 @@ static int __init vmx_init(void)
rdmsrl_safe(MSR_EFER, &host_efer);
for (i = 0; i < NR_VMX_MSR; ++i)
for (i = 0; i < ARRAY_SIZE(vmx_msr_index); ++i)
kvm_define_shared_msr(i, vmx_msr_index[i]);
vmx_io_bitmap_a = (unsigned long *)__get_free_page(GFP_KERNEL);

View File

@ -87,6 +87,7 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
static void update_cr8_intercept(struct kvm_vcpu *vcpu);
static void process_nmi(struct kvm_vcpu *vcpu);
static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
struct kvm_x86_ops *kvm_x86_ops;
EXPORT_SYMBOL_GPL(kvm_x86_ops);
@ -211,6 +212,7 @@ static void shared_msr_update(unsigned slot, u32 msr)
void kvm_define_shared_msr(unsigned slot, u32 msr)
{
BUG_ON(slot >= KVM_NR_SHARED_MSRS);
if (slot >= shared_msrs_global.nr)
shared_msrs_global.nr = slot + 1;
shared_msrs_global.msrs[slot] = msr;
@ -310,6 +312,31 @@ static int exception_class(int vector)
return EXCPT_BENIGN;
}
#define EXCPT_FAULT 0
#define EXCPT_TRAP 1
#define EXCPT_ABORT 2
#define EXCPT_INTERRUPT 3
static int exception_type(int vector)
{
unsigned int mask;
if (WARN_ON(vector > 31 || vector == NMI_VECTOR))
return EXCPT_INTERRUPT;
mask = 1 << vector;
/* #DB is trap, as instruction watchpoints are handled elsewhere */
if (mask & ((1 << DB_VECTOR) | (1 << BP_VECTOR) | (1 << OF_VECTOR)))
return EXCPT_TRAP;
if (mask & ((1 << DF_VECTOR) | (1 << MC_VECTOR)))
return EXCPT_ABORT;
/* Reserved exceptions will result in fault */
return EXCPT_FAULT;
}
static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
unsigned nr, bool has_error, u32 error_code,
bool reinject)
@ -758,6 +785,15 @@ static void kvm_update_dr7(struct kvm_vcpu *vcpu)
vcpu->arch.switch_db_regs |= KVM_DEBUGREG_BP_ENABLED;
}
static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
{
u64 fixed = DR6_FIXED_1;
if (!guest_cpuid_has_rtm(vcpu))
fixed |= DR6_RTM;
return fixed;
}
static int __kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val)
{
switch (dr) {
@ -773,7 +809,7 @@ static int __kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val)
case 6:
if (val & 0xffffffff00000000ULL)
return -1; /* #GP */
vcpu->arch.dr6 = (val & DR6_VOLATILE) | DR6_FIXED_1;
vcpu->arch.dr6 = (val & DR6_VOLATILE) | kvm_dr6_fixed(vcpu);
kvm_update_dr6(vcpu);
break;
case 5:
@ -1215,6 +1251,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
unsigned long flags;
s64 usdiff;
bool matched;
bool already_matched;
u64 data = msr->data;
raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
@ -1279,6 +1316,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
pr_debug("kvm: adjusted tsc offset by %llu\n", delta);
}
matched = true;
already_matched = (vcpu->arch.this_tsc_generation == kvm->arch.cur_tsc_generation);
} else {
/*
* We split periods of matched TSC writes into generations.
@ -1294,7 +1332,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
kvm->arch.cur_tsc_write = data;
kvm->arch.cur_tsc_offset = offset;
matched = false;
pr_debug("kvm: new tsc generation %u, clock %llu\n",
pr_debug("kvm: new tsc generation %llu, clock %llu\n",
kvm->arch.cur_tsc_generation, data);
}
@ -1319,10 +1357,11 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr)
raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
spin_lock(&kvm->arch.pvclock_gtod_sync_lock);
if (matched)
kvm->arch.nr_vcpus_matched_tsc++;
else
if (!matched) {
kvm->arch.nr_vcpus_matched_tsc = 0;
} else if (!already_matched) {
kvm->arch.nr_vcpus_matched_tsc++;
}
kvm_track_tsc_matching(vcpu);
spin_unlock(&kvm->arch.pvclock_gtod_sync_lock);
@ -2032,6 +2071,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
data &= ~(u64)0x40; /* ignore flush filter disable */
data &= ~(u64)0x100; /* ignore ignne emulation enable */
data &= ~(u64)0x8; /* ignore TLB cache disable */
data &= ~(u64)0x40000; /* ignore Mc status write enable */
if (data != 0) {
vcpu_unimpl(vcpu, "unimplemented HWCR wrmsr: 0x%llx\n",
data);
@ -2974,9 +3014,7 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
vcpu->arch.interrupt.pending && !vcpu->arch.interrupt.soft;
events->interrupt.nr = vcpu->arch.interrupt.nr;
events->interrupt.soft = 0;
events->interrupt.shadow =
kvm_x86_ops->get_interrupt_shadow(vcpu,
KVM_X86_SHADOW_INT_MOV_SS | KVM_X86_SHADOW_INT_STI);
events->interrupt.shadow = kvm_x86_ops->get_interrupt_shadow(vcpu);
events->nmi.injected = vcpu->arch.nmi_injected;
events->nmi.pending = vcpu->arch.nmi_pending != 0;
@ -4082,7 +4120,8 @@ static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes,
if (gpa == UNMAPPED_GVA)
return X86EMUL_PROPAGATE_FAULT;
ret = kvm_read_guest(vcpu->kvm, gpa, data, toread);
ret = kvm_read_guest_page(vcpu->kvm, gpa >> PAGE_SHIFT, data,
offset, toread);
if (ret < 0) {
r = X86EMUL_IO_NEEDED;
goto out;
@ -4103,10 +4142,24 @@ static int kvm_fetch_guest_virt(struct x86_emulate_ctxt *ctxt,
{
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
unsigned offset;
int ret;
return kvm_read_guest_virt_helper(addr, val, bytes, vcpu,
access | PFERR_FETCH_MASK,
exception);
/* Inline kvm_read_guest_virt_helper for speed. */
gpa_t gpa = vcpu->arch.walk_mmu->gva_to_gpa(vcpu, addr, access|PFERR_FETCH_MASK,
exception);
if (unlikely(gpa == UNMAPPED_GVA))
return X86EMUL_PROPAGATE_FAULT;
offset = addr & (PAGE_SIZE-1);
if (WARN_ON(offset + bytes > PAGE_SIZE))
bytes = (unsigned)PAGE_SIZE - offset;
ret = kvm_read_guest_page(vcpu->kvm, gpa >> PAGE_SHIFT, val,
offset, bytes);
if (unlikely(ret < 0))
return X86EMUL_IO_NEEDED;
return X86EMUL_CONTINUE;
}
int kvm_read_guest_virt(struct x86_emulate_ctxt *ctxt,
@ -4730,7 +4783,6 @@ static void emulator_set_segment(struct x86_emulate_ctxt *ctxt, u16 selector,
if (desc->g)
var.limit = (var.limit << 12) | 0xfff;
var.type = desc->type;
var.present = desc->p;
var.dpl = desc->dpl;
var.db = desc->d;
var.s = desc->s;
@ -4762,6 +4814,12 @@ static int emulator_set_msr(struct x86_emulate_ctxt *ctxt,
return kvm_set_msr(emul_to_vcpu(ctxt), &msr);
}
static int emulator_check_pmc(struct x86_emulate_ctxt *ctxt,
u32 pmc)
{
return kvm_pmu_check_pmc(emul_to_vcpu(ctxt), pmc);
}
static int emulator_read_pmc(struct x86_emulate_ctxt *ctxt,
u32 pmc, u64 *pdata)
{
@ -4838,6 +4896,7 @@ static const struct x86_emulate_ops emulate_ops = {
.set_dr = emulator_set_dr,
.set_msr = emulator_set_msr,
.get_msr = emulator_get_msr,
.check_pmc = emulator_check_pmc,
.read_pmc = emulator_read_pmc,
.halt = emulator_halt,
.wbinvd = emulator_wbinvd,
@ -4850,7 +4909,7 @@ static const struct x86_emulate_ops emulate_ops = {
static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
{
u32 int_shadow = kvm_x86_ops->get_interrupt_shadow(vcpu, mask);
u32 int_shadow = kvm_x86_ops->get_interrupt_shadow(vcpu);
/*
* an sti; sti; sequence only disable interrupts for the first
* instruction. So, if the last instruction, be it emulated or
@ -4858,8 +4917,13 @@ static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
* means that the last instruction is an sti. We should not
* leave the flag on in this case. The same goes for mov ss
*/
if (!(int_shadow & mask))
if (int_shadow & mask)
mask = 0;
if (unlikely(int_shadow || mask)) {
kvm_x86_ops->set_interrupt_shadow(vcpu, mask);
if (!mask)
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
}
static void inject_emulated_exception(struct kvm_vcpu *vcpu)
@ -4874,19 +4938,6 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu)
kvm_queue_exception(vcpu, ctxt->exception.vector);
}
static void init_decode_cache(struct x86_emulate_ctxt *ctxt)
{
memset(&ctxt->opcode_len, 0,
(void *)&ctxt->_regs - (void *)&ctxt->opcode_len);
ctxt->fetch.start = 0;
ctxt->fetch.end = 0;
ctxt->io_read.pos = 0;
ctxt->io_read.end = 0;
ctxt->mem_read.pos = 0;
ctxt->mem_read.end = 0;
}
static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
{
struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
@ -5085,23 +5136,22 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
return dr6;
}
static void kvm_vcpu_check_singlestep(struct kvm_vcpu *vcpu, int *r)
static void kvm_vcpu_check_singlestep(struct kvm_vcpu *vcpu, unsigned long rflags, int *r)
{
struct kvm_run *kvm_run = vcpu->run;
/*
* Use the "raw" value to see if TF was passed to the processor.
* Note that the new value of the flags has not been saved yet.
* rflags is the old, "raw" value of the flags. The new value has
* not been saved yet.
*
* This is correct even for TF set by the guest, because "the
* processor will not generate this exception after the instruction
* that sets the TF flag".
*/
unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
if (unlikely(rflags & X86_EFLAGS_TF)) {
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1;
kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1 |
DR6_RTM;
kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
kvm_run->debug.arch.exception = DB_VECTOR;
kvm_run->exit_reason = KVM_EXIT_DEBUG;
@ -5114,7 +5164,7 @@ static void kvm_vcpu_check_singlestep(struct kvm_vcpu *vcpu, int *r)
* cleared by the processor".
*/
vcpu->arch.dr6 &= ~15;
vcpu->arch.dr6 |= DR6_BS;
vcpu->arch.dr6 |= DR6_BS | DR6_RTM;
kvm_queue_exception(vcpu, DB_VECTOR);
}
}
@ -5133,7 +5183,7 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
vcpu->arch.eff_db);
if (dr6 != 0) {
kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1 | DR6_RTM;
kvm_run->debug.arch.pc = kvm_rip_read(vcpu) +
get_segment_base(vcpu, VCPU_SREG_CS);
@ -5144,14 +5194,15 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
}
}
if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK)) {
if (unlikely(vcpu->arch.dr7 & DR7_BP_EN_MASK) &&
!(kvm_get_rflags(vcpu) & X86_EFLAGS_RF)) {
dr6 = kvm_vcpu_check_hw_bp(eip, 0,
vcpu->arch.dr7,
vcpu->arch.db);
if (dr6 != 0) {
vcpu->arch.dr6 &= ~15;
vcpu->arch.dr6 |= dr6;
vcpu->arch.dr6 |= dr6 | DR6_RTM;
kvm_queue_exception(vcpu, DB_VECTOR);
*r = EMULATE_DONE;
return true;
@ -5215,6 +5266,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
if (emulation_type & EMULTYPE_SKIP) {
kvm_rip_write(vcpu, ctxt->_eip);
if (ctxt->eflags & X86_EFLAGS_RF)
kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF);
return EMULATE_DONE;
}
@ -5265,13 +5318,22 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
r = EMULATE_DONE;
if (writeback) {
unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
toggle_interruptibility(vcpu, ctxt->interruptibility);
kvm_make_request(KVM_REQ_EVENT, vcpu);
vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
kvm_rip_write(vcpu, ctxt->eip);
if (r == EMULATE_DONE)
kvm_vcpu_check_singlestep(vcpu, &r);
kvm_set_rflags(vcpu, ctxt->eflags);
kvm_vcpu_check_singlestep(vcpu, rflags, &r);
__kvm_set_rflags(vcpu, ctxt->eflags);
/*
* For STI, interrupts are shadowed; so KVM_REQ_EVENT will
* do nothing, and it will be requested again as soon as
* the shadow expires. But we still need to check here,
* because POPF has no interrupt shadow.
*/
if (unlikely((ctxt->eflags & ~rflags) & X86_EFLAGS_IF))
kvm_make_request(KVM_REQ_EVENT, vcpu);
} else
vcpu->arch.emulate_regs_need_sync_to_vcpu = true;
@ -5662,7 +5724,6 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
u64 param, ingpa, outgpa, ret;
uint16_t code, rep_idx, rep_cnt, res = HV_STATUS_SUCCESS, rep_done = 0;
bool fast, longmode;
int cs_db, cs_l;
/*
* hypercall generates UD from non zero cpl and real mode
@ -5673,8 +5734,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
return 0;
}
kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
longmode = is_long_mode(vcpu) && cs_l == 1;
longmode = is_64_bit_mode(vcpu);
if (!longmode) {
param = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDX) << 32) |
@ -5739,7 +5799,7 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
{
unsigned long nr, a0, a1, a2, a3, ret;
int r = 1;
int op_64_bit, r = 1;
if (kvm_hv_hypercall_enabled(vcpu->kvm))
return kvm_hv_hypercall(vcpu);
@ -5752,7 +5812,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
trace_kvm_hypercall(nr, a0, a1, a2, a3);
if (!is_long_mode(vcpu)) {
op_64_bit = is_64_bit_mode(vcpu);
if (!op_64_bit) {
nr &= 0xFFFFFFFF;
a0 &= 0xFFFFFFFF;
a1 &= 0xFFFFFFFF;
@ -5778,6 +5839,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
break;
}
out:
if (!op_64_bit)
ret = (u32)ret;
kvm_register_write(vcpu, VCPU_REGS_RAX, ret);
++vcpu->stat.hypercalls;
return r;
@ -5856,6 +5919,11 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
trace_kvm_inj_exception(vcpu->arch.exception.nr,
vcpu->arch.exception.has_error_code,
vcpu->arch.exception.error_code);
if (exception_type(vcpu->arch.exception.nr) == EXCPT_FAULT)
__kvm_set_rflags(vcpu, kvm_get_rflags(vcpu) |
X86_EFLAGS_RF);
kvm_x86_ops->queue_exception(vcpu, vcpu->arch.exception.nr,
vcpu->arch.exception.has_error_code,
vcpu->arch.exception.error_code,
@ -6847,9 +6915,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu)
atomic_set(&vcpu->arch.nmi_queued, 0);
vcpu->arch.nmi_pending = 0;
vcpu->arch.nmi_injected = false;
kvm_clear_interrupt_queue(vcpu);
kvm_clear_exception_queue(vcpu);
memset(vcpu->arch.db, 0, sizeof(vcpu->arch.db));
vcpu->arch.dr6 = DR6_FIXED_1;
vcpu->arch.dr6 = DR6_INIT;
kvm_update_dr6(vcpu);
vcpu->arch.dr7 = DR7_FIXED_1;
kvm_update_dr7(vcpu);
@ -7405,12 +7475,17 @@ unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvm_get_rflags);
void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
{
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP &&
kvm_is_linear_rip(vcpu, vcpu->arch.singlestep_rip))
rflags |= X86_EFLAGS_TF;
kvm_x86_ops->set_rflags(vcpu, rflags);
}
void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
{
__kvm_set_rflags(vcpu, rflags);
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
EXPORT_SYMBOL_GPL(kvm_set_rflags);

View File

@ -47,6 +47,16 @@ static inline int is_long_mode(struct kvm_vcpu *vcpu)
#endif
}
static inline bool is_64_bit_mode(struct kvm_vcpu *vcpu)
{
int cs_db, cs_l;
if (!is_long_mode(vcpu))
return false;
kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
return cs_l;
}
static inline bool mmu_is_nested(struct kvm_vcpu *vcpu)
{
return vcpu->arch.walk_mmu == &vcpu->arch.nested_mmu;
@ -108,6 +118,23 @@ static inline bool vcpu_match_mmio_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
return false;
}
static inline unsigned long kvm_register_readl(struct kvm_vcpu *vcpu,
enum kvm_reg reg)
{
unsigned long val = kvm_register_read(vcpu, reg);
return is_64_bit_mode(vcpu) ? val : (u32)val;
}
static inline void kvm_register_writel(struct kvm_vcpu *vcpu,
enum kvm_reg reg,
unsigned long val)
{
if (!is_64_bit_mode(vcpu))
val = (u32)val;
return kvm_register_write(vcpu, reg, val);
}
void kvm_before_handle_nmi(struct kvm_vcpu *vcpu);
void kvm_after_handle_nmi(struct kvm_vcpu *vcpu);
int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);

View File

@ -399,13 +399,18 @@ struct kvm_vapic_addr {
__u64 vapic_addr;
};
/* for KVM_SET_MPSTATE */
/* for KVM_SET_MP_STATE */
/* not all states are valid on all architectures */
#define KVM_MP_STATE_RUNNABLE 0
#define KVM_MP_STATE_UNINITIALIZED 1
#define KVM_MP_STATE_INIT_RECEIVED 2
#define KVM_MP_STATE_HALTED 3
#define KVM_MP_STATE_SIPI_RECEIVED 4
#define KVM_MP_STATE_STOPPED 5
#define KVM_MP_STATE_CHECK_STOP 6
#define KVM_MP_STATE_OPERATING 7
#define KVM_MP_STATE_LOAD 8
struct kvm_mp_state {
__u32 mp_state;

View File

@ -254,10 +254,9 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap,
spin_lock(&ioapic->lock);
for (index = 0; index < IOAPIC_NUM_PINS; index++) {
e = &ioapic->redirtbl[index];
if (!e->fields.mask &&
(e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
index) || index == RTC_GSI)) {
if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC, index) ||
index == RTC_GSI) {
if (kvm_apic_match_dest(vcpu, NULL, 0,
e->fields.dest_id, e->fields.dest_mode)) {
__set_bit(e->fields.vector,

View File

@ -323,13 +323,13 @@ int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
#define IOAPIC_ROUTING_ENTRY(irq) \
{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP, \
.u.irqchip.irqchip = KVM_IRQCHIP_IOAPIC, .u.irqchip.pin = (irq) }
.u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } }
#define ROUTING_ENTRY1(irq) IOAPIC_ROUTING_ENTRY(irq)
#ifdef CONFIG_X86
# define PIC_ROUTING_ENTRY(irq) \
{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP, \
.u.irqchip.irqchip = SELECT_PIC(irq), .u.irqchip.pin = (irq) % 8 }
.u.irqchip = { .irqchip = SELECT_PIC(irq), .pin = (irq) % 8 } }
# define ROUTING_ENTRY2(irq) \
IOAPIC_ROUTING_ENTRY(irq), PIC_ROUTING_ENTRY(irq)
#else