Commit Graph

102301 Commits

Author SHA1 Message Date
Wanpeng Li
81dc01f749 kvm: vmx: add nested virtualization support for xsaves
Add nested virtualization support for xsaves.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:44 +01:00
Wanpeng Li
203000993d kvm: vmx: add MSR logic for XSAVES
Add logic to get/set the XSS model-specific register.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:39 +01:00
Wanpeng Li
f53cd63c2d kvm: x86: handle XSAVES vmcs and vmexit
Initialize the XSS exit bitmap.  It is zero so there should be no XSAVES
or XRSTORS exits.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:33 +01:00
Paolo Bonzini
404e0a19e1 KVM: cpuid: mask more bits in leaf 0xd and subleaves
- EAX=0Dh, ECX=1: output registers EBX/ECX/EDX are reserved.

- EAX=0Dh, ECX>1: output register ECX bit 0 is clear for all the CPUID
leaves we support, because variable "supported" comes from XCR0 and not
XSS.  Bits above 0 are reserved, so ECX is overall zero.  Output register
EDX is reserved.

Source: Intel Architecture Instruction Set Extensions Programming
Reference, ref. number 319433-022

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:17 +01:00
Paolo Bonzini
412a3c411e KVM: cpuid: set CPUID(EAX=0xd,ECX=1).EBX correctly
This is the size of the XSAVES area.  This starts providing guest support
for XSAVES (with no support yet for supervisor states, i.e. XSS == 0
always in guests for now).

Wanpeng Li suggested testing XSAVEC as well as XSAVES, since in practice
no real processor exists that only has one of them, and there is no
other way for userspace programs to compute the area of the XSAVEC
save area.  CPUID(EAX=0xd,ECX=1).EBX provides an upper bound.

Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:17 +01:00
Wanpeng Li
55412b2eda kvm: x86: Add kvm_x86_ops hook that enables XSAVES for guest
Expose the XSAVES feature to the guest if the kvm_x86_ops say it is
available.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:16 +01:00
Paolo Bonzini
5c404cabd1 KVM: x86: use F() macro throughout cpuid.c
For code that deals with cpuid, this makes things a bit more readable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:15 +01:00
Paolo Bonzini
df1daba7d1 KVM: x86: support XSAVES usage in the host
Userspace is expecting non-compacted format for KVM_GET_XSAVE, but
struct xsave_struct might be using the compacted format.  Convert
in order to preserve userspace ABI.

Likewise, userspace is passing non-compacted format for KVM_SET_XSAVE
but the kernel will pass it to XRSTORS, and we need to convert back.

Fixes: f31a9f7c71
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Cc: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: Nadav Amit <namit@cs.technion.ac.il>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:57:05 +01:00
Paolo Bonzini
ba7b39203a x86: export get_xsave_addr
get_xsave_addr is the API to access XSAVE states, and KVM would
like to use it.  Export it.

Cc: stable@vger.kernel.org
Cc: x86@kernel.org
Cc: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05 13:55:44 +01:00
Paolo Bonzini
28145be0a7 KVM: s390: Fixups for kvm/next (3.19)
Here we have two fixups of the latest interrupt rework and
 one architectural fixup.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJUgIO9AAoJEBF7vIC1phx8U5YQALKPvkoyVaGOexDCQQJrz04+
 9W5ZQuvg020jj1EmRPlv9yekhtYBTYA/OVT+FsDG3eq+xCS4Y04vlvUfk2Lq0YAB
 Bk+cU7plNP8f1Ml/kRTC/yfWcPe8iCVI1WA7zqA2wwEbaEZCz+9+i76wyriDWtJM
 FJEKXqVmLZH/H+GAw4GQG0YrO4uZ70Sk5F5YVtceGN5mDlqCATXa2xADm4ciAiIW
 L6w+G3KjIuv/Opds6DFMfJgYQeAF+0MFFWpD2rUfVLwIvdtBTtnxVwQZfpoXH7rF
 SeNU6n7CILS0csW/ZMcuTE8UrYgW1kkj1iVgc6fT7nkTWWZ5+iGRbAKQG5WPKibn
 84f9cn7kJ22ZobayaLskNfe041o1a+zMUCswHdFYTgF29WefeqkikoPx0B80g4p6
 O1jZqSZXKiTtlmIqiISikJhDVx0/kC+ftlu4MJLKq7O8RgcaYmAJp7gSGz50sMPB
 AybcfNkoYsg5J85sinT16TA/Vmcd3ZBWRqaokiRFD/uuqh5cKnyZOTpPwph8Welq
 QxBtqpG36cBLRJRMzlSJNXg1LKvb8WAxWYxGjiSL/8y/V3intnSFYdLjhNpx3yj8
 WCxaG3m2Gf68RQOVSdqlvgH//xBeSz0l9grX+BzJNvBY7aZMmTbXHwIt/lSCiBZd
 radGCMmVN4YEq4OoTgyr
 =ZZSq
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-20141204' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixups for kvm/next (3.19)

Here we have two fixups of the latest interrupt rework and
one architectural fixup.
2014-12-05 13:55:40 +01:00
Jens Freimann
99e20009ae KVM: s390: clean up return code handling in irq delivery code
Instead of returning a possibly random or'ed together value, let's
always return -EFAULT if rc is set.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04 16:39:00 +01:00
Jens Freimann
9185124e87 KVM: s390: use atomic bitops to access pending_irqs bitmap
Currently we use a mixture of atomic/non-atomic bitops
and the local_int spin lock to protect the pending_irqs bitmap
and interrupt payload data.

We need to use atomic bitops for the pending_irqs bitmap everywhere
and in addition acquire the local_int lock where interrupt data needs
to be protected.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04 16:38:57 +01:00
David Hildenbrand
467fc29892 KVM: s390: some ext irqs have to clear the ext cpu addr
The cpu address of a source cpu (responsible for an external irq) is only to
be stored if bit 6 of the ext irq code is set.

If bit 6 is not set, it is to be zeroed out.

The special external irq code used for virtio and pfault uses the cpu addr as a
parameter field. As bit 6 is set, this implementation is correct.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04 16:38:38 +01:00
Radim Krčmář
45c3094a64 KVM: x86: allow 256 logical x2APICs again
While fixing an x2apic bug,
 17d68b7 KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376)
we've made only one cluster available.  This means that the amount of
logically addressible x2APICs was reduced to 16 and VCPUs kept
overwriting themselves in that region, so even the first cluster wasn't
set up correctly.

This patch extends x2APIC support back to the logical_map's limit, and
keeps the CVE fixed as messages for non-present APICs are dropped.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:08 +01:00
Radim Krčmář
25995e5b4a KVM: x86: check bounds of APIC maps
They can't be violated now, but play it safe for the future.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:08 +01:00
Radim Krčmář
fa834e9197 KVM: x86: fix APIC physical destination wrapping
x2apic allows destinations > 0xff and we don't want them delivered to
lower APICs.  They are correctly handled by doing nothing.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:07 +01:00
Radim Krčmář
085563fb04 KVM: x86: deliver phys lowest-prio
Physical mode can't address more than one APIC, but lowest-prio is
allowed, so we just reuse our paths.

SDM 10.6.2.1 Physical Destination:
  Also, for any non-broadcast IPI or I/O subsystem initiated interrupt
  with lowest priority delivery mode, software must ensure that APICs
  defined in the interrupt address are present and enabled to receive
  interrupts.

We could warn on top of that.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:06 +01:00
Radim Krčmář
698f9755d9 KVM: x86: don't retry hopeless APIC delivery
False from kvm_irq_delivery_to_apic_fast() means that we don't handle it
in the fast path, but we still return false in cases that were perfectly
handled, fix that.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:06 +01:00
Radim Krčmář
decdc28382 KVM: x86: use MSR_ICR instead of a number
0x830 MSR is 0x300 xAPIC MMIO, which is MSR_ICR.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:05 +01:00
Nadav Amit
c69d3d9bc1 KVM: x86: Fix reserved x2apic registers
x2APIC has no registers for DFR and ICR2 (see Intel SDM 10.12.1.2 "x2APIC
Register Address Space"). KVM needs to cause #GP on such accesses.

Fix it (DFR and ICR2 on read, ICR2 on write, DFR already handled on writes).

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:05 +01:00
Nadav Amit
39f062ff51 KVM: x86: Generate #UD when memory operand is required
Certain x86 instructions that use modrm operands only allow memory operand
(i.e., mod012), and cause a #UD exception otherwise. KVM ignores this fact.
Currently, the instructions that are such and are emulated by KVM are MOVBE,
MOVNTPS, MOVNTPD and MOVNTI.  MOVBE is the most blunt example, since it may be
emulated by the host regardless of MMIO.

The fix introduces a new group for handling such instructions, marking mod3 as
illegal instruction.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04 15:29:04 +01:00
Paolo Bonzini
be06b6bece KVM: s390: Several fixes,cleanups and reworks
Here is a bunch of fixes that deal mostly with architectural compliance:
 - interrupt priorities
 - interrupt handling
 - intruction exit handling
 
 We also provide a helper function for getting the guest visible storage key.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJUeHKOAAoJEBF7vIC1phx8fS8P/2i1zXvsB+Mp9FafU6FO3Ci6
 yM/ZBLFEsX2jmU0TGR2szZCP4xuHqomJm441+2CX9bXzjf8vnA2hGiDYX0bnBraK
 1/klx6Li1fQxsiSHXIgBXr0wh5ftNUdFVZiJoifY9dEdrhVI+YEiptIl7lADCFXi
 SdGtjEAzrVEe8H0g6OuBXeEfPzHvxAzNJ31yHuiCKl7vbFVnXNVPeZhq3dJZwmWC
 iIdlSqGIjbcHNMLXvrScLDAScBe6WruBLhPSy5aTIA2eBU6f4qkOedSFABJkIAq+
 V6v5pXWbrVBIqfLXE7Vp7jxhP7+viBGzu/gPkfT8HV1pQZDa94WojF8hGU0DLGLd
 vgZuFDV8cMOZMUS/onXOwnIXN5VPvP8V2v3y8gTiap3MykRiyTGEzq2auU9p0K5n
 /8W6Cn1P/WBS8MOFYg726DGmMAWkzpEVz9rxpCLaTpzz7QVyLuSLq/n3SXyQNQIl
 zhox/KzwUQD0t1062USoK3w4suYNvnX0BuFOwxXvS7f4bsb+6V/t0GyIBnVAL/OF
 DZzJSIyzP/Ur/9krxJQxML3kEELU1CjwSLOrzDUnZA3ytaKvsLrHkTb9nK6fREDK
 14AGRnp94B3EMPR6T+7T6gvwK2Y/QIo8Y/EFAa2BwXY4Q/BQPSVU3x3RK6L9+7jY
 VaG9sgn4OC2ZPzxjqMTc
 =MQUy
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-20141128' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Several fixes,cleanups and reworks

Here is a bunch of fixes that deal mostly with architectural compliance:
- interrupt priorities
- interrupt handling
- intruction exit handling

We also provide a helper function for getting the guest visible storage key.
2014-12-03 15:20:11 +01:00
Jens Freimann
fc2020cfe9 KVM: s390: allow injecting all kinds of machine checks
Allow to specify CR14, logout area, external damage code
and failed storage address.

Since more then one machine check can be indicated to the guest at
a time we need to combine all indication bits with already pending
requests.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:05 +01:00
Jens Freimann
383d0b0501 KVM: s390: handle pending local interrupts via bitmap
This patch adapts handling of local interrupts to be more compliant with
the z/Architecture Principles of Operation and introduces a data
structure
which allows more efficient handling of interrupts.

* get rid of li->active flag, use bitmap instead
* Keep interrupts in a bitmap instead of a list
* Deliver interrupts in the order of their priority as defined in the
  PoP
* Use a second bitmap for sigp emergency requests, as a CPU can have
  one request pending from every other CPU in the system.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:04 +01:00
Jens Freimann
c0e6159d51 KVM: s390: add bitmap for handling cpu-local interrupts
Adds a bitmap to the vcpu structure which is used to keep track
of local pending interrupts. Also add enum with all interrupt
types sorted in order of priority (highest to lowest)

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:04 +01:00
Jens Freimann
0fb97abe05 KVM: s390: refactor interrupt delivery code
Move delivery code for cpu-local interrupt from the huge do_deliver_interrupt()
to smaller functions which handle one type of interrupt.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:03 +01:00
Jens Freimann
60f90a14dd KVM: s390: add defines for virtio and pfault interrupt code
Get rid of open coded value for virtio and pfault completion interrupts.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:03 +01:00
David Hildenbrand
af43eb2fd7 KVM: s390: external param not valid for cpu timer and ckc
The 32bit external interrupt parameter is only valid for timing-alert and
service-signal interrupts.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:02 +01:00
Jens Freimann
0146a7b0b0 KVM: s390: refactor interrupt injection code
In preparation for the rework of the local interrupt injection code,
factor out injection routines from kvm_s390_inject_vcpu().

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:59:01 +01:00
Jason J. Herne
9fcf93b5de KVM: S390: Create helper function get_guest_storage_key
Define get_guest_storage_key which can be used to get the value of a guest
storage key. This compliments the functionality provided by the helper function
set_guest_storage_key. Both functions are needed for live migration of s390
guests that use storage keys.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 13:58:48 +01:00
Christian Borntraeger
da00fcbdac KVM: s390: trigger the right CPU exit for floating interrupts
When injecting a floating interrupt and no CPU is idle we
kick one CPU to do an external exit. In case of I/O we
should trigger an I/O exit instead. This does not matter
for Linux guests as external and I/O interrupts are
enabled/disabled at the same time, but play safe anyway.

The same holds true for machine checks. Since there is no
special exit, just reuse the generic stop exit. The injection
code inside the VCPU loop will recheck anyway and rearm the
proper exits (e.g. control registers) if necessary.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2014-11-28 12:33:00 +01:00
Thomas Huth
04b41acd06 KVM: s390: Fix rewinding of the PSW pointing to an EXECUTE instruction
A couple of our interception handlers rewind the PSW to the beginning
of the instruction to run the intercepted instruction again during the
next SIE entry. This normally works fine, but there is also the
possibility that the instruction did not get run directly but via an
EXECUTE instruction.
In this case, the PSW does not point to the instruction that caused the
interception, but to the EXECUTE instruction! So we've got to rewind the
PSW to the beginning of the EXECUTE instruction instead.
This is now accomplished with a new helper function kvm_s390_rewind_psw().

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 12:32:56 +01:00
Thomas Huth
a02689fecd KVM: s390: Small fixes for the PFMF handler
This patch includes two small fixes for the PFMF handler: First, the
start address for PFMF has to be masked according to the current
addressing mode, which is now done with kvm_s390_logical_to_effective().
Second, the protection exceptions have a lower priority than the
specification exceptions, so the check for low-address protection
has to be moved after the last spot where we inject a specification
exception.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28 12:32:38 +01:00
Paolo Bonzini
2b4a273b42 kvm: x86: avoid warning about potential shift wrapping bug
cs.base is declared as a __u64 variable and vector is a u32 so this
causes a static checker warning.  The user indeed can set "sipi_vector"
to any u32 value in kvm_vcpu_ioctl_x86_set_vcpu_events(), but the
value should really have 8-bit precision only.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24 16:53:50 +01:00
Paolo Bonzini
c9eab58f64 KVM: x86: move device assignment out of kvm_host.h
Create a new header, and hide the device assignment functions there.
Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying
arch/x86/kvm/iommu.c to take a PCI device struct.

Based on a patch by Radim Krcmar <rkrcmark@redhat.com>.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24 16:53:50 +01:00
Paolo Bonzini
b65d6e17fe kvm: x86: mask out XSAVES
This feature is not supported inside KVM guests yet, because we do not emulate
MSR_IA32_XSS.  Mask it out.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23 18:33:37 +01:00
Radim Krčmář
c274e03af7 kvm: x86: move assigned-dev.c and iommu.c to arch/x86/
Now that ia64 is gone, we can hide deprecated device assignment in x86.

Notable changes:
 - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl()

The easy parts were removed from generic kvm code, remaining
 - kvm_iommu_(un)map_pages() would require new code to be moved
 - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23 18:33:36 +01:00
Radim Krcmar
3bf58e9ae8 kvm: remove CONFIG_X86 #ifdefs from files formerly shared with ia64
Signed-off-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21 18:07:26 +01:00
Paolo Bonzini
6ef768fac9 kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/
ia64 does not need them anymore.  Ack notifiers become x86-specific
too.

Suggested-by: Gleb Natapov <gleb@kernel.org>
Reviewed-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21 18:02:37 +01:00
Paolo Bonzini
003f7de625 KVM: ia64: remove
KVM for ia64 has been marked as broken not just once, but twice even,
and the last patch from the maintainer is now roughly 5 years old.
Time for it to rest in peace.

Acked-by: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-20 11:08:33 +01:00
Nicholas Krause
86619e7ba3 KVM: x86: Remove FIXMEs in emulate.c
Remove FIXME comments about needing fault addresses to be returned.  These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.

Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:54:43 +01:00
Paolo Bonzini
997b04128d KVM: emulator: remove duplicated limit check
The check on the higher limit of the segment, and the check on the
maximum accessible size, is the same for both expand-up and
expand-down segments.  Only the computation of "lim" varies.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:40:24 +01:00
Paolo Bonzini
01485a2230 KVM: emulator: remove code duplication in register_address{,_increment}
register_address has been a duplicate of address_mask ever since the
ancestor of __linearize was born in 90de84f50b (KVM: x86 emulator:
preserve an operand's segment identity, 2010-11-17).

However, we can put it to a better use by including the call to reg_read
in register_address.  Similarly, the call to reg_rmw can be moved to
register_address_increment.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:27:27 +01:00
Nadav Amit
31ff64881b KVM: x86: Move __linearize masking of la into switch
In __linearize there is check of the condition whether to check if masking of
the linear address is needed.  It occurs immediately after switch that
evaluates the same condition.  Merge them.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:20:15 +01:00
Nadav Amit
abc7d8a4c9 KVM: x86: Non-canonical access using SS should cause #SS
When SS is used using a non-canonical address, an #SS exception is generated on
real hardware.  KVM emulator causes a #GP instead. Fix it to behave as real x86
CPU.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:19:57 +01:00
Nadav Amit
d50eaa1803 KVM: x86: Perform limit checks when assigning EIP
If branch (e.g., jmp, ret) causes limit violations, since the target IP >
limit, the #GP exception occurs before the branch.  In other words, the RIP
pushed on the stack should be that of the branch and not that of the target.

To do so, we can call __linearize, with new EIP, which also saves us the code
which performs the canonical address checks. On the case of assigning an EIP >=
2^32 (when switching cs.l), we also safe, as __linearize will check the new EIP
does not exceed the limit and would trigger #GP(0) otherwise.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:19:22 +01:00
Nadav Amit
a7315d2f3c KVM: x86: Emulator performs privilege checks on __linearize
When segment is accessed, real hardware does not perform any privilege level
checks.  In contrast, KVM emulator does. This causes some discrepencies from
real hardware. For instance, reading from readable code segment may fail due to
incorrect segment checks. In addition, it introduces unnecassary overhead.

To reference Intel SDM 5.5 ("Privilege Levels"): "Privilege levels are checked
when the segment selector of a segment descriptor is loaded into a segment
register." The SDM never mentions privilege level checks during memory access,
except for loading far pointers in section 5.10 ("Pointer Validation"). Those
are actually segment selector loads and are emulated in the similarily (i.e.,
regardless to __linearize checks).

This behavior was also checked using sysexit. A data-segment whose DPL=0 was
loaded, and after sysexit (CPL=3) it is still accessible.

Therefore, all the privilege level checks in __linearize are removed.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:17:58 +01:00
Nadav Amit
1c1c35ae4b KVM: x86: Stack size is overridden by __linearize
When performing segmented-read/write in the emulator for stack operations, it
ignores the stack size, and uses the ad_bytes as indication for the pointer
size. As a result, a wrong address may be accessed.

To fix this behavior, we can remove the masking of address in __linearize and
perform it beforehand.  It is already done for the operands (so currently it is
inefficiently done twice). It is missing in two cases:
1. When using rip_relative
2. On fetch_bit_operand that changes the address.

This patch masks the address on these two occassions, and removes the masking
from __linearize.

Note that it does not mask EIP during fetch. In protected/legacy mode code
fetch when RIP >= 2^32 should result in #GP and not wrap-around. Since we make
limit checks within __linearize, this is the expected behavior.

Partial revert of commit 518547b32a (KVM: x86: Emulator does not
calculate address correctly, 2014-09-30).

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:17:10 +01:00
Nadav Amit
7d882ffa81 KVM: x86: Revert NoBigReal patch in the emulator
Commit 10e38fc7cab6 ("KVM: x86: Emulator flag for instruction that only support
16-bit addresses in real mode") introduced NoBigReal for instructions such as
MONITOR. Apparetnly, the Intel SDM description that led to this patch is
misleading.  Since no instruction is using NoBigReal, it is safe to remove it,
we fully understand what the SDM means.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19 18:13:27 +01:00
Tiejun Chen
842bb26a40 kvm: x86: vmx: remove MMIO_MAX_GEN
MMIO_MAX_GEN is the same as MMIO_GEN_MASK.  Use only one.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-18 11:12:18 +01:00