Commit Graph

336 Commits

Author SHA1 Message Date
Marc Zyngier
eede821dbf KVM: arm/arm64: vgic: move GICv2 registers to their own structure
In order to make way for the GICv3 registers, move the v2-specific
registers to their own structure.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-07-11 04:57:31 -07:00
Alex Bennée
1df08ba0aa arm64: KVM: allow export and import of generic timer regs
For correct guest suspend/resume behaviour we need to ensure we include
the generic timer registers for 64 bit guests. As CONFIG_KVM_ARM_TIMER is
always set for arm64 we don't need to worry about null implementations.
However I have re-jigged the kvm_arm_timer_set/get_reg declarations to
be in the common include/kvm/arm_arch_timer.h headers.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-07-11 04:46:55 -07:00
Kim Phillips
b88657674d ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping
A userspace process can map device MMIO memory via VFIO or /dev/mem,
e.g., for platform device passthrough support in QEMU.

During early development, we found the PAGE_S2 memory type being used
for MMIO mappings.  This patch corrects that by using the more strongly
ordered memory type for device MMIO mappings: PAGE_S2_DEVICE.

Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-07-11 04:46:53 -07:00
Eric Auger
df6ce24f2e ARM: KVM: Unmap IPA on memslot delete/move
Currently when a KVM region is deleted or moved after
KVM_SET_USER_MEMORY_REGION ioctl, the corresponding
intermediate physical memory is not unmapped.

This patch corrects this and unmaps the region's IPA range
in kvm_arch_commit_memory_region using unmap_stage2_range.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-07-11 04:46:52 -07:00
Christoffer Dall
4f853a714b arm/arm64: KVM: Fix and refactor unmap_range
unmap_range() was utterly broken, to quote Marc, and broke in all sorts
of situations.  It was also quite complicated to follow and didn't
follow the usual scheme of having a separate iterating function for each
level of page tables.

Address this by refactoring the code and introduce a pgd_clear()
function.

Reviewed-by: Jungseok Lee <jays.lee@samsung.com>
Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-07-11 04:46:51 -07:00
Linus Torvalds
b05d59dfce At over 200 commits, covering almost all supported architectures, this
was a pretty active cycle for KVM.  Changes include:
 
 - a lot of s390 changes: optimizations, support for migration,
   GDB support and more
 
 - ARM changes are pretty small: support for the PSCI 0.2 hypercall
   interface on both the guest and the host (the latter acked by Catalin)
 
 - initial POWER8 and little-endian host support
 
 - support for running u-boot on embedded POWER targets
 
 - pretty large changes to MIPS too, completing the userspace interface
   and improving the handling of virtualized timer hardware
 
 - for x86, a larger set of changes is scheduled for 3.17.  Still,
   we have a few emulator bugfixes and support for running nested
   fully-virtualized Xen guests (para-virtualized Xen guests have
   always worked).  And some optimizations too.
 
 The only missing architecture here is ia64.  It's not a coincidence
 that support for KVM on ia64 is scheduled for removal in 3.17.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTjtlBAAoJEBvWZb6bTYbyMOUP/2NAePghE3IjG99ikHFdn+BX
 BfrURsuR6GD0AhYQnBidBmpFbAmN/LwSJxv/M7sV7OBRWLu3qbt69DrPTU2e/FK1
 j9q25peu8jRyHzJ1q9rBroo74nD9lQYuVr3uXNxxcg0DRnw14JHGlM3y8LDEknO8
 W+gpWTeAQ+2AuOX98MpRbCRMuzziCSv5bP5FhBVnsWHiZfvMbcUrbeJt+zYSiDAZ
 0tHm/5dFKzfj/vVrrnjD4EZcRr688Bs5rztG96hY6aoVJryjZGLtLp92wCWkRRmH
 CCvZwd245NmNthuKHzcs27/duSWfU0uOlu7AMrD44QYhzeDGyB/2nbCxbGqLLoBA
 nnOviXH4cC65/CnisZ79zfo979HbZcX+Lzg747EjBgCSxJmLlwgiG8yXtDvk5otB
 TH6GUeGDiEEPj//JD3XtgSz0sF2NvjREWRyemjDMvhz6JC/bLytXKb3sn+NXSj8m
 ujzF9eQoa4qKDcBL4IQYGTJ4z5nY3Pd68dHFIPHB7n82OxFLSQUBKxXw8/1fb5og
 VVb8PL4GOcmakQlAKtTMlFPmuy4bbL2r/2iV5xJiOZKmXIu8Hs1JezBE3SFAltbl
 3cAGwSM9/dDkKxUbTFblyOE9bkKbg4WYmq0LkdzsPEomb3IZWntOT25rYnX+LrBz
 bAknaZpPiOrW11Et1htY
 =j5Od
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into next

Pull KVM updates from Paolo Bonzini:
 "At over 200 commits, covering almost all supported architectures, this
  was a pretty active cycle for KVM.  Changes include:

   - a lot of s390 changes: optimizations, support for migration, GDB
     support and more

   - ARM changes are pretty small: support for the PSCI 0.2 hypercall
     interface on both the guest and the host (the latter acked by
     Catalin)

   - initial POWER8 and little-endian host support

   - support for running u-boot on embedded POWER targets

   - pretty large changes to MIPS too, completing the userspace
     interface and improving the handling of virtualized timer hardware

   - for x86, a larger set of changes is scheduled for 3.17.  Still, we
     have a few emulator bugfixes and support for running nested
     fully-virtualized Xen guests (para-virtualized Xen guests have
     always worked).  And some optimizations too.

  The only missing architecture here is ia64.  It's not a coincidence
  that support for KVM on ia64 is scheduled for removal in 3.17"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits)
  KVM: add missing cleanup_srcu_struct
  KVM: PPC: Book3S PR: Rework SLB switching code
  KVM: PPC: Book3S PR: Use SLB entry 0
  KVM: PPC: Book3S HV: Fix machine check delivery to guest
  KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
  KVM: PPC: Book3S HV: Make sure we don't miss dirty pages
  KVM: PPC: Book3S HV: Fix dirty map for hugepages
  KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address
  KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()
  KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number
  KVM: PPC: Book3S: Add ONE_REG register names that were missed
  KVM: PPC: Add CAP to indicate hcall fixes
  KVM: PPC: MPIC: Reset IRQ source private members
  KVM: PPC: Graciously fail broken LE hypercalls
  PPC: ePAPR: Fix hypercall on LE guest
  KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler
  KVM: PPC: BOOK3S: Always use the saved DAR value
  PPC: KVM: Make NX bit available with magic page
  KVM: PPC: Disable NX for old magic page using guests
  KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest
  ...
2014-06-04 08:47:12 -07:00
Anup Patel
4447a208f7 ARM/ARM64: KVM: Advertise KVM_CAP_ARM_PSCI_0_2 to user space
We have PSCI v0.2 emulation available in KVM ARM/ARM64
hence advertise this to user space (i.e. QEMU or KVMTOOL)
via KVM_CHECK_EXTENSION ioctl.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
b376d02b53 ARM/ARM64: KVM: Emulate PSCI v0.2 CPU_SUSPEND
This patch adds emulation of PSCI v0.2 CPU_SUSPEND function call for
KVM ARM/ARM64. This is a CPU-level function call which can suspend
current CPU or current CPU cluster. We don't have VCPU clusters in
KVM so we only suspend the current VCPU.

The CPU_SUSPEND emulation is not tested much because currently there
is no CPUIDLE driver in Linux kernel that uses PSCI CPU_SUSPEND. The
PSCI CPU_SUSPEND implementation in ARM64 kernel was tested using a
Simple CPUIDLE driver which is not published due to unstable DT-bindings
for PSCI.
(For more info, http://lwn.net/Articles/574950/)

For simplicity, we implement CPU_SUSPEND emulation similar to WFI
(Wait-for-interrupt) emulation and we also treat power-down request
to be same as stand-by request. This is consistent with section
5.4.1 and section 5.4.2 of PSCI v0.2 specification.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
aa8aeefe5e ARM/ARM64: KVM: Fix CPU_ON emulation for PSCI v0.2
As-per PSCI v0.2, the source CPU provides physical address of
"entry point" and "context id" for starting a target CPU. Also,
if target CPU is already running then we should return ALREADY_ON.

Current emulation of CPU_ON function does not consider physical
address of "context id" and returns INVALID_PARAMETERS if target
CPU is already running.

This patch updates kvm_psci_vcpu_on() such that it works for both
PSCI v0.1 and PSCI v0.2.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
bab0b43012 ARM/ARM64: KVM: Emulate PSCI v0.2 MIGRATE_INFO_TYPE and related functions
This patch adds emulation of PSCI v0.2 MIGRATE, MIGRATE_INFO_TYPE, and
MIGRATE_INFO_UP_CPU function calls for KVM ARM/ARM64.

KVM ARM/ARM64 being a hypervisor (and not a Trusted OS), we cannot provide
this functions hence we emulate these functions in following way:
1. MIGRATE - Returns "Not Supported"
2. MIGRATE_INFO_TYPE - Return 2 i.e. Trusted OS is not present
3. MIGRATE_INFO_UP_CPU - Returns "Not Supported"

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
e6bc13c8a7 ARM/ARM64: KVM: Emulate PSCI v0.2 AFFINITY_INFO
This patch adds emulation of PSCI v0.2 AFFINITY_INFO function call
for KVM ARM/ARM64. This is a VCPU-level function call which will be
used to determine current state of given affinity level.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
4b1238269e ARM/ARM64: KVM: Emulate PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET
The PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET functions are system-level
functions hence cannot be fully emulated by in-kernel PSCI emulation code.

To tackle this, we forward PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET function
calls from vcpu to user space (i.e. QEMU or KVMTOOL) via kvm_run structure
using KVM_EXIT_SYSTEM_EVENT exit reasons.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:58 -07:00
Anup Patel
e8e7fcc5e2 ARM/ARM64: KVM: Make kvm_psci_call() return convention more flexible
Currently, the kvm_psci_call() returns 'true' or 'false' based on whether
the PSCI function call was handled successfully or not. This does not help
us emulate system-level PSCI functions where the actual emulation work will
be done by user space (QEMU or KVMTOOL). Examples of such system-level PSCI
functions are: PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET.

This patch updates kvm_psci_call() to return three types of values:
1) > 0 (success)
2) = 0 (success but exit to user space)
3) < 0 (errors)

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:57 -07:00
Anup Patel
7d0f84aae9 ARM/ARM64: KVM: Add base for PSCI v0.2 emulation
Currently, the in-kernel PSCI emulation provides PSCI v0.1 interface to
VCPUs. This patch extends current in-kernel PSCI emulation to provide
PSCI v0.2 interface to VCPUs.

By default, ARM/ARM64 KVM will always provide PSCI v0.1 interface for
keeping the ABI backward-compatible.

To select PSCI v0.2 interface for VCPUs, the user space (i.e. QEMU or
KVMTOOL) will have to set KVM_ARM_VCPU_PSCI_0_2 feature when doing VCPU
init using KVM_ARM_VCPU_INIT ioctl.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-30 04:18:57 -07:00
Mark Salter
5d4e08c45a arm: KVM: fix possible misalignment of PGDs and bounce page
The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
a bounce page (if hypervisor init code crosses page boundary) and
hypervisor PGDs. The problem is that kalloc() does not guarantee
the proper alignment. In the case of the bounce page, the page sized
buffer allocated may also cross a page boundary negating the purpose
and leading to a hang during kvm initialization. Likewise the PGDs
allocated may not meet the minimum alignment requirements of the
underlying MMU. This patch uses __get_free_page() to guarantee the
worst case alignment needs of the bounce page and PGDs on both arm
and arm64.

Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-28 03:21:48 -07:00
Will Deacon
4e4468fac4 ARM: KVM: disable KVM in Kconfig on big-endian systems
KVM currently crashes and burns on big-endian hosts, so don't allow it
to be selected until we've got that fixed.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-26 04:20:03 -07:00
Linus Torvalds
467a9e1633 CPU hotplug notifiers registration fixes for 3.15-rc1
The purpose of this single series of commits from Srivatsa S Bhat (with
 a small piece from Gautham R Shenoy) touching multiple subsystems that use
 CPU hotplug notifiers is to provide a way to register them that will not
 lead to deadlocks with CPU online/offline operations as described in the
 changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
 of callback registration functions).
 
 The first three commits in the series introduce the API and document it
 and the rest simply goes through the users of CPU hotplug notifiers and
 converts them to using the new method.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
 4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
 KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
 BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
 rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
 j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
 ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
 lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
 IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
 cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
 StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
 sWl//rg76nb13dFjvF+q
 =SW7Q
 -----END PGP SIGNATURE-----

Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
 "The purpose of this single series of commits from Srivatsa S Bhat
  (with a small piece from Gautham R Shenoy) touching multiple
  subsystems that use CPU hotplug notifiers is to provide a way to
  register them that will not lead to deadlocks with CPU online/offline
  operations as described in the changelog of commit 93ae4f978c ("CPU
  hotplug: Provide lockless versions of callback registration
  functions").

  The first three commits in the series introduce the API and document
  it and the rest simply goes through the users of CPU hotplug notifiers
  and converts them to using the new method"

* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  net/iucv/iucv.c: Fix CPU hotplug callback registration
  net/core/flow.c: Fix CPU hotplug callback registration
  mm, zswap: Fix CPU hotplug callback registration
  mm, vmstat: Fix CPU hotplug callback registration
  profile: Fix CPU hotplug callback registration
  trace, ring-buffer: Fix CPU hotplug callback registration
  xen, balloon: Fix CPU hotplug callback registration
  hwmon, via-cputemp: Fix CPU hotplug callback registration
  hwmon, coretemp: Fix CPU hotplug callback registration
  thermal, x86-pkg-temp: Fix CPU hotplug callback registration
  octeon, watchdog: Fix CPU hotplug callback registration
  oprofile, nmi-timer: Fix CPU hotplug callback registration
  intel-idle: Fix CPU hotplug callback registration
  clocksource, dummy-timer: Fix CPU hotplug callback registration
  drivers/base/topology.c: Fix CPU hotplug callback registration
  acpi-cpufreq: Fix CPU hotplug callback registration
  zsmalloc: Fix CPU hotplug callback registration
  scsi, fcoe: Fix CPU hotplug callback registration
  scsi, bnx2fc: Fix CPU hotplug callback registration
  scsi, bnx2i: Fix CPU hotplug callback registration
  ...
2014-04-07 14:55:46 -07:00
Srivatsa S. Bhat
8146875de7 arm, kvm: Fix CPU hotplug callback registration
On 03/15/2014 12:40 AM, Christoffer Dall wrote:
> On Fri, Mar 14, 2014 at 11:13:29AM +0530, Srivatsa S. Bhat wrote:
>> On 03/13/2014 04:51 AM, Christoffer Dall wrote:
>>> On Tue, Mar 11, 2014 at 02:05:38AM +0530, Srivatsa S. Bhat wrote:
>>>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>>>> initialization for the CPUs that are already online, often do it as shown
>>>> below:
>>>>
[...]
>>> Just so we're clear, the existing code was simply racy as not prone to
>>> deadlocks, right?
>>>
>>> This makes it clear that the test above for compatible CPUs can be quite
>>> easily evaded by using CPU hotplug, but we don't really have a good
>>> solution for handling that yet...  Hmmm, grumble grumble, I guess if you
>>> hotplug unsupported CPUs on a KVM/ARM system for now, stuff will break.
>>>
>>
>> In this particular case, there was no deadlock possibility, rather the
>> existing code had insufficient synchronization against CPU hotplug.
>>
>> init_hyp_mode() would invoke cpu_init_hyp_mode() on currently online CPUs
>> using on_each_cpu(). If a CPU came online after this point and before calling
>> register_cpu_notifier(), that CPU would remain uninitialized because this
>> subsystem would miss the hot-online event. This patch fixes this bug and
>> also uses the new synchronization method (instead of get/put_online_cpus())
>> to ensure that we don't deadlock with CPU hotplug.
>>
>
> Yes, that was my conclusion as well.  Thanks for clarifying.  (It could
> be noted in the commit message as well if you should feel so inclined).
>

Please find the patch with updated changelog (and your Ack) below.
(No changes in code).

From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Subject: [PATCH] arm, kvm: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

	cpu_notifier_register_begin();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	/* Note the use of the double underscored version of the API */
	__register_cpu_notifier(&foobar_cpu_notifier);

	cpu_notifier_register_done();

In the existing arm kvm code, there is no synchronization with CPU hotplug
to avoid missing the hotplug events that might occur after invoking
init_hyp_mode() and before calling register_cpu_notifier(). Fix this bug
and also use the new synchronization method (instead of get/put_online_cpus())
to ensure that we don't deadlock with CPU hotplug.

Cc: Gleb Natapov <gleb@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20 13:43:41 +01:00
Marc Zyngier
56041bf920 ARM: KVM: fix warning in mmu.c
Compiling with THP enabled leads to the following warning:

arch/arm/kvm/mmu.c: In function ‘unmap_range’:
arch/arm/kvm/mmu.c:177:39: warning: ‘pte’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   if (kvm_pmd_huge(*pmd) || page_empty(pte)) {
                                        ^
Code inspection reveals that these two cases are mutually exclusive,
so GCC is a bit overzealous here. Silence it anyway by initializing
pte to NULL and testing it later on.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03 01:15:25 +00:00
Marc Zyngier
8034699a42 ARM: KVM: trap VM system registers until MMU and caches are ON
In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03 01:15:24 +00:00
Marc Zyngier
af20814ee9 ARM: KVM: add world-switch for AMAIR{0,1}
HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1.
In order to minimise the amount of surprise a guest could generate by
trying to access these registers with caches off, add them to the
list of registers we switch/handle.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-03 01:15:24 +00:00
Marc Zyngier
ac30a11e8e ARM: KVM: introduce per-vcpu HYP Configuration Register
So far, KVM/ARM used a fixed HCR configuration per guest, except for
the VI/VF/VA bits to control the interrupt in absence of VGIC.

With the upcoming need to dynamically reconfigure trapping, it becomes
necessary to allow the HCR to be changed on a per-vcpu basis.

The fix here is to mimic what KVM/arm64 already does: a per vcpu HCR
field, initialized at setup time.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-03 01:15:23 +00:00
Marc Zyngier
547f781378 ARM: KVM: fix ordering of 64bit coprocessor accesses
Commit 240e99cbd0 (ARM: KVM: Fix 64-bit coprocessor handling)
added an ordering dependency for the 64bit registers.

The order described is: CRn, CRm, Op1, Op2, 64bit-first.

Unfortunately, the implementation is: CRn, 64bit-first, CRm...

Move the 64bit test to be last in order to match the documentation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-03 01:15:23 +00:00
Marc Zyngier
46c214dd59 ARM: KVM: fix handling of trapped 64bit coprocessor accesses
Commit 240e99cbd0 (ARM: KVM: Fix 64-bit coprocessor handling)
changed the way we match the 64bit coprocessor access from
user space, but didn't update the trap handler for the same
set of registers.

The effect is that a trapped 64bit access is never matched, leading
to a fault being injected into the guest. This went unnoticed as we
didn't really trap any 64bit register so far.

Placing the CRm field of the access into the CRn field of the matching
structure fixes the problem. Also update the debug feature to emit the
expected string in case of failing match.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-03 01:15:23 +00:00
Marc Zyngier
9d218a1fcf arm64: KVM: flush VM pages before letting the guest enable caches
When the guest runs with caches disabled (like in an early boot
sequence, for example), all the writes are diectly going to RAM,
bypassing the caches altogether.

Once the MMU and caches are enabled, whatever sits in the cache
becomes suddenly visible, which isn't what the guest expects.

A way to avoid this potential disaster is to invalidate the cache
when the MMU is being turned on. For this, we hook into the SCTLR_EL1
trapping code, and scan the stage-2 page tables, invalidating the
pages/sections that have already been mapped in.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03 01:15:22 +00:00
Marc Zyngier
a3c8bd31af ARM: KVM: introduce kvm_p*d_addr_end
The use of p*d_addr_end with stage-2 translation is slightly dodgy,
as the IPA is 40bits, while all the p*d_addr_end helpers are
taking an unsigned long (arm64 is fine with that as unligned long
is 64bit).

The fix is to introduce 64bit clean versions of the same helpers,
and use them in the stage-2 page table code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03 01:15:22 +00:00
Marc Zyngier
2d58b733c8 arm64: KVM: force cache clean on page fault when caches are off
In order for the guest with caches off to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as
guest accesses are completely bypassing the cache until it
decides to enable it).

For this purpose, hook into the coherent_icache_guest_page
function and flush the region if the guest SCTLR_EL1
register doesn't show the MMU  and caches as being enabled.
The function also get renamed to coherent_cache_guest_page.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-03-03 01:15:20 +00:00
Marc Zyngier
b20c9f29c5 arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT
Commit 1fcf7ce0c6 (arm: kvm: implement CPU PM notifier) added
support for CPU power-management, using a cpu_notifier to re-init
KVM on a CPU that entered CPU idle.

The code assumed that a CPU entering idle would actually be powered
off, loosing its state entierely, and would then need to be
reinitialized. It turns out that this is not always the case, and
some HW performs CPU PM without actually killing the core. In this
case, we try to reinitialize KVM while it is still live. It ends up
badly, as reported by Andre Przywara (using a Calxeda Midway):

[    3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760
[    3.663897] unexpected data abort in Hyp mode at: 0xc067d150
[    3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0

The trick here is to detect if we've been through a full re-init or
not by looking at HVBAR (VBAR_EL2 on arm64). This involves
implementing the backend for __hyp_get_vectors in the main KVM HYP
code (rather small), and checking the return value against the
default one when the CPU notifier is called on CPU_PM_EXIT.

Reported-by: Andre Przywara <osp@andrep.de>
Tested-by: Andre Przywara <osp@andrep.de>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <rob.herring@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-27 19:27:10 +01:00
Linus Torvalds
7ebd3faa9b First round of KVM updates for 3.14; PPC parts will come next week.
Nothing major here, just bugfixes all over the place.  The most
 interesting part is the ARM guys' virtualized interrupt controller
 overhaul, which lets userspace get/set the state and thus enables
 migration of ARM VMs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU
 uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH
 5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0
 iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8
 FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd
 Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ
 dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z
 2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR
 l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy
 X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF
 hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu
 x7N8KR5hAj+mLBoM9/Al
 =8sYU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "First round of KVM updates for 3.14; PPC parts will come next week.

  Nothing major here, just bugfixes all over the place.  The most
  interesting part is the ARM guys' virtualized interrupt controller
  overhaul, which lets userspace get/set the state and thus enables
  migration of ARM VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
  kvm: make KVM_MMU_AUDIT help text more readable
  KVM: s390: Fix memory access error detection
  KVM: nVMX: Update guest activity state field on L2 exits
  KVM: nVMX: Fix nested_run_pending on activity state HLT
  KVM: nVMX: Clean up handling of VMX-related MSRs
  KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
  KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
  KVM: nVMX: Leave VMX mode on clearing of feature control MSR
  KVM: VMX: Fix DR6 update on #DB exception
  KVM: SVM: Fix reading of DR6
  KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
  add support for Hyper-V reference time counter
  KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
  KVM: x86: fix tsc catchup issue with tsc scaling
  KVM: x86: limit PIT timer frequency
  KVM: x86: handle invalid root_hpa everywhere
  kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
  kvm: vfio: silence GCC warning
  KVM: ARM: Remove duplicate include
  arm/arm64: KVM: relax the requirements of VMA alignment for THP
  ...
2014-01-22 21:40:43 -08:00
Paolo Bonzini
ab53f22e2e Merge tag 'kvm-arm-for-3.14' of git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-queue 2014-01-15 12:14:29 +01:00
Sachin Kamat
61466710de KVM: ARM: Remove duplicate include
trace.h was included twice. Remove duplicate inclusion.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-01-08 13:54:00 -08:00
Marc Zyngier
136d737fd2 arm/arm64: KVM: relax the requirements of VMA alignment for THP
The THP code in KVM/ARM is a bit restrictive in not allowing a THP
to be used if the VMA is not 2MB aligned. Actually, it is not so much
the VMA that matters, but the associated memslot:

A process can perfectly mmap a region with no particular alignment
restriction, and then pass a 2MB aligned address to KVM. In this
case, KVM will only use this 2MB aligned region, and will ignore
the range between vma->vm_start and memslot->userspace_addr.

It can also choose to place this memslot at whatever alignment it
wants in the IPA space. In the end, what matters is the relative
alignment of the user space and IPA mappings with respect to a
2M page. They absolutely must be the same if you want to use THP.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-01-08 13:49:03 -08:00
Christoffer Dall
e9b152cb95 arm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put
The arch-generic KVM code expects the cpu field of a vcpu to be -1 if
the vcpu is no longer assigned to a cpu.  This is used for the optimized
make_all_cpus_request path and will be used by the vgic code to check
that no vcpus are running.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:01:34 -08:00
Christoffer Dall
ce01e4e887 KVM: arm-vgic: Set base addr through device API
Support setting the distributor and cpu interface base addresses in the
VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API
in addition to the ARM specific API.

This has the added benefit of being able to share more code in user
space and do things in a uniform manner.

Also deprecate the older API at the same time, but backwards
compatibility will be maintained.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:01:22 -08:00
Christoffer Dall
7330672bef KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
ioctl, which can then later be leveraged to use the
KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
a more generic API than the ARM-specific one and is useful for
save/restore of VGIC state.

Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.

Note that we change the check for creating a VGIC from bailing out if
any VCPUs were created, to bailing out if any VCPUs were ever run.  This
is an important distinction that shouldn't break anything, but allows
creating the VGIC after the VCPUs have been created.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:01:16 -08:00
Christoffer Dall
e1ba0207a1 ARM: KVM: Allow creating the VGIC after VCPUs
Rework the VGIC initialization slightly to allow initialization of the
vgic cpu-specific state even if the irqchip (the VGIC) hasn't been
created by user space yet.  This is safe, because the vgic data
structures are already allocated when the CPU is allocated if VGIC
support is compiled into the kernel.  Further, the init process does not
depend on any other information and the sacrifice is a slight
performance degradation for creating VMs in the no-VGIC case.

The reason is that the new device control API doesn't mandate creating
the VGIC before creating the VCPU and it is unreasonable to require user
space to create the VGIC before creating the VCPUs.

At the same time move the irqchip_in_kernel check out of
kvm_vcpu_first_run_init and into the init function to make the per-vcpu
and global init functions symmetric and add comments on the exported
functions making it a bit easier to understand the init flow by only
looking at vgic.c.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:01:06 -08:00
Andre Przywara
39735a3a39 ARM/KVM: save and restore generic timer registers
For migration to work we need to save (and later restore) the state of
each core's virtual generic timer.
Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
the three needed registers (control, counter, compare value).
Though they live in cp15 space, we don't use the existing list, since
they need special accessor functions and the arch timer is optional.

Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:00:15 -08:00
Christoffer Dall
a1a64387ad arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init
Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs
at the first time because that will overwrite any potentially restored
values from user space.

Cc: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 09:58:57 -08:00
Christoffer Dall
478a8237f6 arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive
The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
waitqueue for the corresponding CPU is not active.  This does not seem
correct, since KVM should not care what the specific thread is doing,
for example, user space may not have called KVM_RUN on this VCPU yet or
the thread may be busy looping to user space because it received a
signal; this is really up to the user space implementation.  Instead we
should check specifically that the CPU is marked as being turned off,
regardless of the VCPU thread state, and if it is, we shall
simply clear the pause flag on the CPU and wake up the thread if it
happens to be blocked for us.

Further, the implementation seems to be racy when executing multiple
VCPU threads.  There really isn't a reasonable user space programming
scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
before turning on the boot CPU.

Therefore, set the pause flag on the vcpu at VCPU init time (which can
reasonably be expected to be completed for all CPUs by user space before
running any VCPUs) and clear both this flag and the feature (in case the
feature can somehow get set again in the future) and ping the waitqueue
on turning on a VCPU using PSCI.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 09:55:17 -08:00
Lorenzo Pieralisi
1fcf7ce0c6 arm: kvm: implement CPU PM notifier
Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state
must be re-initialized. This patch implements a CPU PM notifier that
upon warm-boot calls a KVM hook to reinitialize properly the hypervisor
state so that the CPU can be safely resumed.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-16 17:17:33 +00:00
Santosh Shilimkar
4fda342cc7 arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings
KVM initialisation fails on architectures implementing virt_to_idmap()
because virt_to_phys() on such architectures won't fetch you the correct
idmap page.

So update the KVM ARM code to use the virt_to_idmap() to fix the issue.
Since the KVM code is shared between arm and arm64, we create
kvm_virt_to_phys() and handle the redirection in respective headers.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-11 09:49:31 -08:00
Gleb Natapov
2ecd1aba59 Fix percpu vmalloc allocations
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJSiDFHAAoJEEtpOizt6ddyCWQH/jmTxPGBK3Bx1qTr25izgW8Z
 69SYTfkRoCV/3O3VWh9LiOSz9zrnCm0ODopQmMUDbEYF8KnzMiLASZ5mf3xmvoo1
 3KB/upM6DXEd2u2ZEQKnwp0cQZmcTzXM0PfGZ86xlmLFH6zjMt0iPa/FwflQMWsM
 QJyRJOMrB/WyLbY77FdsyK8RD4fJy7mlLS6LSXnnDb5gMeiPu8d9PWXdlIwCsEe3
 Zk0AAyJBXYn0yjzrZ/5HbUlmTBMLbEIomzCqCVAglpiXUb7E0w7YoyTc3dagyMyK
 zZ+6OqJAZpIi5tliGuhadLsioinM/L9xfnOFtd8jtvPqZbYcetkphTbBg9Bj4mw=
 =VmDY
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-fixes-3.13-1' of git://git.linaro.org/people/cdall/linux-kvm-arm into next

Fix percpu vmalloc allocations
2013-11-19 10:43:05 +02:00
Christoffer Dall
40c2729bab arm/arm64: KVM: Fix hyp mappings of vmalloc regions
Using virt_to_phys on percpu mappings is horribly wrong as it may be
backed by vmalloc.  Introduce kvm_kaddr_to_phys which translates both
types of valid kernel addresses to the corresponding physical address.

At the same time resolves a typing issue where we were storing the
physical address as a 32 bit unsigned long (on arm), truncating the
physical address for addresses above the 4GB limit.  This caused
breakage on Keystone.

Cc: <stable@vger.kernel.org>	[3.10+]
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-11-16 18:54:45 -08:00
Linus Torvalds
f080480488 Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
 is probably the most interesting change from a user point of view.
 On the x86 side there are nested virtualization improvements and a
 few bugfixes.  ARM got transparent huge page support, improved
 overcommit, and support for big endian guests.
 
 Finally, there is a new interface to connect KVM with VFIO.  This
 helps with devices that use NoSnoop PCI transactions, letting the
 driver in the guest execute WBINVD instructions.  This includes
 some nVidia cards on Windows, that fail to start without these
 patches and the corresponding userspace changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu
 L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq
 XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp
 db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu
 w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq
 vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc
 dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC
 t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc
 0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1
 7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R
 qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH
 NTmT/cfmYp2BRkiCfCiS
 =rWNf
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM changes from Paolo Bonzini:
 "Here are the 3.13 KVM changes.  There was a lot of work on the PPC
  side: the HV and emulation flavors can now coexist in a single kernel
  is probably the most interesting change from a user point of view.

  On the x86 side there are nested virtualization improvements and a few
  bugfixes.

  ARM got transparent huge page support, improved overcommit, and
  support for big endian guests.

  Finally, there is a new interface to connect KVM with VFIO.  This
  helps with devices that use NoSnoop PCI transactions, letting the
  driver in the guest execute WBINVD instructions.  This includes some
  nVidia cards on Windows, that fail to start without these patches and
  the corresponding userspace changes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
  kvm, vmx: Fix lazy FPU on nested guest
  arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
  arm/arm64: KVM: MMIO support for BE guest
  kvm, cpuid: Fix sparse warning
  kvm: Delete prototype for non-existent function kvm_check_iopl
  kvm: Delete prototype for non-existent function complete_pio
  hung_task: add method to reset detector
  pvclock: detect watchdog reset at pvclock read
  kvm: optimize out smp_mb after srcu_read_unlock
  srcu: API for barrier after srcu read unlock
  KVM: remove vm mmap method
  KVM: IOMMU: hva align mapping page size
  KVM: x86: trace cpuid emulation when called from emulator
  KVM: emulator: cleanup decode_register_operand() a bit
  KVM: emulator: check rex prefix inside decode_register()
  KVM: x86: fix emulation of "movzbl %bpl, %eax"
  kvm_host: typo fix
  KVM: x86: emulate SAHF instruction
  MAINTAINERS: add tree for kvm.git
  Documentation/kvm: add a 00-INDEX file
  ...
2013-11-15 13:51:36 +09:00
Paolo Bonzini
ede5822242 A handful of fixes for KVM/arm64:
- A couple a basic fixes for running BE guests on a LE host
 - A performance improvement for overcommitted VMs (same as the equivalent
   patch for ARM)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJSfLSVAAoJECPQ0LrRPXpDMG0QAJrTocErN2BQMDoT9DpcQhh6
 yoD6KjbS3O4lWz60wJ0BgJ6gcQFg7JiFrPk6JcyT+ykXYf1UuLymUhAkU7Sw+0lP
 GVt7sr2SaaQd6ZjGphWyWPXuDbvN1CxyIi7TD4CNe0tTYwSI6Vaf19h2Bkjd+VfJ
 o2Sf2zHz4mutTCmPJuqnI255MLveTyQr/VZT1xNS79FiJM3/j3+UxCEi1fwgTkkb
 4l3AyW9RN2mmTsS4VE6an2iosCi9pqoAC3y88vnaeBpUHIf/O2O57sT0+8o7jM6z
 6uZVesMsKNmDtkMUFRQyj4Mps3yIVcccDpFJr4UcZH+ipbM+5nPY8AlYcKqk+4KY
 T7Zys1hITq7xSEK4HiQt+AJnXDXZF5YZnzqUqVQHZFBn5P1GfB4/Bo9E3+QG68oq
 AO1ry6SiRRPmTAZVeYqV82DX1YSjbvghvvPXhtPvolNhyzJJooBvhpWfGthVeZds
 tuazKvvwDv0pFEWSwiFvWyqGW4FHKz3vWSUfuR1MF2P86fIfT5buJ9/StMxiRRH8
 tSwoV3Ksut2kX9l5o8MZqmv1UkH88hwxxxd2J2aHGU+XPLlbQaYSWSK0d6hKTLMZ
 ErZmCK/BARdll9UiTdXZ8h7UjfVDfuhTVeW1PvKQJ7sLGCDeT0VfhcrXnTtr41Je
 iLWeDea0NJBzQxRZoJHG
 =s0ny
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm64/for-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next

A handful of fixes for KVM/arm64:

- A couple a basic fixes for running BE guests on a LE host
- A performance improvement for overcommitted VMs (same as the equivalent
  patch for ARM)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Conflicts:
	arch/arm/include/asm/kvm_emulate.h
	arch/arm64/include/asm/kvm_emulate.h
2013-11-11 12:05:20 +01:00
Paolo Bonzini
6da8ae556c Updates for KVM/ARM, take 3 supporting more than 4 CPUs.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJSfD2NAAoJEEtpOizt6ddyK0IH+wQf6Hwe2nLAhj86knqDHsPt
 LgJ6ZSY2bhWTKhCVDkH4HQt6ZqWEV7P8HsLNLc9FxjxCIgGO6Lp6Obv6sYscZrvh
 OdzsZ/+j1t035qmeLwBJnB2x+j21ACd5LYVaWkfJPmGJ40KrcgP/t3fj3r1le91Z
 jUM5BZY8fbEpJ1JaLoYZIEm1nYQJ4cabkqb9dFieqbVB1OoYw5W7KCV0wazOpg+f
 T2WL8Dy+lP8DfGRrEjsIM299DlMAFfUCj7mgfO3yTDUK0Q6Q3ZN7f394c+04OfZv
 /V1fM4y4X1i/2wlL3oPaf1WtYhwd2VxJwo1n3SLg1b9gWghYnqUR/vBHrUmzX/A=
 =2ESp
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.13-3' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-next

Updates for KVM/ARM, take 3 supporting more than 4 CPUs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Conflicts:
	arch/arm/kvm/reset.c [cpu_reset->reset_regs change; context only]
2013-11-11 12:02:27 +01:00
Marc Zyngier
ce94fe93d5 arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
When booting a vcpu using PSCI, make sure we start it with the
endianness of the caller. Otherwise, secondaries can be pretty
unhappy to execute a BE kernel in LE mode...

This conforms to PSCI spec Rev B, 5.13.3.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-11-07 19:09:08 +00:00
Marc Zyngier
6d89d2d9b5 arm/arm64: KVM: MMIO support for BE guest
Do the necessary byteswap when host and guest have different
views of the universe. Actually, the only case we need to take
care of is when the guest is BE. All the other cases are naturally
handled.

Also be careful about endianness when the data is being memcopy-ed
from/to the run buffer.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-11-07 19:09:04 +00:00
Gleb Natapov
95f328d3ad Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue
Conflicts:
	arch/powerpc/include/asm/processor.h
2013-11-04 10:20:57 +02:00
Christoph Lameter
1436c1aa62 ARM: 7862/1: pcpu: replace __get_cpu_var_uses
This is the ARM part of Christoph's patchset cleaning up the various
uses of __get_cpu_var across the tree.

The idea is to convert __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations
that use the offset. Thereby address calculations are avoided and fewer
registers are used when code is generated.

[will: fixed debug ref counting checks and pcpu array accesses]

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-29 11:06:27 +00:00
Paolo Bonzini
5bb3398dd2 Updates for KVM/ARM, take 2 including:
- Transparent Huge Pages and hugetlbfs support for KVM/ARM
  - Yield CPU when guest executes WFE to speed up CPU overcommit
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJSZ5NVAAoJEEtpOizt6ddyEJgH+wWw6KWqHlParb+rf04cqCQV
 Fj3euz+SpYr2U2u0RimgkmeahUiGUhnlBSSH+tkLmt1if6nLawBJbUcIhaZMVdv+
 cvS6k+NtK7ibwPOyFeoZCS8taEbVDut2YgrtRKbne6QDLRYBEXFtpY8o6ptLoSu4
 ifQCF0FZyElCGLylSxFt9GsK+LjNjQWatVrzoHap9d58u2bma6GYwr4mEzVMHms7
 REtTvpwWgsDR5C/69aG8wE4cpJZALH3OeCgy6AccdzTLaQWWpK2YLWz8AFOvoYx6
 EsFmBFHZYcuwN+fv2jILgA3Is1oWwqI6k5lL+N3g/oTNNALDSWnfiJkXypJsfow=
 =2Ijm
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.13-2' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-queue

Updates for KVM/ARM, take 2 including:
 - Transparent Huge Pages and hugetlbfs support for KVM/ARM
 - Yield CPU when guest executes WFE to speed up CPU overcommit
2013-10-28 13:15:55 +01:00
Marc Zyngier
79c648806f arm/arm64: KVM: PSCI: use MPIDR to identify a target CPU
The KVM PSCI code blindly assumes that vcpu_id and MPIDR are
the same thing. This is true when vcpus are organized as a flat
topology, but is wrong when trying to emulate any other topology
(such as A15 clusters).

Change the KVM PSCI CPU_ON code to look at the MPIDR instead
of the vcpu_id to pick a target CPU.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Marc Zyngier
7999b4d182 ARM: KVM: drop limitation to 4 CPU VMs
Now that the KVM/arm code knows about affinity, remove the hard
limit of 4 vcpus per VM.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Marc Zyngier
9cbb6d969c ARM: KVM: fix L2CTLR to be per-cluster
The L2CTLR register contains the number of CPUs in this cluster.

Make sure the register content is actually relevant to the vcpu
that is being configured by computing the number of cores that are
part of its cluster.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Marc Zyngier
2d1d841bd4 ARM: KVM: Fix MPIDR computing to support virtual clusters
In order to be able to support more than 4 A7 or A15 CPUs,
we need to fix the MPIDR computing to reflect the fact that
both A15 and A7 can only exist in clusters of at most 4 CPUs.

Fix the MPIDR computing to allow virtual clusters to be exposed
to the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Christoffer Dall
9b5fdb9781 KVM: ARM: Transparent huge page (THP) support
Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 17:06:30 -07:00
Christoffer Dall
ad361f093c KVM: ARM: Support hugetlbfs backed huge pages
Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 17:06:20 -07:00
Christoffer Dall
86ed81aa2e KVM: ARM: Update comments for kvm_handle_wfi
Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 15:26:50 -07:00
Marc Zyngier
58d5ec8f8e ARM: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 15:26:50 -07:00
Gleb Natapov
13acfd5715 Powerpc KVM work is based on a commit after rc4.
Merging master into next to satisfy the dependencies.

Conflicts:
	arch/arm/kvm/reset.c
2013-10-17 17:41:49 +03:00
Aneesh Kumar K.V
5587027ce9 kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 15:49:23 +02:00
Jonathan Austin
e8c2d99f82 KVM: ARM: Add support for Cortex-A7
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12 17:45:30 -07:00
Jonathan Austin
1158fca401 KVM: ARM: Fix calculation of virtual CPU ID
KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12 17:44:39 -07:00
Anup Patel
42c4e0c77a ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl
For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.

This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:48 -07:00
Anup Patel
4a6fee805d ARM: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl
for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:10 -07:00
Anup Patel
b373e492f3 KVM: ARM: Fix typo in comments of inject_abt()
Very minor typo in comments of inject_abt() when we update fault status
register for injecting prefetch abort.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 17:29:19 +01:00
Olof Johansson
ac570e0493 ARM: kvm: rename cpu_reset to avoid name clash
cpu_reset is already #defined in <asm/proc-fns.h> as processor.reset,
so it expands here and causes problems.

Cc: <stable@vger.kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-09-24 11:15:05 -07:00
Linus Torvalds
2e03285224 Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "This set includes adding support for Neon acceleration of RAID6 XOR
  code from Ard Biesheuvel, cache flushing and barrier updates from Will
  Deacon, and a cleanup to the ARM debug code which reduces the amount
  of code by about 500 lines.

  A few other cleanups, such as constifying the machine descriptors
  which already shouldn't be written to, cleaning up the printing of the
  L2 cache size"

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (55 commits)
  ARM: 7826/1: debug: support debug ll on hisilicon soc
  ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo
  ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables
  ARM: 7828/1: ARMv7-M: implement restart routine common to all v7-M machines
  ARM: 7827/1: highbank: fix debug uart virtual address for LPAE
  ARM: 7823/1: errata: workaround Cortex-A15 erratum 773022
  ARM: 7806/1: allow DEBUG_UNCOMPRESS for Tegra
  ARM: 7793/1: debug: use generic option for ep93xx PL10x debug port
  ARM: debug: move SPEAr debug to generic PL01x code
  ARM: debug: move davinci debug to generic 8250 code
  ARM: debug: move keystone debug to generic 8250 code
  ARM: debug: remove DEBUG_ROCKCHIP_UART
  ARM: debug: provide generic option choices for 8250 and PL01x ports
  ARM: debug: move PL01X debug include into arch/arm/include/debug/
  ARM: debug: provide PL01x debug uart phys/virt address configuration options
  ARM: debug: add support for word accesses to debug/8250.S
  ARM: debug: move 8250 debug include into arch/arm/include/debug/
  ARM: debug: provide 8250 debug uart phys/virt address configuration options
  ARM: debug: provide 8250 debug uart register shift configuration option
  ARM: debug: provide 8250 debug uart flow control configuration option
  ...
2013-09-05 18:07:32 -07:00
Russell King
141b97433d Merge branches 'debug-choice', 'devel-stable' and 'misc' into for-linus 2013-09-05 10:34:15 +01:00
Linus Torvalds
ae7a835cc5 Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Gleb Natapov:
 "The highlights of the release are nested EPT and pv-ticketlocks
  support (hypervisor part, guest part, which is most of the code, goes
  through tip tree).  Apart of that there are many fixes for all arches"

Fix up semantic conflicts as discussed in the pull request thread..

* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
  ARM: KVM: Add newlines to panic strings
  ARM: KVM: Work around older compiler bug
  ARM: KVM: Simplify tracepoint text
  ARM: KVM: Fix kvm_set_pte assignment
  ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
  ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
  ARM: KVM: vgic: fix GICD_ICFGRn access
  ARM: KVM: vgic: simplify vgic_get_target_reg
  KVM: MMU: remove unused parameter
  KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
  KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
  KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
  KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
  KVM: PPC: Book3S: Fix compile error in XICS emulation
  KVM: PPC: Book3S PR: return appropriate error when allocation fails
  arch: powerpc: kvm: add signed type cast for comparation
  KVM: x86: add comments where MMIO does not return to the emulator
  KVM: vmx: count exits to userspace during invalid guest emulation
  KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
  kvm: optimize away THP checks in kvm_is_mmio_pfn()
  ...
2013-09-04 18:15:06 -07:00
Christoffer Dall
1fe40f6d39 ARM: KVM: Add newlines to panic strings
The panic strings are hard to read and on narrow terminals some
characters are simply truncated off the panic message.

Make is slightly prettier with a newline in the Hyp panic strings.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-30 15:48:02 -07:00
Christoffer Dall
6833d83891 ARM: KVM: Work around older compiler bug
Compilers before 4.6 do not behave well with unnamed fields in structure
initializers and therefore produces build errors:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676

By refering to the unnamed union using braces, both older and newer
compilers produce the same result.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Russell King <linux@arm.linux.org.uk>
Tested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-30 15:47:58 -07:00
Christoffer Dall
6e72cc5700 ARM: KVM: Simplify tracepoint text
The tracepoint for kvm_guest_fault was extremely long, make it a
slightly bit shorter.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-30 15:47:53 -07:00
Christoffer Dall
8947c09d05 ARM: 7808/1: KVM: mm: Get rid of L_PTE_USER ref from PAGE_S2_DEVICE
THe L_PTE_USER actually has nothing to do with stage 2 mappings and the
L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER
was used for before proper handling of stage 2 memory defines.

Changelog:
  [v3]: Drop call to kvm_set_s2pte_writable in mmu.c
  [v2]: Change default mappings to be r/w instead of r/o, as per Marc
     Zyngier's suggestion.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-13 20:25:06 +01:00
Will Deacon
e3ab547f57 ARM: kvm: use inner-shareable barriers after TLB flushing
When flushing the TLB at PL2 in response to remapping at stage-2 or VMID
rollover, we have a dsb instruction to ensure completion of the command
before continuing.

Since we only care about other processors for TLB invalidation, use the
inner-shareable variant of the dsb instruction instead.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:45 +01:00
Christoffer Dall
2184a60de2 KVM: ARM: Squash len warning
The 'len' variable was declared an unsigned and then checked for less
than 0, which results in warnings on some compilers.  Since len is
assigned an int, make it an int.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-11 21:03:39 -07:00
Marc Zyngier
979acd5e18 arm64: KVM: fix 2-level page tables unmapping
When using 64kB pages, we only have two levels of page tables,
meaning that PGD, PUD and PMD are fused. In this case, trying
to refcount PUDs and PMDs independently is a a complete disaster,
as they are the same.

We manage to get it right for the allocation (stage2_set_pte uses
{pmd,pud}_none), but the unmapping path clears both pud and pmd
refcounts, which fails spectacularly with 2-level page tables.

The fix is to avoid calling clear_pud_entry when both the pmd and
pud pages are empty. For this, and instead of introducing another
pud_empty function, consolidate both pte_empty and pmd_empty into
page_empty (the code is actually identical) and use that to also
test the validity of the pud.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-07 18:17:39 -07:00
Christoffer Dall
d3840b2661 ARM: KVM: Fix unaligned unmap_range leak
The unmap_range function did not properly cover the case when the start
address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table
or pmd table was cleared, causing us to leak memory when incrementing
the addr.

The fix is to always move onto the next page table entry boundary
instead of adding the full size of the VA range covered by the
corresponding table level entry.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-07 18:17:28 -07:00
Christoffer Dall
240e99cbd0 ARM: KVM: Fix 64-bit coprocessor handling
The PAR was exported as CRn == 7 and CRm == 0, but in fact the primary
coprocessor register number was determined by CRm for 64-bit coprocessor
registers as the user space API was modeled after the coprocessor
access instructions (see the ARM ARM rev. C - B3-1445).

However, just changing the CRn to CRm breaks the sorting check when
booting the kernel, because the internal kernel logic always treats CRn
as the primary register number, and it makes the table sorting
impossible to understand for humans.

Alternatively we could change the logic to always have CRn == CRm, but
that becomes unclear in the number of ways we do look up of a coprocessor
register.  We could also have a separate 64-bit table but that feels
somewhat over-engineered.  Instead, keep CRn the primary representation
of the primary coproc. register number in-kernel and always export the
primary number as CRm as per the existing user space ABI.

Note: The TTBR registers just magically worked because they happened to
follow the CRn(0) regs and were considered CRn(0) in the in-kernel
representation.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-06 11:32:30 -07:00
Takuya Yoshikawa
e59dbe09f8 KVM: Introduce kvm_arch_memslots_updated()
This is called right after the memslots is updated, i.e. when the result
of update_memslots() gets installed in install_new_memslots().  Since
the memslots needs to be updated twice when we delete or move a memslot,
kvm_arch_commit_memory_region() does not correspond to this exactly.

In the following patch, x86 will use this new API to check if the mmio
generation has reached its maximum value, in which case mmio sptes need
to be flushed out.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-18 12:29:25 +02:00
Linus Torvalds
fe489bf450 KVM fixes for 3.11
On the x86 side, there are some optimizations and documentation updates.
 The big ARM/KVM change for 3.11, support for AArch64, will come through
 Catalin Marinas's tree.  s390 and PPC have misc cleanups and bugfixes.
 
 There is a conflict due to "s390/pgtable: fix ipte notify bit" having
 entered 3.10 through Martin Schwidefsky's s390 tree.  This pull request
 has additional changes on top, so this tree's version is the correct one.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0oU6AAoJEBvWZb6bTYbynnsP/RSUrrHrA8Wu1tqVfAKu+1y5
 6OIihqZ9x11/YMaNofAfv86jqxFu0/j7CzMGphNdjzujqKI+Q1tGe7oiVCmKzoG+
 UvSctWsz0lpllgBtnnrm5tcfmG6rrddhLtpA7m320+xCVx8KV5P4VfyHZEU+Ho8h
 ziPmb2mAQ65gBNX6nLHEJ3ITTgad6gt4NNbrKIYpyXuWZQJypzaRqT/vpc4md+Ed
 dCebMXsL1xgyb98EcnOdrWH1wV30MfucR7IpObOhXnnMKeeltqAQPvaOlKzZh4dK
 +QfxJfdRZVS0cepcxzx1Q2X3dgjoKQsHq1nlIyz3qu1vhtfaqBlixLZk0SguZ/R9
 1S1YqucZiLRO57RD4q0Ak5oxwobu18ZoqJZ6nledNdWwDe8bz/W2wGAeVty19ky0
 qstBdM9jnwXrc0qrVgZp3+s5dsx3NAm/KKZBoq4sXiDLd/yBzdEdWIVkIrU3X9wU
 3X26wOmBxtsB7so/JR7ciTsQHelmLicnVeXohAEP9CjIJffB81xVXnXs0P0SYuiQ
 RzbSCwjPzET4JBOaHWT0Dhv0DTS/EaI97KzlN32US3Bn3WiLlS1oDCoPFoaLqd2K
 LxQMsXS8anAWxFvexfSuUpbJGPnKSidSQoQmJeMGBa9QhmZCht3IL16/Fb641ToN
 xBohzi49L9FDbpOnTYfz
 =1zpG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "On the x86 side, there are some optimizations and documentation
  updates.  The big ARM/KVM change for 3.11, support for AArch64, will
  come through Catalin Marinas's tree.  s390 and PPC have misc cleanups
  and bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits)
  KVM: PPC: Ignore PIR writes
  KVM: PPC: Book3S PR: Invalidate SLB entries properly
  KVM: PPC: Book3S PR: Allow guest to use 1TB segments
  KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
  KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
  KVM: PPC: Book3S PR: Fix proto-VSID calculations
  KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
  KVM: Fix RTC interrupt coalescing tracking
  kvm: Add a tracepoint write_tsc_offset
  KVM: MMU: Inform users of mmio generation wraparound
  KVM: MMU: document fast invalidate all mmio sptes
  KVM: MMU: document fast invalidate all pages
  KVM: MMU: document fast page fault
  KVM: MMU: document mmio page fault
  KVM: MMU: document write_flooding_count
  KVM: MMU: document clear_spte_count
  KVM: MMU: drop kvm_mmu_zap_mmio_sptes
  KVM: MMU: init kvm generation close to mmio wrap-around value
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: fast invalidate all mmio sptes
  ...
2013-07-03 13:21:40 -07:00
Linus Torvalds
1873e50028 Main features:
- KVM and Xen ports to AArch64
 - Hugetlbfs and transparent huge pages support for arm64
 - Applied Micro X-Gene Kconfig entry and dts file
 - Cache flushing improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0bZAAAoJEGvWsS0AyF7xTEEP/R/aRoqWwbVAMlwAhujq616O
 t4RzIyBXZXqxS9I+raokCX4mgYxdeisJlzN2hoq73VEX2BQlXZoYh8vmfY9WeNSM
 2pdfif2HF7oo9ymCRyqfuhbumPrTyJhpbguzOYrxPqpp2f1hv2D8hbUJEFj429yL
 UjqTFoONngfouZmAlwrPGZQKhBI95vvN53yvDMH0PWfvpm07DKGIQMYp20y0pj8j
 slhLH3bh2kfpS1cf23JtH6IICwWD2pXW0POo569CfZry6bI74xve+Trcsm7iPnsO
 PSI1P046ME1mu3SBbKwiPIdN/FQqWwTHW07fvMmH/xuXu3Zs/mxgzi7vDzDrVvTg
 PJSbKWD6N/IPPwKS/gCUmWWDASO0bXx3KlDuRZqAjbRojs0UPUOTUhzJM/BHUms1
 vY2QS9lAm02LmZZrk1LeKKP85gB+qKQvHuOVhIOldWeLGKtsNufz1kynz6YTqsLq
 uUB55KwbhQ7q8+aoY6lWujqiTXMoLkBgGdjHs2I407PAv7ZjlhRWk2fIry7xJifp
 rKu2cIlWsRe4CGvGI410NvIJFrGvJAV4wA43sgBDjPumyILgT/5jw9r3RpJEBZZs
 akw/Bl1CbL+gMjyoPUWgcWZdRkUCE0eLrgyMOmaYfst8cOTaWw4dWLvUG/bBZg+Y
 mGnuEQUQtAPadk8P/Sv3
 =PZ/e
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull ARM64 updates from Catalin Marinas:
 "Main features:
   - KVM and Xen ports to AArch64
   - Hugetlbfs and transparent huge pages support for arm64
   - Applied Micro X-Gene Kconfig entry and dts file
   - Cache flushing improvements

  For arm64 huge pages support, there are x86 changes moving part of
  arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits)
  arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
  arm64: Add defines for APM ARMv8 implementation
  arm64: Enable APM X-Gene SOC family in the defconfig
  arm64: Add Kconfig option for APM X-Gene SOC family
  arm64/Makefile: provide vdso_install target
  ARM64: mm: THP support.
  ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
  ARM64: mm: HugeTLB support.
  ARM64: mm: Move PTE_PROT_NONE bit.
  ARM64: mm: Make PAGE_NONE pages read only and no-execute.
  ARM64: mm: Restore memblock limit when map_mem finished.
  mm: thp: Correct the HPAGE_PMD_ORDER check.
  x86: mm: Remove general hugetlb code from x86.
  mm: hugetlb: Copy general hugetlb code from x86 to mm.
  x86: mm: Remove x86 version of huge_pmd_share.
  mm: hugetlb: Copy huge_pmd_share from x86 to mm.
  arm64: KVM: document kernel object mappings in HYP
  arm64: KVM: MAINTAINERS update
  arm64: KVM: userspace API documentation
  arm64: KVM: enable initialization of a 32bit vcpu
  ...
2013-07-03 10:31:38 -07:00
Russell King
3c0c01ab74 Merge branch 'devel-stable' into for-next
Conflicts:
	arch/arm/Makefile
	arch/arm/include/asm/glue-proc.h
2013-06-29 11:44:43 +01:00
Arnd Bergmann
8bd4ffd6b3 ARM: kvm: don't include drivers/virtio/Kconfig
The virtio configuration has recently moved and is now visible everywhere.
Including the file again from KVM as we used to need earlier now causes
dependency problems:

warning: (CAIF_VIRTIO && VIRTIO_PCI && VIRTIO_MMIO && REMOTEPROC && RPMSG)
selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION)

Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-06-26 10:50:06 -07:00
Geoff Levand
f2dda9d829 arm/kvm: Cleanup KVM_ARM_MAX_VCPUS logic
Commit d21a1c83c7 (ARM: KVM: define KVM_ARM_MAX_VCPUS
unconditionally) changed the Kconfig logic for KVM_ARM_MAX_VCPUS to work around a
build error arising from the use of KVM_ARM_MAX_VCPUS when CONFIG_KVM=n.  The
resulting Kconfig logic is a bit awkward and leaves a KVM_ARM_MAX_VCPUS always
defined in the kernel config file.

This change reverts the Kconfig logic back and adds a simple preprocessor
conditional in kvm_host.h to handle when CONFIG_KVM_ARM_MAX_VCPUS is undefined.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-06-26 10:50:05 -07:00
Marc Zyngier
22cfbb6d73 ARM: KVM: clear exclusive monitor on all exception returns
Make sure we clear the exclusive monitor on all exception returns,
which otherwise could lead to lock corruptions.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-06-26 10:50:05 -07:00
Marc Zyngier
479c5ae2f8 ARM: KVM: add missing dsb before invalidating Stage-2 TLBs
When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.

For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa
before doing the TLB invalidation itself.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-06-26 10:50:04 -07:00
Marc Zyngier
6a077e4ab9 ARM: KVM: perform save/restore of PAR
Not saving PAR is an unfortunate oversight. If the guest performs
an AT* operation and gets scheduled out before reading the result
of the translation from PAR, it could become corrupted by another
guest or the host.

Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.

  [ Slightly tweaked to use an even register as first operand to ldrd
    and strd operations in interrupts_head.S - Christoffer ]

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-06-26 10:50:04 -07:00
Marc Zyngier
4db845c3d8 ARM: KVM: get rid of S2_PGD_SIZE
S2_PGD_SIZE defines the number of pages used by a stage-2 PGD
and is unused, except for a VM_BUG_ON check that missuses the
define.

As the check is very unlikely to ever triggered except in
circumstances where KVM is the least of our worries, just kill
both the define and the VM_BUG_ON check.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-26 10:50:04 -07:00
Marc Zyngier
8734f16fb2 ARM: KVM: don't special case PC when doing an MMIO
Admitedly, reading a MMIO register to load PC is very weird.
Writing PC to a MMIO register is probably even worse. But
the architecture doesn't forbid any of these, and injecting
a Prefetch Abort is the wrong thing to do anyway.

Remove this check altogether, and let the adventurous guest
wander into LaLaLand if they feel compelled to do so.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-26 10:50:03 -07:00
Marc Zyngier
dac288f7b3 ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs
HYP PGDs are passed around as phys_addr_t, except just before calling
into the hypervisor init code, where they are cast to a rather weird
unsigned long long.

Just keep them around as phys_addr_t, which is what makes the most
sense.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-26 10:50:03 -07:00
Dave P Martin
24a7f67575 ARM: KVM: Don't handle PSCI calls via SMC
Currently, kvmtool unconditionally declares that HVC should be used
to call PSCI, so the function numbers in the DT tell the guest
nothing about the function ID namespace or calling convention for
SMC.

We already assume that the guest will examine and honour the DT,
since there is no way it could possibly guess the KVM-specific PSCI
function IDs otherwise.  So let's not encourage guests to violate
what's specified in the DT by using SMC to make the call.

[ Modified to apply to top of kvm/arm tree - Christoffer ]

Signed-off-by: Dave P Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-26 10:50:02 -07:00
Anup Patel
5ae7f87a56 ARM: KVM: Allow host virt timer irq to be different from guest timer virt irq
The arch_timer irq numbers (or PPI numbers) are implementation dependent,
so the host virtual timer irq number can be different from guest virtual
timer irq number.

This patch ensures that host virtual timer irq number is read from DTB and
guest virtual timer irq is determined based on vcpu target type.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-26 10:50:02 -07:00
Marc Zyngier
f61701e0a2 ARM: KVM: timer: allow DT matching for ARMv8 cores
ARMv8 cores have the exact same timer as ARMv7 cores. Make sure the
KVM timer code can match it in the device tree.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-06-12 16:40:31 +01:00
Mark Rutland
f793c23ebb ARM: KVM: arch_timers: zero CNTVOFF upon return to host
To use the virtual counters from the host, we need to ensure that
CNTVOFF doesn't change unexpectedly. When we change to a guest, we
replace the host's CNTVOFF, but we don't restore it when returning to
the host.

As the host sets CNTVOFF to zero, and never changes it, we can simply
zero CNTVOFF when returning to the host. This patch adds said zeroing to
the return to host path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-07 10:20:27 +01:00
Marc Zyngier
d4cb9df5d1 ARM: KVM: be more thorough when invalidating TLBs
The KVM/ARM MMU code doesn't take care of invalidating TLBs before
freeing a {pte,pmd} table. This could cause problems if the page
is reallocated and then speculated into by another CPU.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-03 10:58:56 +03:00
Andre Przywara
e8180dcaa8 ARM: KVM: prevent NULL pointer dereferences with KVM VCPU ioctl
Some ARM KVM VCPU ioctls require the vCPU to be properly initialized
with the KVM_ARM_VCPU_INIT ioctl before being used with further
requests. KVM_RUN checks whether this initialization has been
done, but other ioctls do not.
Namely KVM_GET_REG_LIST will dereference an array with index -1
without initialization and thus leads to a kernel oops.
Fix this by adding checks before executing the ioctl handlers.

 [ Removed superflous comment from static function - Christoffer ]

Changes from v1:
 * moved check into a static function with a meaningful name

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-06-03 10:58:56 +03:00
Marc Zyngier
535cf7b3b1 KVM: get rid of $(addprefix ../../../virt/kvm/, ...) in Makefiles
As requested by the KVM maintainers, remove the addprefix used to
refer to the main KVM code from the arch code, and replace it with
a KVM variable that does the same thing.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-19 15:14:00 +03:00
Marc Zyngier
7275acdfe2 ARM: KVM: move GIC/timer code to a common location
As KVM/arm64 is looming on the horizon, it makes sense to move some
of the common code to a single location in order to reduce duplication.

The code could live anywhere. Actually, most of KVM is already built
with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
not exactly talking about style here. But maybe it is time to start
moving into a less ugly direction.

The include files must be in a "public" location, as they are accessed
from non-KVM files (arch/arm/kernel/asm-offsets.c).

For this purpose, introduce two new locations:
- virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
  virt/kvm, so this could be seen as a (very ugly) precedent.
- include/kvm/  : there is already an include/xen, and while the
  intent is slightly different, this seems as good a location as
  any

Eventually, we should probably have independant Makefiles at every
levels (just like everywhere else in the kernel), but this is just
the first step.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-19 15:13:08 +03:00
Linus Torvalds
01227a889e Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Gleb Natapov:
 "Highlights of the updates are:

  general:
   - new emulated device API
   - legacy device assignment is now optional
   - irqfd interface is more generic and can be shared between arches

  x86:
   - VMCS shadow support and other nested VMX improvements
   - APIC virtualization and Posted Interrupt hardware support
   - Optimize mmio spte zapping

  ppc:
    - BookE: in-kernel MPIC emulation with irqfd support
    - Book3S: in-kernel XICS emulation (incomplete)
    - Book3S: HV: migration fixes
    - BookE: more debug support preparation
    - BookE: e6500 support

  ARM:
   - reworking of Hyp idmaps

  s390:
   - ioeventfd for virtio-ccw

  And many other bug fixes, cleanups and improvements"

* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  kvm: Add compat_ioctl for device control API
  KVM: x86: Account for failing enable_irq_window for NMI window request
  KVM: PPC: Book3S: Add API for in-kernel XICS emulation
  kvm/ppc/mpic: fix missing unlock in set_base_addr()
  kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
  kvm/ppc/mpic: remove users
  kvm/ppc/mpic: fix mmio region lists when multiple guests used
  kvm/ppc/mpic: remove default routes from documentation
  kvm: KVM_CAP_IOMMU only available with device assignment
  ARM: KVM: iterate over all CPUs for CPU compatibility check
  KVM: ARM: Fix spelling in error message
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
  KVM: ARM: Fix API documentation for ONE_REG encoding
  ARM: KVM: promote vfp_host pointer to generic host cpu context
  ARM: KVM: add architecture specific hook for capabilities
  ARM: KVM: perform HYP initilization for hotplugged CPUs
  ARM: KVM: switch to a dual-step HYP init code
  ARM: KVM: rework HYP page table freeing
  ARM: KVM: enforce maximum size for identity mapped code
  ARM: KVM: move to a KVM provided HYP idmap
  ...
2013-05-05 14:47:31 -07:00
Linus Torvalds
8546dc1d4b Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "The major items included in here are:

   - MCPM, multi-cluster power management, part of the infrastructure
     required for ARMs big.LITTLE support.

   - A rework of the ARM KVM code to allow re-use by ARM64.

   - Error handling cleanups of the IS_ERR_OR_NULL() madness and fixes
     of that stuff for arch/arm

   - Preparatory patches for Cortex-M3 support from Uwe Kleine-König.

  There is also a set of three patches in here from Hugh/Catalin to
  address freeing of inappropriate page tables on LPAE.  You already
  have these from akpm, but they were already part of my tree at the
  time he sent them, so unfortunately they'll end up with duplicate
  commits"

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits)
  ARM: EXYNOS: remove unnecessary use of IS_ERR_VALUE()
  ARM: IMX: remove unnecessary use of IS_ERR_VALUE()
  ARM: OMAP: use consistent error checking
  ARM: cleanup: OMAP hwmod error checking
  ARM: 7709/1: mcpm: Add explicit AFLAGS to support v6/v7 multiplatform kernels
  ARM: 7700/2: Make cpu_init() notrace
  ARM: 7702/1: Set the page table freeing ceiling to TASK_SIZE
  ARM: 7701/1: mm: Allow arch code to control the user page table ceiling
  ARM: 7703/1: Disable preemption in broadcast_tlb*_a15_erratum()
  ARM: mcpm: provide an interface to set the SMP ops at run time
  ARM: mcpm: generic SMP secondary bringup and hotplug support
  ARM: mcpm_head.S: vlock-based first man election
  ARM: mcpm: Add baremetal voting mutexes
  ARM: mcpm: introduce helpers for platform coherency exit/setup
  ARM: mcpm: introduce the CPU/cluster power API
  ARM: multi-cluster PM: secondary kernel entry code
  ARM: cacheflush: add synchronization helpers for mixed cache state accesses
  ARM: cpu hotplug: remove majority of cache flushing from platforms
  ARM: smp: flush L1 cache in cpu_die()
  ARM: tegra: remove tegra specific cpu_disable()
  ...
2013-05-03 09:13:19 -07:00
Russell King
946342d03e Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Linus Torvalds
5d434fcb25 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual stuff, mostly comment fixes, typo fixes, printk fixes and small
  code cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (45 commits)
  mm: Convert print_symbol to %pSR
  gfs2: Convert print_symbol to %pSR
  m32r: Convert print_symbol to %pSR
  iostats.txt: add easy-to-find description for field 6
  x86 cmpxchg.h: fix wrong comment
  treewide: Fix typo in printk and comments
  doc: devicetree: Fix various typos
  docbook: fix 8250 naming in device-drivers
  pata_pdc2027x: Fix compiler warning
  treewide: Fix typo in printks
  mei: Fix comments in drivers/misc/mei
  treewide: Fix typos in kernel messages
  pm44xx: Fix comment for "CONFIG_CPU_IDLE"
  doc: Fix typo "CONFIG_CGROUP_CGROUP_MEMCG_SWAP"
  mmzone: correct "pags" to "pages" in comment.
  kernel-parameters: remove outdated 'noresidual' parameter
  Remove spurious _H suffixes from ifdef comments
  sound: Remove stray pluses from Kconfig file
  radio-shark: Fix printk "CONFIG_LED_CLASS"
  doc: put proper reference to CONFIG_MODULE_SIG_ENFORCE
  ...
2013-04-30 09:36:50 -07:00
Andre Przywara
d4e071ce6a ARM: KVM: iterate over all CPUs for CPU compatibility check
kvm_target_cpus() checks the compatibility of the used CPU with
KVM, which is currently limited to ARM Cortex-A15 cores.
However by calling it only once on any random CPU it assumes that
all cores are the same, which is not necessarily the case (for example
in Big.Little).

[ I cut some of the commit message and changed the formatting of the
  code slightly to pass checkpatch and look more like the rest of the
  kvm/arm init code - Christoffer ]

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:23 -07:00
Christoffer Dall
df75921738 KVM: ARM: Fix spelling in error message
s/unkown/unknown/

Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:22 -07:00
Arnd Bergmann
d21a1c83c7 ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
The CONFIG_KVM_ARM_MAX_VCPUS symbol is needed in order to build the
kernel/context_tracking.c code, which includes the vgic data structures
implictly through the kvm headers. Definining the symbol to zero
on builds without KVM resolves this build error:

In file included from include/linux/kvm_host.h:33:0,
                 from kernel/context_tracking.c:18:
arch/arm/include/asm/kvm_host.h:28:23: warning: "CONFIG_KVM_ARM_MAX_VCPUS" is not defined [-Wundef]
 #define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS
                       ^
arch/arm/include/asm/kvm_vgic.h:34:24: note: in expansion of macro 'KVM_MAX_VCPUS'
 #define VGIC_MAX_CPUS  KVM_MAX_VCPUS
                        ^
arch/arm/include/asm/kvm_vgic.h:38:6: note: in expansion of macro 'VGIC_MAX_CPUS'
 #if (VGIC_MAX_CPUS > 8)
      ^
In file included from arch/arm/include/asm/kvm_host.h:41:0,
                 from include/linux/kvm_host.h:33,
                 from kernel/context_tracking.c:18:
arch/arm/include/asm/kvm_vgic.h:59:11: error: 'CONFIG_KVM_ARM_MAX_VCPUS' undeclared here (not in a function)
  } percpu[VGIC_MAX_CPUS];
           ^

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:14 -07:00
Marc Zyngier
3de50da690 ARM: KVM: promote vfp_host pointer to generic host cpu context
We use the vfp_host pointer to store the host VFP context, should
the guest start using VFP itself.

Actually, we can use this pointer in a more generic way to store
CPU speficic data, and arm64 is using it to dump the whole host
state before switching to the guest.

Simply rename the vfp_host field to host_cpu_context, and the
corresponding type to kvm_cpu_context_t. No change in functionnality.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:13 -07:00
Marc Zyngier
17b1e31f92 ARM: KVM: add architecture specific hook for capabilities
Most of the capabilities are common to both arm and arm64, but
we still need to handle the exceptions.

Introduce kvm_arch_dev_ioctl_check_extension, which both architectures
implement (in the 32bit case, it just returns 0).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:12 -07:00
Marc Zyngier
d157f4a515 ARM: KVM: perform HYP initilization for hotplugged CPUs
Now that we have the necessary infrastructure to boot a hotplugged CPU
at any point in time, wire a CPU notifier that will perform the HYP
init for the incoming CPU.

Note that this depends on the platform code and/or firmware to boot the
incoming CPU with HYP mode enabled and return to the kernel by following
the normal boot path (HYP stub installed).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:11 -07:00
Marc Zyngier
5a677ce044 ARM: KVM: switch to a dual-step HYP init code
Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
  runtime mappings, as pages are manipulated from PL1 exclusively

The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.

To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text

The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
  it is unlikely to use the last 64kB (I doubt we'll ever support KVM
  on a system with something like 4MB of RAM, but patches are very
  welcome).

Let's call this VA the trampoline VA.

Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd

The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
  runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
  physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
  page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).

Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:10 -07:00
Marc Zyngier
4f728276fb ARM: KVM: rework HYP page table freeing
There is no point in freeing HYP page tables differently from Stage-2.
They now have the same requirements, and should be dealt with the same way.

Promote unmap_stage2_range to be The One True Way, and get rid of a number
of nasty bugs in the process (good thing we never actually called free_hyp_pmds
before...).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:10 -07:00
Marc Zyngier
2fb410596c ARM: KVM: move to a KVM provided HYP idmap
After the HYP page table rework, it is pretty easy to let the KVM
code provide its own idmap, rather than expecting the kernel to
provide it. It takes actually less code to do so.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:08 -07:00
Marc Zyngier
3562c76dcb ARM: KVM: fix HYP mapping limitations around zero
The current code for creating HYP mapping doesn't like to wrap
around zero, which prevents from mapping anything into the last
page of the virtual address space.

It doesn't take much effort to remove this limitation, making
the code more consistent with the rest of the kernel in the process.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:08 -07:00
Marc Zyngier
6060df84cb ARM: KVM: simplify HYP mapping population
The way we populate HYP mappings is a bit convoluted, to say the least.
Passing a pointer around to keep track of the current PFN is quite
odd, and we end-up having two different PTE accessors for no good
reason.

Simplify the whole thing by unifying the two PTE accessors, passing
a pgprot_t around, and moving the various validity checks to the
upper layers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:23:07 -07:00
Mark Rutland
372b7c1bc8 ARM: KVM: arch_timer: use symbolic constants
In clocksource/arm_arch_timer.h we define useful symbolic constants.
Let's use them to make the KVM arch_timer code clearer.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 22:22:57 -07:00
Marc Zyngier
210552c1bf ARM: KVM: add support for minimal host vs guest profiling
In order to be able to correctly profile what is happening on the
host, we need to be able to identify when we're running on the guest,
and log these events differently.

Perf offers a simple way to register callbacks into KVM. Mimic what
x86 does and enjoy being able to profile your KVM host.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-28 21:44:01 -07:00
Gleb Natapov
2dfee7b271 Merge branch 'kvm-arm-cleanup' from git://github.com/columbia/linux-kvm-arm.git 2013-04-25 18:23:48 +03:00
Masanari Iida
b23f7a09f9 treewide: Fix typo in printk and comments
Fix typo in printk and comments within various drivers.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-04-24 16:43:00 +02:00
Marc Zyngier
15bbc1b28f ARM: KVM: fix unbalanced get_cpu() in access_dcsw
In the very unlikely event where a guest would be foolish enough to
*read* from a write-only cache maintainance register, we end up
with preemption disabled, due to a misplaced get_cpu().

Just move the "is_write" test outside of the critical section.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-17 12:51:32 -07:00
Marc Zyngier
ca46e10fb2 ARM: KVM: fix KVM_CAP_ARM_SET_DEVICE_ADDR reporting
Commit 3401d54696 (KVM: ARM: Introduce KVM_ARM_SET_DEVICE_ADDR
ioctl) added support for the KVM_CAP_ARM_SET_DEVICE_ADDR capability,
but failed to add a break in the relevant case statement, returning
the number of CPUs instead.

Luckilly enough, the CONFIG_NR_CPUS=0 patch hasn't been merged yet
(https://lkml.org/lkml/diff/2012/3/31/131/1), so the bug wasn't
noticed.

Just give it a break!

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-04-16 16:21:24 -07:00
Alexander Graf
79558f112f KVM: ARM: Fix kvm_vm_ioctl_irq_line
Commit aa2fbe6d broke the ARM KVM target by introducing a new parameter
to irq handling functions.

Fix the function prototype to get things compiling again and ignore the
parameter just like we did before

Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-04-16 18:16:10 -03:00
Russell King
7fb476c231 Merge branch 'kvm-arm/vgic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into fixes 2013-03-22 17:10:35 +00:00
Marc Zyngier
f42798c689 ARM: KVM: Fix length of mmio access
Instead of hardcoding the maximum MMIO access to be 4 bytes,
compare it to sizeof(unsigned long), which will do the
right thing on both 32 and 64bit systems.

Same thing for sign extention.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 16:01:51 -08:00
Marc Zyngier
000d399625 ARM: KVM: sanitize freeing of HYP page tables
Instead of trying to free everything from PAGE_OFFSET to the
top of memory, use the virt_addr_valid macro to check the
upper limit.

Also do the same for the vmalloc region where the IO mappings
are allocated.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:59:20 -08:00
Marc Zyngier
6190920a35 ARM: KVM: move kvm_handle_wfi to handle_exit.c
It has little to do in emulate.c these days...

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:45 -08:00
Marc Zyngier
48762767e1 ARM: KVM: change kvm_tlb_flush_vmid to kvm_tlb_flush_vmid_ipa
v8 is capable of invalidating Stage-2 by IPA, but v7 is not.
Change kvm_tlb_flush_vmid() to take an IPA parameter, which is
then ignored by the invalidation code (and nuke the whole TLB
as it always did).

This allows v8 to implement a more optimized strategy.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:45 -08:00
Marc Zyngier
06fe0b73ff ARM: KVM: move include of asm/idmap.h to kvm_mmu.h
Since the arm64 code doesn't have a global asm/idmap.h file, move
the inclusion to asm/kvm_mmu.h.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:45 -08:00
Marc Zyngier
d7bb077737 ARM: KVM: vgic: decouple alignment restriction from page size
The virtual GIC is supposed to be 4kB aligned. On a 64kB page
system, comparing the alignment to PAGE_SIZE is wrong.

Use SZ_4K instead.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
cfe3950c2a ARM: KVM: fix fault_ipa computing
The ARM ARM says that HPFAR reports bits [39:12] of the faulting
IPA, and we need to complement it with the bottom 12 bits of the
faulting VA.

This is always 12 bits, irrespective of the page size. Makes it
clearer in the code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
728d577d35 ARM: KVM: move kvm_target_cpu to guest.c
guest.c already contains some target-specific checks. Let's move
kvm_target_cpu() over there so arm.c is mostly target agnostic.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
b4034bde5f ARM: KVM: fix address validation for HYP mappings
__create_hyp_mappings() performs some kind of address validation before
creating the mapping, by verifying that the start address is above
PAGE_OFFSET.

This check is not completely correct for kernel memory (the upper
boundary has to be checked as well so we do not end up with highmem
pages), and wrong for IO mappings (the mapping must exist in the vmalloc
region).

Fix this by using the proper predicates (virt_addr_valid and
is_vmalloc_addr), which also work correctly on ARM64 (where the vmalloc
region is below PAGE_OFFSET).

Also change the BUG_ON() into a less agressive error return.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
06e8c3b0f3 ARM: KVM: allow HYP mappings to be at an offset from kernel mappings
arm64 cannot represent the kernel VAs in HYP mode, because of the lack
of TTBR1 at EL2. A way to cope with this situation is to have HYP VAs
to be an offset from the kernel VAs.

Introduce macros to convert a kernel VA to a HYP VA, make the HYP
mapping functions use these conversion macros. Also change the
documentation to reflect the existence of the offset.

On ARM, where we can have an identity mapping between kernel and HYP,
the macros are without any effect.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
9c7a6432fb ARM: KVM: use kvm_kernel_vfp_t as an abstract type for VFP containers
In order to keep the VFP allocation code common, use an abstract type
for the VFP containers. Maps onto struct vfp_hard_struct on ARM.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
e7858c58d5 ARM: KVM: move hyp init to kvm_host.h
Make the split of the pgd_ptr an implementation specific thing
by moving the init call to an inline function.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Marc Zyngier
c62ee2b227 ARM: KVM: abstract most MMU operations
Move low level MMU-related operations to kvm_mmu.h. This makes
the MMU code reusable by the arm64 port.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:44 -08:00
Christoffer Dall
c088f8f008 KVM: ARM: Reintroduce trace_kvm_hvc
This one got lost in the move to handle_exit, so let's reintroduce it
using an accessor to the immediate value field like the other ones.

Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:43 -08:00
Marc Zyngier
3414bbfff9 ARM: KVM: move exit handler selection to a separate file
The exit handler selection code cannot be shared with arm64
(two different modes, more exception classes...).

Move it to a separate file (handle_exit.c).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:43 -08:00
Marc Zyngier
c599756329 ARM: KVM: move kvm_condition_valid to emulate.c
This is really hardware emulation, and as such it better be with
its little friends.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:43 -08:00
Marc Zyngier
52d1dba933 ARM: KVM: abstract HSR_EC_IABT away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:43 -08:00
Marc Zyngier
1cc287dd08 ARM: KVM: abstract fault decoding away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:43 -08:00
Marc Zyngier
4926d445eb ARM: KVM: abstract exception class decoding away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:43 -08:00
Marc Zyngier
23b415d61a ARM: KVM: abstract IL decoding away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:43 -08:00
Marc Zyngier
a7123377e7 ARM: KVM: abstract SAS decoding away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:43 -08:00
Marc Zyngier
b37670b0f3 ARM: KVM: abstract S1TW abort detection away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Marc Zyngier
78abfcde49 ARM: KVM: abstract (and fix) external abort detection away
Bit 8 is cache maintenance, bit 9 is external abort.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Marc Zyngier
d0adf747c9 ARM: KVM: abstract HSR_SRT_{MASK,SHIFT} away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:42 -08:00
Marc Zyngier
7c511b881f ARM: KVM: abstract HSR_SSE away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Marc Zyngier
023cc96406 ARM: KVM: abstract HSR_WNR away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Marc Zyngier
4a1df28ac0 ARM: KVM: abstract HSR_ISV away
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Marc Zyngier
7393b59917 ARM: KVM: abstract fault register accesses
Instead of directly accessing the fault registers, use proper accessors
so the core code can be shared.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:42 -08:00
Marc Zyngier
db730d8d62 ARM: KVM: convert GP registers from u32 to unsigned long
On 32bit ARM, unsigned long is guaranteed to be a 32bit quantity.
On 64bit ARM, it is a 64bit quantity.

In order to be able to share code between the two architectures,
convert the registers to be unsigned long, so the core code can
be oblivious of the change.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-03-06 15:48:42 -08:00
Jonghwan Choi
0b5e3bac30 KVM: ARM: Fix wrong address in comment
hyp_hvc vector offset is 0x14 and hyp_svc vector offset is 0x8.

Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
2013-03-06 15:48:42 -08:00
Takuya Yoshikawa
16014753b1 KVM: ARM: Remove kvm_arch_set_memory_region()
This was replaced with prepare/commit long before:

  commit f7784b8ec9
  KVM: split kvm_arch_set_memory_region into prepare and commit

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Takuya Yoshikawa
8482644aea KVM: set_memory_region: Refactor commit_memory_region()
This patch makes the parameter old a const pointer to the old memory
slot and adds a new parameter named change to know the change being
requested: the former is for removing extra copying and the latter is
for cleaning up the code.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Takuya Yoshikawa
7b6195a91d KVM: set_memory_region: Refactor prepare_memory_region()
This patch drops the parameter old, a copy of the old memory slot, and
adds a new parameter named change to know the change being requested.

This not only cleans up the code but also removes extra copying of the
memory slot structure.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Takuya Yoshikawa
462fce4606 KVM: set_memory_region: Drop user_alloc from prepare/commit_memory_region()
X86 does not use this any more.  The remaining user, s390's !user_alloc
check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no
longer supported.

Note: fixed powerpc's indentations with spaces to suppress checkpatch
errors.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Marc Zyngier
3b8cd8a015 ARM: KVM: fix compilation after removal of user_alloc from struct kvm_memory_slot
Commit 7a905b1 (KVM: Remove user_alloc from struct kvm_memory_slot)
broke KVM/ARM by removing the user_alloc field from a public structure.

As we only used this field to alert the user that we didn't support
this operation mode, there is no harm in discarding this bit of code
without any remorse.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-02-25 11:48:41 +02:00
Marc Zyngier
bef103aa7d ARM: KVM: fix kvm_arch_{prepare,commit}_memory_region
Commit f82a8cfe9 (KVM: struct kvm_memory_slot.user_alloc -> bool)
broke the ARM KVM port by changing the prototype of two global
functions.

Apply the same change to fix the compilation breakage.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-02-25 11:47:28 +02:00
Marc Zyngier
33c83cb3c1 ARM: KVM: vgic: take distributor lock on sync_hwstate path
Now that the maintenance interrupt handling is actually out of the
handler itself, the code becomes quite racy as we can get preempted
while we process the state.

Wrapping this code around the distributor lock ensures that we're not
preempted and relatively race-free.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-22 13:29:38 +00:00
Marc Zyngier
75da01e127 ARM: KVM: vgic: force EOIed LRs to the empty state
The VGIC doesn't guarantee that an EOIed LR that has been configured
to generate a maintenance interrupt will appear as empty.

While the code recovers from this situation, it is better to clean
the LR and flag it as empty so it can be quickly recycled.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-22 13:29:37 +00:00
Marc Zyngier
967f84275b ARM: KVM: arch_timers: Wire the init code and config option
It is now possible to select CONFIG_KVM_ARM_TIMER to enable the
KVM architected timer support.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:06:00 +00:00
Marc Zyngier
c7e3ba64ba ARM: KVM: arch_timers: Add timer world switch
Do the necessary save/restore dance for the timers in the world
switch code. In the process, allow the guest to read the physical
counter, which is useful for its own clock_event_device.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:05:38 +00:00
Marc Zyngier
53e724067a ARM: KVM: arch_timers: Add guest timer core support
Add some the architected timer related infrastructure, and support timer
interrupt injection, which can happen as a resultof three possible
events:

- The virtual timer interrupt has fired while we were still
  executing the guest
- The timer interrupt hasn't fired, but it expired while we
  were doing the world switch
- A hrtimer we programmed earlier has fired

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:05:11 +00:00
Marc Zyngier
75431f9d73 ARM: KVM: Add VGIC configuration option
It is now possible to select the VGIC configuration option.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:00:15 +00:00
Marc Zyngier
01ac5e342f ARM: KVM: VGIC initialisation code
Add the init code for the hypervisor, the virtual machine, and
the virtual CPUs.

An interrupt handler is also wired to allow the VGIC maintenance
interrupts, used to deal with level triggered interrupts and LR
underflows.

A CPU hotplug notifier is registered to disable/enable the interrupt
as requested.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:00:10 +00:00
Marc Zyngier
348b2b0708 ARM: KVM: VGIC control interface world switch
Enable the VGIC control interface to be save-restored on world switch.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 19:00:03 +00:00
Marc Zyngier
5863c2ce72 ARM: KVM: VGIC interrupt injection
Plug the interrupt injection code. Interrupts can now be generated
from user space.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:59:55 +00:00
Marc Zyngier
a1fcb44e26 ARM: KVM: vgic: retire queued, disabled interrupts
An interrupt may have been disabled after being made pending on the
CPU interface (the classic case is a timer running while we're
rebooting the guest - the interrupt would kick as soon as the CPU
interface gets enabled, with deadly consequences).

The solution is to examine already active LRs, and check the
interrupt is still enabled. If not, just retire it.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:59:49 +00:00
Marc Zyngier
9d949dce52 ARM: KVM: VGIC virtual CPU interface management
Add VGIC virtual CPU interface code, picking pending interrupts
from the distributor and stashing them in the VGIC control interface
list registers.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:59:20 +00:00
Marc Zyngier
b47ef92af8 ARM: KVM: VGIC distributor handling
Add the GIC distributor emulation code. A number of the GIC features
are simply ignored as they are not required to boot a Linux guest.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:59:15 +00:00
Christoffer Dall
330690cdce ARM: KVM: VGIC accept vcpu and dist base addresses from user space
User space defines the model to emulate to a guest and should therefore
decide which addresses are used for both the virtual CPU interface
directly mapped in the guest physical address space and for the emulated
distributor interface, which is mapped in software by the in-kernel VGIC
support.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:59:01 +00:00
Marc Zyngier
1a89dd9113 ARM: KVM: Initial VGIC infrastructure code
Wire the basic framework code for VGIC support and the initial in-kernel
MMIO support code for the VGIC, used for the distributor emulation.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:58:55 +00:00
Marc Zyngier
1638a12d4e ARM: KVM: Keep track of currently running vcpus
When an interrupt occurs for the guest, it is sometimes necessary
to find out which vcpu was running at that point.

Keep track of which vcpu is being run in kvm_arch_vcpu_ioctl_run(),
and allow the data to be retrieved using either:
- kvm_arm_get_running_vcpu(): returns the vcpu running at this point
  on the current CPU. Can only be used in a non-preemptible context.
- kvm_arm_get_running_vcpus(): returns the per-CPU variable holding
  the running vcpus, usable for per-CPU interrupts.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:58:48 +00:00
Christoffer Dall
3401d54696 KVM: ARM: Introduce KVM_ARM_SET_DEVICE_ADDR ioctl
On ARM some bits are specific to the model being emulated for the guest and
user space needs a way to tell the kernel about those bits.  An example is mmio
device base addresses, where KVM must know the base address for a given device
to properly emulate mmio accesses within a certain address range or directly
map a device with virtualiation extensions into the guest address space.

We make this API ARM-specific as we haven't yet reached a consensus for a
generic API for all KVM architectures that will allow us to do something like
this.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-02-11 18:58:39 +00:00
Marc Zyngier
aa024c2f35 KVM: ARM: Power State Coordination Interface implementation
Implement the PSCI specification (ARM DEN 0022A) to control
virtual CPUs being "powered" on or off.

PSCI/KVM is detected using the KVM_CAP_ARM_PSCI capability.

A virtual CPU can now be initialized in a "powered off" state,
using the KVM_ARM_VCPU_POWER_OFF feature flag.

The guest can use either SMC or HVC to execute a PSCI function.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:18 -05:00
Christoffer Dall
45e96ea6b3 KVM: ARM: Handle I/O aborts
When the guest accesses I/O memory this will create data abort
exceptions and they are handled by decoding the HSR information
(physical address, read/write, length, register) and forwarding reads
and writes to QEMU which performs the device emulation.

Certain classes of load/store operations do not support the syndrome
information provided in the HSR.  We don't support decoding these (patches
are available elsewhere), so we report an error to user space in this case.

This requires changing the general flow somewhat since new calls to run
the VCPU must check if there's a pending MMIO load and perform the write
after userspace has made the data available.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:17 -05:00
Christoffer Dall
94f8e6418d KVM: ARM: Handle guest faults in KVM
Handles the guest faults in KVM by mapping in corresponding user pages
in the 2nd stage page tables.

We invalidate the instruction cache by MVA whenever we map a page to the
guest (no, we cannot only do it when we have an iabt because the guest
may happily read/write a page before hitting the icache) if the hardware
uses VIPT or PIPT.  In the latter case, we can invalidate only that
physical page.  In the first case, all bets are off and we simply must
invalidate the whole affair.  Not that VIVT icaches are tagged with
vmids, and we are out of the woods on that one.  Alexander Graf was nice
enough to remind us of this massive pain.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:16 -05:00
Rusty Russell
4fe21e4c6d KVM: ARM: VFP userspace interface
We use space #18 for floating point regs.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:15 -05:00
Christoffer Dall
c27581ed32 KVM: ARM: Demux CCSIDR in the userspace API
The Cache Size Selection Register (CSSELR) selects the current Cache
Size ID Register (CCSIDR).  You write which cache you are interested
in to CSSELR, and read the information out of CCSIDR.

Which cache numbers are valid is known by reading the Cache Level ID
Register (CLIDR).

To export this state to userspace, we add a KVM_REG_ARM_DEMUX
numberspace (17), which uses 8 bits to represent which register is
being demultiplexed (0 for CCSIDR), and the lower 8 bits to represent
this demultiplexing (in our case, the CSSELR value, which is 4 bits).

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:14 -05:00
Christoffer Dall
1138245ccf KVM: ARM: User space API for getting/setting co-proc registers
The following three ioctls are implemented:
 -  KVM_GET_REG_LIST
 -  KVM_GET_ONE_REG
 -  KVM_SET_ONE_REG

Now we have a table for all the cp15 registers, we can drive a generic
API.

The register IDs carry the following encoding:

ARM registers are mapped using the lower 32 bits.  The upper 16 of that
is the register group type, or coprocessor number:

ARM 32-bit CP15 registers have the following id bit patterns:
  0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>

ARM 64-bit CP15 registers have the following id bit patterns:
  0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>

For futureproofing, we need to tell QEMU about the CP15 registers the
host lets the guest access.

It will need this information to restore a current guest on a future
CPU or perhaps a future KVM which allow some of these to be changed.

We use a separate table for these, as they're only for the userspace API.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:14 -05:00
Christoffer Dall
5b3e5e5bf2 KVM: ARM: Emulation framework and CP15 emulation
Adds a new important function in the main KVM/ARM code called
handle_exit() which is called from kvm_arch_vcpu_ioctl_run() on returns
from guest execution. This function examines the Hyp-Syndrome-Register
(HSR), which contains information telling KVM what caused the exit from
the guest.

Some of the reasons for an exit are CP15 accesses, which are
not allowed from the guest and this commit handles these exits by
emulating the intended operation in software and skipping the guest
instruction.

Minor notes about the coproc register reset:
1) We reserve a value of 0 as an invalid cp15 offset, to catch bugs in our
   table, at cost of 4 bytes per vcpu.

2) Added comments on the table indicating how we handle each register, for
   simplicity of understanding.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:13 -05:00
Christoffer Dall
f7ed45be3b KVM: ARM: World-switch implementation
Provides complete world-switch implementation to switch to other guests
running in non-secure modes. Includes Hyp exception handlers that
capture necessary exception information and stores the information on
the VCPU and KVM structures.

The following Hyp-ABI is also documented in the code:

Hyp-ABI: Calling HYP-mode functions from host (in SVC mode):
   Switching to Hyp mode is done through a simple HVC #0 instruction. The
   exception vector code will check that the HVC comes from VMID==0 and if
   so will push the necessary state (SPSR, lr_usr) on the Hyp stack.
   - r0 contains a pointer to a HYP function
   - r1, r2, and r3 contain arguments to the above function.
   - The HYP function will be called with its arguments in r0, r1 and r2.
   On HYP function return, we return directly to SVC.

A call to a function executing in Hyp mode is performed like the following:

        <svc code>
        ldr     r0, =BSYM(my_hyp_fn)
        ldr     r1, =my_param
        hvc #0  ; Call my_hyp_fn(my_param) from HYP mode
        <svc code>

Otherwise, the world-switch is pretty straight-forward. All state that
can be modified by the guest is first backed up on the Hyp stack and the
VCPU values is loaded onto the hardware. State, which is not loaded, but
theoretically modifiable by the guest is protected through the
virtualiation features to generate a trap and cause software emulation.
Upon guest returns, all state is restored from hardware onto the VCPU
struct and the original state is restored from the Hyp-stack onto the
hardware.

SMP support using the VMPIDR calculated on the basis of the host MPIDR
and overriding the low bits with KVM vcpu_id contributed by Marc Zyngier.

Reuse of VMIDs has been implemented by Antonios Motakis and adapated from
a separate patch into the appropriate patches introducing the
functionality. Note that the VMIDs are stored per VM as required by the ARM
architecture reference manual.

To support VFP/NEON we trap those instructions using the HPCTR. When
we trap, we switch the FPU.  After a guest exit, the VFP state is
returned to the host.  When disabling access to floating point
instructions, we also mask FPEXC_EN in order to avoid the guest
receiving Undefined instruction exceptions before we have a chance to
switch back the floating point state.  We are reusing vfp_hard_struct,
so we depend on VFPv3 being enabled in the host kernel, if not, we still
trap cp10 and cp11 in order to inject an undefined instruction exception
whenever the guest tries to use VFP/NEON. VFP/NEON developed by
Antionios Motakis and Rusty Russell.

Aborts that are permission faults, and not stage-1 page table walk, do
not report the faulting address in the HPFAR.  We have to resolve the
IPA, and store it just like the HPFAR register on the VCPU struct. If
the IPA cannot be resolved, it means another CPU is playing with the
page tables, and we simply restart the guest.  This quirk was fixed by
Marc Zyngier.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:12 -05:00
Christoffer Dall
86ce85352f KVM: ARM: Inject IRQs and FIQs from userspace
All interrupt injection is now based on the VM ioctl KVM_IRQ_LINE.  This
works semantically well for the GIC as we in fact raise/lower a line on
a machine component (the gic).  The IOCTL uses the follwing struct.

struct kvm_irq_level {
	union {
		__u32 irq;     /* GSI */
		__s32 status;  /* not used for KVM_IRQ_LEVEL */
	};
	__u32 level;           /* 0 or 1 */
};

ARM can signal an interrupt either at the CPU level, or at the in-kernel irqchip
(GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for
specific cpus.  The irq field is interpreted like this:

  bits:  | 31 ... 24 | 23  ... 16 | 15    ...    0 |
  field: | irq_type  | vcpu_index |   irq_number   |

The irq_type field has the following values:
- irq_type[0]: out-of-kernel GIC: irq_number 0 is IRQ, irq_number 1 is FIQ
- irq_type[1]: in-kernel GIC: SPI, irq_number between 32 and 1019 (incl.)
               (the vcpu_index field is ignored)
- irq_type[2]: in-kernel GIC: PPI, irq_number between 16 and 31 (incl.)

The irq_number thus corresponds to the irq ID in as in the GICv2 specs.

This is documented in Documentation/kvm/api.txt.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:12 -05:00
Christoffer Dall
d5d8184d35 KVM: ARM: Memory virtualization setup
This commit introduces the framework for guest memory management
through the use of 2nd stage translation. Each VM has a pointer
to a level-1 table (the pgd field in struct kvm_arch) which is
used for the 2nd stage translations. Entries are added when handling
guest faults (later patch) and the table itself can be allocated and
freed through the following functions implemented in
arch/arm/kvm/arm_mmu.c:
 - kvm_alloc_stage2_pgd(struct kvm *kvm);
 - kvm_free_stage2_pgd(struct kvm *kvm);

Each entry in TLBs and caches are tagged with a VMID identifier in
addition to ASIDs. The VMIDs are assigned consecutively to VMs in the
order that VMs are executed, and caches and tlbs are invalidated when
the VMID space has been used to allow for more than 255 simultaenously
running guests.

The 2nd stage pgd is allocated in kvm_arch_init_vm(). The table is
freed in kvm_arch_destroy_vm(). Both functions are called from the main
KVM code.

We pre-allocate page table memory to be able to synchronize using a
spinlock and be called under rcu_read_lock from the MMU notifiers.  We
steal the mmu_memory_cache implementation from x86 and adapt for our
specific usage.

We support MMU notifiers (thanks to Marc Zyngier) through
kvm_unmap_hva and kvm_set_spte_hva.

Finally, define kvm_phys_addr_ioremap() to map a device at a guest IPA,
which is used by VGIC support to map the virtual CPU interface registers
to the guest. This support is added by Marc Zyngier.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:11 -05:00
Christoffer Dall
342cd0ab0e KVM: ARM: Hypervisor initialization
Sets up KVM code to handle all exceptions taken to Hyp mode.

When the kernel is booted in Hyp mode, calling an hvc instruction with r0
pointing to the new vectors, the HVBAR is changed to the the vector pointers.
This allows subsystems (like KVM here) to execute code in Hyp-mode with the
MMU disabled.

We initialize other Hyp-mode registers and enables the MMU for Hyp-mode from
the id-mapped hyp initialization code. Afterwards, the HVBAR is changed to
point to KVM Hyp vectors used to catch guest faults and to switch to Hyp mode
to perform a world-switch into a KVM guest.

Also provides memory mapping code to map required code pages, data structures,
and I/O regions  accessed in Hyp mode at the same virtual address as the host
kernel virtual addresses, but which conforms to the architectural requirements
for translations in Hyp mode. This interface is added in arch/arm/kvm/arm_mmu.c
and comprises:
 - create_hyp_mappings(from, to);
 - create_hyp_io_mappings(from, to, phys_addr);
 - free_hyp_pmds();

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:10 -05:00
Christoffer Dall
749cf76c5a KVM: ARM: Initial skeleton to compile KVM support
Targets KVM support for Cortex A-15 processors.

Contains all the framework components, make files, header files, some
tracing functionality, and basic user space API.

Only supported core is Cortex-A15 for now.

Most functionality is in arch/arm/kvm/* or arch/arm/include/asm/kvm_*.h.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:10 -05:00