Commit Graph

110 Commits

Author SHA1 Message Date
Eric Auger
0b3289ebc2 KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
irqfd/arm curently does not support routing. kvm_irq_map_gsi is
supposed to return all the routing entries associated with the
provided gsi and return the number of those entries. We should
return 0 at this point.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-04-22 15:37:54 +01:00
Paolo Bonzini
bf0fb67cf9 KVM/ARM changes for v4.1:
- fixes for live migration
 - irqfd support
 - kvm-io-bus & vgic rework to enable ioeventfd
 - page ageing for stage-2 translation
 - various cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVHQ0kAAoJECPQ0LrRPXpDHKQQALjw6STaZd7n20OFopNgHd4P
 qVeWYEKBxnsiSvL4p3IOSlZlEul+7x08aZqtyxWQRQcDT4ggTI+3FKKfc+8yeRpH
 WV6YJP0bGqz7039PyMLuIgs48xkSZtntePw69hPJfHZh4C1RBlP5T2SfE8mU8VZX
 fWToiU3W12QfKnmN7JFgxZopnGhrYCrG0EexdTDziAZu0GEMlDrO4wnyTR60WCvT
 4TEF73R0kpAz4yplKuhcDHuxIG7VFhQ4z7b09M1JtR0gQ3wUvfbD3Wqqi49SwHkv
 NQOStcyLsIlDosSRcLXNCwb3IxjObXTBcAxnzgm2Aoc1xMMZX1ZPQNNs6zFZzycb
 2c6QMiQ35zm7ellbvrG+bT+BP86JYWcAtHjWcaUFgqSJjb8MtqcMtsCea/DURhqx
 /kictqbPYBBwKW6SKbkNkisz59hPkuQnv35fuf992MRCbT9LAXLPRLbcirucCzkE
 p1MOotsWoO3ldJMZaVn0KYk3sQf6mCIfbYPEdOcw3fhJlvyy3NdjVkLOFbA5UUg1
 rQ7Ru2rTemBc0ExVrymngNTMpMB4XcEeJzXfhcgMl3DWbDj60Ku/O26sDtZ6bsFv
 JuDYn8FVDHz9gpEQHgiUi1YMsBKXLhnILa1ppaa6AflykU3BRfYjAk1SXmX84nQK
 mJUJEdFuxi6pHN0UKxUI
 =avA4
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next'

KVM/ARM changes for v4.1:

- fixes for live migration
- irqfd support
- kvm-io-bus & vgic rework to enable ioeventfd
- page ageing for stage-2 translation
- various cleanups
2015-04-07 18:09:20 +02:00
Andre Przywara
950324ab81 KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus
Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
data to be passed on from syndrome decoding all the way down to the
VGIC register handlers. Now as we switch the MMIO handling to be
routed through the KVM MMIO bus, it does not make sense anymore to
use that structure already from the beginning. So we keep the data in
local variables until we put them into the kvm_io_bus framework.
Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
structure. On that way we replace the data buffer in that structure
with a pointer pointing to a single location in a local variable, so
we get rid of some copying on the way.
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.

I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
because that touches a lot of code lines without any good reason.

This is based on an original patch by Nikolay.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:19 +01:00
Andre Przywara
fb8f61abab KVM: arm/arm64: prepare GICv3 emulation to use kvm_io_bus MMIO handling
Using the framework provided by the recent vgic.c changes, we
register a kvm_io_bus device on mapping the virtual GICv3 resources.
The distributor mapping is pretty straight forward, but the
redistributors need some more love, since they need to be tagged with
the respective redistributor (read: VCPU) they are connected with.
We use the kvm_io_bus framework to register one devices per VCPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:13 +01:00
Andre Przywara
0ba10d5392 KVM: arm/arm64: merge GICv3 RD_base and SGI_base register frames
Currently we handle the redistributor registers in two separate MMIO
regions, one for the overall behaviour and SPIs and one for the
SGIs/PPIs. That latter forces the creation of _two_ KVM I/O bus
devices for each redistributor.
Since the spec mandates those two pages to be contigious, we could as
well merge them and save the churn with the second KVM I/O bus device.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30 17:07:08 +01:00
Andre Przywara
a9cf86f62b KVM: arm/arm64: prepare GICv2 emulation to be handled by kvm_io_bus
Using the framework provided by the recent vgic.c changes we register
a kvm_io_bus device when initializing the virtual GICv2.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:15 +00:00
Andre Przywara
6777f77f0f KVM: arm/arm64: implement kvm_io_bus MMIO handling for the VGIC
Currently we use a lot of VGIC specific code to do the MMIO
dispatching.
Use the previous reworks to add kvm_io_bus style MMIO handlers.

Those are not yet called by the MMIO abort handler, also the actual
VGIC emulator function do not make use of it yet, but will be enabled
with the following patches.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Andre Przywara
9f199d0a0e KVM: arm/arm64: simplify vgic_find_range() and callers
The vgic_find_range() function in vgic.c takes a struct kvm_exit_mmio
argument, but actually only used the length field in there. Since we
need to get rid of that structure in that part of the code anyway,
let's rework the function (and it's callers) to pass the length
argument to the function directly.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Andre Przywara
cf50a1eb43 KVM: arm/arm64: rename struct kvm_mmio_range to vgic_io_range
The name "kvm_mmio_range" is a bit bold, given that it only covers
the VGIC's MMIO ranges. To avoid confusion with kvm_io_range, rename
it to vgic_io_range.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26 21:43:14 +00:00
Christoffer Dall
1a74847885 arm/arm64: KVM: Fix migration race in the arch timer
When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running.  When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).

This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race.  It is unlikely, but possible, for the following
sequence of events to happen:

 1. Userspace stops the VM
 2. Hrtimer for VCPU is scheduled
 3. Userspace checkpoints the VGIC state (no pending timer interrupts)
 4. The hrtimer fires, schedules work in a workqueue
 5. Workqueue function runs, masks the timer and injects timer interrupt
 6. Userspace checkpoints the timer state (timer masked)

At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.

Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:48:00 +01:00
Christoffer Dall
47a98b15ba arm/arm64: KVM: support for un-queuing active IRQs
Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:46:44 +01:00
Alex Bennée
71760950bf arm/arm64: KVM: add a common vgic_queue_irq_to_lr fn
This helps re-factor away some of the repetitive code and makes the code
flow more nicely.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:44:52 +01:00
Christoffer Dall
ae705930fc arm/arm64: KVM: Keep elrsr/aisr in sync with software model
There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-14 13:42:07 +01:00
Wei Yongjun
b52104e509 arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()
Add the missing unlock before return from function kvm_vgic_create()
in the error handling case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-13 11:40:57 +01:00
Eric Auger
174178fed3 KVM: arm/arm64: add irqfd support
This patch enables irqfd on arm/arm64.

Both irqfd and resamplefd are supported. Injection is implemented
in vgic.c without routing.

This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability
automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set.

Irqfd injection is restricted to SPI. The rationale behind not
supporting PPI irqfd injection is that any device using a PPI would
be a private-to-the-CPU device (timer for instance), so its state
would have to be context-switched along with the VCPU and would
require in-kernel wiring anyhow. It is not a relevant use case for
irqfds.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-12 15:15:34 +01:00
Eric Auger
649cf73994 KVM: arm/arm64: remove coarse grain dist locking at kvm_vgic_sync_hwstate
To prepare for irqfd addition, coarse grain locking is removed at
kvm_vgic_sync_hwstate level and finer grain locking is introduced in
vgic_process_maintenance only.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-12 15:15:33 +01:00
Mark Rutland
0f37247574 KVM: vgic: add virt-capable compatible strings
Several dts only list "arm,cortex-a7-gic" or "arm,gic-400" in their GIC
compatible list, and while this is correct (and supported by the GIC
driver), KVM will fail to detect that it can support these cases.

This patch adds the missing strings to the VGIC code. The of_device_id
entries are padded to keep the probe function data aligned.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-11 15:27:48 +01:00
Linus Torvalds
b9085bcbf5 Fairly small update, but there are some interesting new features.
Common: Optional support for adding a small amount of polling on each HLT
 instruction executed in the guest (or equivalent for other architectures).
 This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes
 or TCP_RR netperf tests).  This also has to be enabled manually for now,
 but the plan is to auto-tune this in the future.
 
 ARM/ARM64: the highlights are support for GICv3 emulation and dirty page
 tracking
 
 s390: several optimizations and bugfixes.  Also a first: a feature
 exposed by KVM (UUID and long guest name in /proc/sysinfo) before
 it is available in IBM's hypervisor! :)
 
 MIPS: Bugfixes.
 
 x86: Support for PML (page modification logging, a new feature in
 Broadwell Xeons that speeds up dirty page tracking), nested virtualization
 improvements (nested APICv---a nice optimization), usual round of emulation
 fixes.  There is also a new option to reduce latency of the TSC deadline
 timer in the guest; this needs to be tuned manually.
 
 Some commits are common between this pull and Catalin's; I see you
 have already included his tree.
 
 ARM has other conflicts where functions are added in the same place
 by 3.19-rc and 3.20 patches.  These are not large though, and entirely
 within KVM.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJU28rkAAoJEL/70l94x66DXqQH/1TDOfJIjW7P2kb0Sw7Fy1wi
 cEX1KO/VFxAqc8R0E/0Wb55CXyPjQJM6xBXuFr5cUDaIjQ8ULSktL4pEwXyyv/s5
 DBDkN65mriry2w5VuEaRLVcuX9Wy+tqLQXWNkEySfyb4uhZChWWHvKEcgw5SqCyg
 NlpeHurYESIoNyov3jWqvBjr4OmaQENyv7t2c6q5ErIgG02V+iCux5QGbphM2IC9
 LFtPKxoqhfeB2xFxTOIt8HJiXrZNwflsTejIlCl/NSEiDVLLxxHCxK2tWK/tUXMn
 JfLD9ytXBWtNMwInvtFm4fPmDouv2VDyR0xnK2db+/axsJZnbxqjGu1um4Dqbak=
 =7gdx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM update from Paolo Bonzini:
 "Fairly small update, but there are some interesting new features.

  Common:
     Optional support for adding a small amount of polling on each HLT
     instruction executed in the guest (or equivalent for other
     architectures).  This can improve latency up to 50% on some
     scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests).  This
     also has to be enabled manually for now, but the plan is to
     auto-tune this in the future.

  ARM/ARM64:
     The highlights are support for GICv3 emulation and dirty page
     tracking

  s390:
     Several optimizations and bugfixes.  Also a first: a feature
     exposed by KVM (UUID and long guest name in /proc/sysinfo) before
     it is available in IBM's hypervisor! :)

  MIPS:
     Bugfixes.

  x86:
     Support for PML (page modification logging, a new feature in
     Broadwell Xeons that speeds up dirty page tracking), nested
     virtualization improvements (nested APICv---a nice optimization),
     usual round of emulation fixes.

     There is also a new option to reduce latency of the TSC deadline
     timer in the guest; this needs to be tuned manually.

     Some commits are common between this pull and Catalin's; I see you
     have already included his tree.

  Powerpc:
     Nothing yet.

     The KVM/PPC changes will come in through the PPC maintainers,
     because I haven't received them yet and I might end up being
     offline for some part of next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: ia64: drop kvm.h from installed user headers
  KVM: x86: fix build with !CONFIG_SMP
  KVM: x86: emulate: correct page fault error code for NoWrite instructions
  KVM: Disable compat ioctl for s390
  KVM: s390: add cpu model support
  KVM: s390: use facilities and cpu_id per KVM
  KVM: s390/CPACF: Choose crypto control block format
  s390/kernel: Update /proc/sysinfo file with Extended Name and UUID
  KVM: s390: reenable LPP facility
  KVM: s390: floating irqs: fix user triggerable endless loop
  kvm: add halt_poll_ns module parameter
  kvm: remove KVM_MMIO_SIZE
  KVM: MIPS: Don't leak FPU/DSP to guest
  KVM: MIPS: Disable HTW while in guest
  KVM: nVMX: Enable nested posted interrupt processing
  KVM: nVMX: Enable nested virtual interrupt delivery
  KVM: nVMX: Enable nested apic register virtualization
  KVM: nVMX: Make nested control MSRs per-cpu
  KVM: nVMX: Enable nested virtualize x2apic mode
  KVM: nVMX: Prepare for using hardware MSR bitmap
  ...
2015-02-13 09:55:09 -08:00
Andre Przywara
4fa96afd94 arm/arm64: KVM: force alignment of VGIC dist/CPU/redist addresses
Although the GIC architecture requires us to map the MMIO regions
only at page aligned addresses, we currently do not enforce this from
the kernel side.
Restrict any vGICv2 regions to be 4K aligned and any GICv3 regions
to be 64K aligned. Document this requirement.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:33 +01:00
Andre Przywara
ac3d373564 arm/arm64: KVM: allow userland to request a virtual GICv3
With all of the GICv3 code in place now we allow userland to ask the
kernel for using a virtual GICv3 in the guest.
Also we provide the necessary support for guests setting the memory
addresses for the virtual distributor and redistributors.
This requires some userland code to make use of that feature and
explicitly ask for a virtual GICv3.
Document that KVM_CREATE_IRQCHIP only works for GICv2, but is
considered legacy and using KVM_CREATE_DEVICE is preferred.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:33 +01:00
Andre Przywara
b5d84ff600 arm/arm64: KVM: enable kernel side of GICv3 emulation
With all the necessary GICv3 emulation code in place, we can now
connect the code to the GICv3 backend in the kernel.
The LR register handling is different depending on the emulated GIC
model, so provide different implementations for each.
Also allow non-v2-compatible GICv3 implementations (which don't
provide MMIO regions for the virtual CPU interface in the DT), but
restrict those hosts to support GICv3 guests only.
If the device tree provides a GICv2 compatible GICV resource entry,
but that one is faulty, just disable the GICv2 emulation and let the
user use at least the GICv3 emulation for guests.
To provide proper support for the legacy KVM_CREATE_IRQCHIP ioctl,
note virtual GICv2 compatibility in struct vgic_params and use it
on creating a VGICv2.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:32 +01:00
Andre Przywara
6d52f35af1 arm64: KVM: add SGI generation register emulation
While the generation of a (virtual) inter-processor interrupt (SGI)
on a GICv2 works by writing to a MMIO register, GICv3 uses the system
register ICC_SGI1R_EL1 to trigger them.
Add a trap handler function that calls the new SGI register handler
in the GICv3 code. As ICC_SRE_EL1.SRE at this point is still always 0,
this will not trap yet, but will only be used later when all the data
structures have been initialized properly.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:32 +01:00
Andre Przywara
a0675c25d6 arm/arm64: KVM: add virtual GICv3 distributor emulation
With everything separated and prepared, we implement a model of a
GICv3 distributor and redistributors by using the existing framework
to provide handler functions for each register group.

Currently we limit the emulation to a model enforcing a single
security state, with SRE==1 (forcing system register access) and
ARE==1 (allowing more than 8 VCPUs).

We share some of the functions provided for GICv2 emulation, but take
the different ways of addressing (v)CPUs into account.
Save and restore is currently not implemented.

Similar to the split-off of the GICv2 specific code, the new emulation
code goes into a new file (vgic-v3-emul.c).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:31 +01:00
Andre Przywara
9fedf14677 arm/arm64: KVM: add opaque private pointer to MMIO data
For a GICv2 there is always only one (v)CPU involved: the one that
does the access. On a GICv3 the access to a CPU redistributor is
memory-mapped, but not banked, so the (v)CPU affected is determined by
looking at the MMIO address region being accessed.
To allow passing the affected CPU into the accessors later, extend
struct kvm_exit_mmio to add an opaque private pointer parameter.
The current GICv2 emulation just does not use it.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:30 +01:00
Andre Przywara
1d916229e3 arm/arm64: KVM: split GICv2 specific emulation code from vgic.c
vgic.c is currently a mixture of generic vGIC emulation code and
functions specific to emulating a GICv2. To ease the addition of
GICv3, split off strictly v2 specific parts into a new file
vgic-v2-emul.c.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

-------
As the diff isn't always obvious here (and to aid eventual rebases),
here is a list of high-level changes done to the code:
* added new file to respective arm/arm64 Makefiles
* moved GICv2 specific functions to vgic-v2-emul.c:
  - handle_mmio_misc()
  - handle_mmio_set_enable_reg()
  - handle_mmio_clear_enable_reg()
  - handle_mmio_set_pending_reg()
  - handle_mmio_clear_pending_reg()
  - handle_mmio_priority_reg()
  - vgic_get_target_reg()
  - vgic_set_target_reg()
  - handle_mmio_target_reg()
  - handle_mmio_cfg_reg()
  - handle_mmio_sgi_reg()
  - vgic_v2_unqueue_sgi()
  - read_set_clear_sgi_pend_reg()
  - write_set_clear_sgi_pend_reg()
  - handle_mmio_sgi_set()
  - handle_mmio_sgi_clear()
  - vgic_v2_handle_mmio()
  - vgic_get_sgi_sources()
  - vgic_dispatch_sgi()
  - vgic_v2_queue_sgi()
  - vgic_v2_map_resources()
  - vgic_v2_init()
  - vgic_v2_add_sgi_source()
  - vgic_v2_init_model()
  - vgic_v2_init_emulation()
  - handle_cpu_mmio_misc()
  - handle_mmio_abpr()
  - handle_cpu_mmio_ident()
  - vgic_attr_regs_access()
  - vgic_create() (renamed to vgic_v2_create())
  - vgic_destroy() (renamed to vgic_v2_destroy())
  - vgic_has_attr() (renamed to vgic_v2_has_attr())
  - vgic_set_attr() (renamed to vgic_v2_set_attr())
  - vgic_get_attr() (renamed to vgic_v2_get_attr())
  - struct kvm_mmio_range vgic_dist_ranges[]
  - struct kvm_mmio_range vgic_cpu_ranges[]
  - struct kvm_device_ops kvm_arm_vgic_v2_ops {}

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:30 +01:00
Andre Przywara
832158125d arm/arm64: KVM: add vgic.h header file
vgic.c is currently a mixture of generic vGIC emulation code and
functions specific to emulating a GICv2. To ease the addition of
GICv3 later, we create new header file vgic.h, which holds constants
and prototypes of commonly used functions.
Rename some identifiers to avoid name space clutter.
I removed the long-standing comment about using the kvm_io_bus API
to tackle the GIC register ranges, as it wouldn't be a win for us
anymore.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

-------
As the diff isn't always obvious here (and to aid eventual rebases),
here is a list of high-level changes done to the code:
* moved definitions and prototypes from vgic.c to vgic.h:
  - VGIC_ADDR_UNDEF
  - ACCESS_{READ,WRITE}_*
  - vgic_init()
  - vgic_update_state()
  - vgic_kick_vcpus()
  - vgic_get_vmcr()
  - vgic_set_vmcr()
  - struct mmio_range {} (renamed to struct kvm_mmio_range)
* removed static keyword and exported prototype in vgic.h:
  - vgic_bitmap_get_reg()
  - vgic_bitmap_set_irq_val()
  - vgic_bitmap_get_shared_map()
  - vgic_bytemap_get_reg()
  - vgic_dist_irq_set_pending()
  - vgic_dist_irq_clear_pending()
  - vgic_cpu_irq_clear()
  - vgic_reg_access()
  - handle_mmio_raz_wi()
  - vgic_handle_enable_reg()
  - vgic_handle_set_pending_reg()
  - vgic_handle_clear_pending_reg()
  - vgic_handle_cfg_reg()
  - vgic_unqueue_irqs()
  - find_matching_range() (renamed to vgic_find_range)
  - vgic_handle_mmio_range()
  - vgic_update_state()
  - vgic_get_vmcr()
  - vgic_set_vmcr()
  - vgic_queue_irq()
  - vgic_kick_vcpus()
  - vgic_init()
  - vgic_v2_init_emulation()
  - vgic_has_attr_regs()
  - vgic_set_common_attr()
  - vgic_get_common_attr()
  - vgic_destroy()
  - vgic_create()
* moved functions to vgic.h (static inline):
  - mmio_data_read()
  - mmio_data_write()
  - is_in_range()

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:30 +01:00
Andre Przywara
b60da146c1 arm/arm64: KVM: refactor/wrap vgic_set/get_attr()
vgic_set_attr() and vgic_get_attr() contain both code specific for
the emulated GIC as well as code for the userland facing, generic
part of the GIC.
Split the guest GIC facing code of from the generic part to allow
easier splitting later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:29 +01:00
Andre Przywara
d97f683d0f arm/arm64: KVM: refactor MMIO accessors
The MMIO accessors for GICD_I[CS]ENABLER, GICD_I[CS]PENDR and
GICD_ICFGR behave very similar for GICv2 and GICv3, although the way
the affected VCPU is determined differs.
Since we need them to access the registers from three different
places in the future, we factor out a generic, backend-facing
implementation and use small wrappers in the current GICv2 emulation.
This will ease adding GICv3 accessors later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:29 +01:00
Andre Przywara
2f5fa41a7a arm/arm64: KVM: make the value of ICC_SRE_EL1 a per-VM variable
ICC_SRE_EL1 is a system register allowing msr/mrs accesses to the
GIC CPU interface for EL1 (guests). Currently we force it to 0, but
for proper GICv3 support we have to allow guests to use it (depending
on their selected virtual GIC model).
So add ICC_SRE_EL1 to the list of saved/restored registers on a
world switch, but actually disallow a guest to change it by only
restoring a fixed, once-initialized value.
This value depends on the GIC model userland has chosen for a guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:28 +01:00
Andre Przywara
3caa2d8c3b arm/arm64: KVM: make the maximum number of vCPUs a per-VM value
Currently the maximum number of vCPUs supported is a global value
limited by the used GIC model. GICv3 will lift this limit, but we
still need to observe it for guests using GICv2.
So the maximum number of vCPUs is per-VM value, depending on the
GIC model the guest uses.
Store and check the value in struct kvm_arch, but keep it down to
8 for now.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:28 +01:00
Andre Przywara
4ce7ebdfc6 arm/arm64: KVM: dont rely on a valid GICH base address
To check whether the vGIC was already initialized, we currently check
the GICH base address for not being NULL. Since with GICv3 we may
get along without this address, lets use the irqchip_in_kernel()
function to detect an already initialized vGIC.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:27 +01:00
Andre Przywara
ea2f83a7de arm/arm64: KVM: move kvm_register_device_ops() into vGIC probing
Currently we unconditionally register the GICv2 emulation device
during the host's KVM initialization. Since with GICv3 support we
may end up with only v2 or only v3 or both supported, we move the
registration into the GIC probing function, where we will later know
which combination is valid.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:27 +01:00
Andre Przywara
b26e5fdac4 arm/arm64: KVM: introduce per-VM ops
Currently we only have one virtual GIC model supported, so all guests
use the same emulation code. With the addition of another model we
end up with different guests using potentially different vGIC models,
so we have to split up some functions to be per VM.
Introduce a vgic_vm_ops struct to hold function pointers for those
functions that are different and provide the necessary code to
initialize them.
Also split up the vgic_init() function to separate out VGIC model
specific functionality into a separate function, which will later be
different for a GICv3 model.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:26 +01:00
Andre Przywara
05bc8aafe6 arm/arm64: KVM: wrap 64 bit MMIO accesses with two 32 bit ones
Some GICv3 registers can and will be accessed as 64 bit registers.
Currently the register handling code can only deal with 32 bit
accesses, so we do two consecutive calls to cover this.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:26 +01:00
Andre Przywara
96415257a1 arm/arm64: KVM: refactor vgic_handle_mmio() function
Currently we only need to deal with one MMIO region for the GIC
emulation (the GICv2 distributor), but we soon need to extend this.
Refactor the existing code to allow easier addition of different
ranges without code duplication.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:25 +01:00
Andre Przywara
59892136c4 arm/arm64: KVM: pass down user space provided GIC type into vGIC code
With the introduction of a second emulated GIC model we need to let
userspace specify the GIC model to use for each VM. Pass the
userspace provided value down into the vGIC code and store it there
to differentiate later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20 18:25:25 +01:00
Eric Auger
065c003482 KVM: arm/arm64: vgic: add init entry to VGIC KVM device
Since the advent of VGIC dynamic initialization, this latter is
initialized quite late on the first vcpu run or "on-demand", when
injecting an IRQ or when the guest sets its registers.

This initialization could be initiated explicitly much earlier
by the users-space, as soon as it has provided the requested
dimensioning parameters.

This patch adds a new entry to the VGIC KVM device that allows
the user to manually request the VGIC init:
- a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
- Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT

The rationale behind introducing a group is to be able to add other
controls later on, if needed.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-11 14:12:15 +01:00
Eric Auger
66b030e48a KVM: arm/arm64: vgic: vgic_init returns -ENODEV when no online vcpu
To be more explicit on vgic initialization failure, -ENODEV is
returned by vgic_init when no online vcpus can be found at init.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-11 14:12:15 +01:00
Richard Cochran
2eebdde652 timecounter: keep track of accumulated fractional nanoseconds
The current timecounter implementation will drop a variable amount
of resolution, depending on the magnitude of the time delta. In
other words, reading the clock too often or too close to a time
stamp conversion will introduce errors into the time values. This
patch fixes the issue by introducing a fractional nanosecond field
that accumulates the low order bits.

Reported-by: Janusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-30 18:29:27 -05:00
Linus Torvalds
66dcff86ba 3.19 changes for KVM:
- spring cleaning: removed support for IA64, and for hardware-assisted
 virtualization on the PPC970
 - ARM, PPC, s390 all had only small fixes
 
 For x86:
 - small performance improvements (though only on weird guests)
 - usual round of hardware-compliancy fixes from Nadav
 - APICv fixes
 - XSAVES support for hosts and guests.  XSAVES hosts were broken because
 the (non-KVM) XSAVES patches inadvertently changed the KVM userspace
 ABI whenever XSAVES was enabled; hence, this part is going to stable.
 Guest support is just a matter of exposing the feature and CPUID leaves
 support.
 
 Right now KVM is broken for PPC BookE in your tree (doesn't compile).
 I'll reply to the pull request with a patch, please apply it either
 before the pull request or in the merge commit, in order to preserve
 bisectability somewhat.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJUkpg+AAoJEL/70l94x66DUmoH/jzXYkptSW9NGgm79KqxGJlD
 lzLnLBkitVvx++Mz5YBhdJEhKKLUlCtifFT1zPJQ/pthQhIRSaaAwZyNGgUs5w5x
 yMGKHiPQFyZRbmQtZhCInW0BftJoYHHciO3nUfHCZnp34My9MP2D55W7/z+fYFfQ
 DuqBSE9ThyZJtZ4zh8NRA9fCOeuqwVYRyoBs820Wbsh4cpIBoIK63Dg7k+CLE+ZV
 MZa/mRL6bAfsn9W5bnOUAgHJ3SPznnWbO3/g0aV+roL/5pffblprJx9lKNR08xUM
 6hDFLop2gDehDJesDkY/o8Ckp1hEouvfsVpSShry4vcgtn0hgh2O5/6Orbmj6vE=
 =Zwq1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM update from Paolo Bonzini:
 "3.19 changes for KVM:

   - spring cleaning: removed support for IA64, and for hardware-
     assisted virtualization on the PPC970

   - ARM, PPC, s390 all had only small fixes

  For x86:
   - small performance improvements (though only on weird guests)
   - usual round of hardware-compliancy fixes from Nadav
   - APICv fixes
   - XSAVES support for hosts and guests.  XSAVES hosts were broken
     because the (non-KVM) XSAVES patches inadvertently changed the KVM
     userspace ABI whenever XSAVES was enabled; hence, this part is
     going to stable.  Guest support is just a matter of exposing the
     feature and CPUID leaves support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (179 commits)
  KVM: move APIC types to arch/x86/
  KVM: PPC: Book3S: Enable in-kernel XICS emulation by default
  KVM: PPC: Book3S HV: Improve H_CONFER implementation
  KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR register
  KVM: PPC: Book3S HV: Remove code for PPC970 processors
  KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions
  KVM: PPC: Book3S HV: Simplify locking around stolen time calculations
  arch: powerpc: kvm: book3s_paired_singles.c: Remove unused function
  arch: powerpc: kvm: book3s_pr.c: Remove unused function
  arch: powerpc: kvm: book3s.c: Remove some unused functions
  arch: powerpc: kvm: book3s_32_mmu.c: Remove unused function
  KVM: PPC: Book3S HV: Check wait conditions before sleeping in kvmppc_vcore_blocked
  KVM: PPC: Book3S HV: ptes are big endian
  KVM: PPC: Book3S HV: Fix inaccuracies in ICP emulation for H_IPI
  KVM: PPC: Book3S HV: Fix KSM memory corruption
  KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI
  KVM: PPC: Book3S HV: Fix computation of tlbie operand
  KVM: PPC: Book3S HV: Add missing HPTE unlock
  KVM: PPC: BookE: Improve irq inject tracepoint
  arm/arm64: KVM: Require in-kernel vgic for the arch timers
  ...
2014-12-18 16:05:28 -08:00
Christoffer Dall
05971120fc arm/arm64: KVM: Require in-kernel vgic for the arch timers
It is curently possible to run a VM with architected timers support
without creating an in-kernel VGIC, which will result in interrupts from
the virtual timer going nowhere.

To address this issue, move the architected timers initialization to the
time when we run a VCPU for the first time, and then only initialize
(and enable) the architected timers if we have a properly created and
initialized in-kernel VGIC.

When injecting interrupts from the virtual timer to the vgic, the
current setup should ensure that this never calls an on-demand init of
the VGIC, which is the only call path that could return an error from
kvm_vgic_inject_irq(), so capture the return value and raise a warning
if there's an error there.

We also change the kvm_timer_init() function from returning an int to be
a void function, since the function always succeeds.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-15 11:50:42 +01:00
Christoffer Dall
ca7d9c829d arm/arm64: KVM: Initialize the vgic on-demand when injecting IRQs
Userspace assumes that it can wire up IRQ injections after having
created all VCPUs and after having created the VGIC, but potentially
before starting the first VCPU.  This can currently lead to lost IRQs
because the state of that IRQ injection is not stored anywhere and we
don't return an error to userspace.

We haven't seen this problem manifest itself yet, presumably because
guests reset the devices on boot, but this could cause issues with
migration and other non-standard startup configurations.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-15 11:36:21 +01:00
Christoffer Dall
1f57be2895 arm/arm64: KVM: Add (new) vgic_initialized macro
Some code paths will need to check to see if the internal state of the
vgic has been initialized (such as when creating new VCPUs), so
introduce such a macro that checks the nr_cpus field which is set when
the vgic has been initialized.

Also set nr_cpus = 0 in kvm_vgic_destroy, because the error path in
vgic_init() will call this function, and code should never errornously
assume the vgic to be properly initialized after an error.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13 14:17:10 +01:00
Christoffer Dall
c52edf5f8c arm/arm64: KVM: Rename vgic_initialized to vgic_ready
The vgic_initialized() macro currently returns the state of the
vgic->ready flag, which indicates if the vgic is ready to be used when
running a VM, not specifically if its internal state has been
initialized.

Rename the macro accordingly in preparation for a more nuanced
initialization flow.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13 14:17:05 +01:00
Peter Maydell
6d3cfbe21b arm/arm64: KVM: vgic: move reset initialization into vgic_init_maps()
VGIC initialization currently happens in three phases:
 (1) kvm_vgic_create() (triggered by userspace GIC creation)
 (2) vgic_init_maps() (triggered by userspace GIC register read/write
     requests, or from kvm_vgic_init() if not already run)
 (3) kvm_vgic_init() (triggered by first VM run)

We were doing initialization of some state to correspond with the
state of a freshly-reset GIC in kvm_vgic_init(); this is too late,
since it will overwrite changes made by userspace using the
register access APIs before the VM is run. Move this initialization
earlier, into the vgic_init_maps() phase.

This fixes a bug where QEMU could successfully restore a saved
VM state snapshot into a VM that had already been run, but could
not restore it "from cold" using the -loadvm command line option
(the symptoms being that the restored VM would run but interrupts
were ignored).

Finally rename vgic_init_maps to vgic_init and renamed kvm_vgic_init to
kvm_vgic_map_resources.

  [ This patch is originally written by Peter Maydell, but I have
    modified it somewhat heavily, renaming various bits and moving code
    around.  If something is broken, I am to be blamed. - Christoffer ]

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13 14:15:52 +01:00
Christoffer Dall
6b50f54064 arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()
If we detect another vCPU is running we just exit and return 0 as if we
succesfully created the VGIC, but the VGIC wouldn't actual be created.

This shouldn't break in-kernel behavior because the kernel will not
observe the failed the attempt to create the VGIC, but userspace could
be rightfully confused.

Cc: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-26 14:40:44 +01:00
Shannon Zhao
016ed39c54 arm/arm64: KVM: vgic: kick the specific vcpu instead of iterating through all
When call kvm_vgic_inject_irq to inject interrupt, we can known which
vcpu the interrupt for by the irq_num and the cpuid. So we should just
kick this vcpu to avoid iterating through all.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-26 10:19:37 +00:00
Christoffer Dall
b1e952b4e4 arm/arm64: vgic: Remove unreachable irq_clear_pending
When 'injecting' an edge-triggered interrupt with a falling edge we
shouldn't clear the pending state on the distributor.  In fact, we
don't, because the check in vgic_validate_injection would prevent us
from ever reaching this bit of code.

Remove the unreachable snippet.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-25 13:57:28 +00:00
wanghaibin
7d39f9e32c KVM: ARM: VGIC: Optimize the vGIC vgic_update_irq_pending function.
When vgic_update_irq_pending with level-sensitive false, it is need to
deactivates an interrupt, and, it can go to out directly.
Here return a false value, because it will be not need to kick.

Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-25 13:57:26 +00:00
Christoffer Dall
2df36a5dd6 arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs
The EIRSR and ELRSR registers are 32-bit registers on GICv2, and we
store these as an array of two such registers on the vgic vcpu struct.
However, we access them as a single 64-bit value or as a bitmap pointer
in the generic vgic code, which breaks BE support.

Instead, store them as u64 values on the vgic structure and do the
word-swapping in the assembly code, which already handles the byte order
for BE systems.

Tested-by: Victor Kamensky <victor.kamensky@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-16 10:57:41 +02:00