With v5.2-rc1, The ftrace functions_graph tracer locks up whenever it is
enabled on arm64.
Since commit 0ea415390c ("clocksource/arm_arch_timer: Use
arch_timer_read_counter to access stable counters") a function pointer
is consistently used to read the counter instead of potentially
referencing an inlinable function.
The graph tracers relies on accessing the timer counters to compute the
time spent in functions which causes the lockup when attempting to trace
these code paths.
Annotate the arm arch timer counter accessors as notrace.
Fixes: 0ea415390c ("clocksource/arm_arch_timer: Use
arch_timer_read_counter to access stable counters")
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Mostly just incremental improvements here:
- Introduce AT_HWCAP2 for advertising CPU features to userspace
- Expose SVE2 availability to userspace
- Support for "data cache clean to point of deep persistence" (DC PODP)
- Honour "mitigations=off" on the cmdline and advertise status via sysfs
- CPU timer erratum workaround (Neoverse-N1 #1188873)
- Introduce perf PMU driver for the SMMUv3 performance counters
- Add config option to disable the kuser helpers page for AArch32 tasks
- Futex modifications to ensure liveness under contention
- Rework debug exception handling to seperate kernel and user handlers
- Non-critical fixes and cleanup
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzMFGgACgkQt6xw3ITB
YzTicAf/TX1h1+ecbx4WJAa4qeiOCPoNpG9efldQumqJhKL44MR5bkhuShna5mwE
ptm5qUXkZCxLTjzssZKnbdbgwa3t+emW8Of3D91IfI9akiZbMoDx5FGgcNbqjazb
RLrhOFHwgontA38yppZN+DrL+sXbvif/CVELdHahkEx6KepSGaS2lmPXRmz/W56v
4yIRy/zxc3Dhjgfm3wKh72nBwoZdLiIc4mchd5pthNlR9E2idrYkQegG1C+gA00r
o8uZRVOWgoh7H+QJE+xLUc8PaNCg8xqRRXOuZYg9GOz6hh7zSWhm+f1nRz9S2tIR
gIgsCHNqoO2I3E1uJpAQXDGtt2kFhA==
=ulpJ
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Mostly just incremental improvements here:
- Introduce AT_HWCAP2 for advertising CPU features to userspace
- Expose SVE2 availability to userspace
- Support for "data cache clean to point of deep persistence" (DC PODP)
- Honour "mitigations=off" on the cmdline and advertise status via
sysfs
- CPU timer erratum workaround (Neoverse-N1 #1188873)
- Introduce perf PMU driver for the SMMUv3 performance counters
- Add config option to disable the kuser helpers page for AArch32 tasks
- Futex modifications to ensure liveness under contention
- Rework debug exception handling to seperate kernel and user
handlers
- Non-critical fixes and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
Documentation: Add ARM64 to kernel-parameters.rst
arm64/speculation: Support 'mitigations=' cmdline option
arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB
arm64: enable generic CPU vulnerabilites support
arm64: add sysfs vulnerability show for speculative store bypass
arm64: Fix size of __early_cpu_boot_status
clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters
clocksource/arm_arch_timer: Remove use of workaround static key
clocksource/arm_arch_timer: Drop use of static key in arch_timer_reg_read_stable
clocksource/arm_arch_timer: Direcly assign set_next_event workaround
arm64: Use arch_timer_read_counter instead of arch_counter_get_cntvct
watchdog/sbsa: Use arch_timer_read_counter instead of arch_counter_get_cntvct
ARM: vdso: Remove dependency with the arch_timer driver internals
arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1
arm64: Add part number for Neoverse N1
arm64: Make ARM64_ERRATUM_1188873 depend on COMPAT
arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32
arm64: mm: Remove pte_unmap_nested()
arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable
arm64: compat: Reduce address limit for 64K pages
...
Instead of always going via arch_counter_get_cntvct_stable to access the
counter workaround, let's have arch_timer_read_counter point to the
right method.
For that, we need to track whether any CPU in the system has a
workaround for the counter. This is done by having an atomic variable
tracking this.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The use of a static key in a hotplug path has proved to be a real
nightmare, and makes it impossible to have scream-free lockdep
kernel.
Let's remove the static key altogether, and focus on something saner.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When a given timer is affected by an erratum and requires an
alternative implementation of set_next_event, we do a rather
complicated dance to detect and call the workaround on each
set_next_event call.
This is clearly idiotic, as we can perfectly detect whether
this CPU requires a workaround while setting up the clock event
device.
This only requires the CPU-specific detection to be done a bit
earlier, and we can then safely override the set_next_event pointer
if we have a workaround associated to that CPU.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by; Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We currently deal with ARM64_ERRATUM_1188873 by always trapping EL0
accesses for both instruction sets. Although nothing wrong comes out
of that, people trying to squeeze the last drop of performance from
buggy HW find this over the top. Oh well.
Let's change the mitigation by flipping the counter enable bit
on return to userspace. Non-broken HW gets an extra branch on
the fast path, which is hopefully not the end of the world.
The arch timer workaround is also removed.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
As we will exhaust the first 32 bits of AT_HWCAP let's start
exposing AT_HWCAP2 to userspace to give us up to 64 caps.
Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
prefer to expand into AT_HWCAP2 in order to provide a consistent
view to userspace between ILP32 and LP64. However internal to the
kernel we prefer to continue to use the full space of elf_hwcap.
To reduce complexity and allow for future expansion, we now
represent hwcaps in the kernel as ordinals and use a
KERNEL_HWCAP_ prefix. This allows us to support automatic feature
based module loading for all our hwcaps.
We introduce cpu_set_feature to set hwcaps which complements the
existing cpu_have_feature helper. These helpers allow us to clean
up existing direct uses of elf_hwcap and reduce any future effort
required to move beyond 64 caps.
For convenience we also introduce cpu_{have,set}_named_feature which
makes use of the cpu_feature macro to allow providing a hwcap name
without a {KERNEL_}HWCAP_ prefix.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
[will: use const_ilog2() and tweak documentation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
After this commit ded24019b6b6f(clocksource: arm_arch_timer: clean up
printk usage), the previous macro is redundant, so delete it.
And move the new macro to the previous position.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
for 32-bit guests
s390: interrupt cleanup, introduction of the Guest Information Block,
preparation for processor subfunctions in cpu models
PPC: bug fixes and improvements, especially related to machine checks
and protection keys
x86: many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations; plus AVIC fixes.
Generic: memcg accounting
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJci+7XAAoJEL/70l94x66DUMkIAKvEefhceySHYiTpfefjLjIC
16RewgHa+9CO4Oo5iXiWd90fKxtXLXmxDQOS4VGzN0rxvLGRw/fyXIxL1MDOkaAO
l8SLSNuewY4XBUgISL3PMz123r18DAGOuy9mEcYU/IMesYD2F+wy5lJ17HIGq6X2
RpoF1p3qO1jfkPTKOob6Ixd4H5beJNPKpdth7LY3PJaVhDxgouj32fxnLnATVSnN
gENQ10fnt8BCjshRYW6Z2/9bF15JCkUFR1xdBW2/xh1oj+kvPqqqk2bEN1eVQzUy
2hT/XkwtpthqjSbX8NNavWRSFnOnbMLTRKQyIXmFVsM5VoSrwtiGsCFzBgcT++I=
=XIzU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- some cleanups
- direct physical timer assignment
- cache sanitization for 32-bit guests
s390:
- interrupt cleanup
- introduction of the Guest Information Block
- preparation for processor subfunctions in cpu models
PPC:
- bug fixes and improvements, especially related to machine checks
and protection keys
x86:
- many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations
- AVIC fixes
Generic:
- memcg accounting"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
kvm: vmx: fix formatting of a comment
KVM: doc: Document the life cycle of a VM and its resources
MAINTAINERS: Add KVM selftests to existing KVM entry
Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
KVM: PPC: Fix compilation when KVM is not enabled
KVM: Minor cleanups for kvm_main.c
KVM: s390: add debug logging for cpu model subfunctions
KVM: s390: implement subfunction processor calls
arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
KVM: arm/arm64: Remove unused timer variable
KVM: PPC: Book3S: Improve KVM reference counting
KVM: PPC: Book3S HV: Fix build failure without IOMMU support
Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
x86: kvmguest: use TSC clocksource if invariant TSC is exposed
KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
...
The Allwinner A64 SoC is known[1] to have an unstable architectural
timer, which manifests itself most obviously in the time jumping forward
a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a
timer frequency of 24 MHz, implying that the time went slightly backward
(and this was interpreted by the kernel as it jumping forward and
wrapping around past the epoch).
Investigation revealed instability in the low bits of CNTVCT at the
point a high bit rolls over. This leads to power-of-two cycle forward
and backward jumps. (Testing shows that forward jumps are about twice as
likely as backward jumps.) Since the counter value returns to normal
after an indeterminate read, each "jump" really consists of both a
forward and backward jump from the software perspective.
Unless the kernel is trapping CNTVCT reads, a userspace program is able
to read the register in a loop faster than it changes. A test program
running on all 4 CPU cores that reported jumps larger than 100 ms was
run for 13.6 hours and reported the following:
Count | Event
-------+---------------------------
9940 | jumped backward 699ms
268 | jumped backward 1398ms
1 | jumped backward 2097ms
16020 | jumped forward 175ms
6443 | jumped forward 699ms
2976 | jumped forward 1398ms
9 | jumped forward 356516ms
9 | jumped forward 357215ms
4 | jumped forward 714430ms
1 | jumped forward 3578440ms
This works out to a jump larger than 100 ms about every 5.5 seconds on
each CPU core.
The largest jump (almost an hour!) was the following sequence of reads:
0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000
Note that the middle bits don't necessarily all read as all zeroes or
all ones during the anomalous behavior; however the low 10 bits checked
by the function in this patch have never been observed with any other
value.
Also note that smaller jumps are much more common, with backward jumps
of 2048 (2^11) cycles observed over 400 times per second on each core.
(Of course, this is partially explained by lower bits rolling over more
frequently.) Any one of these could have caused the 95 year time skip.
Similar anomalies were observed while reading CNTPCT (after patching the
kernel to allow reads from userspace). However, the CNTPCT jumps are
much less frequent, and only small jumps were observed. The same program
as before (except now reading CNTPCT) observed after 72 hours:
Count | Event
-------+---------------------------
17 | jumped backward 699ms
52 | jumped forward 175ms
2831 | jumped forward 699ms
5 | jumped forward 1398ms
Further investigation showed that the instability in CNTPCT/CNTVCT also
affected the respective timer's TVAL register. The following values were
observed immediately after writing CNVT_TVAL to 0x10000000:
CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error
--------------------+------------+--------------------+-----------------
0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000
0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000
0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000
0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000
The pattern of errors in CNTV_TVAL seemed to depend on exactly which
value was written to it. For example, after writing 0x10101010:
CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error
--------------------+------------+--------------------+-----------------
0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000
0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000
0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000
0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000
0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000
0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000
I was also twice able to reproduce the issue covered by Allwinner's
workaround[4], that writing to TVAL sometimes fails, and both CVAL and
TVAL are left with entirely bogus values. One was the following values:
CNTVCT | CNTV_TVAL | CNTV_CVAL
--------------------+------------+--------------------------------------
0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past)
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
========================================================================
Because the CPU can read the CNTPCT/CNTVCT registers faster than they
change, performing two reads of the register and comparing the high bits
(like other workarounds) is not a workable solution. And because the
timer can jump both forward and backward, no pair of reads can
distinguish a good value from a bad one. The only way to guarantee a
good value from consecutive reads would be to read _three_ times, and
take the middle value only if the three values are 1) each unique and
2) increasing. This takes at minimum 3 counter cycles (125 ns), or more
if an anomaly is detected.
However, since there is a distinct pattern to the bad values, we can
optimize the common case (1022/1024 of the time) to a single read by
simply ignoring values that match the error pattern. This still takes no
more than 3 cycles in the worst case, and requires much less code. As an
additional safety check, we still limit the loop iteration to the number
of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods.
For the TVAL registers, the simple solution is to not use them. Instead,
read or write the CVAL and calculate the TVAL value in software.
Although the manufacturer is aware of at least part of the erratum[4],
there is no official name for it. For now, use the kernel-internal name
"UNKNOWN1".
[1]: https://github.com/armbian/build/commit/a08cd6fe7ae9
[2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/
[3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26
[4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
A host running in VHE mode gets the EL2 physical timer as its time
source (accessed using the EL1 sysreg accessors, which get re-directed
to the EL2 sysregs by VHE).
The EL1 physical timer remains unused by the host kernel, allowing us to
pass that on directly to a KVM guest and saves us from emulating this
timer for the guest on VHE systems.
Store the EL1 Physical Timer's IRQ number in
struct arch_timer_kvm_info on VHE systems to allow KVM to use it.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
When running on Cortex-A76, a timer access from an AArch32 EL0
task may end up with a corrupted value or register. The workaround for
this is to trap these accesses at EL1/EL2 and execute them there.
This only affects versions r0p0, r1p0 and r2p0 of the CPU.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be
fine. However, cpu_possible_mask is more accurate and if there are other
clockevent source in the system which are set to cpu_possible_mask, then
having cpu_all_mask may result in issue.
E.g. on a platform with arm,sp804 timer with rating 300 and
cpu_possible_mask and this arch_mem_timer timer with rating 400 and
cpu_all_mask, tick_check_preferred may choose both preferred as the
cpumasks are not equal though they must be.
This issue was root caused incorrectly initially and a fix was merged as
commit 1332a90558 ("tick: Prefer a lower rating device only if it's CPU
local device").
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/1531151136-18297-2-git-send-email-sudeep.holla@arm.com
Common:
- Python 3 support in kvm_stat
- Accounting of slabs to kmemcg
ARM:
- Optimized arch timer handling for KVM/ARM
- Improvements to the VGIC ITS code and introduction of an ITS reset
ioctl
- Unification of the 32-bit fault injection logic
- More exact external abort matching logic
PPC:
- Support for running hashed page table (HPT) MMU mode on a host that
is using the radix MMU mode; single threaded mode on POWER 9 is
added as a pre-requisite
- Resolution of merge conflicts with the last second 4.14 HPT fixes
- Fixes and cleanups
s390:
- Some initial preparation patches for exitless interrupts and crypto
- New capability for AIS migration
- Fixes
x86:
- Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs, and
after-reset state
- Refined dependencies for VMX features
- Fixes for nested SMI injection
- A lot of cleanups
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJaDayXAAoJEED/6hsPKofo/3UH/3HvlcHt+ADTkCU1/iiKAs+i
0zngIOXIxgHDnV0ww6bV+Znww0BzTYgKCAXX76z603jdpDwG/pzQQcbLDF5ZoJnD
sQtF10gZinWaRsHlfbLqjrHGL2pGDHO1UKBKLJ0bAIyORPZBxs7i+VmrY/blnr9c
0wsybJ8RbvwAxjsDL5jeX/z4NehPupmKUc4Lf0eZdSHwVOf9sjn+MP6jJ0r2JcIb
D+zddPBiLStzN97t4gZpQsrlj3LKrDS+6hY+1TjSvlh+yHKFVFh58VhLm4DuDeb5
bYOAlWJ/gAWEzfvr5Ld+Nd7SqWWn/14logPkQ4gcU4BI/neAOzk4c6hJfCHl1nk=
=593n
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"First batch of KVM changes for 4.15
Common:
- Python 3 support in kvm_stat
- Accounting of slabs to kmemcg
ARM:
- Optimized arch timer handling for KVM/ARM
- Improvements to the VGIC ITS code and introduction of an ITS reset
ioctl
- Unification of the 32-bit fault injection logic
- More exact external abort matching logic
PPC:
- Support for running hashed page table (HPT) MMU mode on a host that
is using the radix MMU mode; single threaded mode on POWER 9 is
added as a pre-requisite
- Resolution of merge conflicts with the last second 4.14 HPT fixes
- Fixes and cleanups
s390:
- Some initial preparation patches for exitless interrupts and crypto
- New capability for AIS migration
- Fixes
x86:
- Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs,
and after-reset state
- Refined dependencies for VMX features
- Fixes for nested SMI injection
- A lot of cleanups"
* tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (89 commits)
KVM: s390: provide a capability for AIS state migration
KVM: s390: clear_io_irq() requests are not expected for adapter interrupts
KVM: s390: abstract conversion between isc and enum irq_types
KVM: s390: vsie: use common code functions for pinning
KVM: s390: SIE considerations for AP Queue virtualization
KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup
KVM: PPC: Book3S HV: Cosmetic post-merge cleanups
KVM: arm/arm64: fix the incompatible matching for external abort
KVM: arm/arm64: Unify 32bit fault injection
KVM: arm/arm64: vgic-its: Implement KVM_DEV_ARM_ITS_CTRL_RESET
KVM: arm/arm64: Document KVM_DEV_ARM_ITS_CTRL_RESET
KVM: arm/arm64: vgic-its: Free caches when GITS_BASER Valid bit is cleared
KVM: arm/arm64: vgic-its: New helper functions to free the caches
KVM: arm/arm64: vgic-its: Remove kvm_its_unmap_device
arm/arm64: KVM: Load the timer state when enabling the timer
KVM: arm/arm64: Rework kvm_timer_should_fire
KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate
KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit
KVM: arm/arm64: Move phys_timer_emulate function
KVM: arm/arm64: Use kvm_arm_timer_set/get_reg for guest register traps
...
Plenty of acronym soup here:
- Initial support for the Scalable Vector Extension (SVE)
- Improved handling for SError interrupts (required to handle RAS events)
- Enable GCC support for 128-bit integer types
- Remove kernel text addresses from backtraces and register dumps
- Use of WFE to implement long delay()s
- ACPI IORT updates from Lorenzo Pieralisi
- Perf PMU driver for the Statistical Profiling Extension (SPE)
- Perf PMU driver for Hisilicon's system PMUs
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJaCcLqAAoJELescNyEwWM0JREH/2FbmD/khGzEtP8LW+o9D8iV
TBM02uWQxS1bbO1pV2vb+512YQO+iWfeQwJH9Jv2FZcrMvFv7uGRnYgAnJuXNGrl
W+LL6OhN22A24LSawC437RU3Xe7GqrtONIY/yLeJBPablfcDGzPK1eHRA0pUzcyX
VlyDruSHWX44VGBPV6JRd3x0vxpV8syeKOjbRvopRfn3Nwkbd76V3YSfEgwoTG5W
ET1sOnXLmHHdeifn/l1Am5FX1FYstpcd7usUTJ4Oto8y7e09tw3bGJCD0aMJ3vow
v1pCUWohEw7fHqoPc9rTrc1QEnkdML4vjJvMPUzwyTfPrN+7uEuMIEeJierW+qE=
=0qrg
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The big highlight is support for the Scalable Vector Extension (SVE)
which required extensive ABI work to ensure we don't break existing
applications by blowing away their signal stack with the rather large
new vector context (<= 2 kbit per vector register). There's further
work to be done optimising things like exception return, but the ABI
is solid now.
Much of the line count comes from some new PMU drivers we have, but
they're pretty self-contained and I suspect we'll have more of them in
future.
Plenty of acronym soup here:
- initial support for the Scalable Vector Extension (SVE)
- improved handling for SError interrupts (required to handle RAS
events)
- enable GCC support for 128-bit integer types
- remove kernel text addresses from backtraces and register dumps
- use of WFE to implement long delay()s
- ACPI IORT updates from Lorenzo Pieralisi
- perf PMU driver for the Statistical Profiling Extension (SPE)
- perf PMU driver for Hisilicon's system PMUs
- misc cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
arm64: Make ARMV8_DEPRECATED depend on SYSCTL
arm64: Implement __lshrti3 library function
arm64: support __int128 on gcc 5+
arm64/sve: Add documentation
arm64/sve: Detect SVE and activate runtime support
arm64/sve: KVM: Hide SVE from CPU features exposed to guests
arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
arm64/sve: KVM: Prevent guests from using SVE
arm64/sve: Add sysctl to set the default vector length for new processes
arm64/sve: Add prctl controls for userspace vector length management
arm64/sve: ptrace and ELF coredump support
arm64/sve: Preserve SVE registers around EFI runtime service calls
arm64/sve: Preserve SVE registers around kernel-mode NEON use
arm64/sve: Probe SVE capabilities and usable vector lengths
arm64: cpufeature: Move sys_caps_initialised declarations
arm64/sve: Backend logic for setting the vector length
arm64/sve: Signal handling support
arm64/sve: Support vector length resetting for new processes
arm64/sve: Core task context handling
arm64/sve: Low-level CPU setup
...
Using the physical counter allows KVM to retain the offset between the
virtual and physical counter as long as it is actively running a VCPU.
As soon as a VCPU is released, another thread is scheduled or we start
running userspace applications, we reset the offset to 0, so that
userspace accessing the virtual timer can still read the virtual counter
and get the same view of time as the kernel.
This opens up potential improvements for KVM performance, but we have to
make a few adjustments to preserve system consistency.
Currently get_cycles() is hardwired to arch_counter_get_cntvct() on
arm64, but as we move to using the physical timer for the in-kernel
time-keeping on systems that boot in EL2, we should use the same counter
for get_cycles() as for other in-kernel timekeeping operations.
Similarly, implementations of arch_timer_set_next_event_phys() is
modified to use the counter specific to the timer being programmed.
VHE kernels or kernels continuing to use the virtual timer are
unaffected.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
As we are about to use the physical counter on arm64 systems that have
KVM support, implement arch_counter_get_cntpct() and the associated
errata workaround functionality for stable timer reads.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Our ctags mangling script can't handle newlines inside of a
DEFINE_PER_CPU(), leading to an annoying message whenever tags are
built:
ctags: Warning: drivers/clocksource/arm_arch_timer.c:302: null expansion of name pattern "\1"
This was dealt with elsewhere in commit:
25528213fe ("tags: Fix DEFINE_PER_CPU expansions")
... by ensuring each DEFINE_PER_CPU() was contained on a single line,
even where this would violate the usual code style (checkpatch warnings
and all).
Let's do the same for the arch timer driver, and get rid of the
distraction.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The ACPI GTDT code validates the CNTFRQ field of each MMIO timer
frame against the CNTFRQ system register of the current CPU, to
ensure that they are equal, which is mandated by the architecture.
However, reading the CNTFRQ field of a frame is not possible until
the RFRQ bit in the frame's CNTACRn register is set, and doing so
before that willl produce the following error:
arch_timer: [Firmware Bug]: CNTFRQ mismatch: frame @ 0x00000000e0be0000: (0x00000000), CPU: (0x0ee6b280)
arch_timer: Disabling MMIO timers due to CNTFRQ mismatch
arch_timer: Failed to initialize memory-mapped timer.
The reason is that the CNTFRQ field is RES0 if access is not enabled.
So move the validation of CNTFRQ into the loop that iterates over the
timers to find the best frame, but defer it until after we have selected
the best frame, which should also have enabled the RFRQ bit.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The arch timer configuration for a CPU might get reset after suspending
said CPU.
In order to reliably use the event stream in the kernel (e.g. for delays),
we keep track of the state where we can safely consider the event stream as
properly configured. After writing to cntkctl, we issue an ISB to ensure
that subsequent delay loops can rely on the event stream being enabled.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.
Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.
Fixes: c2743a3676 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Use the new static_branch_enable_cpuslocked() function to switch
the workaround static key on the CPU hotplug path.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170801080257.5056-5-marc.zyngier@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The macro name is now renamed to 'TIMER_ACPI_DECLARE' for consistency
with the CLOCKSOURCE_OF_DECLARE => TIMER_OF_DECLARE change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
The CLOCKSOURCE_OF_DECLARE macro is used widely for the timers to declare the
clocksource at early stage. However, this macro is also used to initialize
the clockevent if any, or the clockevent only.
It was originally suggested to declare another macro to initialize a
clockevent, so in order to separate the two entities even they belong to the
same IP. This was not accepted because of the impact on the DT where splitting
a clocksource/clockevent definition does not make sense as it is a Linux
concept not a hardware description.
On the other side, the clocksource has not interrupt declared while the
clockevent has, so it is easy from the driver to know if the description is
for a clockevent or a clocksource, IOW it could be implemented at the driver
level.
So instead of dealing with a named clocksource macro, let's use a more generic
one: TIMER_OF_DECLARE.
The patch has not functional changes.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Fix boot warning 'Trying to vfree() nonexistent vm area'
from arch_timer_mem_of_init().
Refactored code attempts to read and iounmap using address frame
instead of address ioremap(frame->cntbase).
Fixes: c389d701df ("clocksource: arm_arch_timer: split MMIO timer probing.")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Reviewed-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
arch_timer_mem_find_best_frame() looks through ARCH_TIMER_MEM_MAX_FRAMES
frames even after finding matches to ensure the best frame is chosen,
which means the variable frame will point to the last valid frame but
not necessarily the best frame.
On Juno, we get the following error as the wrong frame is returned as the
best frame from arch_timer_mem_find_best_frame():
arch_timer: Unable to map frame @ 0x0000000000000000
arch_timer: Frame missing phys irq.
Failed to initialize '/timer@2a810000': -22
Fix the issue by correctly returning the best frame from
arch_timer_mem_find_best_frame().
Fixes: c389d701df ("clocksource: arm_arch_timer: split MMIO timer probing.")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1494246747-17267-1-git-send-email-sudeep.holla@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In some rare randconfig builds, we end up with two functions being entirely
unused:
drivers/clocksource/arm_arch_timer.c:342:12: error: 'erratum_set_next_event_tval_phys' defined but not used [-Werror=unused-function]
static int erratum_set_next_event_tval_phys(unsigned long evt,
drivers/clocksource/arm_arch_timer.c:335:12: error: 'erratum_set_next_event_tval_virt' defined but not used [-Werror=unused-function]
static int erratum_set_next_event_tval_virt(unsigned long evt,
We could add an #ifdef around them, but we would already have to check for
several symbols there and there is a chance this would get more complicated
over time, so marking them as __maybe_unused is the simplest way to avoid the
harmless warnings.
Fixes: 01d3e3ff26 ("arm64: arch_timer: Rework the set_next_event workarounds")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Link: http://lkml.kernel.org/r/20170419173737.3846098-1-arnd@arndb.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The patch add memory-mapped timer register support by using the
information provided by the new GTDT driver of ACPI.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
[Mark: verify CNTFRQ, only register the first frame]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
The patch update arm_arch_timer driver to use the function
provided by the new GTDT driver of ACPI.
By this way, arm_arch_timer.c can be simplified, and separate
all the ACPI GTDT knowledge from this timer driver.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Currently the code to probe MMIO architected timers mixes DT parsing with
actual poking of hardware. This makes the code harder than necessary to
understand, and makes it difficult to add support for probing via ACPI.
This patch splits the DT parsing from HW probing. The DT parsing now
lives in arch_timer_mem_of_init(), which fills in an arch_timer_mem
structure that it hands to probing functions that can be reused for ACPI
support.
Since the rate detection logic will be slight different when using ACPI,
the probing is performed as a number of steps. This results in more code
for the moment, and some arguably redundant work, but simplifies matters
considerably when ACPI support is added.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
[Mark: refactor the probing split]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
To cleanly split code paths specific to ACPI or DT at a higher level,
this patch removes arch_timer_init(), folding the relevant
parts of its logic into existing callers.
This pathes the way for further rework, and saves a few lines.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
When booting with DT, it's possible for timer nodes to be probed in any
order. Some common initialisation needs to occur after all nodes have
been probed, and arch_timer_common_init() has code to detect when this
has happened.
This logic is DT-specific, and it would be best to factor it out of the
common code that will be shared with ACPI.
This patch folds this into the existing arch_timer_needs_probing(),
which is renamed to arch_timer_needs_of_probing(), and no longer takes
any arguments. This is only called when using DT, and not when using
ACPI, which will have a deterministic probe order.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
For historical reasons, rate detection when probing via DT is somewhat
convoluted. We tried to package this up in arch_timer_detect_rate(), but
with the addition of ACPI worse, and gets in the way of stringent rate
checking when ACPI is used.
This patch makes arch_timer_detect_rate() specific to DT, ripping out
ACPI logic. In preparation for rework of the MMIO timer probing, the
reading of the relevant CNTFRQ register is factored out to callers. The
function is then renamed to arch_timer_of_configure_rate(), which better
represents its new place in the world.
Comments are added in the DT and ACPI probe paths to explain this.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Currently, the arch timer driver uses ARCH_TIMER_PHYS_SECURE_PPI to mean
the driver will use the secure PPI *and* potentially also use the
non-secure PPI. This is somewhat confusing.
For arm64 it never makes sense to use the secure PPI, but we do anyway,
inheriting this behaviour from 32-bit arm. For ACPI, we may not even
have a valid secure PPI, so we need to be able to only request the
non-secure PPI.
To that end, this patch reworks the timer driver so that we can request
the non-secure PPI alone. The PPI selection is split out into a new
function, arch_timer_select_ppi(), and verification of the selected PPI
is shifted out to callers (as DT may select the PPI by other means and
must handle this anyway).
We now consistently use arch_timer_has_nonsecure_ppi() to determine
whether we must manage a non-secure PPI *in addition* to a secure PPI.
When we only have a non-secure PPI, this returns false.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
This patch add a new enum "arch_timer_spi_nr" and use it in the driver.
Just for code's readability, no functional change.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
To support the arm_arch_timer via ACPI we need to share defines and enums
between the driver and the ACPI parser code.
So we split out the relevant defines and enums into arm_arch_timer.h.
No functional change.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
In preparation for moving the PPI enum out into a header, rename the
enum and its constituent values these so they are namespaced w.r.t. the
arch timer. This will aid consistency and avoid potential name clashes
when this move occurs.
No functional change.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
In preparation for moving the type macros out into a header, rename
these so they are namespaced w.r.t. the arch timer. We'll apply the same
prefix to other definitions in subsequent patches. This will aid
consistency and avoid potential name clahses when this move occurs.
No functional change.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Almost all string in the arm_arch_timer driver duplicate an common
prefix (though a few do not). For consistency, it would be better to use
pr_fmt(), and always use this prefix. At the same time, we may as well
clean up some whitespace issues in arch_timer_banner and
arch_timer_init.
No functional change.
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[Mark: reword commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
In order to deal with ACPI enabled platforms suffering from the
HISILICON_ERRATUM_161010101, let's add the required OEM data that
allow the workaround to be enabled.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: dann frazier <dann.frazier@canonical.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Just as we're able to identify a broken platform using some DT
information, let's enable a way to spot the offenders with ACPI.
The difference is that we can only match on some OEM info instead
of implementation-specific properties. So in order to avoid the
insane multiplication of errata structures, we allow an array
of OEM descriptions to be attached to an erratum structure.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: dann frazier <dann.frazier@canonical.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cortex-A73 (all versions) counter read can return a wrong value
when the counter crosses a 32bit boundary.
The workaround involves performing the read twice, and to return
one or the other depending on whether a transition has taken place.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Userspace being allowed to use read CNTVCT_EL0 anytime (and not
only in the VDSO), we need to enable trapping whenever a cntvct
workaround is enabled on a given CPU.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we're about to allow per CPU cntkctl_el1 configuration, we cannot
rely on the register value to be common when performing power
management.
Let's turn saved_cntkctl into a per-cpu variable.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to access clocksource_counter from the errata handling code,
move it (together with the related structures and functions) towards
the top of the file.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Instead of applying a CPU-specific workaround to all CPUs in the system,
allow it to only affect a subset of them (typical big-little case).
This is done by turning the erratum pointer into a per-CPU variable.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The way we work around errata affecting set_next_event is not very
nice, at it imposes this workaround on errata that do not need it.
Add new workaround hooks and let the existing workarounds use them.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>