Various architectures that use asm-generic/io.h still defined their
own default versions of ioremap_nocache, ioremap_wt and ioremap_wc
that point back to plain ioremap directly or indirectly. Remove these
definitions and rely on asm-generic/io.h instead. For this to work
the backup ioremap_* defintions needs to be changed to purely cpp
macros instea of inlines to cover for architectures like openrisc
that only define ioremap after including <asm-generic/io.h>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com>
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for timekeepoing and clocksource drivers:
- VDSO data was updated conditional on the availability of a VDSO
capable clocksource. This causes the VDSO functions which do not
depend on a VDSO capable clocksource to operate on stale data.
Always update unconditionally.
- Prevent a double free in the mediatek driver
- Use the proper helper in the sh_mtu2 driver so it won't attempt to
initialize non-existing interrupts"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping/vsyscall: Update VDSO data unconditionally
clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name()
clocksource/drivers/mediatek: Fix error handling
* for-next/elf-hwcap-docs:
: Update the arm64 ELF HWCAP documentation
docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s]
docs/arm64: cpu-feature-registers: Documents missing visible fields
docs/arm64: elf_hwcaps: Document HWCAP_SB
docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value
* for-next/smccc-conduit-cleanup:
: SMC calling convention conduit clean-up
firmware: arm_sdei: use common SMCCC_CONDUIT_*
firmware/psci: use common SMCCC_CONDUIT_*
arm: spectre-v2: use arm_smccc_1_1_get_conduit()
arm64: errata: use arm_smccc_1_1_get_conduit()
arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()
* for-next/zone-dma:
: Reintroduction of ZONE_DMA for Raspberry Pi 4 support
arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
dma/direct: turn ARCH_ZONE_DMA_BITS into a variable
arm64: Make arm64_dma32_phys_limit static
arm64: mm: Fix unused variable warning in zone_sizes_init
mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type'
arm64: use both ZONE_DMA and ZONE_DMA32
arm64: rename variables used to calculate ZONE_DMA32's size
arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys()
* for-next/relax-icc_pmr_el1-sync:
: Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear
arm64: Document ICC_CTLR_EL3.PMHE setting requirements
arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear
* for-next/double-page-fault:
: Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag
mm: fix double page fault on arm64 if PTE_AF is cleared
x86/mm: implement arch_faults_on_old_pte() stub on x86
arm64: mm: implement arch_faults_on_old_pte() on arm64
arm64: cpufeature: introduce helper cpu_has_hw_af()
* for-next/misc:
: Various fixes and clean-ups
arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist
arm64: mm: Remove MAX_USER_VA_BITS definition
arm64: mm: simplify the page end calculation in __create_pgd_mapping()
arm64: print additional fault message when executing non-exec memory
arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
arm64: pgtable: Correct typo in comment
arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1
arm64: cpufeature: Fix typos in comment
arm64/mm: Poison initmem while freeing with free_reserved_area()
arm64: use generic free_initrd_mem()
arm64: simplify syscall wrapper ifdeffery
* for-next/kselftest-arm64-signal:
: arm64-specific kselftest support with signal-related test-cases
kselftest: arm64: fake_sigreturn_misaligned_sp
kselftest: arm64: fake_sigreturn_bad_size
kselftest: arm64: fake_sigreturn_duplicated_fpsimd
kselftest: arm64: fake_sigreturn_missing_fpsimd
kselftest: arm64: fake_sigreturn_bad_size_for_magic0
kselftest: arm64: fake_sigreturn_bad_magic
kselftest: arm64: add helper get_current_context
kselftest: arm64: extend test_init functionalities
kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]
kselftest: arm64: mangle_pstate_invalid_daif_bits
kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils
kselftest: arm64: extend toplevel skeleton Makefile
* for-next/kaslr-diagnostics:
: Provide diagnostics on boot for KASLR
arm64: kaslr: Check command line before looking for a seed
arm64: kaslr: Announce KASLR status on boot
Just like we do for WFE trapping, it can be useful to turn off
WFI trapping when the physical CPU is not oversubscribed (that
is, the vcpu is the only runnable process on this CPU) *and*
that we're using direct injection of interrupts.
The conditions are reevaluated on each vcpu_load(), ensuring that
we don't switch to this mode on a busy system.
On a GICv4 system, this has the effect of reducing the generation
of doorbell interrupts to zero when the right conditions are
met, which is a huge improvement over the current situation
(where the doorbells are screaming if the CPU ever hits a
blocking WFI).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Link: https://lore.kernel.org/r/20191107160412.30301-3-maz@kernel.org
Following commit 73e86cb03c ("arm64: Move PTE_RDONLY bit handling out
of set_pte_at()"), the PTE_RDONLY bit is no longer managed by
set_pte_at() but built into the PAGE_* attribute definitions.
Consequently, pte_same() must include this bit when checking two PTEs
for equality.
Remove the arm64-specific pte_same() function, practically reverting
commit 747a70e60b ("arm64: Fix copy-on-write referencing in HugeTLB")
Fixes: 73e86cb03c ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()")
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced
function's arguments (and some other registers) to be captured into a
struct pt_regs, allowing these to be inspected and/or modified. This is
a building block for live-patching, where a function's arguments may be
forwarded to another function. This is also necessary to enable ftrace
and in-kernel pointer authentication at the same time, as it allows the
LR value to be captured and adjusted prior to signing.
Using GCC's -fpatchable-function-entry=N option, we can have the
compiler insert a configurable number of NOPs between the function entry
point and the usual prologue. This also ensures functions are AAPCS
compliant (e.g. disabling inter-procedural register allocation).
For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the
following:
| unsigned long bar(void);
|
| unsigned long foo(void)
| {
| return bar() + 1;
| }
... to:
| <foo>:
| nop
| nop
| stp x29, x30, [sp, #-16]!
| mov x29, sp
| bl 0 <bar>
| add x0, x0, #0x1
| ldp x29, x30, [sp], #16
| ret
This patch builds the kernel with -fpatchable-function-entry=2,
prefixing each function with two NOPs. To trace a function, we replace
these NOPs with a sequence that saves the LR into a GPR, then calls an
ftrace entry assembly function which saves this and other relevant
registers:
| mov x9, x30
| bl <ftrace-entry>
Since patchable functions are AAPCS compliant (and the kernel does not
use x18 as a platform register), x9-x18 can be safely clobbered in the
patched sequence and the ftrace entry code.
There are now two ftrace entry functions, ftrace_regs_entry (which saves
all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is
allocated for each within modules.
Signed-off-by: Torsten Duwe <duwe@suse.de>
[Mark: rework asm, comments, PLTs, initialization, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Julien Thierry <jthierry@redhat.com>
Cc: Will Deacon <will@kernel.org>
For FTRACE_WITH_REGS, we're going to want to generate a MOV (register)
instruction as part of the callsite intialization. As MOV (register) is
an alias for ORR (shifted register), we can generate this with
aarch64_insn_gen_logical_shifted_reg(), but it's somewhat verbose and
difficult to read in-context.
Add a aarch64_insn_gen_move_reg() wrapper for this case so that we can
write callers in a more straightforward way.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
commit 9b31cf493f ("arm64: mm: Introduce MAX_USER_VA_BITS definition")
introduced the MAX_USER_VA_BITS definition, which was used to support
the arm64 mm use-cases where the user-space could use 52-bit virtual
addresses whereas the kernel-space would still could a maximum of 48-bit
virtual addressing.
But, now with commit b6d00d47e8 ("arm64: mm: Introduce 52-bit Kernel
VAs"), we removed the 52-bit user/48-bit kernel kconfig option and hence
there is no longer any scenario where user VA != kernel VA size
(even with CONFIG_ARM64_FORCE_52BIT enabled, the same is true).
Hence we can do away with the MAX_USER_VA_BITS macro as it is equal to
VA_BITS (maximum VA space size) in all possible use-cases. Note that
even though the 'vabits_actual' value would be 48 for arm64 hardware
which don't support LVA-8.2 extension (even when CONFIG_ARM64_VA_BITS_52
is enabled), VA_BITS would still be set to a value 52. Hence this change
would be safe in all possible VA address space combinations.
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The update of the VDSO data is depending on __arch_use_vsyscall() returning
True. This is a leftover from the attempt to map the features of various
architectures 1:1 into generic code.
The usage of __arch_use_vsyscall() in the actual vsyscall implementations
got dropped and replaced by the requirement for the architecture code to
return U64_MAX if the global clocksource is not usable in the VDSO.
But the __arch_use_vsyscall() check in the update code stayed which causes
the VDSO data to be stale or invalid when an architecture actually
implements that function and returns False when the current clocksource is
not usable in the VDSO.
As a consequence the VDSO implementations of clock_getres(), time(),
clock_gettime(CLOCK_.*_COARSE) operate on invalid data and return bogus
information.
Remove the __arch_use_vsyscall() check from the VDSO update function and
update the VDSO data unconditionally.
[ tglx: Massaged changelog and removed the now useless implementations in
asm-generic/ARM64/MIPS ]
Fixes: 44f57d788e ("timekeeping: Provide a generic update_vsyscall() implementation")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1571887709-11447-1-git-send-email-chenhc@lemote.com
The Broadcom Brahma-B53 core is susceptible to the issue described by
ARM64_ERRATUM_845719 so this commit enables the workaround to be applied
when executing on that core.
Since there are now multiple entries to match, we must convert the
existing ARM64_ERRATUM_845719 into an erratum list.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
Some architectures, notably ARM, are interested in tweaking this
depending on their runtime DMA addressing limitations.
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Shared and writable mappings (__S.1.) should be clean (!dirty) initially
and made dirty on a subsequent write either through the hardware DBM
(dirty bit management) mechanism or through a write page fault. A clean
pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY
clear.
The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and
PTE_DIRTY clear. Prior to commit 73e86cb03c ("arm64: Move PTE_RDONLY
bit handling out of set_pte_at()"), it was the responsibility of
set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the
software PTE_DIRTY bit was not set. However, the above commit removed
the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in
set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions
unchanged. The result is that shared+writable mappings are now dirty by
default
Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}.
In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_*
attributes.
Fixes: 73e86cb03c ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()")
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Move the synchronous exception paths from entry.S into a C file to
improve the code readability.
* for-next/entry-s-to-c:
arm64: entry-common: don't touch daif before bp-hardening
arm64: Remove asmlinkage from updated functions
arm64: entry: convert el0_sync to C
arm64: entry: convert el1_sync to C
arm64: add local_daif_inherit()
arm64: Add prototypes for functions called by entry.S
arm64: remove __exception annotations
Similarly to erratum 1165522 that affects Cortex-A76, A57 and A72
respectively suffer from errata 1319537 and 1319367, potentially
resulting in TLB corruption if the CPU speculates an AT instruction
while switching guests.
The fix is slightly more involved since we don't have VHE to help us
here, but the idea is the same: when switching a guest in, we must
prevent any speculated AT from being able to parse the page tables
until S2 is up and running. Only at this stage can we allow AT to take
place.
For this, we always restore the guest sysregs first, except for its
SCTLR and TCR registers, which must be set with SCTLR.M=1 and
TCR.EPD{0,1} = {1, 1}, effectively disabling the PTW and TLB
allocation. Once S2 is setup, we restore the guest's SCTLR and
TCR. Similar things must be done on TLB invalidation...
* 'kvm-arm64/erratum-1319367' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms:
arm64: Enable and document ARM errata 1319367 and 1319537
arm64: KVM: Prevent speculative S1 PTW when restoring vcpu context
arm64: KVM: Disable EL1 PTW when invalidating S2 TLBs
arm64: KVM: Reorder system register restoration and stage-2 activation
arm64: Add ARM64_WORKAROUND_1319367 for all A57 and A72 versions
On CPUs that support S2FWB (Armv8.4+), KVM configures the stage 2 page
tables to override the memory attributes of memory accesses, regardless
of the stage 1 page table configurations, and also when the stage 1 MMU
is turned off. This results in all memory accesses to RAM being
cacheable, including during early boot of the guest.
On CPUs without this feature, memory accesses were non-cacheable during
boot until the guest turned on the stage 1 MMU, and we had to detect
when the guest turned on the MMU, such that we could invalidate all cache
entries and ensure a consistent view of memory with the MMU turned on.
When the guest turned on the caches, we would call stage2_flush_vm()
from kvm_toggle_cache().
However, stage2_flush_vm() walks all the stage 2 tables, and calls
__kvm_flush-dcache_pte, which on a system with S2FWB does ... absolutely
nothing.
We can avoid that whole song and dance, and simply not set TVM when
creating a VM on a system that has S2FWB.
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20191028130541.30536-1-christoffer.dall@arm.com
Neoverse-N1 cores with the 'COHERENT_ICACHE' feature may fetch stale
instructions when software depends on prefetch-speculation-protection
instead of explicit synchronization. [0]
The workaround is to trap I-Cache maintenance and issue an
inner-shareable TLBI. The affected cores have a Coherent I-Cache, so the
I-Cache maintenance isn't necessary. The core tells user-space it can
skip it with CTR_EL0.DIC. We also have to trap this register to hide the
bit forcing DIC-aware user-space to perform the maintenance.
To avoid trapping all cache-maintenance, this workaround depends on
a firmware component that only traps I-cache maintenance from EL0 and
performs the workaround.
For user-space, the kernel's work is to trap CTR_EL0 to hide DIC, and
produce a fake IminLine. EL3 traps the now-necessary I-Cache maintenance
and performs the inner-shareable-TLBI that makes everything better.
[0] https://developer.arm.com/docs/sden885747/latest/arm-neoverse-n1-mp050-software-developer-errata-notice
* for-next/neoverse-n1-stale-instr:
arm64: Silence clang warning on mismatched value/register sizes
arm64: compat: Workaround Neoverse-N1 #1542419 for compat user-space
arm64: Fake the IminLine size on systems affected by Neoverse-N1 #1542419
arm64: errata: Hide CTR_EL0.DIC on systems affected by Neoverse-N1 #1542419
The previous patches mechanically transformed the assembly version of
entry.S to entry-common.c for synchronous exceptions.
The C version of local_daif_restore() doesn't quite do the same thing
as the assembly versions if pseudo-NMI is in use. In particular,
| local_daif_restore(DAIF_PROCCTX_NOIRQ)
will still allow pNMI to be delivered. This is not the behaviour
do_el0_ia_bp_hardening() and do_sp_pc_abort() want as it should not
be possible for the PMU handler to run as an NMI until the bp-hardening
sequence has run.
The bp-hardening calls were placed where they are because this was the
first C code to run after the relevant exceptions. As we've now moved
that point earlier, move the checks and calls earlier too.
This makes it clearer that this stuff runs before any kind of exception,
and saves modifying PSTATE twice.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that the callers of these functions have moved into C, they no longer
need the asmlinkage annotation. Remove it.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is largely a 1-1 conversion of asm to C, with a couple of caveats.
The el0_sync{_compat} switches explicitly handle all the EL0 debug
cases, so el0_dbg doesn't have to try to bail out for unexpected EL1
debug ESR values. This also means that an unexpected vector catch from
AArch32 is routed to el0_inv.
We *could* merge the native and compat switches, which would make the
diffstat negative, but I've tried to stay as close to the existing
assembly as possible for the moment.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[split out of a bigger series, added nokprobes. removed irq trace
calls as the C helpers do this. renamed el0_dbg's use of FAR]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Some synchronous exceptions can be taken from a number of contexts,
e.g. where IRQs may or may not be masked. In the entry assembly for
these exceptions, we use the inherit_daif assembly macro to ensure
that we only mask those exceptions which were masked when the exception
was taken.
So that we can do the same from C code, this patch adds a new
local_daif_inherit() function, following the existing local_daif_*()
naming scheme.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[moved away from local_daif_restore()]
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Functions that are only called by assembly don't always have a
C header file prototype.
Add the prototypes before moving the assembly callers to C.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit 7326749801 ("arm64: unwind: reference pt_regs via embedded
stack frame") arm64 has not used the __exception annotation to dump
the pt_regs during stack tracing. in_exception_text() has no callers.
This annotation is only used to blacklist kprobes, it means the same as
__kprobes.
Section annotations like this require the functions to be grouped
together between the start/end markers, and placed according to
the linker script. For kprobes we also have NOKPROBE_SYMBOL() which
logs the symbol address in a section that kprobes parses and
blacklists at boot.
Using NOKPROBE_SYMBOL() instead lets kprobes publish the list of
blacklisted symbols, and saves us from having an arm64 specific
spelling of __kprobes.
do_debug_exception() already has a NOKPROBE_SYMBOL() annotation.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Systems affected by Neoverse-N1 #1542419 support DIC so do not need to
perform icache maintenance once new instructions are cleaned to the PoU.
For the errata workaround, the kernel hides DIC from user-space, so that
the unnecessary cache maintenance can be trapped by firmware.
To reduce the number of traps, produce a fake IminLine value based on
PAGE_SIZE.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cores affected by Neoverse-N1 #1542419 could execute a stale instruction
when a branch is updated to point to freshly generated instructions.
To workaround this issue we need user-space to issue unnecessary
icache maintenance that we can trap. Start by hiding CTR_EL0.DIC.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Enable paravirtualization features when running under a hypervisor
supporting the PV_TIME_ST hypercall.
For each (v)CPU, we ask the hypervisor for the location of a shared
page which the hypervisor will use to report stolen time to us. We set
pv_time_ops to the stolen time function which simply reads the stolen
value from the shared page for a VCPU. We guarantee single-copy
atomicity using READ_ONCE which means we can also read the stolen
time for another VCPU than the currently running one while it is
potentially being updated by the hypervisor.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Allow user space to inform the KVM host where in the physical memory
map the paravirtualized time structures should be located.
User space can set an attribute on the VCPU providing the IPA base
address of the stolen time structure for that VCPU. This must be
repeated for every VCPU in the VM.
The address is given in terms of the physical address visible to
the guest and must be 64 byte aligned. The guest will discover the
address via a hypercall.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Implement the service call for configuring a shared structure between a
VCPU and the hypervisor in which the hypervisor can write the time
stolen from the VCPU's execution time by other tasks on the host.
User space allocates memory which is placed at an IPA also chosen by user
space. The hypervisor then updates the shared structure using
kvm_put_guest() to ensure single copy atomicity of the 64-bit value
reporting the stolen time in nanoseconds.
Whenever stolen time is enabled by the guest, the stolen time counter is
reset.
The stolen time itself is retrieved from the sched_info structure
maintained by the Linux scheduler code. We enable SCHEDSTATS when
selecting KVM Kconfig to ensure this value is meaningful.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
This provides a mechanism for querying which paravirtualized time
features are available in this hypervisor.
Also add the header file which defines the ABI for the paravirtualized
time features we're about to add.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
In some scenarios, such as buggy guest or incorrect configuration of the
VMM and firmware description data, userspace will detect a memory access
to a portion of the IPA, which is not mapped to any MMIO region.
For this purpose, the appropriate action is to inject an external abort
to the guest. The kernel already has functionality to inject an
external abort, but we need to wire up a signal from user space that
lets user space tell the kernel to do this.
It turns out, we already have the set event functionality which we can
perfectly reuse for this.
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
For a long time, if a guest accessed memory outside of a memslot using
any of the load/store instructions in the architecture which doesn't
supply decoding information in the ESR_EL2 (the ISV bit is not set), the
kernel would print the following message and terminate the VM as a
result of returning -ENOSYS to userspace:
load/store instruction decoding not implemented
The reason behind this message is that KVM assumes that all accesses
outside a memslot is an MMIO access which should be handled by
userspace, and we originally expected to eventually implement some sort
of decoding of load/store instructions where the ISV bit was not set.
However, it turns out that many of the instructions which don't provide
decoding information on abort are not safe to use for MMIO accesses, and
the remaining few that would potentially make sense to use on MMIO
accesses, such as those with register writeback, are not used in
practice. It also turns out that fetching an instruction from guest
memory can be a pretty horrible affair, involving stopping all CPUs on
SMP systems, handling multiple corner cases of address translation in
software, and more. It doesn't appear likely that we'll ever implement
this in the kernel.
What is much more common is that a user has misconfigured his/her guest
and is actually not accessing an MMIO region, but just hitting some
random hole in the IPA space. In this scenario, the error message above
is almost misleading and has led to a great deal of confusion over the
years.
It is, nevertheless, ABI to userspace, and we therefore need to
introduce a new capability that userspace explicitly enables to change
behavior.
This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV)
which does exactly that, and introduces a new exit reason to report the
event to userspace. User space can then emulate an exception to the
guest, restart the guest, suspend the guest, or take any other
appropriate action as per the policy of the running system.
Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Rework the EL2 vector hardening that is only selected for A57 and A72
so that the table can also be used for ARM64_WORKAROUND_1319367.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
On arm64 without hardware Access Flag, copying from user will fail because
the pte is old and cannot be marked young. So we always end up with zeroed
page after fork() + CoW for pfn mappings. We don't always have a
hardware-managed Access Flag on arm64.
Hence implement arch_faults_on_old_pte on arm64 to indicate that it might
cause page fault when accessing old pte.
Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We unconditionally set the HW_AFDBM capability and only enable it on
CPUs which really have the feature. But sometimes we need to know
whether this cpu has the capability of HW AF. So decouple AF from
DBM by a new helper cpu_has_hw_af().
If later we noticed a potential performance issue on this path, we can
turn it into a static label as with other CPU features.
Signed-off-by: Jia He <justin.he@arm.com>
Suggested-by: Suzuki Poulose <Suzuki.Poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Workaround for Cavium/Marvell ThunderX2 erratum #219.
* errata/tx2-219:
arm64: Allow CAVIUM_TX2_ERRATUM_219 to be selected
arm64: Avoid Cavium TX2 erratum 219 when switching TTBR
arm64: Enable workaround for Cavium TX2 erratum 219 when running SMT
arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is set
Sign-extending TTBR1 addresses when converting to an untagged address
breaks the documented POSIX semantics for mlock() in some obscure error
cases where we end up returning -EINVAL instead of -ENOMEM as a direct
result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for
TTBR1 addresses and only clear the tag bits for user addresses. This
matches the behaviour of the 'clear_address_tag' assembly macro, so
rename that and align the implementations at the same time so that they
use the same instruction sequences for the tag manipulation.
Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
The 'F' field of the PAR_EL1 register lives in bit 0, not bit 1.
Fix the broken definition in 'sysreg.h'.
Fixes: e8620cff99 ("arm64: sysreg: Add some field definitions for PAR_EL1")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
The GICv3 architecture specification is incredibly misleading when it
comes to PMR and the requirement for a DSB. It turns out that this DSB
is only required if the CPU interface sends an Upstream Control
message to the redistributor in order to update the RD's view of PMR.
This message is only sent when ICC_CTLR_EL1.PMHE is set, which isn't
the case in Linux. It can still be set from EL3, so some special care
is required. But the upshot is that in the (hopefuly large) majority
of the cases, we can drop the DSB altogether.
This relies on a new static key being set if the boot CPU has PMHE
set. The drawback is that this static key has to be exported to
modules.
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
So far all arm64 devices have supported 32 bit DMA masks for their
peripherals. This is not true anymore for the Raspberry Pi 4 as most of
it's peripherals can only address the first GB of memory on a total of
up to 4 GB.
This goes against ZONE_DMA32's intent, as it's expected for ZONE_DMA32
to be addressable with a 32 bit mask. So it was decided to re-introduce
ZONE_DMA in arm64.
ZONE_DMA will contain the lower 1G of memory, which is currently the
memory area addressable by any peripheral on an arm64 device.
ZONE_DMA32 will contain the rest of the 32 bit addressable memory.
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Back in commit:
4378a7d4be ("arm64: implement syscall wrappers")
... I implemented the arm64 syscall wrapper glue following the approach
taken on x86. While doing so, I also copied across some ifdeffery that
isn't necessary on arm64.
On arm64 we don't share any of the native wrappers with compat tasks,
and unlike x86 we don't have alternative implementations of
SYSCALL_DEFINE0(), COND_SYSCALL(), or SYS_NI() defined when AArch32
compat support is enabled.
Thus we don't need to prevent multiple definitions of these macros, and
can remove the #ifndef ... #endif guards protecting them. If any of
these had been previously defined elsewhere, syscalls are unlikely to
work correctly, and we'd want the compiler to warn about the multiple
definitions.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We export the entire kernel address space (i.e. the whole of the TTBR1
address range) via /proc/kcore. The kc_vaddr_to_offset() and
kc_offset_to_vaddr() macros are intended to convert between a kernel
virtual address and its offset relative to the start of the TTBR1
address space.
Prior to commit:
14c127c957 ("arm64: mm: Flip kernel VA space")
... the offset was calculated relative to VA_START, which at the time
was the start of the TTBR1 address space. At this time, PAGE_OFFSET
pointed to the high half of the TTBR1 address space where arm64's
linear map lived.
That commit swapped the position of VA_START and PAGE_OFFSET, but
failed to update kc_vaddr_to_offset() or kc_offset_to_vaddr(), so
since then the two macros behave incorrectly.
Note that VA_START was subsequently renamed to PAGE_END in commit:
77ad4ce693 ("arm64: memory: rename VA_START to PAGE_END")
As the generic implementations of the two macros calculate the offset
relative to PAGE_OFFSET (which is now the start of the TTBR1 address
space), we can delete the arm64 implementation and use those.
Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
- Numerous fixes to the compat vDSO build system, especially when
combining gcc and clang
- Fix parsing of PAR_EL1 in spurious kernel fault detection
- Partial workaround for Neoverse-N1 erratum #1542419
- Fix IRQ priority masking on entry from compat syscalls
- Fix advertisment of FRINT HWCAP to userspace
- Attempt to workaround inlining breakage with '__always_inline'
- Fix accidental freeing of parent SVE state on fork() error path
- Add some missing NULL pointer checks in instruction emulation init
- Some formatting and comment fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl2dv4cQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNO6UB/4yY3lYR6C++7EdVwYxQRXf8VX9ukeO76gp
P/AS6Kt8+AiOuhFJJXDj3D7K/KqgZnJEhzeWHTZluYpIBuzFerW+RxzmExL+wFWf
ISZgdh7roFCQx3Nt+iBs/bAMPvk5Da1KHvSw/yZ6P8mj6fK8sVUh/O8+KK4kSzfT
muDoSO6WHSonAEOYm9ryn1q1pM5DsCjr+9fm7d9L+dJAUP2xX44ymlIY+v6yD3Or
IWJMYaWKb4TbdTJSy2VbUSM0fzByGBJCx1wOTd4gV6uDbB4GA6h+E/DMB1qnvv9W
nH5c4qwVgYhp7prpescMxYZoV/I9damvfnaIjqh9jc3H3milEqcn
=GwLJ
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A larger-than-usual batch of arm64 fixes for -rc3.
The bulk of the fixes are dealing with a bunch of issues with the
build system from the compat vDSO, which unfortunately led to some
significant Makefile rework to manage the horrible combinations of
toolchains that we can end up needing to drive simultaneously.
We came close to disabling the thing entirely, but Vincenzo was quick
to spin up some patches and I ended up picking up most of the bits
that were left [*]. Future work will look at disentangling the header
files properly.
Other than that, we have some important fixes all over, including one
papering over the miscompilation fallout from forcing
CONFIG_OPTIMIZE_INLINING=y, which I'm still unhappy about. Harumph.
We've still got a couple of open issues, so I'm expecting to have some
more fixes later this cycle.
Summary:
- Numerous fixes to the compat vDSO build system, especially when
combining gcc and clang
- Fix parsing of PAR_EL1 in spurious kernel fault detection
- Partial workaround for Neoverse-N1 erratum #1542419
- Fix IRQ priority masking on entry from compat syscalls
- Fix advertisment of FRINT HWCAP to userspace
- Attempt to workaround inlining breakage with '__always_inline'
- Fix accidental freeing of parent SVE state on fork() error path
- Add some missing NULL pointer checks in instruction emulation init
- Some formatting and comment fixes"
[*] Will's final fixes were
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
but they were already in linux-next by then and he didn't rebase
just to add those.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (21 commits)
arm64: armv8_deprecated: Checking return value for memory allocation
arm64: Kconfig: Make CONFIG_COMPAT_VDSO a proper Kconfig option
arm64: vdso32: Rename COMPATCC to CC_COMPAT
arm64: vdso32: Pass '--target' option to clang via VDSO_CAFLAGS
arm64: vdso32: Don't use KBUILD_CPPFLAGS unconditionally
arm64: vdso32: Move definition of COMPATCC into vdso32/Makefile
arm64: Default to building compat vDSO with clang when CONFIG_CC_IS_CLANG
lib: vdso: Remove CROSS_COMPILE_COMPAT_VDSO
arm64: vdso32: Remove jump label config option in Makefile
arm64: vdso32: Detect binutils support for dmb ishld
arm64: vdso: Remove stale files from old assembly implementation
arm64: vdso32: Fix broken compat vDSO build warnings
arm64: mm: fix spurious fault detection
arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
arm64: Fix incorrect irqflag restore for priority masking for compat
arm64: mm: avoid virt_to_phys(init_mm.pgd)
arm64: cpufeature: Effectively expose FRINT capability to userspace
arm64: Mark functions using explicit register variables as '__always_inline'
docs: arm64: Fix indentation and doc formatting
arm64/sve: Fix wrong free for task->thread.sve_state
...
As a PRFM instruction racing against a TTBR update can have undesirable
effects on TX2, NOP-out such PRFM on cores that are affected by
the TX2-219 erratum.
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order to workaround the TX2-219 erratum, it is necessary to trap
TTBRx_EL1 accesses to EL2. This is done by setting HCR_EL2.TVM on
guest entry, which has the side effect of trapping all the other
VM-related sysregs as well.
To minimize the overhead, a fast path is used so that we don't
have to go all the way back to the main sysreg handling code,
unless the rest of the hypervisor expects to see these accesses.
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Older versions of binutils (prior to 2.24) do not support the "ISHLD"
option for memory barrier instructions, which leads to a build failure
when assembling the vdso32 library.
Add a compilation time mechanism that detects if binutils supports those
instructions and configure the kernel accordingly.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Moving over to the generic C implementation of the vDSO inadvertently
left some stale files behind which are no longer used. Remove them.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
a nested hypervisor has always been busted on Broadwell and newer processors,
and that has finally been fixed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdlzTRAAoJEL/70l94x66DElcH/Rvhn5VQE/n2J+tKEXAICxQu
FqcTBJ5x2mp04aFe7xD3kWoKRJmz2lmHdw2ahFd4sqqLfGEFF/KW24ADI33vzLx/
UmT78O0Je3PX77TRnEXy+napbJny0iT6ikTAQKPbyQ151JlqlbPvatpDXXLPWQHv
jj6nKHCvMBrhV3kgaXO3cTFl8swX1hvR9lo9PcA2gRNt+HMN0heUmpfKughPoOes
JH+UNjsEr7MYlXYlIIc9o71EYH+kgPObwlLejy0ture+dvvZEJUJjZJE8H/XG5f2
ryXG9favaCOTAvaGf0R5Es+47A3crqUr6gHS0N28QKPn7x4hehIkKpA9dXQnWIw=
=1/LN
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM and x86 bugfixes of all kinds.
The most visible one is that migrating a nested hypervisor has always
been busted on Broadwell and newer processors, and that has finally
been fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: x86: omit "impossible" pmu MSRs from MSR list
KVM: nVMX: Fix consistency check on injected exception error code
KVM: x86: omit absent pmu MSRs from MSR list
selftests: kvm: Fix libkvm build error
kvm: vmx: Limit guest PMCs to those supported on the host
kvm: x86, powerpc: do not allow clearing largepages debugfs entry
KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure
KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF
selftests: kvm: add test for dirty logging inside nested guests
KVM: x86: fix nested guest live migration with PML
KVM: x86: assign two bits to track SPTE kinds
KVM: x86: Expose XSAVEERPTR to the guest
kvm: x86: Enumerate support for CLZERO instruction
kvm: x86: Use AMD CPUID semantics for AMD vCPUs
kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH
KVM: X86: Fix userspace set invalid CR4
kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func
KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns
KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH
arm64: KVM: Kill hyp_alternate_select()
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXZbQhwAKCRCAXGG7T9hj
vh03AP9mOLNY8r16u6a+Iy0YVccTaeiQiquG6HgFVEGX2Ki38gD/Xf5u6bPRYBts
uSRL/eYDvtfU4YGGMjogn20Fdzhc5Ak=
=EkVp
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes and cleanups from Juergen Gross:
- a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some
cases, plus a follow-on cleanup series for that driver
- a patch for introducing non-blocking EFI callbacks in Xen's EFI
driver, plu a cleanup patch for Xen EFI handling merging the x86 and
ARM arch specific initialization into the Xen EFI driver
- a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning
up after a user process has died
- a fix for Xen on ARM after removal of ZONE_DMA
- a cleanup patch for avoiding build warnings for Xen on ARM
* tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: fix self-deadlock after killing user process
xen/efi: have a common runtime setup function
arm: xen: mm: use __GPF_DMA32 for arm64
xen/balloon: Clear PG_offline in balloon_retrieve()
xen/balloon: Mark pages PG_offline in balloon_append()
xen/balloon: Drop __balloon_append()
xen/balloon: Set pages PageOffline() in balloon_add_region()
ARM: xen: unexport HYPERVISOR_platform_op function
xen/efi: Set nonblocking callbacks
As of ac7c3e4ff4 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"),
inline functions are no longer annotated with '__always_inline', which
allows the compiler to decide whether inlining is really a good idea or
not. Although this is a great idea on paper, the reality is that AArch64
GCC prior to 9.1 has been shown to get confused when creating an
out-of-line copy of a function passing explicit 'register' variables
into an inline assembly block:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
It's not clear whether this is specific to arm64 or not but, for now,
ensure that all of our functions using 'register' variables are marked
as '__always_inline' so that the old behaviour is effectively preserved.
Hopefully other architectures are luckier with their compilers.
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
- Remove the now obsolete hyp_alternate_select construct
- Fix the TRACE_INCLUDE_PATH macro in the vgic code
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl2TFyoPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDROwP/inRUonz+KEG2B0Bx/NWtzdnDghxdcoNC9H6
lVDHJ2dtC5Kmf0iHEualUvxXHYx7QJ3Maov3UAtkeYl3s4wC6TAl++QkqAG9PYsc
lPQH4GBiQNewQyaebc/NKHDz3I3TClJDq57haHSFFiCwsUpJRgYL8WjktZD/Dide
CUSQGxdnaALzHvMv5a8yQWadPL/RrXCZqOSKbUjjc20meZxrO66HwUd1G6uZZVDn
VClMQwFkQzVjR7yX21/7gmTcwG99RqVaAsvOpCu9+MVlqSpDROspmSPMuG5X/usO
zDgC07UFNPYHQKrGu8DHqlvO9DrK3vR8VEuKu+asVZP7D/ntvKhAM2c5ai188Z12
w8rOnhJKnDtMGHXn4owcC9tgSfrPR+ZukaltzKRVVFm1Y1Io+qTkAuf3geFqZ1hj
L9LWZ0KlMsFvfIKWPcAEp5rA9EeZoP5IeVCelBWj9ERDrcCMhma8RxpAlBPz1YPy
J345jthE4xFZYQxV+amTKJ3CzbZPuU2iIKgDBYiG2PNCuKwCT46RQitOXWWTwSIb
FZ6pcsmhofj69dSAlrRFjEpiLNkJuNX1ArsAA91vXemTXA2YfVLMZo1HkrmFNfbR
j4HP1BhNVdCgk6HF2HzwdRt8eutvk889GG3q+uCoYCaSu3M8MUEgx64LurOPProO
11jhNb3J
=9luB
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm fixes for 5.4, take #1
- Remove the now obsolete hyp_alternate_select construct
- Fix the TRACE_INCLUDE_PATH macro in the vgic code
Today the EFI runtime functions are setup in architecture specific
code (x86 and arm), with the functions themselves living in drivers/xen
as they are not architecture dependent.
As the setup is exactly the same for arm and x86 move the setup to
drivers/xen, too. This at once removes the need to make the single
functions global visible.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[boris: "Dropped EXPORT_SYMBOL_GPL(xen_efi_runtime_setup)"]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
people, and until recently arm64 used these erroneously/pointlessly for
other levels of page table.
To make it incredibly clear that these only apply to the PTE level, and to
align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
to pgtable_pte_page_{ctor,dtor}().
These changes were generated with the following shell script:
----
git grep -lw 'pgtable_page_.tor' | while read FILE; do
sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
done
----
... with the documentation re-flowed to remain under 80 columns, and
whitespace fixed up in macros to keep backslashes aligned.
There should be no functional change as a result of this patch.
Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arm64 handles top-down mmap layout in a way that can be easily reused by
other architectures, so make it available in mm. It then introduces a new
config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other
architectures to benefit from those functions. Note that this new config
depends on MMU being enabled, if selected without MMU support, a warning
will be thrown.
Link: http://lkml.kernel.org/r/20190730055113.23635-5-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem
cache for page table allocations on several architectures that do not use
PAGE_SIZE tables for one or more levels of the page table hierarchy.
Most architectures do not implement these functions and use __weak default
NOP implementation of pgd_cache_init(). Since there is no such default
for pgtable_cache_init(), its empty stub is duplicated among most
architectures.
Rename the definitions of pgd_cache_init() to pgtable_cache_init() and
drop empty stubs of pgtable_cache_init().
Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org> [arm64]
Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: remove quicklist page table caches".
A while ago Nicholas proposed to remove quicklist page table caches [1].
I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.
[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com
This patch (of 3):
Remove page table allocator "quicklists". These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.
The numbers in the initial commit look interesting but probably don't
apply anymore. If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.
Also it might be better to instead make more general improvements to page
allocator if this is still so slow.
Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix clang build breakage with CONFIG_OPTIMIZE_INLINING=y
- Fix compilation of pointer tagging selftest
- Fix COND_SYSCALL definitions to work with CFI checks
- Fix stale documentation reference in our Kconfig
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl2DO5cQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNBHhCADEoSOk8WSijBvC4WNCPXBvukuk2KEwal2s
By+iEo3byr9uRskduddumEnTA3F5otfV7VR3HG34P8K3dzP541Zw8sPs5Q1MU/Zo
7SzU6J14q+Q3V/+F0RDpajP60bBbmjYQi23xmO8hnnHbZlIV7dteD+JvMWrNh31M
nTvPHpS+jm/SNFUSL1EMnbr2Au5UhN8j+9Qk1NyPgmTV9DkS4drDs4lBqp8VhvCD
v6kC3vdW5DiXvMHoa5RXO+3B57HYVmlhylEO0ZOmKxyZEhph1jiuR2ng4CDYtxo3
qngynXxa7u79a6TdDhnzresaku4xM22UO9bf1WkVq9yv6JQ7/ArV
=wZPH
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"We've had a few arm64 fixes trickle in this week. Nothing catastophic,
but all things that should be addressed:
- Fix clang build breakage with CONFIG_OPTIMIZE_INLINING=y
- Fix compilation of pointer tagging selftest
- Fix COND_SYSCALL definitions to work with CFI checks
- Fix stale documentation reference in our Kconfig"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix reference to docs for ARM64_TAGGED_ADDR_ABI
arm64: fix function types in COND_SYSCALL
selftests, arm64: add kernel headers path for tags_test
arm64: fix unreachable code issue with cmpxchg
- Addition of multiprobes to kprobe and uprobe events
Allows for more than one probe attached to the same location
- Addition of adding immediates to probe parameters
- Clean up of the recordmcount.c code. This brings us closer
to merging recordmcount into objtool, and reuse code.
- Other small clean ups
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXYQoqhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qlIxAP9VVABbpuvOYqxKuFgyP62ituSXPLkL
gZv4I5Zse4b6/gD/eksFXY/OHo7jp6aQiHvxotUkAiFFE9iHzi0JscdMJgo=
=WqrT
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Addition of multiprobes to kprobe and uprobe events (allows for more
than one probe attached to the same location)
- Addition of adding immediates to probe parameters
- Clean up of the recordmcount.c code. This brings us closer to merging
recordmcount into objtool, and reuse code.
- Other small clean ups
* tag 'trace-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
selftests/ftrace: Update kprobe event error testcase
tracing/probe: Reject exactly same probe event
tracing/probe: Fix to allow user to enable events on unloaded modules
selftests/ftrace: Select an existing function in kprobe_eventname test
tracing/kprobe: Fix NULL pointer access in trace_porbe_unlink()
tracing: Make sure variable reference alias has correct var_ref_idx
tracing: Be more clever when dumping hex in __print_hex()
ftrace: Simplify ftrace hash lookup code in clear_func_from_hash()
tracing: Add "gfp_t" support in synthetic_events
tracing: Rename tracing_reset() to tracing_reset_cpu()
tracing: Document the stack trace algorithm in the comments
tracing/arm64: Have max stack tracer handle the case of return address after data
recordmcount: Clarify what cleanup() does
recordmcount: Remove redundant cleanup() calls
recordmcount: Kernel style formatting
recordmcount: Kernel style function signature formatting
recordmcount: Rewrite error/success handling
selftests/ftrace: Add syntax error test for multiprobe
selftests/ftrace: Add syntax error test for immediates
selftests/ftrace: Add a testcase for kprobe multiprobe event
...
- add dma-mapping and block layer helpers to take care of IOMMU
merging for mmc plus subsequent fixups (Yoshihiro Shimoda)
- rework handling of the pgprot bits for remapping (me)
- take care of the dma direct infrastructure for swiotlb-xen (me)
- improve the dma noncoherent remapping infrastructure (me)
- better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me)
- cleanup mmaping of coherent DMA allocations (me)
- various misc cleanups (Andy Shevchenko, me)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl2CSucLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPfrhAAgXZA/EdFPvkkCoDrmgtf3XkudX9gajeCd9g4NZy6
ZBQElTVvm4S0sQj7IXgALnMumDMbbTibW5SQLX5GwQDe+XXBpZ8ajpAnJAXc8a5T
qaFQ4SInr4CgBZf9nZKDkbSBZ1Tu3AQm1c0QI8riRCkrVTuX4L06xpCef4Yh4mgO
rwWEjIioYpQiKZMmu98riXh3ZNfFG3mVJRhKt8B6XJbBgnUnjDOPYGgaUwp6CU20
tFBKL2GaaV0vdLJ5wYhIGXT4DJ8tp9T5n3IYGZv1Ux889RaZEHlCrMxzelYeDbCT
KhZbhcSECGnddsh73t/UX7/KhytuqnfKa9n+Xo6AWuA47xO4c36quOOcTk9M0vE5
TfGDmewgL6WIv4lzokpRn5EkfDhyL33j8eYJrJ8e0ldcOhSQIFk4ciXnf2stWi6O
JrlzzzSid+zXxu48iTfoPdnMr7psTpiMvvRvKfEeMp2FX9Fg6EdMzJYLTEl+COHB
0WwNacZmY3P01+b5EZXEgqKEZevIIdmPKbyM9rPtTjz8BjBwkABHTpN3fWbVBf7/
Ax6OPYyW40xp1fnJuzn89m3pdOxn88FpDdOaeLz892Zd+Qpnro1ayulnFspVtqGM
mGbzA9whILvXNRpWBSQrvr2IjqMRjbBxX3BVACl3MMpOChgkpp5iANNfSDjCftSF
Zu8=
=/wGv
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- add dma-mapping and block layer helpers to take care of IOMMU merging
for mmc plus subsequent fixups (Yoshihiro Shimoda)
- rework handling of the pgprot bits for remapping (me)
- take care of the dma direct infrastructure for swiotlb-xen (me)
- improve the dma noncoherent remapping infrastructure (me)
- better defaults for ->mmap, ->get_sgtable and ->get_required_mask
(me)
- cleanup mmaping of coherent DMA allocations (me)
- various misc cleanups (Andy Shevchenko, me)
* tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
mmc: queue: Fix bigger segments usage
arm64: use asm-generic/dma-mapping.h
swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
swiotlb-xen: simplify cache maintainance
swiotlb-xen: use the same foreign page check everywhere
swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
xen: remove the exports for xen_{create,destroy}_contiguous_region
xen/arm: remove xen_dma_ops
xen/arm: simplify dma_cache_maint
xen/arm: use dev_is_dma_coherent
xen/arm: consolidate page-coherent.h
xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
arm: remove wrappers for the generic dma remap helpers
dma-mapping: introduce a dma_common_find_pages helper
dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
vmalloc: lift the arm flag for coherent mappings to common code
dma-mapping: provide a better default ->get_required_mask
dma-mapping: remove the dma_declare_coherent_memory export
remoteproc: don't allow modular build
...
* ARM: ITS translation cache; support for 512 vCPUs, various cleanups
and bugfixes
* PPC: various minor fixes and preparation
* x86: bugfixes all over the place (posted interrupts, SVM, emulation
corner cases, blocked INIT), some IPI optimizations
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdf7fdAAoJEL/70l94x66DJzkIAKDcuWXJB4Qtoto6yUvPiHZm
LYkY/Dn1zulb/DhzrBoXFey/jZXwl9kxMYkVTefnrAl0fRwFGX+G1UYnQrtAL6Gr
ifdTYdy3kZhXCnnp99QAantWDswJHo1THwbmHrlmkxS4MdisEaTHwgjaHrDRZ4/d
FAEwW2isSonP3YJfTtsKFFjL9k2D4iMnwZ/R2B7UOaWvgnerZ1GLmOkilvnzGGEV
IQ89IIkWlkKd4SKgq8RkDKlfW5JrLrSdTK2Uf0DvAxV+J0EFkEaR+WlLsqumra0z
Eg3KwNScfQj0DyT0TzurcOxObcQPoMNSFYXLRbUu1+i0CGgm90XpF1IosiuihgU=
=w6I3
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"s390:
- ioctl hardening
- selftests
ARM:
- ITS translation cache
- support for 512 vCPUs
- various cleanups and bugfixes
PPC:
- various minor fixes and preparation
x86:
- bugfixes all over the place (posted interrupts, SVM, emulation
corner cases, blocked INIT)
- some IPI optimizations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits)
KVM: X86: Use IPI shorthands in kvm guest when support
KVM: x86: Fix INIT signal handling in various CPU states
KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode
KVM: VMX: Stop the preemption timer during vCPU reset
KVM: LAPIC: Micro optimize IPI latency
kvm: Nested KVM MMUs need PAE root too
KVM: x86: set ctxt->have_exception in x86_decode_insn()
KVM: x86: always stop emulation on page fault
KVM: nVMX: trace nested VM-Enter failures detected by H/W
KVM: nVMX: add tracepoint for failed nested VM-Enter
x86: KVM: svm: Fix a check in nested_svm_vmrun()
KVM: x86: Return to userspace with internal error on unexpected exit reason
KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code
KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers
doc: kvm: Fix return description of KVM_SET_MSRS
KVM: X86: Tune PLE Window tracepoint
KVM: VMX: Change ple_window type to unsigned int
KVM: X86: Remove tailing newline for tracepoints
KVM: X86: Trace vcpu_id for vmexit
KVM: x86: Manually calculate reserved bits when loading PDPTRS
...
Define a weak function in COND_SYSCALL instead of a weak alias to
sys_ni_syscall, which has an incompatible type. This fixes indirect
call mismatches with Control-Flow Integrity (CFI) checking.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined
when CONFIG_OPTIMIZE_INLINING is set.
Clang then fails a compile-time assertion, because it cannot tell at
compile time what the size of the argument is:
mm/memcontrol.o: In function `__cmpxchg_mb':
memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175'
memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `__compiletime_assert_175'
Mark all of the cmpxchg() style functions as __always_inline to
ensure that the compiler can see the result.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/648
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Tested-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will@kernel.org>
- 52-bit virtual addressing in the kernel
- New ABI to allow tagged user pointers to be dereferenced by syscalls
- Early RNG seeding by the bootloader
- Improve robustness of SMP boot
- Fix TLB invalidation in light of recent architectural clarifications
- Support for i.MX8 DDR PMU
- Remove direct LSE instruction patching in favour of static keys
- Function error injection using kprobes
- Support for the PPTT "thread" flag introduced by ACPI 6.3
- Move PSCI idle code into proper cpuidle driver
- Relaxation of implicit I/O memory barriers
- Build with RELR relocations when toolchain supports them
- Numerous cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl1yYREQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNAM3CAChqDFQkryXoHwdeEcaukMRVNxtxOi4pM4g
5xqkb7PoqRJssIblsuhaXjrSD97yWCgaqCmFe6rKoes++lP4bFcTe22KXPPyPBED
A+tK4nTuKKcZfVbEanUjI+ihXaHJmKZ/kwAxWsEBYZ4WCOe3voCiJVNO2fHxqg1M
8TskZ2BoayTbWMXih0eJg2MCy/xApBq4b3nZG4bKI7Z9UpXiKN1NYtDh98ZEBK4V
d/oNoHsJ2ZvIQsztoBJMsvr09DTCazCijWZiECadm6l41WEPFizngrACiSJLLtYo
0qu4qxgg9zgFlvBCRQmIYSggTuv35RgXSfcOwChmW5DUjHG+f9GK
=Ru4B
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Although there isn't tonnes of code in terms of line count, there are
a fair few headline features which I've noted both in the tag and also
in the merge commits when I pulled everything together.
The part I'm most pleased with is that we had 35 contributors this
time around, which feels like a big jump from the usual small group of
core arm64 arch developers. Hopefully they all enjoyed it so much that
they'll continue to contribute, but we'll see.
It's probably worth highlighting that we've pulled in a branch from
the risc-v folks which moves our CPU topology code out to where it can
be shared with others.
Summary:
- 52-bit virtual addressing in the kernel
- New ABI to allow tagged user pointers to be dereferenced by
syscalls
- Early RNG seeding by the bootloader
- Improve robustness of SMP boot
- Fix TLB invalidation in light of recent architectural
clarifications
- Support for i.MX8 DDR PMU
- Remove direct LSE instruction patching in favour of static keys
- Function error injection using kprobes
- Support for the PPTT "thread" flag introduced by ACPI 6.3
- Move PSCI idle code into proper cpuidle driver
- Relaxation of implicit I/O memory barriers
- Build with RELR relocations when toolchain supports them
- Numerous cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (114 commits)
arm64: remove __iounmap
arm64: atomics: Use K constraint when toolchain appears to support it
arm64: atomics: Undefine internal macros after use
arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL
arm64: asm: Kill 'asm/atomic_arch.h'
arm64: lse: Remove unused 'alt_lse' assembly macro
arm64: atomics: Remove atomic_ll_sc compilation unit
arm64: avoid using hard-coded registers for LSE atomics
arm64: atomics: avoid out-of-line ll/sc atomics
arm64: Use correct ll/sc atomic constraints
jump_label: Don't warn on __exit jump entries
docs/perf: Add documentation for the i.MX8 DDR PMU
perf/imx_ddr: Add support for AXI ID filtering
arm64: kpti: ensure patched kernel text is fetched from PoU
arm64: fix fixmap copy for 16K pages and 48-bit VA
perf/smmuv3: Validate groups for global filtering
perf/smmuv3: Validate group size
arm64: Relax Documentation/arm64/tagged-pointers.rst
arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_F
arm64: mm: Ignore spurious translation faults taken from the kernel
...
Now that the Xen special cases are gone nothing worth mentioning is
left in the arm64 <asm/dma-mapping.h> file, so switch to use the
asm-generic version instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Use the dma-noncoherent dev_is_dma_coherent helper instead of the home
grown variant. Note that both are always initialized to the same
value in arch_setup_dma_ops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Shared the duplicate arm/arm64 code in include/xen/arm/page-coherent.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
hyp_alternate_select() is now completely unused. Goodbye.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
While parts of the VGIC support a large number of vcpus (we
bravely allow up to 512), other parts are more limited.
One of these limits is visible in the KVM_IRQ_LINE ioctl, which
only allows 256 vcpus to be signalled when using the CPU or PPI
types. Unfortunately, we've cornered ourselves badly by allocating
all the bits in the irq field.
Since the irq_type subfield (8 bit wide) is currently only taking
the values 0, 1 and 2 (and we have been careful not to allow anything
else), let's reduce this field to only 4 bits, and allocate the
remaining 4 bits to a vcpu2_index, which acts as a multiplier:
vcpu_id = 256 * vcpu2_index + vcpu_index
With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2)
allowing this to be discovered, it becomes possible to inject
PPIs to up to 4096 vcpus. But please just don't.
Whilst we're there, add a clarification about the use of KVM_IRQ_LINE
on arm, which is not completely conditionned by KVM_CAP_IRQCHIP.
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Most archs (well at least x86) store the function call return address on the
stack before storing the local variables for the function. The max stack
tracer depends on this in its algorithm to display the stack size of each
function it finds in the back trace.
Some archs (arm64), may store the return address (from its link register)
just before calling a nested function. There's no reason to save the link
register on leaf functions, as it wont be updated. This breaks the algorithm
of the max stack tracer.
Add a new define ARCH_FTRACE_SHIFT_STACK_TRACER that an architecture may set
if it stores the return address (link register) after it stores the
function's local variables, and have the stack trace shift the values of the
mapped stack size to the appropriate functions.
Link: 20190802094103.163576-1-jiping.ma2@windriver.com
Reported-by: Jiping Ma <jiping.ma2@windriver.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* for-next/52-bit-kva: (25 commits)
Support for 52-bit virtual addressing in kernel space
* for-next/cpu-topology: (9 commits)
Move CPU topology parsing into core code and add support for ACPI 6.3
* for-next/error-injection: (2 commits)
Support for function error injection via kprobes
* for-next/perf: (8 commits)
Support for i.MX8 DDR PMU and proper SMMUv3 group validation
* for-next/psci-cpuidle: (7 commits)
Move PSCI idle code into a new CPUidle driver
* for-next/rng: (4 commits)
Support for 'rng-seed' property being passed in the devicetree
* for-next/smpboot: (3 commits)
Reduce fragility of secondary CPU bringup in debug configurations
* for-next/tbi: (10 commits)
Introduce new syscall ABI with relaxed requirements for pointer tags
* for-next/tlbi: (6 commits)
Handle spurious page faults arising from kernel space
The 'K' constraint is a documented AArch64 machine constraint supported
by GCC for matching integer constants that can be used with a 32-bit
logical instruction. Unfortunately, some released compilers erroneously
accept the immediate '4294967295' for this constraint, which is later
refused by GAS at assembly time. This had led us to avoid the use of
the 'K' constraint altogether.
Instead, detect whether the compiler is up to the job when building the
kernel and pass the 'K' constraint to our 32-bit atomic macros when it
appears to be supported.
Signed-off-by: Will Deacon <will@kernel.org>
We use a bunch of internal macros when constructing our atomic and
cmpxchg routines in order to save on boilerplate. Avoid exposing these
directly to users of the header files.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The contents of 'asm/atomic_arch.h' can be split across some of our
other 'asm/' headers. Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The 'alt_lse' assembly macro has been unused since 7c8fc35dfc
("locking/atomics/arm64: Replace our atomic/lock bitop implementations
with asm-generic").
Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Now that we have removed the out-of-line ll/sc atomics we can give
the compiler the freedom to choose its own register allocation.
Remove the hard-coded use of x30.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware
or toolchain doesn't support it the existing code will fallback to ll/sc
atomics. It achieves this by branching from inline assembly to a function
that is built with special compile flags. Further this results in the
clobbering of registers even when the fallback isn't used increasing
register pressure.
Improve this by providing inline implementations of both LSE and
ll/sc and use a static key to select between them, which allows for the
compiler to generate better atomics code. Put the LL/SC fallback atomics
in their own subsection to improve icache performance.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Based on an email from Will Deacon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
arch_dma_mmap_pgprot is used for two things:
1) to override the "normal" uncached page attributes for mapping
memory coherent to devices that can't snoop the CPU caches
2) to provide the special DMA_ATTR_WRITE_COMBINE semantics on older
arm systems and some mips platforms
Replace one with the pgprot_dmacoherent macro that is already provided
by arm and much simpler to use, and lift the DMA_ATTR_WRITE_COMBINE
handling to common code with an explicit arch opt-in.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Acked-by: Paul Burton <paul.burton@mips.com> # mips
The A64 ISA accepts distinct (but overlapping) ranges of immediates for:
* add arithmetic instructions ('I' machine constraint)
* sub arithmetic instructions ('J' machine constraint)
* 32-bit logical instructions ('K' machine constraint)
* 64-bit logical instructions ('L' machine constraint)
... but we currently use the 'I' constraint for many atomic operations
using sub or logical instructions, which is not always valid.
When CONFIG_ARM64_LSE_ATOMICS is not set, this allows invalid immediates
to be passed to instructions, potentially resulting in a build failure.
When CONFIG_ARM64_LSE_ATOMICS is selected the out-of-line ll/sc atomics
always use a register as they have no visibility of the value passed by
the caller.
This patch adds a constraint parameter to the ATOMIC_xx and
__CMPXCHG_CASE macros so that we can pass appropriate constraints for
each case, with uses updated accordingly.
Unfortunately prior to GCC 8.1.0 the 'K' constraint erroneously accepted
'4294967295', so we must instead force the use of a register.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 2f6ea23f63 ("arm64: KVM: Avoid marking pages as XN in
Stage-2 if CTR_EL0.DIC is set"), KVM has stopped marking normal memory
as execute-never at stage2 when the system supports D->I Coherency at
the PoU. This avoids KVM taking a trap when the page is first executed,
in order to clean it to PoU.
The patch that added this change also wrapped PAGE_S2_DEVICE mappings
up in this too. The upshot is, if your CPU caches support DIC ...
you can execute devices.
Revert the PAGE_S2_DEVICE change so PTE_S2_XN is always used
directly.
Fixes: 2f6ea23f63 ("arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
PAR_EL1 is a mysterious creature, but sometimes it's necessary to read
it when translating addresses in situations where we cannot walk the
page table directly.
Add a couple of system register definitions for the fault indication
field ('F') and the fault status code ('FST').
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Commit 6a4cbd63c25a ("Revert "arm64: Remove unnecessary ISBs from
set_{pte,pmd,pud}"") reintroduced ISB instructions to some of our
page table setter functions in light of a recent clarification to the
Armv8 architecture. Although 'set_pgd()' isn't currently used to update
a live page table, add the ISB instruction there too for consistency
with the other macros and to provide some future-proofing if we use it
on live tables in the future.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
05f2d2f83b ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
added a new TLB invalidation helper which is used when freeing
intermediate levels of page table used for kernel mappings, but is
missing the required ISB instruction after completion of the TLBI
instruction.
Add the missing barrier.
Cc: <stable@vger.kernel.org>
Fixes: 05f2d2f83b ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
This reverts commit 24fe1b0efa.
Commit 24fe1b0efa ("arm64: Remove unnecessary ISBs from
set_{pte,pmd,pud}") removed ISB instructions immediately following updates
to the page table, on the grounds that they are not required by the
architecture and a DSB alone is sufficient to ensure that subsequent data
accesses use the new translation:
DDI0487E_a, B2-128:
| ... no instruction that appears in program order after the DSB
| instruction can alter any state of the system or perform any part of
| its functionality until the DSB completes other than:
|
| * Being fetched from memory and decoded
| * Reading the general-purpose, SIMD and floating-point,
| Special-purpose, or System registers that are directly or indirectly
| read without causing side-effects.
However, the same document also states the following:
DDI0487E_a, B2-125:
| DMB and DSB instructions affect reads and writes to the memory system
| generated by Load/Store instructions and data or unified cache
| maintenance instructions being executed by the PE. Instruction fetches
| or accesses caused by a hardware translation table access are not
| explicit accesses.
which appears to claim that the DSB alone is insufficient. Unfortunately,
some CPU designers have followed the second clause above, whereas in Linux
we've been relying on the first. This means that our mapping sequence:
MOV X0, <valid pte>
STR X0, [Xptep] // Store new PTE to page table
DSB ISHST
LDR X1, [X2] // Translates using the new PTE
can actually raise a translation fault on the load instruction because the
translation can be performed speculatively before the page table update and
then marked as "faulting" by the CPU. For user PTEs, this is ok because we
can handle the spurious fault, but for kernel PTEs and intermediate table
entries this results in a panic().
Revert the offending commit to reintroduce the missing barriers.
Cc: <stable@vger.kernel.org>
Fixes: 24fe1b0efa ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Currently in arm64, FDT is mapped to RO before it's passed to
early_init_dt_scan(). However, there might be some codes
(eg. commit "fdt: add support for rng-seed") that need to modify FDT
during init. Map FDT to RO after early fixups are done.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When taking an SError or Debug exception from EL0, we run the C
handler for these exceptions before updating the context tracking
code and unmasking lower priority interrupts.
When booting with nohz_full lockdep tells us we got this wrong:
| =============================
| WARNING: suspicious RCU usage
| 5.3.0-rc2-00010-gb4b5e9dcb11b-dirty #11271 Not tainted
| -----------------------------
| include/linux/rcupdate.h:643 rcu_read_unlock() used illegally wh!
|
| other info that might help us debug this:
|
|
| RCU used illegally from idle CPU!
| rcu_scheduler_active = 2, debug_locks = 1
| RCU used illegally from extended quiescent state!
| 1 lock held by a.out/432:
| #0: 00000000c7a79515 (rcu_read_lock){....}, at: brk_handler+0x00
|
| stack backtrace:
| CPU: 1 PID: 432 Comm: a.out Not tainted 5.3.0-rc2-00010-gb4b5e9d1
| Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno De8
| Call trace:
| dump_backtrace+0x0/0x140
| show_stack+0x14/0x20
| dump_stack+0xbc/0x104
| lockdep_rcu_suspicious+0xf8/0x108
| brk_handler+0x164/0x1b0
| do_debug_exception+0x11c/0x278
| el0_dbg+0x14/0x20
Moving the ct_user_exit calls to be before do_debug_exception() means
they are also before trace_hardirqs_off() has been updated. Add a new
ct_user_exit_irqoff macro to avoid the context-tracking code using
irqsave/restore before we've updated trace_hardirqs_off(). To be
consistent, do this everywhere.
The C helper is called enter_from_user_mode() to match x86 in the hope
we can merge them into kernel/context_tracking.c later.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 6c81fe7925 ("arm64: enable context tracking")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The trusted OS may reject CPU_OFF calls to its resident CPU, so we must
avoid issuing those. We never migrate a Trusted OS and we already take
care to prevent CPU_OFF PSCI call. However, this is not reflected
explicitly to the userspace. Any user can attempt to hotplug trusted OS
resident CPU. The entire motion of going through the various state
transitions in the CPU hotplug state machine gets executed and the
PSCI layer finally refuses to make CPU_OFF call.
This results is unnecessary unwinding of CPU hotplug state machine in
the kernel. Instead we can mark the trusted OS resident CPU as not
available for hotplug, so that the user attempt or request to do the
same will get immediately rejected.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Strengthen the wording in the documentation for cpu_enable() to make it
more obvious to readers not already familiar with the code when the core
will call this callback and that this is intentional.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[will: minor tweak to emphasis in the comment]
Signed-off-by: Will Deacon <will@kernel.org>
Prior to commit:
14c127c957 ("arm64: mm: Flip kernel VA space")
... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.
Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.
To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Cleanup memory.h so that the indentation is consistent, remove pointless
line-wrapping and use consistent parameter names for different versions
of the same macro.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Commenting the #endif of a multi-statement #ifdef block with the
condition which guards it is useful and can save having to scroll back
through the file to figure out which set of Kconfig options apply to
a particular piece of code.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
There's no need for __tag_set() to be a complicated macro when
CONFIG_KASAN_SW_TAGS=y and a simple static inline otherwise. Rewrite
the thing as a common static inline function.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Rather than subtracting from -1 and then adding 1, we can simply
subtract from 0.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Build virt_to_page() on top of virt_to_pfn() so we can avoid the need
for explicit shifting.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The default implementations of page_to_virt() and virt_to_page() are
fairly confusing to read and the former evaluates its 'page' parameter
twice in the macro
Rewrite them so that the computation is expressed as 'base + index' in
both cases and the parameter is always evaluated exactly once.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When converting a linear virtual address to a physical address, pfn or
struct page *, we must make sure that the tag bits are masked before the
calculation otherwise we end up with corrupt pointers when running with
CONFIG_KASAN_SW_TAGS=y:
| Unable to handle kernel paging request at virtual address 0037fe0007580d08
| [0037fe0007580d08] address between user and kernel address ranges
Mask out the tag in __virt_to_phys_nodebug() and virt_to_page().
Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9cb1c5ddd2 ("arm64: mm: Remove bit-masking optimisations for PAGE_OFFSET and VMEMMAP_START")
Signed-off-by: Will Deacon <will@kernel.org>
virt_addr_valid() is intended to test whether or not the passed address
is a valid linear map address. Unfortunately, it relies on
_virt_addr_is_linear() which is broken because it assumes the linear
map is at the top of the address space, which it no longer is.
Reimplement virt_addr_valid() using __is_lm_address() and remove
_virt_addr_is_linear() entirely. At the same time, ensure we evaluate
the macro parameter only once and move it within the __ASSEMBLY__ block.
Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Signed-off-by: Will Deacon <will@kernel.org>
Pull in generic CPU topology changes from Paul Walmsley (RISC-V).
* tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
MAINTAINERS: Add an entry for generic architecture topology
base: arch_topology: update Kconfig help description
RISC-V: Parse cpu topology during boot.
arm: Use common cpu_topology structure and functions.
cpu-topology: Move cpu topology code to common code.
dt-binding: cpu-topology: Move cpu-map to a common binding.
Documentation: DT: arm: add support for sockets defining package boundaries
GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.
This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdTfRfAAoJEL/70l94x66DcN0IAIwyaU2+kwP0jd2miQuKxgwl
WU4u7dZCoQC6meWEVmrSJIVMBONRubmZ9iCqT7807YP8YZSQpOth51FMbULUWuy1
VW1eaRwqidX0EAihDhg2ZbBZ8H6RQ9Fn0aiEEh44dAZZAwGSVnO3PRKvQEJ15xjk
q+OQ4hrxtoorwLj+myejmq3YenTFTCMMJfYwwvlCl+J1FfrLZi5k3X5Gjk+j8Ixd
8CL8/6u5Lu6MCgfYVvxvo8/bUPiATBdF1sWJMMALwXTrDiSy4tQRD0NvZP1HM8G1
hy0XnhgtsS9rWNLtAFOj+r/XhP9V5lOOGX8yBcj0XQQr+DC9MG6MCL+pXXOaMcA=
=ZZh8
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Bugfixes (arm and x86) and cleanups"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: kvm: Adding config fragments
KVM: selftests: Update gitignore file for latest changes
kvm: remove unnecessary PageReserved check
KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable
KVM: arm: Don't write junk to CP15 registers on reset
KVM: arm64: Don't write junk to sysregs on reset
KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
x86: kvm: remove useless calls to kvm_para_available
KVM: no need to check return value of debugfs_create functions
KVM: remove kvm_arch_has_vcpu_debugfs()
KVM: Fix leak vCPU's VMCS value into other pCPU
KVM: Check preempted_in_kernel for involuntary preemption
KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire
arm64: KVM: hyp: debug-sr: Mark expected switch fall-through
KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC
KVM: arm: vgic-v3: Mark expected switch fall-through
arm64: KVM: regmap: Fix unexpected switch fall-through
KVM: arm/arm64: Introduce kvm_pmu_vcpu_init() to setup PMU counter index
untagged_addr() can be called with a '__user' pointer parameter and must
therefore use '__force' casts both when passing this parameter through
to sign_extend64() as a 'u64', but also when casting the 's64' return
value back to the '__user' pointer type.
Signed-off-by: Will Deacon <will@kernel.org>
_virt_addr_valid() is defined as the same value in two places and rolls
its own version of virt_to_pfn() in both cases.
Consolidate these definitions by inlining a simplified version directly
into virt_addr_valid().
Signed-off-by: Will Deacon <will@kernel.org>
Previous patches have enabled 52-bit kernel + user VAs and there is no
longer any scenario where user VA != kernel VA size.
This patch removes the, now redundant, vabits_user variable and replaces
usage with vabits_actual where appropriate.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Most of the machinery is now in place to enable 52-bit kernel VAs that
are detectable at boot time.
This patch adds a Kconfig option for 52-bit user and kernel addresses
and plumbs in the requisite CONFIG_ macros as well as sets TCR.T1SZ,
physvirt_offset and vmemmap at early boot.
To simplify things this patch also removes the 52-bit user/48-bit kernel
kconfig option.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In a later patch we will need to have a slightly larger VMEMMAP region
to accommodate boot time selection between 48/52-bit kernel VAs.
This patch modifies the formula for computing VMEMMAP_SIZE to depend
explicitly on the PAGE_OFFSET and start of kernel addressable memory.
(This allows for a slightly larger direct linear map in future).
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
vmemmap is a preprocessor definition that depends on a variable,
memstart_addr. In a later patch we will need to expand the size of
the VMEMMAP region and optionally modify vmemmap depending upon
whether or not hardware support is available for 52-bit virtual
addresses.
This patch changes vmemmap to be a variable. As the old definition
depended on a variable load, this should not affect performance
noticeably.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When running with a 52-bit userspace VA and a 48-bit kernel VA we offset
ttbr1_el1 to allow the kernel pagetables with a 52-bit PTRS_PER_PGD to
be used for both userspace and kernel.
Moving on to a 52-bit kernel VA we no longer require this offset to
ttbr1_el1 should we be running on a system with HW support for 52-bit
VAs.
This patch introduces conditional logic to offset_ttbr1 to query
SYS_ID_AA64MMFR2_EL1 whenever 52-bit VAs are selected. If there is HW
support for 52-bit VAs then the ttbr1 offset is skipped.
We choose to read a system register rather than vabits_actual because
offset_ttbr1 can be called in places where the kernel data is not
actually mapped.
Calls to offset_ttbr1 appear to be made from rarely called code paths so
this extra logic is not expected to adversely affect performance.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order to support 52-bit kernel addresses detectable at boot time, one
needs to know the actual VA_BITS detected. A new variable vabits_actual
is introduced in this commit and employed for the KVM hypervisor layout,
KASAN, fault handling and phys-to/from-virt translation where there
would normally be compile time constants.
In order to maintain performance in phys_to_virt, another variable
physvirt_offset is introduced.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order to support 52-bit kernel addresses detectable at boot time, the
kernel needs to know the most conservative VA_BITS possible should it
need to fall back to this quantity due to lack of hardware support.
A new compile time constant VA_BITS_MIN is introduced in this patch and
it is employed in the KASAN end address, KASLR, and EFI stub.
For Arm, if 52-bit VA support is unavailable the fallback is to 48-bits.
In other words: VA_BITS_MIN = min (48, VA_BITS)
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command
line argument and affects the codegen of the inline address sanetiser.
Essentially, for an example memory access:
*ptr1 = val;
The compiler will insert logic similar to the below:
shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET)
if (somethingWrong(shadowValue))
flagAnError();
This code sequence is inserted into many places, thus
KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel
text.
If we want to run a single kernel binary with multiple address spaces,
then we need to do this with KASAN_SHADOW_OFFSET fixed.
Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide
shadow addresses we know that the end of the shadow region is constant
w.r.t. VA space size:
KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET
This means that if we increase the size of the VA space, the start of
the KASAN region expands into lower addresses whilst the end of the
KASAN region is fixed.
Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via
build scripts with the VA size used as a parameter. (There are build
time checks in the C code too to ensure that expected values are being
derived). It is sufficient, and indeed is a simplification, to remove
the build scripts (and build time checks) entirely and instead provide
KASAN_SHADOW_OFFSET values.
This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the
arm64 Makefile, and instead we adopt the approach used by x86 to supply
offset values in kConfig. To help debug/develop future VA space changes,
the Makefile logic has been preserved in a script file in the arm64
Documentation folder.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order to allow for a KASAN shadow that changes size at boot time, one
must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
start address. Also, it is highly desirable to maintain the same
function addresses in the kernel .text between VA sizes. Both of these
requirements necessitate us to flip the kernel address space halves s.t.
the direct linear map occupies the lower addresses.
This patch puts the direct linear map in the lower addresses of the
kernel VA range and everything else in the higher ranges.
We need to adjust:
*) KASAN shadow region placement logic,
*) KASAN_SHADOW_OFFSET computation logic,
*) virt_to_phys, phys_to_virt checks,
*) page table dumper.
These are all small changes, that need to take place atomically, so they
are bundled into this commit.
As part of the re-arrangement, a guard region of 2MB (to preserve
alignment for fixed map) is added after the vmemmap. Otherwise the
vmemmap could intersect with IS_ERR pointers.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Currently there are assumptions about the alignment of VMEMMAP_START
and PAGE_OFFSET that won't be valid after this series is applied.
These assumptions are in the form of bitwise operators being used
instead of addition and subtraction when calculating addresses.
This patch replaces these bitwise operators with addition/subtraction.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The commit d5370f7548 ("arm64: prefetch: add alternative pattern for
CPUs without a prefetcher") introduced MIDR_IS_CPU_MODEL_RANGE() to be
used in has_no_hw_prefetch() with rv_min=0 which generates a compilation
warning from GCC,
In file included from ./arch/arm64/include/asm/cache.h:8,
from ./include/linux/cache.h:6,
from ./include/linux/printk.h:9,
from ./include/linux/kernel.h:15,
from ./include/linux/cpumask.h:10,
from arch/arm64/kernel/cpufeature.c:11:
arch/arm64/kernel/cpufeature.c: In function 'has_no_hw_prefetch':
./arch/arm64/include/asm/cputype.h:59:26: warning: comparison of
unsigned expression >= 0 is always true [-Wtype-limits]
_model == (model) && rv >= (rv_min) && rv <= (rv_max); \
^~
arch/arm64/kernel/cpufeature.c:889:9: note: in expansion of macro
'MIDR_IS_CPU_MODEL_RANGE'
return MIDR_IS_CPU_MODEL_RANGE(midr, MIDR_THUNDERX,
^~~~~~~~~~~~~~~~~~~~~~~
Fix it by converting MIDR_IS_CPU_MODEL_RANGE to a static inline
function.
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
Inspired by the commit 7cd01b08d3 ("powerpc: Add support for function
error injection"), this patch supports function error injection for
Arm64.
This patch mainly support two functions: one is regs_set_return_value()
which is used to overwrite the return value; the another function is
override_function_with_return() which is to override the probed
function returning and jump to its caller.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data
from user memory into the kernel memory or vice versa. Since a user can
provided a tagged pointer to one of the syscalls that use copy_from_user,
we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr,
before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the
checks, but then passes them as is into the kernel internals.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
[will: Add __force to casting in untagged_addr() to kill sparse warning]
Signed-off-by: Will Deacon <will@kernel.org>
Some TIF_* flags are documented in the comment block at the top, some
next to their definitions, some in both places.
Move all documentation to the individual definitions for consistency,
and for easy lookup.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will@kernel.org>
The arm64 implementation of the default I/O accessors requires barrier
instructions to satisfy the memory ordering requirements documented in
memory-barriers.txt [1], which are largely derived from the behaviour of
I/O accesses on x86.
Of particular interest are the requirements that a write to a device
must be ordered against prior writes to memory, and a read from a device
must be ordered against subsequent reads from memory. We satisfy these
requirements using various flavours of DSB: the most expensive barrier
we have, since it implies completion of prior accesses. This was deemed
necessary when we first implemented the accessors, since accesses to
different endpoints could propagate independently and therefore the only
way to enforce order is to rely on completion guarantees [2].
Since then, the Armv8 memory model has been retrospectively strengthened
to require "other-multi-copy atomicity", a property that requires memory
accesses from an observer to become visible to all other observers
simultaneously [3]. In other words, propagation of accesses is limited
to transitioning from locally observed to globally observed. It recently
became apparent that this change also has a subtle impact on our I/O
accessors for shared peripherals, allowing us to use the cheaper DMB
instruction instead.
As a concrete example, consider the following:
memcpy(dma_buffer, data, bufsz);
writel(DMA_START, dev->ctrl_reg);
A DMB ST instruction between the final write to the DMA buffer and the
write to the control register will ensure that the writes to the DMA
buffer are observed before the write to the control register by all
observers. Put another way, if an observer can see the write to the
control register, it can also see the writes to memory. This has always
been the case and is not sufficient to provide the ordering required by
Linux, since there is no guarantee that the master interface of the
DMA-capable device has observed either of the accesses. However, in an
other-multi-copy atomic world, we can infer two things:
1. A write arriving at an endpoint shared between multiple CPUs is
visible to all CPUs
2. A write that is visible to all CPUs is also visible to all other
observers in the shareability domain
Pieced together, this allows us to use DMB OSHST for our default I/O
write accessors and DMB OSHLD for our default I/O read accessors (the
outer-shareability is for handling non-cacheable mappings) for shared
devices. Memory-mapped, DMA-capable peripherals that are private to a
CPU (i.e. inaccessible to other CPUs) still require the DSB, however
these are few and far between and typically require special treatment
anyway which is outside of the scope of the portable driver API (e.g.
GIC, page-table walker, SPE profiler).
Note that our mandatory barriers remain as DSBs, since there are cases
where they are used to flush the store buffer of the CPU, e.g. when
publishing page table updates to the SMMU.
[1] https://git.kernel.org/linus/4614bbdee357
[2] https://www.youtube.com/watch?v=i6DayghhA8Q
[3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The function cpucap_multi_entry_cap_cpu_enable() is unused, remove it to
avoid any confusion reading the code and potential for bit rot.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Our SCTLR_ELx field definitions are somewhat over-engineered in that
they carefully define masks describing the RES0/RES1 bits and then use
these to construct further masks representing bits to be set/cleared for
the _EL1 and _EL2 registers.
However, most of the resulting definitions aren't actually used by
anybody and have subsequently started to bit-rot when new fields have
been added by the architecture, resulting in fields being part of the
RES0 mask despite being defined and used elsewhere.
Rather than fix up these masks, simply remove the unused parts entirely
so that we can drop the maintenance burden. We can always add things
back if we need them in the future.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The ESR.EC encoding of 0b011010 (0x1a) describes an exception generated
by an ERET, ERETAA or ERETAB instruction as a result of a nested
virtualisation trap to EL2.
Add an encoding for this EC and a string description so that we identify
it correctly if we take one unexpectedly.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
stat.h is listed in include/uapi/asm-generic/Kbuild, so Kbuild will
automatically generate it.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
For a number of years, UAPI headers have been split from kernel-internal
headers. The latter are never exposed to userspace, and always built
with __KERNEL__ defined.
Most headers under arch/arm64 don't have __KERNEL__ guards, but there
are a few stragglers lying around. To make things more consistent, and
to set a good example going forward, let's remove these redundant
__KERNEL__ guards.
In a couple of cases, a trailing #endif lacked a comment describing its
corresponding #if or #ifdef, so these are fixes up at the same time.
Guards in auto-generated crypto code are left as-is, as these guards are
generated by scripting imported from the upstream openssl project
scripts. Guards in UAPI headers are left as-is, as these can be included
by userspace or the kernel.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
As of commit 4141c857fd ("arm64: convert
raw syscall invocation to C"), moving syscall handling from assembly to
C, the macro mask_nospec64 is no longer referenced.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Pull vdso timer fixes from Thomas Gleixner:
"A series of commits to deal with the regression caused by the generic
VDSO implementation.
The usage of clock_gettime64() for 32bit compat fallback syscalls
caused seccomp filters to kill innocent processes because they only
allow clock_gettime().
Handle the compat syscalls with clock_gettime() as before, which is
not a functional problem for the VDSO as the legacy compat application
interface is not y2038 safe anyway. It's just extra fallback code
which needs to be implemented on every architecture.
It's opt in for now so that it does not break the compile of already
converted architectures in linux-next. Once these are fixed, the
#ifdeffery goes away.
So much for trying to be smart and reuse code..."
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64: compat: vdso: Use legacy syscalls as fallback
x86/vdso/32: Use 32bit syscall fallback
lib/vdso/32: Provide legacy syscall fallbacks
lib/vdso: Move fallback invocation to the callers
lib/vdso/32: Remove inconsistent NULL pointer checks
kprobes manipulates the interrupted PSTATE for single step, and
doesn't restore it. Thus, if we put a kprobe where the pstate.D
(debug) masked, the mask will be cleared after the kprobe hits.
Moreover, in the most complicated case, this can lead a kernel
crash with below message when a nested kprobe hits.
[ 152.118921] Unexpected kernel single-step exception at EL1
When the 1st kprobe hits, do_debug_exception() will be called.
At this point, debug exception (= pstate.D) must be masked (=1).
But if another kprobes hits before single-step of the first kprobe
(e.g. inside user pre_handler), it unmask the debug exception
(pstate.D = 0) and return.
Then, when the 1st kprobe setting up single-step, it saves current
DAIF, mask DAIF, enable single-step, and restore DAIF.
However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
single-step exception happens soon after restoring DAIF.
This has been introduced by commit 7419333fa1 ("arm64: kprobe:
Always clear pstate.D in breakpoint exception handler")
To solve this issue, this stores all DAIF bits and restore it
after single stepping.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 7419333fa1 ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler")
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
When CONFIG_KASAN_SW_TAGS=n, set_tag() is compiled away. GCC throws a
warning,
mm/kasan/common.c: In function '__kasan_kmalloc':
mm/kasan/common.c:464:5: warning: variable 'tag' set but not used
[-Wunused-but-set-variable]
u8 tag = 0xff;
^~~
Fix it by making __tag_set() a static inline function the same as
arch_kasan_set_tag() in mm/kasan/kasan.h for consistency because there
is a macro in arch/arm64/include/asm/kasan.h,
#define arch_kasan_set_tag(addr, tag) __tag_set(addr, tag)
However, when CONFIG_DEBUG_VIRTUAL=n and CONFIG_SPARSEMEM_VMEMMAP=y,
page_to_virt() will call __tag_set() with incorrect type of a
parameter, so fix that as well. Also, still let page_to_virt() return
"void *" instead of "const void *", so will not need to add a similar
cast in lowmem_page_address().
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
GCC throws a warning,
arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page':
arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used
[-Wunused-but-set-variable]
pud_t pud;
^~~
because pud_table() is a macro and compiled away. Fix it by making it a
static inline function and for pud_sect() as well.
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
On a system with two security states, if SCR_EL3.FIQ is cleared,
non-secure IRQ priorities get shifted to fit the secure view but
priority masks aren't.
On such system, it turns out that GIC_PRIO_IRQON masks the priority of
normal interrupts, which obviously ends up in a hang.
Increase GIC_PRIO_IRQON value (i.e. lower priority) to make sure
interrupts are not blocked by it.
Cc: Oleg Nesterov <oleg@redhat.com>
Fixes: bd82d4bd21 ("arm64: Fix incorrect irqflag restore for priority masking")
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[will: fixed Fixes: tag]
Signed-off-by: Will Deacon <will@kernel.org>
GCC throws out this warning on arm64.
drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry':
drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si'
set but not used [-Wunused-but-set-variable]
Fix it by making free_screen_info() a static inline function.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
their architecturally maximum values, which defeats the use of
FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
machines.
Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
saturate at zero.
Fixes: 3c739b5710 ("arm64: Keep track of CPU feature registers")
Cc: <stable@vger.kernel.org> # 4.4.x-
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The generic VDSO implementation uses the Y2038 safe clock_gettime64() and
clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks
seccomp setups because these syscalls might be not (yet) allowed.
Implement the 32bit variants which use the legacy syscalls and select the
variant in the core library.
The 64bit time variants are not removed because they are required for the
time64 based vdso accessors.
Fixes: 00b26474c2 ("lib/vdso: Provide generic VDSO implementation")
Reported-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lkml.kernel.org/r/20190728131648.971361611@linutronix.de
Here are some small SPDX fixes for 5.3-rc2 for things that came in
during the 5.3-rc1 merge window that we previously missed.
Only 3 small patches here:
- 2 uapi patches to resolve some SPDX tags that were not correct
- fix an invalid SPDX tag in the iomap Makefile file
All have been properly reviewed on the public mailing lists.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2N9w8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylY9wCeJIYfs/eNf3tsjLQXxUBMYAJNqnsAn2IaMiTt
cv2mck7JZm5KyHpP3f5N
=RSZa
-----END PGP SIGNATURE-----
Merge tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX fixes from Greg KH:
"Here are some small SPDX fixes for 5.3-rc2 for things that came in
during the 5.3-rc1 merge window that we previously missed.
Only three small patches here:
- two uapi patches to resolve some SPDX tags that were not correct
- fix an invalid SPDX tag in the iomap Makefile file
All have been properly reviewed on the public mailing lists"
* tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
iomap: fix Invalid License ID
treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again
treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
We've added two ESR exception classes for new ARM hardware extensions:
ESR_ELx_EC_PAC and ESR_ELx_EC_SVE, but failed to update the strings
used in tracing and other debug.
Let's update "kvm_arm_exception_class" for these two EC, which the
new EC will be visible to user-space via kvm_exit trace events
Also update to "esr_class_str" for ESR_ELx_EC_PAC, by which we can
get more readable debug info.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
UAPI headers licensed under GPL are supposed to have exception
"WITH Linux-syscall-note" so that they can be included into non-GPL
user space application code.
The exception note is missing in some UAPI headers.
Some of them slipped in by the treewide conversion commit b24413180f
("License cleanup: add SPDX GPL-2.0 license identifier to files with
no license"). Just run:
$ git show --oneline b24413180f -- arch/x86/include/uapi/asm/
I believe they are not intentional, and should be fixed too.
This patch was generated by the following script:
git grep -l --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
-- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild |
while read file
do
sed -i -e '/[[:space:]]OR[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
-e '/[[:space:]]or[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \
-e '/[[:space:]]OR[[:space:]]/!{/[[:space:]]or[[:space:]]/!s/\(GPL-[^[:space:]]*\)/\1 WITH Linux-syscall-note/g}' $file
done
After this patch is applied, there are 5 UAPI headers that do not contain
"WITH Linux-syscall-note". They are kept untouched since this exception
applies only to GPL variants.
$ git grep --not -e Linux-syscall-note --and -e SPDX-License-Identifier \
-- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild
include/uapi/drm/panfrost_drm.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/batman_adv.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/qemu_fw_cfg.h:/* SPDX-License-Identifier: BSD-3-Clause */
include/uapi/linux/vbox_err.h:/* SPDX-License-Identifier: MIT */
include/uapi/linux/virtio_iommu.h:/* SPDX-License-Identifier: BSD-3-Clause */
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both RISC-V & ARM64 are using cpu-map device tree to describe
their cpu topology. It's better to move the relevant code to
a common place instead of duplicate code.
To: Will Deacon <will.deacon@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
[Tested on QDF2400]
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
[Tested on Juno and other embedded platforms.]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
On a CPU that doesn't support SSBS, PSTATE[12] is RES0. In a system
where only some of the CPUs implement SSBS, we end-up losing track of
the SSBS bit across task migration.
To address this issue, let's force the SSBS bit on context switch.
Fixes: 8f04e8e6e2 ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: inverted logic and added comments]
Signed-off-by: Will Deacon <will@kernel.org>
This helper is required from generic huge_pte_alloc() which is available
when arch subscribes ARCH_WANT_GENERAL_HUGETLB. arm64 implements it's own
huge_pte_alloc() and does not depend on the generic definition. Drop this
helper which is redundant on arm64.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <Steve.Capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The arm64 stacktrace code is careful to only dereference frame records
in valid stack ranges, ensuring that a corrupted frame record won't
result in a faulting access.
However, it's still possible for corrupt frame records to result in
infinite loops in the stacktrace code, which is also undesirable.
This patch ensures that we complete a stacktrace in finite time, by
keeping track of which stacks we have already completed unwinding, and
verifying that if the next frame record is on the same stack, it is at a
higher address.
As this has turned out to be particularly subtle, comments are added to
explain the procedure.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tengfei Fan <tengfeif@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
Some common code is required by each stacktrace user to initialise
struct stackframe before the first call to unwind_frame().
In preparation for adding to the common code, this patch factors it
out into a separate function start_backtrace(), and modifies the
stacktrace callers appropriately.
No functional change.
Signed-off-by: Dave Martin <dave.martin@arm.com>
[Mark: drop tsk argument, update more callsites]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
on_accessible_stack() and on_task_stack() shouldn't (and don't)
modify their task argument, so it can be const.
This patch adds the appropriate modifiers. Whitespace violations in the
parameter lists are fixed at the same time.
No functional change.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <dave.martin@arm.com>
[Mark: fixup const location, whitespace]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Prior to the introduction of Unified vDSO support and compat layer for
vDSO on arm64, AT_SYSINFO_EHDR was not defined for compat tasks.
In the current implementation, AT_SYSINFO_EHDR is defined even if the
compat vdso layer is not built, which has been shown to break Android
applications using bionic:
| 01-01 01:22:14.097 755 755 F libc : Fatal signal 11 (SIGSEGV),
| code 1 (SEGV_MAPERR), fault addr 0x3cf2c96c in tid 755 (cameraserver),
| pid 755 (cameraserver)
| 01-01 01:22:14.112 759 759 F libc : Fatal signal 11 (SIGSEGV),
| code 1 (SEGV_MAPERR), fault addr 0x3cf2c96c in tid 759
| (android.hardwar), pid 759 (android.hardwar)
| 01-01 01:22:14.120 756 756 F libc : Fatal signal 11 (SIGSEGV)
| code 1 (SEGV_MAPERR), fault addr 0x3cf2c96c in tid 756 (drmserver),
| pid 756 (drmserver)
Restore the old behaviour by making sure that AT_SYSINFO_EHDR for compat
tasks is defined only when CONFIG_COMPAT_VDSO is enabled.
Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In order for things like get_user_pages() to work on ZONE_DEVICE memory,
we need a software PTE bit to identify device-backed PFNs. Hook this up
along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP.
[robin.murphy@arm.com: build fixes]
Link: http://lkml.kernel.org/r/13026c4e64abc17133bbfa07d7731ec6691c0bcd.1559050949.git.robin.murphy@arm.com
Link: http://lkml.kernel.org/r/817d92886fc3b33bcbf6e105ee83a74babb3a5aa.1558547956.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that BIT() can be used from assembly code, we can safely replace
_BITUL() with equivalent BIT().
UAPI headers are still required to use _BITUL(), but there is no more
reason to use it in kernel headers. BIT() is shorter.
Link: http://lkml.kernel.org/r/20190609153941.17249-2-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The asm-generic changes for 5.3 consist of a cleanup series from
Christoph Hellwig, who explains:
"asm-generic/ptrace.h is a little weird in that it doesn't actually
implement any functionality, but it provided multiple layers of macros
that just implement trivial inline functions. We implement those
directly in the few architectures and be off with a much simpler
design."
Link: https://lore.kernel.org/lkml/20190624054728.30966-1-hch@lst.de/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJdKPURAAoJEJpsee/mABjZalgP/idFZKlL9jb32p9eVacW1ngm
CzwKk+49UBpLlimTh3ZtpiSJEHQyXP/QYlJ0/kV65YJriq5FsqlBrkPnWgsgMG9x
HBhJEEfVtXolXK3yNEsFIt/0j0Xh7+uCyBNZNuJrIRy/9x2z2nhBgDAenWPpZNuT
qjpArBAVEWQMsWgmgZUlCKOT7ziSx5+w1bfqiiUZDjwjqimPhLUBfoZmUWHtO49M
4/95RVOIMoLlIcaCUfqsvfkf7v6mfFAADhTrB/FZWVNX839fnpifqQL9BmOlgrEM
kxn5wM/dxRDwRT8+mVRyB8ax4/rIgMIFoaA7Hrv+hoUsiOVD7AkNXynZKQh1hhjl
449j68esoA6vlfdFIhagpKKTiQcWXJDbEgAoSJcM0WIl3JAjc+3nVWShTAAEW65r
Z+Bgy1OczoCsRXbYR/TwpThHj3197xMRQEluzaLnd5Zx5feUDUKuDcxhPpev/ceO
qmV5FeGqxRlZhJjVK8lmcHNZP0e4pkodwrNKC/2NIlIp6EKmMNI0nCjVqINigHGC
97Kc7N94WHdQ3tA7GB8YaUfd8w86W5ZOgRh+uuZ0brPziL1MR5lD/NvzjVSfyvVp
7UHNP7stNbavg20vDhlWGIsWiwoDlJf0YLUA6kXHryb9i/fh8sqWjz99QFu6QIfs
BTgeLtNP8hKhMkgew2XL
=jkfI
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"The asm-generic changes for 5.3 consist of a cleanup series to remove
ptrace.h from Christoph Hellwig, who explains:
'asm-generic/ptrace.h is a little weird in that it doesn't actually
implement any functionality, but it provided multiple layers of
macros that just implement trivial inline functions. We implement
those directly in the few architectures and be off with a much
simpler design.'
at https://lore.kernel.org/lkml/20190624054728.30966-1-hch@lst.de/"
* tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove ptrace.h
x86: don't use asm-generic/ptrace.h
sh: don't use asm-generic/ptrace.h
powerpc: don't use asm-generic/ptrace.h
arm64: don't use asm-generic/ptrace.h