On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined
when CONFIG_OPTIMIZE_INLINING is set.
Clang then fails a compile-time assertion, because it cannot tell at
compile time what the size of the argument is:
mm/memcontrol.o: In function `__cmpxchg_mb':
memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175'
memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `__compiletime_assert_175'
Mark all of the cmpxchg() style functions as __always_inline to
ensure that the compiler can see the result.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/648
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Tested-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will@kernel.org>
* for-next/52-bit-kva: (25 commits)
Support for 52-bit virtual addressing in kernel space
* for-next/cpu-topology: (9 commits)
Move CPU topology parsing into core code and add support for ACPI 6.3
* for-next/error-injection: (2 commits)
Support for function error injection via kprobes
* for-next/perf: (8 commits)
Support for i.MX8 DDR PMU and proper SMMUv3 group validation
* for-next/psci-cpuidle: (7 commits)
Move PSCI idle code into a new CPUidle driver
* for-next/rng: (4 commits)
Support for 'rng-seed' property being passed in the devicetree
* for-next/smpboot: (3 commits)
Reduce fragility of secondary CPU bringup in debug configurations
* for-next/tbi: (10 commits)
Introduce new syscall ABI with relaxed requirements for pointer tags
* for-next/tlbi: (6 commits)
Handle spurious page faults arising from kernel space
The 'K' constraint is a documented AArch64 machine constraint supported
by GCC for matching integer constants that can be used with a 32-bit
logical instruction. Unfortunately, some released compilers erroneously
accept the immediate '4294967295' for this constraint, which is later
refused by GAS at assembly time. This had led us to avoid the use of
the 'K' constraint altogether.
Instead, detect whether the compiler is up to the job when building the
kernel and pass the 'K' constraint to our 32-bit atomic macros when it
appears to be supported.
Signed-off-by: Will Deacon <will@kernel.org>
We use a bunch of internal macros when constructing our atomic and
cmpxchg routines in order to save on boilerplate. Avoid exposing these
directly to users of the header files.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Support for LSE atomic instructions (CONFIG_ARM64_LSE_ATOMICS) relies on
a static key to select between the legacy LL/SC implementation which is
available on all arm64 CPUs and the super-duper LSE implementation which
is available on CPUs implementing v8.1 and later.
Unfortunately, when building a kernel with CONFIG_JUMP_LABEL disabled
(e.g. because the toolchain doesn't support 'asm goto'), the static key
inside the atomics code tries to use atomics itself. This results in a
mess of circular includes and a build failure:
In file included from ./arch/arm64/include/asm/lse.h:11,
from ./arch/arm64/include/asm/atomic.h:16,
from ./include/linux/atomic.h:7,
from ./include/asm-generic/bitops/atomic.h:5,
from ./arch/arm64/include/asm/bitops.h:26,
from ./include/linux/bitops.h:19,
from ./include/linux/kernel.h:12,
from ./include/asm-generic/bug.h:18,
from ./arch/arm64/include/asm/bug.h:26,
from ./include/linux/bug.h:5,
from ./include/linux/page-flags.h:10,
from kernel/bounds.c:10:
./include/linux/jump_label.h: In function ‘static_key_count’:
./include/linux/jump_label.h:254:9: error: implicit declaration of function ‘atomic_read’ [-Werror=implicit-function-declaration]
return atomic_read(&key->enabled);
^~~~~~~~~~~
[ ... more of the same ... ]
Since LSE atomic instructions are not critical to the operation of the
kernel, make them depend on JUMP_LABEL at compile time.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The contents of 'asm/atomic_arch.h' can be split across some of our
other 'asm/' headers. Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The 'alt_lse' assembly macro has been unused since 7c8fc35dfc
("locking/atomics/arm64: Replace our atomic/lock bitop implementations
with asm-generic").
Remove it.
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
We no longer fall back to out-of-line atomics on systems with
CONFIG_ARM64_LSE_ATOMICS where ARM64_HAS_LSE_ATOMICS is not set.
Remove the unused compilation unit which provided these symbols.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Now that we have removed the out-of-line ll/sc atomics we can give
the compiler the freedom to choose its own register allocation.
Remove the hard-coded use of x30.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When building for LSE atomics (CONFIG_ARM64_LSE_ATOMICS), if the hardware
or toolchain doesn't support it the existing code will fallback to ll/sc
atomics. It achieves this by branching from inline assembly to a function
that is built with special compile flags. Further this results in the
clobbering of registers even when the fallback isn't used increasing
register pressure.
Improve this by providing inline implementations of both LSE and
ll/sc and use a static key to select between them, which allows for the
compiler to generate better atomics code. Put the LL/SC fallback atomics
in their own subsection to improve icache performance.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The A64 ISA accepts distinct (but overlapping) ranges of immediates for:
* add arithmetic instructions ('I' machine constraint)
* sub arithmetic instructions ('J' machine constraint)
* 32-bit logical instructions ('K' machine constraint)
* 64-bit logical instructions ('L' machine constraint)
... but we currently use the 'I' constraint for many atomic operations
using sub or logical instructions, which is not always valid.
When CONFIG_ARM64_LSE_ATOMICS is not set, this allows invalid immediates
to be passed to instructions, potentially resulting in a build failure.
When CONFIG_ARM64_LSE_ATOMICS is selected the out-of-line ll/sc atomics
always use a register as they have no visibility of the value passed by
the caller.
This patch adds a constraint parameter to the ATOMIC_xx and
__CMPXCHG_CASE macros so that we can pass appropriate constraints for
each case, with uses updated accordingly.
Unfortunately prior to GCC 8.1.0 the 'K' constraint erroneously accepted
'4294967295', so we must instead force the use of a register.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
On architectures that discard .exit.* sections at runtime, a
warning is printed for each jump label that is used within an
in-kernel __exit annotated function:
can't patch jump_label at ehci_hcd_cleanup+0x8/0x3c
WARNING: CPU: 0 PID: 1 at kernel/jump_label.c:410 __jump_label_update+0x12c/0x138
As these functions will never get executed (they are free'd along
with the rest of initmem) - we do not need to patch them and should
not display any warnings.
The warning is displayed because the test required to satisfy
jump_entry_is_init is based on init_section_contains (__init_begin to
__init_end) whereas the test in __jump_label_update is based on
init_kernel_text (_sinittext to _einittext) via kernel_text_address).
Fixes: 1948367768 ("jump_label: Annotate entries that operate on __init code earlier")
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Add some documentation describing the DDR PMU residing in the Freescale
i.MDX SoC and its perf driver implementation in Linux.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
AXI filtering is used by events 0x41 and 0x42 to count reads or writes
with an ARID or AWID matching a specified filter. The filter is exposed
to userspace as an (ID, MASK) pair, where each set bit in the mask
causes the corresponding bit in the ID to be ignored when matching
against the ID of memory transactions for the purposes of incrementing
the counter.
For example:
# perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xff,axi_id=0x800/ cmd
will count all read transactions from AXI IDs 0x800 - 0x8ff. If the
'axi_mask' is omitted, then it is treated as 0x0 which means that the
'axi_id' will be matched exactly.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
While the MMUs is disabled, I-cache speculation can result in
instructions being fetched from the PoC. During boot we may patch
instructions (e.g. for alternatives and jump labels), and these may be
dirty at the PoU (and stale at the PoC).
Thus, while the MMU is disabled in the KPTI pagetable fixup code we may
load stale instructions into the I-cache, potentially leading to
subsequent crashes when executing regions of code which have been
modified at runtime.
Similarly to commit:
8ec4198743 ("arm64: mm: ensure patched kernel text is fetched from PoU")
... we can invalidate the I-cache after enabling the MMU to prevent such
issues.
The KPTI pagetable fixup code itself should be clean to the PoC per the
boot protocol, so no maintenance is required for this code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
With 16K pages and 48-bit VAs, the PGD level of table has two entries,
and so the fixmap shares a PGD with the kernel image. Since commit:
f9040773b7 ("arm64: move kernel image to base of vmalloc area")
... we copy the existing fixmap to the new fine-grained page tables at
the PUD level in this case. When walking to the new PUD, we forgot to
offset the PGD entry and always used the PGD entry at index 0, but this
worked as the kernel image and fixmap were in the low half of the TTBR1
address space.
As of commit:
14c127c957 ("arm64: mm: Flip kernel VA space")
... the kernel image and fixmap are in the high half of the TTBR1
address space, and hence use the PGD at index 1, but we didn't update
the fixmap copying code to account for this.
Thus, we'll erroneously try to copy the fixmap slots into a PUD under
the PGD entry at index 0. At the point we do so this PGD entry has not
been initialised, and thus we'll try to write a value to a small offset
from physical address 0, causing a number of potential problems.
Fix this be correctly offsetting the PGD. This is split over a few steps
for legibility.
Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Reported-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
With global filtering, it becomes possible for users to construct
self-contradictory groups with conflicting filters. Make sure we
cover that when initially validating events.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Ensure that a group will actually fit into the available counters.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
On AArch64 the TCR_EL1.TBI0 bit is set by default, allowing userspace
(EL0) to perform memory accesses through 64-bit pointers with a non-zero
top byte. However, such pointers were not allowed at the user-kernel
syscall ABI boundary.
With the Tagged Address ABI patchset, it is now possible to pass tagged
pointers to the syscalls. Relax the requirements described in
tagged-pointers.rst to be compliant with the behaviours guaranteed by
the AArch64 Tagged Address ABI.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Now that we have a definition for the 'F' field of PAR_EL1, use that
instead of coding the immediate directly.
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Thanks to address translation being performed out of order with respect to
loads and stores, it is possible for a CPU to take a translation fault when
accessing a page that was mapped by a different CPU.
For example, in the case that one CPU maps a page and then sets a flag to
tell another CPU:
CPU 0
-----
MOV X0, <valid pte>
STR X0, [Xptep] // Store new PTE to page table
DSB ISHST
ISB
MOV X1, #1
STR X1, [Xflag] // Set the flag
CPU 1
-----
loop: LDAR X0, [Xflag] // Poll flag with Acquire semantics
CBZ X0, loop
LDR X1, [X2] // Translates using the new PTE
then the final load on CPU 1 can raise a translation fault because the
translation can be performed speculatively before the read of the flag and
marked as "faulting" by the CPU. This isn't quite as bad as it sounds
since, in reality, code such as:
CPU 0 CPU 1
----- -----
spin_lock(&lock); spin_lock(&lock);
*ptr = vmalloc(size); if (*ptr)
spin_unlock(&lock); foo = **ptr;
spin_unlock(&lock);
will not trigger the fault because there is an address dependency on CPU 1
which prevents the speculative translation. However, more exotic code where
the virtual address is known ahead of time, such as:
CPU 0 CPU 1
----- -----
spin_lock(&lock); spin_lock(&lock);
set_fixmap(0, paddr, prot); if (mapped)
mapped = true; foo = *fix_to_virt(0);
spin_unlock(&lock); spin_unlock(&lock);
could fault. This can be avoided by any of:
* Introducing broadcast TLB maintenance on the map path
* Adding a DSB;ISB sequence after checking a flag which indicates
that a virtual address is now mapped
* Handling the spurious fault
Given that we have never observed a problem due to this under Linux and
future revisions of the architecture are being tightened so that
translation table walks are effectively ordered in the same way as explicit
memory accesses, we no longer treat spurious kernel faults as fatal if an
AT instruction indicates that the access does not trigger a translation
fault.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
PAR_EL1 is a mysterious creature, but sometimes it's necessary to read
it when translating addresses in situations where we cannot walk the
page table directly.
Add a couple of system register definitions for the fault indication
field ('F') and the fault status code ('FST').
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Commit 6a4cbd63c25a ("Revert "arm64: Remove unnecessary ISBs from
set_{pte,pmd,pud}"") reintroduced ISB instructions to some of our
page table setter functions in light of a recent clarification to the
Armv8 architecture. Although 'set_pgd()' isn't currently used to update
a live page table, add the ISB instruction there too for consistency
with the other macros and to provide some future-proofing if we use it
on live tables in the future.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
05f2d2f83b ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
added a new TLB invalidation helper which is used when freeing
intermediate levels of page table used for kernel mappings, but is
missing the required ISB instruction after completion of the TLBI
instruction.
Add the missing barrier.
Cc: <stable@vger.kernel.org>
Fixes: 05f2d2f83b ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
This reverts commit 24fe1b0efa.
Commit 24fe1b0efa ("arm64: Remove unnecessary ISBs from
set_{pte,pmd,pud}") removed ISB instructions immediately following updates
to the page table, on the grounds that they are not required by the
architecture and a DSB alone is sufficient to ensure that subsequent data
accesses use the new translation:
DDI0487E_a, B2-128:
| ... no instruction that appears in program order after the DSB
| instruction can alter any state of the system or perform any part of
| its functionality until the DSB completes other than:
|
| * Being fetched from memory and decoded
| * Reading the general-purpose, SIMD and floating-point,
| Special-purpose, or System registers that are directly or indirectly
| read without causing side-effects.
However, the same document also states the following:
DDI0487E_a, B2-125:
| DMB and DSB instructions affect reads and writes to the memory system
| generated by Load/Store instructions and data or unified cache
| maintenance instructions being executed by the PE. Instruction fetches
| or accesses caused by a hardware translation table access are not
| explicit accesses.
which appears to claim that the DSB alone is insufficient. Unfortunately,
some CPU designers have followed the second clause above, whereas in Linux
we've been relying on the first. This means that our mapping sequence:
MOV X0, <valid pte>
STR X0, [Xptep] // Store new PTE to page table
DSB ISHST
LDR X1, [X2] // Translates using the new PTE
can actually raise a translation fault on the load instruction because the
translation can be performed speculatively before the page table update and
then marked as "faulting" by the CPU. For user PTEs, this is ok because we
can handle the spurious fault, but for kernel PTEs and intermediate table
entries this results in a panic().
Revert the offending commit to reintroduce the missing barriers.
Cc: <stable@vger.kernel.org>
Fixes: 24fe1b0efa ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When we fail to bring a secondary CPU online and it fails in an unknown
state, we should assume the worst and increment 'cpus_stuck_in_kernel'
so that things like kexec() are disabled.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Although SMP bringup is inherently racy, we can significantly reduce
the window during which secondary CPUs can unexpectedly enter the
kernel by sanity checking the 'stack' and 'task' fields of the
'secondary_data' structure. If the booting CPU gave up waiting for us,
then they will have been cleared to NULL and we should spin in a WFE; WFI
loop instead.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
When many debug options are enabled simultaneously (e.g. PROVE_LOCKING,
KMEMLEAK, DEBUG_PAGE_ALLOC, KASAN etc), it is possible for us to timeout
when attempting to boot a secondary CPU and give up. Unfortunately, the
CPU will /eventually/ appear, and sit in the background happily stuck
in a recursive exception due to a NULL stack pointer.
Increase the timeout to 5s, which will of course be enough for anybody.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Commit 428826f535 ("fdt: add support for rng-seed") moves of_fdt_crc32
from early_init_dt_verify() to early_init_dt_scan() since
early_init_dt_scan_chosen() may modify fdt to erase rng-seed.
However, arm and some other arch won't call early_init_dt_scan(), they
call early_init_dt_verify() then early_init_dt_scan_nodes().
Restore of_fdt_crc32 to early_init_dt_verify() then update it in
early_init_dt_scan_chosen() if fdt if updated.
Fixes: 428826f535 ("fdt: add support for rng-seed")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will@kernel.org>
Adding "rng-seed" to dtb. It's fine to add this property if original
fdt doesn't contain it. Since original seed will be wiped after
read, so use a default size 128 bytes here.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
Introducing a chosen node, rng-seed, which is an entropy that can be
passed to kernel called very early to increase initial device
randomness. Bootloader should provide this entropy and the value is
read from /chosen/rng-seed in DT.
Obtain of_fdt_crc32 for CRC check after early_init_dt_scan_nodes(),
since early_init_dt_scan_chosen() would modify fdt to erase rng-seed.
Add a new interface add_bootloader_randomness() for rng-seed use case.
Depends on whether the seed is trustworthy, rng seed would be passed to
add_hwgenerator_randomness(). Otherwise it would be passed to
add_device_randomness(). Decision is controlled by kernel config
RANDOM_TRUST_BOOTLOADER.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu> # drivers/char/random.c
Signed-off-by: Will Deacon <will@kernel.org>
Currently in arm64, FDT is mapped to RO before it's passed to
early_init_dt_scan(). However, there might be some codes
(eg. commit "fdt: add support for rng-seed") that need to modify FDT
during init. Map FDT to RO after early fixups are done.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Documentation/arm64/tagged-address-abi.rst introduces the
relaxation of the syscall ABI that allows userspace to pass
certain tagged pointers to kernel syscalls.
Add the document to index.rst for a correct generation of the
table of content.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
On AArch64 the TCR_EL1.TBI0 bit is set by default, allowing userspace
(EL0) to perform memory accesses through 64-bit pointers with a non-zero
top byte. Introduce the document describing the relaxation of the
syscall ABI that allows userspace to pass certain tagged pointers to
kernel syscalls.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use the standard obj-y form to specify the sub-directories under
arch/arm64/. No functional change intended.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
When taking an SError or Debug exception from EL0, we run the C
handler for these exceptions before updating the context tracking
code and unmasking lower priority interrupts.
When booting with nohz_full lockdep tells us we got this wrong:
| =============================
| WARNING: suspicious RCU usage
| 5.3.0-rc2-00010-gb4b5e9dcb11b-dirty #11271 Not tainted
| -----------------------------
| include/linux/rcupdate.h:643 rcu_read_unlock() used illegally wh!
|
| other info that might help us debug this:
|
|
| RCU used illegally from idle CPU!
| rcu_scheduler_active = 2, debug_locks = 1
| RCU used illegally from extended quiescent state!
| 1 lock held by a.out/432:
| #0: 00000000c7a79515 (rcu_read_lock){....}, at: brk_handler+0x00
|
| stack backtrace:
| CPU: 1 PID: 432 Comm: a.out Not tainted 5.3.0-rc2-00010-gb4b5e9d1
| Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno De8
| Call trace:
| dump_backtrace+0x0/0x140
| show_stack+0x14/0x20
| dump_stack+0xbc/0x104
| lockdep_rcu_suspicious+0xf8/0x108
| brk_handler+0x164/0x1b0
| do_debug_exception+0x11c/0x278
| el0_dbg+0x14/0x20
Moving the ct_user_exit calls to be before do_debug_exception() means
they are also before trace_hardirqs_off() has been updated. Add a new
ct_user_exit_irqoff macro to avoid the context-tracking code using
irqsave/restore before we've updated trace_hardirqs_off(). To be
consistent, do this everywhere.
The C helper is called enter_from_user_mode() to match x86 in the hope
we can merge them into kernel/context_tracking.c later.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 6c81fe7925 ("arm64: enable context tracking")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
First rename the sysctl control to abi.tagged_addr_disabled and make it
default off (zero). When abi.tagged_addr_disabled == 1, only block the
enabling of the TBI ABI via prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE).
Getting the status of the ABI or disabling it is still allowed.
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Require that arg{3,4,5} of the PR_{SET,GET}_TAGGED_ADDR_CTRL prctl and
arg2 of the PR_GET_TAGGED_ADDR_CTRL prctl() are zero rather than ignored
for future extensions.
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In perf_event.c we use smp_processor_id(), but we haven't included
<linux/smp.h> where it is defined, and rely on this being pulled in
via a transitive include. Let's make this more robust by including
<linux.smp.h> explicitly.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
This reverts commit 71c67a31f0.
Commit 117acf5c29 ("powerpc/Makefile: Always pass --synthetic to nm if
supported") removed the only conditional definition of $(NM), so we can
revert our temporary bodge to avoid Kconfig recursion and go back to
passing $(NM) through to the 'tools-support-relr.sh' when detecting
support for RELR relocations.
Signed-off-by: Will Deacon <will@kernel.org>
Back in 2004 we added logic to arch/ppc64/Makefile to pass
the --synthetic option to nm, if it was supported by nm.
Then in 2005 when arch/ppc64 and arch/ppc were merged, the logic to
add --synthetic was moved inside an #ifdef CONFIG_PPC64 block within
arch/powerpc/Makefile, and has remained there since.
That was fine, though crufty, until recently when a change to
init/Kconfig added a config time check that uses $(NM). On powerpc
that leads to an infinite loop because Kconfig uses $(NM) to calculate
some values, then the powerpc Makefile changes $(NM), which Kconfig
notices and restarts.
The original commit that added --synthetic simply said:
On new toolchains we need to use nm --synthetic or we miss code
symbols.
And the nm man page says that the --synthetic option causes nm to:
Include synthetic symbols in the output. These are special symbols
created by the linker for various purposes.
So it seems safe to always pass --synthetic if nm supports it, ie. on
32-bit and 64-bit, it just means 32-bit kernels might have more
symbols reported (and in practice I see no extra symbols). Making it
unconditional avoids the #ifdef CONFIG_PPC64, which in turn avoids the
infinite loop.
Debugged-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Will Deacon <will@kernel.org>
Fix tagged_ptr not being initialized when TBI is not enabled.
Link: https://www.spinics.net/lists/linux-kselftest/msg09446.html
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
The generic Makefile.kasan propagates CONFIG_KASAN_SHADOW_OFFSET into
KASAN_SHADOW_OFFSET, but only does so for CONFIG_KASAN_GENERIC.
Since commit:
6bd1d0be0e ("arm64: kasan: Switch to using KASAN_SHADOW_OFFSET")
... arm64 defines CONFIG_KASAN_SHADOW_OFFSET in Kconfig rather than
defining KASAN_SHADOW_OFFSET in a Makefile. Thus, if
CONFIG_KASAN_SW_TAGS && KASAN_INLINE are selected, we get build time
splats due to KASAN_SHADOW_OFFSET not being set:
| [mark@lakrids:~/src/linux]% usellvm 8.0.1 usekorg 8.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- CC=clang
| scripts/kconfig/conf --syncconfig Kconfig
| CC scripts/mod/empty.o
| clang (LLVM option parsing): for the -hwasan-mapping-offset option: '' value invalid for uint argument!
| scripts/Makefile.build:273: recipe for target 'scripts/mod/empty.o' failed
| make[1]: *** [scripts/mod/empty.o] Error 1
| Makefile:1123: recipe for target 'prepare0' failed
| make: *** [prepare0] Error 2
Let's fix this by always propagating CONFIG_KASAN_SHADOW_OFFSET into
KASAN_SHADOW_OFFSET if CONFIG_KASAN is selected, moving the existing
common definition of +CFLAGS_KASAN_NOSANITIZE to the top of
Makefile.kasan.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
No module currently messed with clearing or setting the execute
permission of kernel memory, and none really should.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will@kernel.org>
The trusted OS may reject CPU_OFF calls to its resident CPU, so we must
avoid issuing those. We never migrate a Trusted OS and we already take
care to prevent CPU_OFF PSCI call. However, this is not reflected
explicitly to the userspace. Any user can attempt to hotplug trusted OS
resident CPU. The entire motion of going through the various state
transitions in the CPU hotplug state machine gets executed and the
PSCI layer finally refuses to make CPU_OFF call.
This results is unnecessary unwinding of CPU hotplug state machine in
the kernel. Instead we can mark the trusted OS resident CPU as not
available for hotplug, so that the user attempt or request to do the
same will get immediately rejected.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
It seems that LLVM's linker does not correctly handle variable assignments
involving section positions that are updated during the SECTIONS
parsing. Commit aa69fb62be ("arm64/efi: Mark __efistub_stext_offset as
an absolute symbol explicitly") ran into this too, but found a different
workaround.
However, this was not enough, as other variables were also miscalculated
which manifested as boot failures under UEFI where __efistub__end was
not taking the correct _end value (they should be the same):
$ ld.lld -EL -maarch64elf --no-undefined -X -shared \
-Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \
-o vmlinux.lld -T poc.lds --whole-archive vmlinux.o && \
readelf -Ws vmlinux.lld | egrep '\b(__efistub_|)_end\b'
368272: ffff000002218000 0 NOTYPE LOCAL HIDDEN 38 __efistub__end
368322: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 38 _end
$ aarch64-linux-gnu-ld.bfd -EL -maarch64elf --no-undefined -X -shared \
-Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \
-o vmlinux.bfd -T poc.lds --whole-archive vmlinux.o && \
readelf -Ws vmlinux.bfd | egrep '\b(__efistub_|)_end\b'
338124: ffff000012318000 0 NOTYPE LOCAL DEFAULT ABS __efistub__end
383812: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 15325 _end
To work around this, all of the __efistub_-prefixed variable assignments
need to be moved after the linker script's SECTIONS entry. As it turns
out, this also solves the problem fixed in commit aa69fb62be, so those
changes are reverted here.
Link: https://github.com/ClangBuiltLinux/linux/issues/634
Link: https://bugs.llvm.org/show_bug.cgi?id=42990
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
Strengthen the wording in the documentation for cpu_enable() to make it
more obvious to readers not already familiar with the code when the core
will call this callback and that this is intentional.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[will: minor tweak to emphasis in the comment]
Signed-off-by: Will Deacon <will@kernel.org>
Prior to commit:
14c127c957 ("arm64: mm: Flip kernel VA space")
... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.
Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.
To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>