We already re-enable interrupts where necessary in the entry code, so
there is no need to do it again in do_page fault. This patch removes
the redundant code.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When hardware updates of the access and dirty states are enabled, the
default ptep_set_access_flags() implementation based on calling
set_pte_at() directly is potentially racy. This triggers the "racy dirty
state clearing" warning in set_pte_at() because an existing writable PTE
is overridden with a clean entry.
There are two main scenarios for this situation:
1. The CPU getting an access fault does not support hardware updates of
the access/dirty flags. However, a different agent in the system
(e.g. SMMU) can do this, therefore overriding a writable entry with a
clean one could potentially lose the automatically updated dirty
status
2. A more complex situation is possible when all CPUs support hardware
AF/DBM:
a) Initial state: shareable + writable vma and pte_none(pte)
b) Read fault taken by two threads of the same process on different
CPUs
c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
eventually reaches do_set_pte() which sets a writable + clean pte.
CPU0 releases the mmap_sem
d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
pte entry it reads is present, writable and clean and it continues
to pte_mkyoung()
e) CPU1 calls ptep_set_access_flags()
If between (d) and (e) the hardware (another CPU) updates the dirty
state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
marking the entry clean again.
This patch implements an arm64-specific ptep_set_access_flags() function
to perform an atomic update of the PTE flags.
Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.3+
[will: reworded comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Enable NUMA balancing for arm64 platforms.
Add pte, pmd protnone helpers for use by automatic NUMA balancing.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Attempt to get the memory and CPU NUMA node via of_numa. If that
fails, default the dummy NUMA node and map all memory and CPUs to node
0.
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In order to extract NUMA information from the device tree, we need to
have the tree in its unflattened form.
Move the call to bootmem_init() in the tail of paging_init() into
setup_arch, and adjust header files so that its declaration is
visible.
Move the unflatten_device_tree() call between the calls to
paging_init() and bootmem_init(). Follow on patches add NUMA handling
to bootmem_init().
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With a VHE capable CPU, kernel can run at EL2 and is a decided at early
boot. If some of the CPUs didn't start it EL2 or doesn't have VHE, we
could have CPUs running at different exception levels, all in the same
kernel! This patch adds an early check for the secondary CPUs to detect
such situations.
For each non-boot CPU add a sanity check to make sure we don't have
different run levels w.r.t the boot CPU. We save the information on
whether the boot CPU is running in hyp mode or not and ensure the
remaining CPUs match it.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[will: made boot_cpu_hyp_mode static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
During the activation of a secondary CPU, we could report serious
configuration issues and hence request to crash the kernel. We do
this for CPU ASID bit check now. We will need it also for handling
mismatched exception levels for the CPUs with VHE. Hence, add a
helper to do the same for reusability.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With CONFIG_PROVE_LOCKING, CONFIG_DEBUG_LOCKDEP and CONFIG_TRACE_IRQFLAGS
enabled, lockdep will compare current->hardirqs_enabled with the flags from
local_irq_save().
When a debug exception occurs, interrupts are disabled in entry.S, but
lockdep isn't told, resulting in:
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
------------[ cut here ]------------
WARNING: at ../kernel/locking/lockdep.c:3523
Modules linked in:
CPU: 3 PID: 1752 Comm: perf Not tainted 4.5.0-rc4+ #2204
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc974868000 ti: ffffffc975f40000 task.ti: ffffffc975f40000
PC is at check_flags.part.35+0x17c/0x184
LR is at check_flags.part.35+0x17c/0x184
pc : [<ffffff80080fc93c>] lr : [<ffffff80080fc93c>] pstate: 600003c5
[...]
---[ end trace 74631f9305ef5020 ]---
Call trace:
[<ffffff80080fc93c>] check_flags.part.35+0x17c/0x184
[<ffffff80080ffe30>] lock_acquire+0xa8/0xc4
[<ffffff8008093038>] breakpoint_handler+0x118/0x288
[<ffffff8008082434>] do_debug_exception+0x3c/0xa8
[<ffffff80080854b4>] el1_dbg+0x18/0x6c
[<ffffff80081e82f4>] do_filp_open+0x64/0xdc
[<ffffff80081d6e60>] do_sys_open+0x140/0x204
[<ffffff80081d6f58>] SyS_openat+0x10/0x18
[<ffffff8008085d30>] el0_svc_naked+0x24/0x28
possible reason: unannotated irqs-off.
irq event stamp: 65857
hardirqs last enabled at (65857): [<ffffff80081fb1c0>] lookup_mnt+0xf4/0x1b4
hardirqs last disabled at (65856): [<ffffff80081fb188>] lookup_mnt+0xbc/0x1b4
softirqs last enabled at (65790): [<ffffff80080bdca4>] __do_softirq+0x1f8/0x290
softirqs last disabled at (65757): [<ffffff80080be038>] irq_exit+0x9c/0xd0
This patch adds the annotations to do_debug_exception(), while trying not
to call trace_hardirqs_off() if el1_dbg() interrupted a task that already
had irqs disabled.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit 1cf4f629d9 ("cpu/hotplug: Move online calls to
hotplugged cpu") it is ensured that callbacks of CPU_ONLINE and
CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
function calls are no longer required.
Replace smp_call_function_single() with a direct call of
hw_breakpoint_reset(). To keep the calling convention, interrupts are
explicitly disabled around the call.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit 1cf4f629d9 ("cpu/hotplug: Move online calls to
hotplugged cpu") it is ensured that callbacks of CPU_ONLINE and
CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
function calls are no longer required.
Replace smp_call_function_single() with a direct call to
clear_os_lock(). The function writes the OSLAR register to clear OS
locking. This does not require to be called with interrupts disabled,
therefore the smp_call_function_single() calling convention is not
preserved.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The mapping of the kernel consist of four segments, each of which is mapped
with different permission attributes and/or lifetimes. To optimize the TLB
and translation table footprint, we define various opaque constants in the
linker script that resolve to different aligment values depending on the
page size and whether CONFIG_DEBUG_ALIGN_RODATA is set.
Considering that
- a 4 KB granule kernel benefits from a 64 KB segment alignment (due to
the fact that it allows the use of the contiguous bit),
- the minimum alignment of the .data segment is THREAD_SIZE already, not
PAGE_SIZE (i.e., we already have padding between _data and the start of
the .data payload in many cases),
- 2 MB is a suitable alignment value on all granule sizes, either for
mapping directly (level 2 on 4 KB), or via the contiguous bit (level 3 on
16 KB and 64 KB),
- anything beyond 2 MB exceeds the minimum alignment mandated by the boot
protocol, and can only be mapped efficiently if the physical alignment
happens to be the same,
we can simplify this by standardizing on 64 KB (or 2 MB) explicitly, i.e.,
regardless of granule size, all segments are aligned either to 64 KB, or to
2 MB if CONFIG_DEBUG_ALIGN_RODATA=y. This also means we can drop the Kconfig
dependency of CONFIG_DEBUG_ALIGN_RODATA on CONFIG_ARM64_4K_PAGES.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Keeping .head.text out of the .text mapping buys us very little: its actual
payload is only 4 KB, most of which is padding, but the page alignment may
add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
padding to the uncompressed kernel Image.
Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
map the adjacent 56 KB of code without the PTE_CONT attribute, and since
this region contains things like the vector table and the GIC interrupt
handling entry point, this region is likely to benefit from the reduced TLB
pressure that results from PTE_CONT mappings.
So remove the alignment between the .head.text and .text sections, and use
the [_text, _etext) rather than the [_stext, _etext) interval for mapping
the .text segment.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Apart from the arm64/linux and EFI header data structures, there is nothing
in the .head.text section that must reside at the beginning of the Image.
So let's move it to the .init section where it belongs.
Note that this involves some minor tweaking of the EFI header, primarily
because the address of 'stext' no longer coincides with the start of the
.text section. It also requires a couple of relocated symbol references
to be slightly rewritten or their definition moved to the linker script.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Replace the poorly defined term chunk with segment, which is a term that is
already used by the ELF spec to describe contiguous mappings with the same
permission attributes of statically allocated ranges of an executable.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that the vmemmap region has been redefined to cover the linear region
rather than the entire physical address space, we no longer need to
perform a virtual-to-physical translation in the implementaion of
virt_to_page(). This restricts virt_to_page() translations to the linear
region, so redefine virt_addr_valid() as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This moves the vmemmap region right below PAGE_OFFSET, aka the start
of the linear region, and redefines its size to be a power of two.
Due to the placement of PAGE_OFFSET in the middle of the address space,
whose size is a power of two as well, this guarantees that virt to
page conversions and vice versa can be implemented efficiently, by
masking and shifting rather than ordinary arithmetic.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Before restricting virt_to_page() to the linear mapping, ensure that
the text patching code does not use it to resolve references into the
core kernel text, which is mapped in the vmalloc area.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The zero page is statically allocated, so grab its struct page pointer
without using virt_to_page(), which will be restricted to the linear
mapping later.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The implementation of free_initmem_default() expects __init_begin
and __init_end to be covered by the linear mapping, which is no
longer the case. So open code it instead, using addresses that are
explicitly translated from kernel virtual to linear virtual.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The translation performed by virt_to_page() is only valid for linear
addresses, and kernel symbols are no longer in the linear mapping.
So perform the __pa() translation explicitly, which does the right
thing in either case, and only then translate to a struct page offset.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This removes the relocate_initrd() implementation and invocation, which are
no longer needed now that the placement of the initrd is guaranteed to be
covered by the linear mapping.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Instead of going out of our way to relocate the initrd if it turns out
to occupy memory that is not covered by the linear mapping, just add the
initrd to the linear mapping. This puts the burden on the bootloader to
pass initrd= and mem= options that are mutually consistent.
Note that, since the placement of the linear region in the PA space is
also dependent on the placement of the kernel Image, which may reside
anywhere in memory, we may still end up with a situation where the initrd
and the kernel Image are simply too far apart to be covered by the linear
region.
Since we now leave it up to the bootloader to pass the initrd in memory
that is guaranteed to be accessible by the kernel, add a mention of this to
the arm64 boot protocol specification as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This reverts commit 36e5cd6b89, since the
section alignment is now guaranteed by construction when choosing the
value of memstart_addr.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This redefines ARM64_MEMSTART_ALIGN in terms of the minimal alignment
required by sparsemem vmemmap. This comes down to using 1 GB for all
translation granules if CONFIG_SPARSEMEM_VMEMMAP is enabled.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
After choosing memstart_addr to be the highest multiple of
ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
address, we clip the memblocks to the maximum size of the linear region.
Since the kernel may be high up in memory, we take care not to clip the
kernel itself, which means we have to clip some memory from the bottom if
this occurs, to ensure that the distance between the first and the last
usable physical memory address can be covered by the linear region.
However, we fail to update memstart_addr if this clipping from the bottom
occurs, which means that we may still end up with virtual addresses that
wrap into the userland range. So increment memstart_addr as appropriate to
prevent this from happening.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, we check two pointers: cpu_ops and cpu_suspend on every idle
state entry. These pointers check can be avoided:
If cpu_ops has not been registered, arm_cpuidle_init() will return
-EOPNOTSUPP, so arm_cpuidle_suspend() will never have chance to
run. In other word, the cpu_ops check can be avoid.
Similarly, the cpu_suspend check could be avoided in this hot path by
moving it into arm_cpuidle_init().
I measured the 4096 * time from arm_cpuidle_suspend entry point to the
cpu_psci_cpu_suspend entry point. HW platform is Marvell BG4CT STB
board.
1. only one shell, no other process, hot-unplug secondary cpus, execute
the following cmd
while true
do
sleep 0.2
done
before the patch: 1581220ns
after the patch: 1579630ns
reduced by 0.1%
2. only one shell, no other process, hot-unplug secondary cpus, execute
the following cmd
while true
do
md5sum /tmp/testfile
sleep 0.2
done
NOTE: the testfile size should be larger than L1+L2 cache size
before the patch: 1961960ns
after the patch: 1912500ns
reduced by 2.5%
So the more complex the system load, the bigger the improvement.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There are some new cpu features which can be identified by id_aa64mmfr2,
this patch appends all fields of it.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
others are usual stable material.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXA8x6AAoJEL/70l94x66D0x8H/RcBnc75994RQ++WmHSvD9GF
yruGB8soLDdjX+Oceol0aEPHokrBu3JtcdoTBe0GwbCKV/F5NkQZ4EfLxDtR3tte
7ILkPULLy5GElFpJNQuT4pmXzTEspFvXpqHhFik7WVBga3W9wMFQcjbrgmGBUzLE
p2aJVhZyErpKxGFkUYWhDnlqWsguTTIzv/pqNhLY4VVc0UrXN9AA0fq9RkvgU3KS
Hxk4/A6SV/b7dyzvttzITww0f1iu8FmlLj2TXapIEoOz7AnInD6KIN0RYpxbDjxN
bEzEfpahUtuDeM87/t2kHEj0Gn09iHK7/BbCC1Hrwo1CQhbAQ/D0GIvqYAQixf4=
=NugZ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Miscellaneous bugfixes.
The ARM and s390 fixes are for new regressions from the merge window,
others are usual stable material"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
compiler-gcc: disable -ftracer for __noclone functions
kvm: x86: make lapic hrtimer pinned
s390/mm/kvm: fix mis-merge in gmap handling
kvm: set page dirty only if page has been writable
KVM: x86: reduce default value of halt_poll_ns parameter
KVM: Hyper-V: do not do hypercall userspace exits if SynIC is disabled
KVM: x86: Inject pending interrupt even if pending nmi exist
arm64: KVM: Register CPU notifiers when the kernel runs at HYP
arm64: kvm: 4.6-rc1: Fix VTCR_EL2 VS setting
When we detect support for 16bit VMID in ID_AA64MMFR1, we set the
VTCR_EL2_VS field to 1 to make use of 16bit vmids. But, with
commit 3a3604bc5e ("arm64: KVM: Switch to C-based stage2 init")
this is broken and we corrupt VTCR_EL2:T0SZ instead of updating the VS
field. VTCR_EL2_VS was actually defined to the field shift (19) and
not the real value for VS. This patch fixes the issue.
Fixes: commit 3a3604bc5e ("arm64: KVM: Switch to C-based stage2 init")
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
A few defconfig updates got dropped on the floor during the merge window,
so I've rounded up the remainder here:
* Fix duplicate definition of MMC_BLOCK_MINORS and bump to 32 for
msm8916
* CPUFreq support for the Juno platform, using the MHU/SCPI interface
* Removal of the default command line, which assumed a console called
ttyAMA0
* Bits and pieces for the Hi6220 (96Boards HiKey)
Signed-off-by: Will Deacon <will.deacon@arm.com>
To use the ARMv8 PMU related register defines from the KVM code, we move
the relevant definitions to asm/perf_event.h header file and rename them
with prefix ARMV8_PMU_. This allows us to get rid of kvm_perf_event.h.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
arm and arm64 use different config options to specify big endian. This
needs taking into account when including code/headers between the two
architectures.
A case in point is PAN, which uses the __instr_arm() macro to output
instructions. The macro comes from opcodes.h, which lives under arch/arm.
On a big-endian build the mismatched config options mean the instruction
isn't byte swapped correctly, resulting in undefined instruction exceptions
during boot:
| alternatives: patching kernel code
| kdevtmpfs[87]: undefined instruction: pc=ffffffc0004505b4
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| Internal error: Oops - undefined instruction: 0 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 87 Comm: kdevtmpfs Not tainted 4.1.16+ #5
| Hardware name: Hisilicon PhosphorHi1382 EVB (DT)
| task: ffffffc336591700 ti: ffffffc3365a4000 task.ti: ffffffc3365a4000
| PC is at dump_instr+0x68/0x100
| LR is at do_undefinstr+0x1d4/0x2a4
| pc : [<ffffffc00076231c>] lr : [<ffffffc0000811d4>] pstate: 604001c5
| sp : ffffffc3365a6450
Cc: <stable@vger.kernel.org> #4.3.x-
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Xuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.
Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are some final updates for ARM SoC specific dts files:
* The i.MX changes were sent relatively late, and had a dependency
on the clk tree, so I delayed that a bit. Support for the new
i.MX6qp SoC and a couple of new boards is added in this branch.
* Uniphier renames a few files to match the final product names
that were decided by the company, kudos to the kernel developer(s)
for getting support upstream before the product release.
Also two boards are added. The patches were posted early enough
and nice overall, but we forgot to apply them and decided to
give it some more time in linux-next
* at91 has two small bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAVvQeGGCrR//JCVInAQLZSw//fuKWje/vigu4a8M8zQ9O3gN/8Elebq8E
OTY73jjpUfsJsRrqLPt9vmhrMsZ4raOIvG47B3ptwzvlV5ToRnlKEyyaYe200oxZ
5x+3i1jg20XRfTXwgusnENI+nqFO32VDaqSpYhUk6b1603RSzjFi+e0SydJaV78a
3GDDe9Y9lB7e5U/0BWBltPgKTQLrjyk7F32I33fcj3RInVi2NrmijXWic1eersyr
U0wHThKVGEHQSid+ZlSYZt0JnzotqCOpA+Pj3SrVCFdA9nOXT1lz1RTqSgNLhjfO
oXBKUWy0Ld1Ayjplg55Z7+QzOnn/JttHtumFYYu2OZcbSGr2AGWmKixR3j9Fvo8Y
X1xdo8eObMhxOrJIejy4NSC+xq/9Z7ur5mRcSMtfkmB1osirZrU9gevu6sBzV1Ha
ea3wFDaoXmmIjA0d5rQirR4XhDV7zs0rfbLPJd6Av6MuTw/hF6VYpEyKVUUzOqld
+jpDKweJhI64wKZ5oQ6AahCPV9LoQrTMk8ElJX07ndLSnYXAXz8B44O4It5b13fH
3UiWPX1tOV1HIed6z8zWUp77N5C5SeyNPtMcdvusf5/hS6gopgxGN3mqKmOUMLeO
iwX2krBpuXWj+3T/FjqSFdUrHAzbmjTIgqv70dreqj4DTkmeYr9BoInjWtzgZ6c0
bUPwJ93kFDc=
=p5al
-----END PGP SIGNATURE-----
Merge tag 'armsoc-dt2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull more ARM DT changes from Arnd Bergmann:
"Here are some final updates for ARM SoC specific dts files:
- The i.MX changes were sent relatively late, and had a dependency on
the clk tree, so I delayed that a bit. Support for the new i.MX6qp
SoC and a couple of new boards is added in this branch.
- Uniphier renames a few files to match the final product names that
were decided by the company, kudos to the kernel developer(s) for
getting support upstream before the product release. Also two
boards are added. The patches were posted early enough and nice
overall, but we forgot to apply them and decided to give it some
more time in linux-next
- at91 has two small bug fixes"
* tag 'armsoc-dt2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (83 commits)
ARM: dts: at91: sama5d4 Xplained: don't disable hsmci regulator
ARM: dts: at91: sama5d3 Xplained: don't disable hsmci regulator
ARM: dts: uniphier: add pinmux node for I2C ch4
ARM: dts: uniphier: add @{address} to EEPROM node
ARM: dts: uniphier: add PH1-Pro4 Sanji board support
ARM: dts: uniphier: add PH1-Pro4 Ace board support
ARM: dts: uniphier: enable I2C channel 2 of ProXstream2 Gentil board
ARM: dts: uniphier: add EEPROM node for ProXstream2 Gentil board
ARM: dts: uniphier: add reference clock nodes
ARM: dts: uniphier: rework UniPhier System Bus nodes
ARM: dts: uniphier: factor out ranges property of support card
arm64: dts: uniphier: rename PH1-LD10 to PH1-LD20
ARM: dts: imx53-qsb: Fix gpio button polarity
ARM: dts: vfxxx: Add DAC node for Vybrid SoC
ARM: dts: imx6q: add missing links between ipu2 and mipi dsi
ARM: dts: imx: Add support for Advantech/GE B850v3
ARM: dts: imx: Add support for Advantech/GE B650v3
ARM: dts: imx: Add support for Advantech/GE B450v3
ARM: dts: imx: Add support for Advantech/GE Bx50v3
ARM: dts: imx: Add Advantech BA-16 Qseven module
...
Pull irq fixes from Thomas Gleixner:
"A small set of fixes for the usual ARM/SOC irqchip drivers
- A set of fixes for mbigen to handle multiple devices in a hardware
module proper
- A cleanup for the mbigen config option which was pointlessly user
configurable.
- A cleanup for tegra replacing open coded functionality by the
proper core function
The config cleanup touches arch/arm64/Kconfig.platforms to select the
irq chip for the related platform"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mbigen: Make CONFIG_HISILICON_IRQ_MBIGEN a hidden option
ARM64: Kconfig: Select mbigen interrupt controller on Hisilicon platform
irqchip/mbigen: Handle multiple device nodes in a mbigen module
irqchip/mbigen: Adjust DT bindings to handle multiple devices in a module
irqchip/tegra: Switch to use irq_domain_free_irqs_common
Currently we disable preemption in copy_to_user_page; a behaviour that
we inherited from the 32-bit arm code. This was necessary for older
cores without broadcast data cache maintenance, and ensured that cache
lines were dirtied and cleaned by the same CPU. On these systems dirty
cache line migration was not possible, so this was sufficient to
guarantee coherency.
On contemporary systems, cache coherence protocols permit (dirty) cache
lines to migrate between CPUs as a result of speculation, prefetching,
and other behaviours. To account for this, in ARMv8 data cache
maintenance operations are broadcast and affect all data caches in the
domain associated with the VA (i.e. ISH for kernel and user mappings).
In __switch_to we ensure that tasks can be safely migrated in the middle
of a maintenance sequence, using a dsb(ish) to ensure prior explicit
memory accesses are observed and cache maintenance operations are
completed before a task can be run on another CPU.
Given the above, it is not necessary to disable preemption in
copy_to_user_page. This patch removes the preempt_{disable,enable}
calls, permitting preemption.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 324420bf91 ("arm64: add support for ioremap() block
mappings") added new p?d_set_huge functions which do the hard work to
generate and set a correct block entry.
These differ from open-coded huge page creation in the early page table
code by explicitly setting the P?D_TYPE_SECT bits (which are implicitly
retained by mk_sect_prot() for any valid prot), but are otherwise
identical (and cannot fail on arm64).
For simplicity and consistency, make use of these in the initial page
table creation code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The KASLR code incorrectly expects the contents of x18 to be preserved
across a call into C code, and uses it to stash the contents of SCTLR_EL1
before enabling the MMU. If the MMU needs to be disabled again to create
the randomized kernel mapping, x18 is written back to SCTLR_EL1, which is
likely to crash the system if x18 has been clobbered by kasan_early_init()
or kaslr_early_init(). So use x22 instead, which is not in use so far in
head.S
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* Miscellaneous bugfixes for ARM KVM
* Cleanup of memory barrier and removal of redundant barriers
* x86 fixes: page tracking oops, support for old buggy KVM nested on 4.5
* Support for protection keys in guests
* Lockdep fix
* Another conversion to simple wait queues and raw spinlocks,
backported from PREEMPT_RT
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJW8aaXAAoJEL/70l94x66D7voH/i2ytj6PbuWQQobSKDY38x8F
MHDFJ5UgTFZPPt8cB8YiCl6Tu0C5I2mNOk0rfb+bcpM5C1U9IAnBbbupyUblp6K9
1u+u+al8IlnOsoLzJSUXKDK5H4mEVrUnwVxTpZol5Ph5qQ8FvpbkxboMu3AGevO5
PIUXucK7fP5WXVV3Nh4YnUnBkeYzuuXcqYcV/TjscNQ4NcMofElcgpBxmE498TTk
rvhyuf2chEtY2DsDh3nzeYgxcGLpvE4/l5a+puEoOx4M5CH24wwne9LHAWJz6ofm
H3XNhsCz3jIGmrNqqkGUUya5qSkCsq2ha7n+VDw+fiP1TKy3FtkrBYQrDj+ISuc=
=UtvF
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
"Second round of KVM changes for 4.6:
- build fixes for PPC KVM
- miscellaneous bugfixes for ARM KVM
- cleanup of memory barrier and removal of redundant barriers
- x86 fixes: page tracking oops, support for old buggy KVM nested on 4.5
- support for protection keys in guests
- lockdep fix
- another conversion to simple wait queues and raw spinlocks,
backported from PREEMPT_RT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits)
KVM: page_track: fix access to NULL slot
KVM: PPC: do not compile in vfio.o unconditionally
kvm, rt: change async pagefault code locking for PREEMPT_RT
KVM/PPC: update the comment of memory barrier in the kvmppc_prepare_to_enter()
KVM/x86: update the comment of memory barrier in the vcpu_enter_guest()
KVM: Replace smp_mb() with smp_load_acquire() in the kvm_flush_remote_tlbs()
KVM/x86: Call smp_wmb() before increasing tlbs_dirty
KVM: Replace smp_mb() with smp_mb_after_atomic() in the kvm_make_all_cpus_request()
KVM/x86: Replace smp_mb() with smp_store_mb/release() in the walk_shadow_page_lockless_begin/end()
KVM: Remove redundant smp_mb() in the kvm_mmu_commit_zap_page()
KVM, pkeys: expose CPUID/CR4 to guest
KVM, pkeys: add pkeys support for permission_fault
KVM, pkeys: introduce pkru_mask to cache conditions
KVM, pkeys: save/restore PKRU when guest/host switches
x86: pkey: introduce write_pkru() for KVM
KVM, pkeys: add pkeys support for xsave state
KVM, pkeys: disable pkeys for guests in non-paging mode
KVM: x86: remove magic number with enum cpuid_leafs
KVM: MMU: return page fault error code from permission_fault
KVM: fix spin_lock_init order on x86
...
This time with:
* Updates for the Exynos IOMMU driver to make use of default
domains and to add support for the SYSMMU v5
* New Mediatek IOMMU driver
* Support for the ARMv7 short descriptor format in the
io-pgtable code
* Default domain support for the ARM SMMU
* Couple of other small fixes all over the place
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJW7/7rAAoJECvwRC2XARrjKvgP/2sgR6lzIGksKpZRNNNoyJEp
PbFt3zxBvIPYow6rQtfMqU82FAi6psq+EVKq+M0EOeJrjFGawwWpN9H/e0ZCs5Z/
/s6DIljRFKrbty59eFsHn57Pd+302Pt0GkwnSgdgBJD7FimozyyeMJnAOs5gPjYT
jF2ajV9FYa5rIRrMsSD2KjLKgBb3xVsgUlW72NU2WwldnOB6fSsfg4ll01kbzTon
IQENT5ywk9zZFouLyrX6EvcvowHslO/sZhGe3Py9qOOHpu9roW7EE7rEGYdabn47
PGpw8O5NOeSrQNzlmhXje5tuKxkh33DV55s7vVcaOy66kWbYExJGoz1/V7Vju4n1
pok82L3N8eauMs3xqNOiQMV8UsWIXOzdMMaGypM18pCVKMaAUiz9vO9rLSmR4Z20
IYFiX0yBXhc1AXMnrRlq/xR2WjBX2L2s0VguvYoSssdmJUZ9aKYxsurF8Ylqpm+1
wymOj+gjM056DqAXcYBVg4ZPOEezRjnUe2qD8lZ4et3xOVUL3LXRi8FmacztEB97
chUSB5mur/XRy6bOVI2l1uRQaqdfErgbCey0fa9N/SWKSHKWtAfR6CYYVpoR6m0L
H/xL7yCn6jUEoadKxZyTKnX8GIN6wNcZdI+58OOMz3sjlmWs69wgdPt8Xx2RNHpm
7caf/9sTdpUeh+V2fySD
=uiAk
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- updates for the Exynos IOMMU driver to make use of default domains
and to add support for the SYSMMU v5
- new Mediatek IOMMU driver
- support for the ARMv7 short descriptor format in the io-pgtable code
- default domain support for the ARM SMMU
- couple of other small fixes all over the place
* tag 'iommu-updates-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
iommu/ipmmu-vmsa: Add r8a7795 DT binding
iommu/mediatek: Check for NULL instead of IS_ERR()
iommu/io-pgtable-armv7s: Fix kmem_cache_alloc() flags
iommu/mediatek: Fix handling of of_count_phandle_with_args result
iommu/dma: Fix NEED_SG_DMA_LENGTH dependency
iommu/mediatek: Mark PM functions as __maybe_unused
iommu/mediatek: Select ARM_DMA_USE_IOMMU
iommu/exynos: Use proper readl/writel register interface
iommu/exynos: Pointers are nto physical addresses
dts: mt8173: Add iommu/smi nodes for mt8173
iommu/mediatek: Add mt8173 IOMMU driver
memory: mediatek: Add SMI driver
dt-bindings: mediatek: Add smi dts binding
dt-bindings: iommu: Add binding for mediatek IOMMU
iommu/ipmmu-vmsa: Use ARCH_RENESAS
iommu/exynos: Support multiple attach_device calls
iommu/exynos: Add Maintainers entry for Exynos SYSMMU driver
iommu/exynos: Add support for v5 SYSMMU
iommu/exynos: Update device tree documentation
iommu/exynos: Add support for SYSMMU controller with bogus version reg
...
- Initial support for ARMv8.1 CPU PMUs
- Support for the CPU PMU in Cavium ThunderX
- CPU PMU support for systems running 32-bit Linux in secure mode
- Support for the system PMU in ARM CCI-550 (Cache Coherent Interconnect)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJW794rAAoJELescNyEwWM0O5IH/0ejoUjip3n4dFZnSzAbQQZe
VxCy3DXW5gS8YaswwX2dFw9K772/BpHlazq8AIJIhaR+b+Zzl5t0iOc12HluDilV
pMvi0JTCxwJhsEiKZnP0cVAU9HM6MAgtMOEegkd/YNESKQey30NeDtIcz/pQfTUV
28AF71+w5VPj/1EpHEEhHQsASRIx7eDbKzThzdlb8PnDS0o23QJhL9HjVTNIAlB8
BGxrUBKtBu0eH2Hx33vNjc7UYx1WZQlCk5cAaXevA8mbFXzYaMQI2Cel2nbNMO9i
eu5zPkDUCG7dq16PxK6IgM4AsDCtmmDuckLdN6UEQWYxkLbb2qHNRKtj0bKB8Sk=
=E4PE
-----END PGP SIGNATURE-----
Merge tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm[64] perf updates from Will Deacon:
"I have another mixed bag of ARM-related perf patches here.
It's about 25% CPU and 75% interconnect, but with drivers/bus/
languishing without an obvious maintainer or tree, Olof and I agreed
to keep all of these PMU patches together. I suspect a whole load of
code from drivers/bus/arm-* can be moved under drivers/perf/, so
that's on the radar for the future.
Summary:
- Initial support for ARMv8.1 CPU PMUs
- Support for the CPU PMU in Cavium ThunderX
- CPU PMU support for systems running 32-bit Linux in secure mode
- Support for the system PMU in ARM CCI-550 (Cache Coherent Interconnect)"
* tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (26 commits)
drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree
arm64: perf: Extend ARMV8_EVTYPE_MASK to include PMCR.LC
arm-cci: remove unused variable
arm-cci: don't return value from void function
arm-cci: make private functions static
arm-cci: CoreLink CCI-550 PMU driver
arm-cci500: Rearrange PMU driver for code sharing with CCI-550 PMU
arm-cci: CCI-500: Work around PMU counter writes
arm-cci: Provide hook for writing to PMU counters
arm-cci: Add helper to enable PMU without synchornising counters
arm-cci: Add routines to save/restore all counters
arm-cci: Get the status of a counter
arm-cci: write_counter: Remove redundant check
arm-cci: Delay PMU counter writes to pmu::pmu_enable
arm-cci: Refactor CCI PMU enable/disable methods
arm-cci: Group writes to counter
arm-cci: fix handling cpumask_any_but return value
arm-cci: simplify sysfs attr handling
drivers/perf: arm_pmu: implement CPU_PM notifier
arm64: dts: Add Cavium ThunderX specific PMU
...
With the recent rewrite of the arm64 KVM hypervisor code in C, enabling
certain options like KASAN would allow the compiler to generate memory
accesses or function calls to addresses not mapped at EL2. This patch
disables the compiler instrumentation on the arm64 hypervisor code for
gcov-based profiling (GCOV_KERNEL), undefined behaviour sanity checker
(UBSAN) and kernel address sanitizer (KASAN).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: <stable@vger.kernel.org> # 4.5+
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
The printk() implementation has a limit of LOG_LINE_MAX (== 1024 - 32)
buffer per call which the arm64 mem_init() breaches when printing the
virtual memory layout with CONFIG_KASAN enabled. The result is that the
last line is no longer printed. This patch splits the call into a
pr_notice() + additional pr_cont() calls.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
After commit 65da0a8e34 ("arm64: use non-global mappings for UEFI
runtime regions"), nobody use __local_flush_icache_all() anymore,
so drop it.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit f80fb3a3d5 ("arm64: add support for kernel ASLR") missed a
DSB necessary to complete I-cache maintenance in the primary boot path,
and hence stale instructions may still be present in the I-cache and may
be executed until the I-cache maintenance naturally completes.
Since commit 8ec4198743 ("arm64: mm: ensure patched kernel text is
fetched from PoU"), all CPUs invalidate their I-caches after their MMU
is enabled. Prior a CPU's MMU having been enabled, arbitrary lines may
have been fetched from the PoC into I-caches. We never patch text
expected to be executed with the MMU off. Thus, it is unnecessary to
perform broadcast I-cache maintenance in the primary boot path.
This patch reduces the scope of the I-cache maintenance to the local
CPU, and adds the missing DSB with similar scope, matching prior
maintenance in the primary boot path.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesehvuel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The implementation of macro inv_entry refers to its 'el' argument without
the required leading backslash, which results in an undefined symbol
'el' to be passed into the kernel_entry macro rather than the index of
the exception level as intended.
This undefined symbol strangely enough does not result in build failures,
although it is visible in vmlinux:
$ nm -n vmlinux |head
U el
0000000000000000 A _kernel_flags_le_hi32
0000000000000000 A _kernel_offset_le_hi32
0000000000000000 A _kernel_size_le_hi32
000000000000000a A _kernel_flags_le_lo32
.....
However, it does result in incorrect code being generated for invalid
exceptions taken from EL0, since the argument check in kernel_entry
assumes EL1 if its argument does not equal '0'.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When running with VHE, there is no need to translate kernel pointers
to the EL2 memory space, since we're already there (and we have a much
saner memory map to start with).
Unfortunately, kvm_ksym_ref is getting in the way, and the first
call into the "hypervisor" section is going to end up in fireworks,
since we're now branching into nowhereland. Meh.
A potential solution is to test if VHE is engaged or not, and only
perform the translation in the negative case. With this in place,
VHE is able to run again.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>