The code for switching to irq_stack stores three pieces of information on
the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
interrupted tasks stack frame), and the address of the struct pt_regs that
contains the register values from kernel entry. (which dump_backtrace()
will print in any stack trace).
To reduce this, we store fp, and the pointer to the struct pt_regs.
unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
appears at the top of the irq_stack), and use the struct pt_regs values
to find the missing interrupted link-register.
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Having the system register numbers as #defines has been a pain
since day one, as the ordering is pretty fragile, and moving
things around leads to renumbering and epic conflict resolutions.
Now that we're mostly acessing the sysreg file in C, an enum is
a much better type to use, and we can clean things up a bit.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Rather than crafting custom macros for reading/writing each system
register provide generics accessors, read_sysreg and write_sysreg, for
this purpose.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
It would add guest exit statistics to debugfs, this can be helpful
while measuring KVM performance.
[ Renamed some of the field names - Christoffer ]
Signed-off-by: Amit Singh Tomar <amittomer25@gmail.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Pull timer fixlets from Thomas Gleixner:
"Two trivial fixes which add missing header fileas and forward
declarations so the code will compile even when the magic include
chains are different"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3: Add missing include for barrier.h
irqchip/gic-v3: Add missing struct device_node declaration
Currently the BUG_ON() checks do not give enough information about the
PTEs being set. This patch changes BUG_ON to WARN_ONCE and dumps the
values of the old and new PTEs. In addition, the checks are only made if
the new PTE entry is valid.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Both the 32bit and 64bit versions of the GICv3 header file are using
barriers, but neglect to include barrier.h, leading to an interesting
splat in some circumstances.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1449483072-17694-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The arm64 asm/cmpxchg.h includes linux/mmdebug.h but doesn't so far as I
can tell actually use anything from it. Removing the inclusion reduces
spurious header dependency rebuilds and also avoids issues with
recursive inclusions of headers causing build breaks due to attempts to
use things before they are defined if linux/mmdebug.h starts pulling in
more low level headers.
Such errors have happened in -next recently, for example:
In file included from include/linux/completion.h:11:0,
from include/linux/rcupdate.h:43,
from include/linux/tracepoint.h:19,
from include/linux/mmdebug.h:6,
from ./arch/arm64/include/asm/cmpxchg.h:22,
from ./arch/arm64/include/asm/atomic.h:41,
from include/linux/atomic.h:4,
from include/linux/spinlock.h:406,
from include/linux/seqlock.h:35,
from include/linux/time.h:5,
from include/uapi/linux/timex.h:56,
from include/linux/timex.h:56,
from include/linux/sched.h:19,
from arch/arm64/kernel/asm-offsets.c:21:
include/linux/wait.h: In function 'wait_on_atomic_t':
include/linux/wait.h:1218:2: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]
if (atomic_read(val) == 0)
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently we treat the alternatives separately from other data that's
only used during initialisation, using separate .altinstructions and
.altinstr_replacement linker sections. These are freed for general
allocation separately from .init*. This is problematic as:
* We do not remove execute permissions, as we do for .init, leaving the
memory executable.
* We pad between them, making the kernel Image bianry up to PAGE_SIZE
bytes larger than necessary.
This patch moves the two sections into the contiguous region used for
.init*. This saves some memory, ensures that we remove execute
permissions, and allows us to remove some code made redundant by this
reorganisation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
irq_stack is a per_cpu variable, that needs to be access from entry.S.
Use an assembler macro instead of the unreadable details.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This refactors the EFI init and runtime code that will be shared
between arm64 and ARM so that it can be built for both archs.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Running with CONFIG_DEBUG_SPINLOCK=y can trigger a BUG with the new IRQ
stack code:
BUG: spinlock lockup suspected on CPU#1
This is due to the IRQ_STACK_TO_TASK_STACK macro incorrectly retrieving
the task stack pointer stashed at the top of the IRQ stack.
Sayeth James:
| Yup, this is what is happening. Its an off-by-one due to broken
| thinking about how the stack works. My broken thinking was:
|
| > top ------------
| > | dummy_lr | <- irq_stack_ptr
| > ------------
| > | x29 |
| > ------------
| > | x19 | <- irq_stack_ptr - 0x10
| > ------------
| > | xzr |
| > ------------
|
| But the stack-pointer is decreased before use. So it actually looks
| like this:
|
| > ------------
| > | | <- irq_stack_ptr
| > top ------------
| > | dummy_lr |
| > ------------
| > | x29 | <- irq_stack_ptr - 0x10
| > ------------
| > | x19 |
| > ------------
| > | xzr | <- irq_stack_ptr - 0x20
| > ------------
|
| The value being used as the original stack is x29, which in all the
| tests is sp but without the current frames data, hence there are no
| missing frames in the output.
|
| Jungseok Lee picked it up with a 32bit user space because aarch32
| can't use x29, so it remains 0 forever. The fix he posted is correct.
This patch fixes the macro and adds some of this wisdom to a comment,
so that the layout of the IRQ stack is well understood.
Cc: James Morse <james.morse@arm.com>
Reported-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
irq_count is used to detect recursive interrupts on the irq_stack, it is
updated late by do_softirq_own_stack(), when called on the irq_stack, before
__do_softirq() re-enables interrupts to process softirqs.
do_softirq_own_stack() is added by this patch, but does not yet switch
stack.
This patch adds the dummy stack frame and data needed by the previous
stack tracing patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch allows unwind_frame() to traverse from interrupt stack to task
stack correctly. It requires data from a dummy stack frame, created
during irq_stack_entry(), added by a later patch.
A similar approach is taken to modify dump_backtrace(), which expects to
find struct pt_regs underneath any call to functions marked __exception.
When on an irq_stack, the struct pt_regs is stored on the old task stack,
the location of which is stored in the dummy stack frame.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[james.morse: merged two patches, reworked for per_cpu irq_stacks, and
no alignment guarantees, added irq_stack definitions]
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There is need for figuring out how to manage struct thread_info data when
IRQ stack is introduced. struct thread_info information should be copied
to IRQ stack under the current thread_info calculation logic whenever
context switching is invoked. This is too expensive to keep supporting
the approach.
Instead, this patch pays attention to sp_el0 which is an unused scratch
register in EL1 context. sp_el0 utilization not only simplifies the
management, but also prevents text section size from being increased
largely due to static allocated IRQ stack as removing masking operation
using THREAD_SIZE in many places.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- A series of fixes to deal with the aliasing between the sp and xzr register
- A fix for the cache flush fix that went in -rc3
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWYcEKAAoJECPQ0LrRPXpDMe0P/0t4kieg6O+3/8DwzGJGRYmZ
paf1UBg99Mq2+xkrHbHmyObYTs2z56m2x9Q5/Wcmg0kG2d7jWv9Hyg7he+CPjnfb
OtnizA/u/4so+bH28idhamZyinm6CMQIwMxhiU4yokUn2aiv3crYP89tWDUHlMvr
rYQt29u4wtDmDJWyddXWM7nev7zc3ZG5q19ZEiEhyjHqbp1LXksFAos3U8sTHd8f
jzWPQTaJDJX62wtg/FLo2prbDD+NGWyY68y6x/c3d9GhkW1NoBoUfgLtWRUuEACN
HaponhD79C24gZ53knbgvoB3J7Gc03RUCUzMYoOi1Aq+ggOofiluO/B8cYZ0P1Ni
tL0OcC4TPiEOxrQch6sEJroIPBulj/DKeO+wVqBWBfQaB3/aS3Y5QTBLIdL9yco+
u/woq5TjRseV5B4e2ZAlAxINB4mscx4mDkM318xFtHD6f8K7FsKA++XqMcIJon6J
a+sdvjiGSc3DP6L3+sTqFflgCUzIk9Vx+p7+jjZgOmqUZDSmi+M6ZgIslWa6f8sz
IhYKI4j4G266qjh/tBF8Um77d8q1aM9qmgnshoML/oYQMTocZucf5/f0ddGfiMp8
2rrMrSLpazvEBDIV7+BR8UVcV/yuwxdHeZM/Yu2YdsxvBEhoHYgYhyr/GfDx0CG1
RGpwSQaYxhbnF9SS91+R
=6wxm
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM fixes for v4.4-rc4
- A series of fixes to deal with the aliasing between the sp and xzr register
- A fix for the cache flush fix that went in -rc3
Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
unsafe on ARM64. This patch converts the rest of use cases to new
accessors and completely removes vcpu_reg() on ARM64.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.
Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.
This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:
*((volatile int *)reg) = 0;
compilers tend to generate "str wzr, [xx]" here
[Marc: Fixed 32bit splat]
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
on architectures implementing spin_lock with LL/SC sequences and acquire
semantics:
| CPU 1 CPU 2 CPU 3
| ================== ==================== ==============
| spin_unlock(&lock);
| spin_lock(&lock):
| r1 = *lock; // r1 == 0;
| o = READ_ONCE(object); // reordered here
| object = NULL;
| smp_mb();
| spin_unlock_wait(&lock);
| *lock = 1;
| smp_mb();
| o->dead = true;
| if (o) // true
| BUG_ON(o->dead); // true!!
The crux of the problem is that spin_unlock_wait(&lock) can return on
CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
to serialise against a concurrent locker and giving it acquire semantics
in the process (although it is not at all clear whether this is needed -
different callers seem to assume different things about the barrier
semantics and architectures are similarly disjoint in their
implementations of the macro).
This patch implements spin_unlock_wait using an LL/SC sequence with
acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
exclusive writeback is omitted, since the spin_lock operation is
indivisible and no intermediate state can be observed.
Signed-off-by: Will Deacon <will.deacon@arm.com>
ARM glibc uses (4 * __getpagesize()) for SHMLBA, which is correct for
4KB pages and works fine for 64KB pages, but the kernel uses a hardcoded
16KB that is too small for 64KB page based kernels. This changes the
definition to what user space sees when using 64KB pages.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch implements the pte_accessible() macro, which can be used to
test whether or not a given pte is a candidate for allocation in the
TLB.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
it only affects arm64 builds; submitted by Arnd Bergmann)
- EFI fixes to deal with early_memremap() returning NULL and correctly
mapping run-time regions
- Fix CPUID register extraction of unsigned fields (not to be
sign-extended)
- ASID allocator fix to deal with long-running tasks over multiple
generation roll-overs
- Revert support for marking page ranges as contiguous PTEs (it leads to
TLB conflicts and requires additional non-trivial kernel changes)
- Proper early_alloc() failure check
- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
larger than the KASan shadow memory)
- Update the fault_info table (original descriptions based on early
engineering spec)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWWJiuAAoJEGvWsS0AyF7xlWIP/3ma+ZSyitIS0FWOld/uo3c1
KbH9i7DrEL9tOzz4AhkKHBA7LOs0NvNkjz2sPLbnVg57H6r2y6Bi1ls5ODUWFy6y
CKI0aaCYhWPyYWDq6H9NfD5Xh6jx0+45dMqKiCy1mvpChEwPfW4aZGceKptNbBrG
v0VG1H5s0U+SjNqKqZ3W/hbwyQ1ZvAXJ022q7/ihPt6s2U0ebjXqc+6S2TcJyWNn
C0bDn40+MK7p8jqRrq80bAjAvC5yDQ7/o7fBsNzsVYhuNTA3HR5CG1jGMJwGcVvA
NJt71vfBq8L4PT2ndt8BxC5G500GdkQk2Nb2i1G9EgakH8Yv5Y2deFTUFDYPTHBg
EfUgORet2iBiCcLY+lLTonjKICsHi4Bn//DsyyEZ7HXAovS0DIH3rQfKubYNlT3p
FR2eskr3cDoQei3L9u0YU1zn+OuWRS7yJdjisjcTAEFaRBKqRXYMoczhVvJPb5xQ
RPtHZNAS0JXH+0Cmdo+nHjSfpEo20nefBvd3Xvs0jvwWKxS6rwexxQWYTKNTbycq
5iTYOGXlequnyTztK5M0AcfAajE+EVT2mAXkD/C727tUdO7yiCh86CNLIREHK8sH
cLnc2iJ12IsJmqV7uRPI5YjNmYau7ZQpfcRfflt1LlL7mx1VmSiyb4JeomGEE/gu
IdJ1iBl2JGguat1DHIXU
=YgtU
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
it only affects arm64 builds; submitted by Arnd Bergmann)
- EFI fixes to deal with early_memremap() returning NULL and correctly
mapping run-time regions
- Fix CPUID register extraction of unsigned fields (not to be
sign-extended)
- ASID allocator fix to deal with long-running tasks over multiple
generation roll-overs
- Revert support for marking page ranges as contiguous PTEs (it leads
to TLB conflicts and requires additional non-trivial kernel changes)
- Proper early_alloc() failure check
- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
larger than the KASan shadow memory)
- Update the fault_info table (original descriptions based on early
engineering spec)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: efi: fix initcall return values
arm64: efi: deal with NULL return value of early_memremap()
arm64: debug: Treat the BRPs/WRPs as unsigned
arm64: cpufeature: Track unsigned fields
arm64: cpufeature: Add helpers for extracting unsigned values
Revert "arm64: Mark kernel page ranges contiguous"
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
arm64: efi: correctly map runtime regions
arm64: mm: fix fault_info table xFSC decoding
arm64: fix building without CONFIG_UID16
arm64: early_alloc: Fix check for allocation failure
- Fix gntdev and numa balancing.
- Fix x86 boot crash due to unallocated legacy irq descs.
- Fix overflow in evtchn device when > 1024 event channels.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWV1coAAoJEFxbo/MsZsTROo8H/1D69XtlQmLrAKWq4JafZrXM
rYQoiRxW/yDNoA3whtOcK4TLf/JpA+B1VAoekXqSEG5Mv9YbIH1Su/y4KwF7WaeX
xSL812ODeN8iYk8A52Zccw0gdl/emzLesPLuq5UrdDhehYp8vQGtk/CdvZIiQAAc
of5Ds9ozIuKTcwDkxOZdUrSG0DvCuvhHBz4xrmuKkbs8CAornfQGBUPKb+vkS05b
2IVzFhCtM2Bhsb8Ji4TfNjsH90T9tghb/QG73APniRMx+hn7CUHkifZ074tnGATp
LdXCJ8D5C8WZx0QCklzcBZUpXbwWv9AWyZR8gZqhGUCMh9XGgByC3lqsMGFgwiM=
=5872
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix gntdev and numa balancing.
- Fix x86 boot crash due to unallocated legacy irq descs.
- Fix overflow in evtchn device when > 1024 event channels.
* tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: dynamically grow pending event channel ring
xen/events: Always allocate legacy interrupts on PV guests
xen/gntdev: Grant maps should not be subject to NUMA balancing
IDAA64DFR0_EL1: BRPs and WRPs are unsigned values. Use
the appropriate helpers to extract those fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Some of the feature bits have unsigned values and need
to be treated accordingly to avoid errors. Adds the property
to the feature bits and use the appropriate field extract helpers.
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
After commit 8c058b0b9c ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The cpuid_feature_extract_field() extracts the feature value
as a signed integer. This could be problematic for features
whose values are unsigned. e.g, ID_AA64DFR0_EL1:BRPs. Add
an unsigned variant for the unsigned fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
when a Stage 1 permission fault or device alignment fault should
have been reported.
This patch implements the workaround (which is to validate that the
Stage-1 translation actually succeeds) by using code patching.
Cc: stable@vger.kernel.org
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
When running a 32bit guest under a 64bit hypervisor, the ARMv8
architecture defines a mapping of the 32bit registers in the 64bit
space. This includes banked registers that are being demultiplexed
over the 64bit ones.
On exceptions caused by an operation involving a 32bit register, the
HW exposes the register number in the ESR_EL2 register. It was so
far understood that SW had to distinguish between AArch32 and AArch64
accesses (based on the current AArch32 mode and register number).
It turns out that I misinterpreted the ARM ARM, and the clue is in
D1.20.1: "For some exceptions, the exception syndrome given in the
ESR_ELx identifies one or more register numbers from the issued
instruction that generated the exception. Where the exception is
taken from an Exception level using AArch32 these register numbers
give the AArch64 view of the register."
Which means that the HW is already giving us the translated version,
and that we shouldn't try to interpret it at all (for example, doing
an MMIO operation from the IRQ mode using the LR register leads to
very unexpected behaviours).
The fix is thus not to perform a call to vcpu_reg32() at all from
vcpu_reg(), and use whatever register number is supplied directly.
The only case we need to find out about the mapping is when we
actively generate a register access, which only occurs when injecting
a fault in a guest.
Cc: stable@vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
A newly introduced function in include/net/sock.h passes a const
argument to smp_load_acquire:
static inline int sk_state_load(const struct sock *sk)
{
return smp_load_acquire(&sk->sk_state);
}
This cause an allmodconfig build failure, since our underlying
load-acquire implementation does not handle const types correctly:
include/net/sock.h: In function 'sk_state_load':
./arch/arm64/include/asm/barrier.h:71:3: error: read-only variable '___p1' used as 'asm' output
asm volatile ("ldarb %w0, %1" \
This patch fixes the problem by reusing the trick in READ_ONCE that
loads via a non-const member of an anonymous union. This has the
advantage of allowing us to use smp_load_acquire on packed structures
(e.g. arch_spinlock_t) as well as primitive types.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Daney <david.daney@cavium.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: David Daney <david.daney@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The permissions in mark_rodata_ro trigger a build error
with STRICT_MM_TYPECHECKS. Fix this by introducing
PAGE_KERNEL_ROX for the same reasons as PAGE_KERNEL_RO.
From Ard:
"PAGE_KERNEL_EXEC has PTE_WRITE set as well, making the range
writeable under the ARMv8.1 DBM feature, that manages the
dirty bit in hardware (writing to a page with the PTE_RDONLY
and PTE_WRITE bits both set will clear the PTE_RDONLY bit in that case)"
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As pointed out by Russell King in response to the proposed ARM version
of this code, the sequence to switch between the UEFI runtime mapping
and current's actual userland mapping (and vice versa) is potentially
unsafe, since it leaves a time window between the switch to the new
page tables and the TLB flush where speculative accesses may hit on
stale global TLB entries.
So instead, use non-global mappings, and perform the switch via the
ordinary ASID-aware context switch routines.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
including ptrace.h brings a definition of BITS_PER_PAGE into device
drivers and cause a build warning in allmodconfig builds:
drivers/block/drbd/drbd_bitmap.c:482:0: warning: "BITS_PER_PAGE" redefined
#define BITS_PER_PAGE (1UL << (PAGE_SHIFT + 3))
This uses a slightly different way to express current_pt_regs()
that avoids the use of the header and gets away with the already
included asm/ptrace.h.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
warnings, e.g.
drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined
However, it looks like the dependency should not even there as
I do not see why __generic_dma_ops() cares about whether we have
an ACPI based system or not.
The current behavior is to fall back to the global dma_ops when
a device has not set its own dma_ops, but only for DT based systems.
This seems dangerous, as a random device might have different
requirements regarding IOMMU or coherency, so we should really
never have that fallback and just forbid DMA when we have not
initialized DMA for a device.
This removes the global dma_ops variable and the special-casing
for ACPI, and just returns the dma ops that got set for the
device, or the dummy_dma_ops if none were present.
The original code has apparently been copied from arm32 where we
rely on it for ISA devices things like the floppy controller, but
we should have no such devices on ARM64.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- __cmpxchg_double*() return type fix to avoid truncation of a long to
int and subsequent logical "not" in cmpxchg_double() misinterpreting
the operation success/failure
- BPF fixes for mod and div by zero
- Fix compilation with STRICT_MM_TYPECHECKS enabled
- VDSO build fix without libgcov
- Some static and __maybe_unused annotations
- Kconfig clean-up (FRAME_POINTER)
- defconfig update for CRYPTO_CRC32_ARM64
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWRNNbAAoJEGvWsS0AyF7xp+wQAIc0A+uSReEJ0Be3kSWZIy0O
9wGCtfp2e3X78ibgVoP/+KvA1JUrMJNwNH54CgGgG6H4rwjRthCvIV/HbKfYufM8
vfuTL2MV1ywkNO0uTzspsICqgKPcpG27SwAlgOcxNXpO0Kui2OlKSxS4kTA8+6Z5
Lm64qDmFG7Z6wcBHhr8JSngC+xvXOvlcUW8odnjXjyCimwnpCFXXnRWDU3RnXJZa
3Khgp8OiRtnCSLfj7YBQA9wfNNgPgKdJ5wevz2g7hiIbYx0IOHmDpzbb3sUNMMKV
XLKeeJgqZL4EXZBCzapHRHCE/q0kiiBhzYSHw6aOBwjD9v683aytT/ax2/AgjzvW
nB3ZPdrbRMjcmNRBT2bheoU8diilhtfxSxf+4T+pVUnVMXDNl/xY9hekGA0hFO1z
nH5P5vkFKsX3U02Ox/G50Od2rM6p7uGRGFYuomSIoJYBItuxGOAuYWlY2+ujcxY5
YvAQ+3FYCkjLipVutlqLxKoZSY8Ex+0LOjPYYsI/+rsE70IVjGuLj0bTm8B/aTcy
dOctNqvOGwo8O5n2jsKM3XkjfUCPRdzu1C7rQz2BqfE9cPAZxg2fQpPv4SGtPuFe
lEvokuYRJ3qYnMt5MG/9Mkqmczfbch88A41wgS9/ySQ57eo3wISLkOiKqzKdJjOa
0qldWaEvST2iVUQmiMl7
=ApkD
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes and clean-ups from Catalin Marinas:
"Here's a second pull request for this merging window with some
fixes/clean-ups:
- __cmpxchg_double*() return type fix to avoid truncation of a long
to int and subsequent logical "not" in cmpxchg_double()
misinterpreting the operation success/failure
- BPF fixes for mod and div by zero
- Fix compilation with STRICT_MM_TYPECHECKS enabled
- VDSO build fix without libgcov
- Some static and __maybe_unused annotations
- Kconfig clean-up (FRAME_POINTER)
- defconfig update for CRYPTO_CRC32_ARM64"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: suspend: make hw_breakpoint_restore static
arm64: mmu: make split_pud and fixup_executable static
arm64: smp: make of_parse_and_init_cpus static
arm64: use linux/types.h in kvm.h
arm64: build vdso without libgcov
arm64: mark cpus_have_hwcap as __maybe_unused
arm64: remove redundant FRAME_POINTER kconfig option and force to select it
arm64: fix R/O permissions of FDT mapping
arm64: fix STRICT_MM_TYPECHECKS issue in PTE_CONT manipulation
arm64: bpf: fix mod-by-zero case
arm64: bpf: fix div-by-zero case
arm64: Enable CRYPTO_CRC32_ARM64 in defconfig
arm64: cmpxchg_dbl: fix return value type
The mapping permissions of the FDT are set to 'PAGE_KERNEL | PTE_RDONLY'
in an attempt to map the FDT as read-only. However, not only does this
break at build time under STRICT_MM_TYPECHECKS (since the two terms are
of different types in that case), it also results in both the PTE_WRITE
and PTE_RDONLY attributes to be set, which means the region is still
writable under ARMv8.1 DBM (and an attempted write will simply clear the
PT_RDONLY bit).
So instead, define PAGE_KERNEL_RO (which already has an established
meaning across architectures) and use that instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
handling.
PPC: Mostly bug fixes.
ARM: No big features, but many small fixes and prerequisites including:
- a number of fixes for the arch-timer
- introducing proper level-triggered semantics for the arch-timers
- a series of patches to synchronously halt a guest (prerequisite for
IRQ forwarding)
- some tracepoint improvements
- a tweak for the EL2 panic handlers
- some more VGIC cleanups getting rid of redundant state
x86: quite a few changes:
- support for VT-d posted interrupts (i.e. PCI devices can inject
interrupts directly into vCPUs). This introduces a new component (in
virt/lib/) that connects VFIO and KVM together. The same infrastructure
will be used for ARM interrupt forwarding as well.
- more Hyper-V features, though the main one Hyper-V synthetic interrupt
controller will have to wait for 4.5. These will let KVM expose Hyper-V
devices.
- nested virtualization now supports VPID (same as PCID but for vCPUs)
which makes it quite a bit faster
- for future hardware that supports NVDIMM, there is support for clflushopt,
clwb, pcommit
- support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in
userspace, which reduces the attack surface of the hypervisor
- obligatory smattering of SMM fixes
- on the guest side, stable scheduler clock support was rewritten to not
require help from the hypervisor.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a
f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw
VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u
p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ
PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG
iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc=
=iv2z
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"First batch of KVM changes for 4.4.
s390:
A bunch of fixes and optimizations for interrupt and time handling.
PPC:
Mostly bug fixes.
ARM:
No big features, but many small fixes and prerequisites including:
- a number of fixes for the arch-timer
- introducing proper level-triggered semantics for the arch-timers
- a series of patches to synchronously halt a guest (prerequisite
for IRQ forwarding)
- some tracepoint improvements
- a tweak for the EL2 panic handlers
- some more VGIC cleanups getting rid of redundant state
x86:
Quite a few changes:
- support for VT-d posted interrupts (i.e. PCI devices can inject
interrupts directly into vCPUs). This introduces a new
component (in virt/lib/) that connects VFIO and KVM together.
The same infrastructure will be used for ARM interrupt
forwarding as well.
- more Hyper-V features, though the main one Hyper-V synthetic
interrupt controller will have to wait for 4.5. These will let
KVM expose Hyper-V devices.
- nested virtualization now supports VPID (same as PCID but for
vCPUs) which makes it quite a bit faster
- for future hardware that supports NVDIMM, there is support for
clflushopt, clwb, pcommit
- support for "split irqchip", i.e. LAPIC in kernel +
IOAPIC/PIC/PIT in userspace, which reduces the attack surface of
the hypervisor
- obligatory smattering of SMM fixes
- on the guest side, stable scheduler clock support was rewritten
to not require help from the hypervisor"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits)
KVM: VMX: Fix commit which broke PML
KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()
KVM: x86: allow RSM from 64-bit mode
KVM: VMX: fix SMEP and SMAP without EPT
KVM: x86: move kvm_set_irq_inatomic to legacy device assignment
KVM: device assignment: remove pointless #ifdefs
KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic
KVM: x86: zero apic_arb_prio on reset
drivers/hv: share Hyper-V SynIC constants with userspace
KVM: x86: handle SMBASE as physical address in RSM
KVM: x86: add read_phys to x86_emulate_ops
KVM: x86: removing unused variable
KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs
KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()
KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
KVM: arm/arm64: Optimize away redundant LR tracking
KVM: s390: use simple switch statement as multiplexer
KVM: s390: drop useless newline in debugging data
KVM: s390: SCA must not cross page boundaries
KVM: arm: Do not indent the arguments of DECLARE_BITMAP
...
This time including:
* A new IOMMU driver for s390 pci devices
* Common dma-ops support based on iommu-api for ARM64. The plan is to
use this as a basis for ARM32 and hopefully other architectures as
well in the future.
* MSI support for ARM-SMMUv3
* Cleanups and dead code removal in the AMD IOMMU driver
* Better RMRR handling for the Intel VT-d driver
* Various other cleanups and small fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWOz7hAAoJECvwRC2XARrjbvYQALwtITTA5iTm0y/ApwNMxI7n
pZpjZVPoBPNsGBc4t/MT8pVhUSdmpBOljbV4Y4CayL1mSSB6Bl2gooZjd66m7Z81
qMJYEVWhFQqVsIKkCSNOgaO7W5y+xt3rTgqN6vCu86/CCDfKrTPP/+CRl1T/z9bo
1J8ioM3KnZG9KzG8JuXYFg5wwbKToaBh6swSmj+O4U9hru7zV/ILP7ikcc9pyMji
12WbzCqchRORsJZD65xMRYAqRaPNN/3IlDejs00TOFhY3qpWgEgFUucyeRJBJ/+q
K4U8T5vZsnr1a04l7/BeYbLmP7y/9Qv0N0xMGtTyoy/w/BieGqRWu4hHhqf/44NO
EhCSXcEThMNCGTjP2VWC4dnQ/s7Y8OmSW9nCreUcFVxHoE5LfDoh8RngA2fpeNuS
ixb3OwP+YXHN9Ck+1BQqQCeBznsPTLuDxlhRjCJsWntIfMSkXebOkz83YxyZ9b0Q
gFvptfuknU7cotUwWa3dg8RiUB8kNlKJyEEByaVpWEbEOabnONKEMkstvuBx6Ots
kA63wbe7QcPgbUYuq7g0nijDw6E2aEtf0nx2Xx4ZDL932qjg/xUkiBpmbDXHw4Gu
nimNXVQtbCzF74SyTvxEtupiijOTm5eHtoKtg0mYnqPZ+V9eOwEvW8IHaFFf8XHD
SecikoTtH1Q4RVtqOcAQ
=jLlB
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"This time including:
- A new IOMMU driver for s390 pci devices
- Common dma-ops support based on iommu-api for ARM64. The plan is
to use this as a basis for ARM32 and hopefully other architectures
as well in the future.
- MSI support for ARM-SMMUv3
- Cleanups and dead code removal in the AMD IOMMU driver
- Better RMRR handling for the Intel VT-d driver
- Various other cleanups and small fixes"
* tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
iommu/vt-d: Fix return value check of parse_ioapics_under_ir()
iommu/vt-d: Propagate error-value from ir_parse_ioapic_hpet_scope()
iommu/vt-d: Adjust the return value of the parse_ioapics_under_ir
iommu: Move default domain allocation to iommu_group_get_for_dev()
iommu: Remove is_pci_dev() fall-back from iommu_group_get_for_dev
iommu/arm-smmu: Switch to device_group call-back
iommu/fsl: Convert to device_group call-back
iommu: Add device_group call-back to x86 iommu drivers
iommu: Add generic_device_group() function
iommu: Export and rename iommu_group_get_for_pci_dev()
iommu: Revive device_group iommu-ops call-back
iommu/amd: Remove find_last_devid_on_pci()
iommu/amd: Remove first/last_device handling
iommu/amd: Initialize amd_iommu_last_bdf for DEV_ALL
iommu/amd: Cleanup buffer allocation
iommu/amd: Remove cmd_buf_size and evt_buf_size from struct amd_iommu
iommu/amd: Align DTE flag definitions
iommu/amd: Remove old alias handling code
iommu/amd: Set alias DTE in do_attach/do_detach
iommu/amd: WARN when __[attach|detach]_device are called with irqs enabled
...
The current arm64 __cmpxchg_double{_mb} implementations carry out the
compare exchange by first comparing the old values passed in to the
values read from the pointer provided and by stashing the cumulative
bitwise difference in a 64-bit register.
By comparing the register content against 0, it is possible to detect if
the values read differ from the old values passed in, so that the compare
exchange detects whether it has to bail out or carry on completing the
operation with the exchange.
Given the current implementation, to detect the cmpxchg operation
status, the __cmpxchg_double{_mb} functions should return the 64-bit
stashed bitwise difference so that the caller can detect cmpxchg failure
by comparing the return value content against 0. The current implementation
declares the return value as an int, which means that the 64-bit
value stashing the bitwise difference is truncated before being
returned to the __cmpxchg_double{_mb} callers, which means that
any bitwise difference present in the top 32 bits goes undetected,
triggering false positives and subsequent kernel failures.
This patch fixes the issue by declaring the arm64 __cmpxchg_double{_mb}
return values as a long, so that the bitwise difference is
properly propagated on failure, restoring the expected behaviour.
Fixes: e9a4b79565 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Cc: <stable@vger.kernel.org> # 4.3+
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Here is the big tty and serial driver update for 4.4-rc1.
Lots of serial driver updates and a few small tty core changes. Full
details in the shortlog.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlY6f64ACgkQMUfUDdst+ykf8gCfYPjtHy5hD/TsharaeXROnVgi
W8cAn16xk1Nmnde220MNNpO6zDu65G/1
=kslf
-----END PGP SIGNATURE-----
Merge tag 'tty-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big tty and serial driver update for 4.4-rc1.
Lots of serial driver updates and a few small tty core changes. Full
details in the shortlog.
All of these have been in linux-next for a while"
* tag 'tty-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (148 commits)
tty: Use unbound workqueue for all input workers
tty: Abstract tty buffer work
tty: Prevent tty teardown during tty_write_message()
tty: core: Use correct spinlock flavor in tiocspgrp()
tty: Combine SIGTTOU/SIGTTIN handling
serial: amba-pl011: fix incorrect integer size in pl011_fifo_to_tty()
ttyFDC: Fix build problems due to use of module_{init,exit}
tty: remove unneeded return statement
serial: 8250_mid: add support for DMA engine handling from UART MMIO
dmaengine: hsu: remove platform data
dmaengine: hsu: introduce stubs for the exported functions
dmaengine: hsu: make the UART driver in control of selecting this driver
serial: fix mctrl helper functions
serial: 8250_pci: Intel MID UART support to its own driver
serial: fsl_lpuart: add earlycon support
tty: disable unbind for old 74xx based serial/mpsc console port
serial: pl011: Spelling s/clocks-names/clock-names/
n_tty: Remove reader wakeups for TTY_BREAK/TTY_PARITY chars
tty: synclink, fix indentation
serial: at91, fix rs485 properties
...
- ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng).
The most significant change is to allow the AML debugger to be
built into the kernel. On top of that there is an update related
to the NFIT table (the ACPI persistent memory interface)
and a few fixes and cleanups.
- ACPI CPPC2 (Collaborative Processor Performance Control v2)
support along with a cpufreq frontend (Ashwin Chaugule).
This can only be enabled on ARM64 at this point.
- New ACPI infrastructure for the early probing of IRQ chips and
clock sources (Marc Zyngier).
- Support for a new hierarchical properties extension of the ACPI
_DSD (Device Specific Data) device configuration object allowing
the kernel to handle hierarchical properties (provided by the
platform firmware this way) automatically and make them available
to device drivers via the generic device properties interface
(Rafael Wysocki).
- Generic device properties API extension to obtain an index of
certain string value in an array of strings, along the lines of
of_property_match_string(), but working for all of the supported
firmware node types, and support for the "dma-names" device
property based on it (Mika Westerberg).
- ACPI core fix to parse the MADT (Multiple APIC Description Table)
entries in the order expected by platform firmware (and mandated
by the specification) to avoid confusion on systems with more than
255 logical CPUs (Lukasz Anaczkowski).
- Consolidation of the ACPI-based handling of PCI host bridges
on x86 and ia64 (Jiang Liu).
- ACPI core fixes to ensure that the correct IRQ number is used to
represent the SCI (System Control Interrupt) in the cases when
it has been re-mapped (Chen Yu).
- New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede).
- ACPI EC driver fixes (Lv Zheng).
- Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri
Kosina, Rami Rosen, Rasmus Villemoes).
- New mechanism in the PM core allowing drivers to check if the
platform firmware is going to be involved in the upcoming system
suspend or if it has been involved in the suspend the system is
resuming from at the moment (Rafael Wysocki).
This should allow drivers to optimize their suspend/resume
handling in some cases and the changes include a couple of users
of it (the i8042 input driver, PCI PM).
- PCI PM fix to prevent runtime-suspended devices with PME enabled
from being resumed during system suspend even if they aren't
configured to wake up the system from sleep (Rafael Wysocki).
- New mechanism to report the number of a wakeup IRQ that woke up
the system from sleep last time (Alexandra Yates).
- Removal of unused interfaces from the generic power domains
framework and fixes related to latency measurements in that
code (Ulf Hansson, Daniel Lezcano).
- cpufreq core sysfs interface rework to make it handle CPUs that
share performance scaling settings (represented by a common
cpufreq policy object) more symmetrically (Viresh Kumar).
This should help to simplify the CPU offline/online handling among
other things.
- cpufreq core fixes and cleanups (Viresh Kumar).
- intel_pstate fixes related to the Turbo Activation Ratio (TAR)
mechanism on client platforms which causes the turbo P-states
range to vary depending on platform firmware settings (Srinivas
Pandruvada).
- intel_pstate sysfs interface fix (Prarit Bhargava).
- Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes
and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G
Bhat, Luis de Bethencourt).
- cpuidle mvebu driver cleanups (Russell King).
- OPP (Operating Performance Points) framework code reorganization
to make it more maintainable (Viresh Kumar).
- Intel Broxton support for the RAPL (Running Average Power Limits)
power capping driver (Amy Wiles).
- Assorted power management code fixes and cleanups (Dan Carpenter,
Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus
Villemoes).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWOC9oAAoJEILEb/54YlRx/c8P/joflwoFsISwJccG62YTQMuc
bMQKM4Kw0vl5La8+pkLpe5t6+mW7l81UFtYF6Dzd8LOKlD9sszD34z1lHmCeT/oR
wn0uZpHagRyLMUfoyiEtlU/VRU6WQNNtS3EgjwUi7xgFz9Q0pjcCZ9OQ6vKov1j5
+6j40ODif5sgo+2vl+rztJiV0SIMkYdkgNqgfN1FE9bdLA2Zkk+PxxJbtGQORuDu
O/K+XhQT2xWquVWi/1p+VtQxs5glBS1oKm0kogV5bElCvNTRNIVABUNcjogITQwo
QSAKgoCKIoaIl5jtDT6u5dc0y67q/dMtqOY9fOCcOz1Z7jbWQzR8D7mpFWIsJUPK
K2LClI3t85ynpN6Jref246A6+C9nwB8JMAiAR11oBw7WbBlkd6tbRgcT5B+iz8UE
FuCCif7pha/Fs+Jt1YRazscIqteQ2bAhhxikuIPMfw2M6M67MNfVNeKA1bAoSM34
dH7JsilblitvV7shrwJHwXPXCOF2jEPoK8I4/q2+TR5qUxEpRJjelQxXGSaQScMZ
iNnjeTgv8H8q+rY5Yjzsl4pxP0Fvf7IuqkptWOJbgepg4cQc9pS87wOpY3uEeQzr
H7ruaQJFCnLO4aXbPNClsiJARhrBk+qMlsh4vBEyCJ2T0ucb+nIUcN4BTi8t85yl
X97BfHHUiDoUrnIsNids
=1gaH
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"Quite a new features are included this time.
First off, the Collaborative Processor Performance Control interface
(version 2) defined by ACPI will now be supported on ARM64 along with
a cpufreq frontend for CPU performance scaling.
Second, ACPI gets a new infrastructure for the early probing of IRQ
chips and clock sources (along the lines of the existing similar
mechanism for DT).
Next, the ACPI core and the generic device properties API will now
support a recently introduced hierarchical properties extension of the
_DSD (Device Specific Data) ACPI device configuration object. If the
ACPI platform firmware uses that extension to organize device
properties in a hierarchical way, the kernel will automatically handle
it and make those properties available to device drivers via the
generic device properties API.
It also will be possible to build the ACPICA's AML interpreter
debugger into the kernel now and use that to diagnose AML-related
problems more efficiently. In the future, this should make it
possible to single-step AML execution and do similar things.
Interesting stuff, although somewhat experimental at this point.
Finally, the PM core gets a new mechanism that can be used by device
drivers to distinguish between suspend-to-RAM (based on platform
firmware support) and suspend-to-idle (or other variants of system
suspend the platform firmware is not involved in) and possibly
optimize their device suspend/resume handling accordingly.
In addition to that, some existing features are re-organized quite
substantially.
First, the ACPI-based handling of PCI host bridges on x86 and ia64 is
unified and the common code goes into the ACPI core (so as to reduce
code duplication and eliminate non-essential differences between the
two architectures in that area).
Second, the Operating Performance Points (OPP) framework is
reorganized to make the code easier to find and follow.
Next, the cpufreq core's sysfs interface is reorganized to get rid of
the "primary CPU" concept for configurations in which the same
performance scaling settings are shared between multiple CPUs.
Finally, some interfaces that aren't necessary any more are dropped
from the generic power domains framework.
On top of the above we have some minor extensions, cleanups and bug
fixes in multiple places, as usual.
Specifics:
- ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng).
The most significant change is to allow the AML debugger to be
built into the kernel. On top of that there is an update related
to the NFIT table (the ACPI persistent memory interface) and a few
fixes and cleanups.
- ACPI CPPC2 (Collaborative Processor Performance Control v2) support
along with a cpufreq frontend (Ashwin Chaugule).
This can only be enabled on ARM64 at this point.
- New ACPI infrastructure for the early probing of IRQ chips and
clock sources (Marc Zyngier).
- Support for a new hierarchical properties extension of the ACPI
_DSD (Device Specific Data) device configuration object allowing
the kernel to handle hierarchical properties (provided by the
platform firmware this way) automatically and make them available
to device drivers via the generic device properties interface
(Rafael Wysocki).
- Generic device properties API extension to obtain an index of
certain string value in an array of strings, along the lines of
of_property_match_string(), but working for all of the supported
firmware node types, and support for the "dma-names" device
property based on it (Mika Westerberg).
- ACPI core fix to parse the MADT (Multiple APIC Description Table)
entries in the order expected by platform firmware (and mandated by
the specification) to avoid confusion on systems with more than 255
logical CPUs (Lukasz Anaczkowski).
- Consolidation of the ACPI-based handling of PCI host bridges on x86
and ia64 (Jiang Liu).
- ACPI core fixes to ensure that the correct IRQ number is used to
represent the SCI (System Control Interrupt) in the cases when it
has been re-mapped (Chen Yu).
- New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede).
- ACPI EC driver fixes (Lv Zheng).
- Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri
Kosina, Rami Rosen, Rasmus Villemoes).
- New mechanism in the PM core allowing drivers to check if the
platform firmware is going to be involved in the upcoming system
suspend or if it has been involved in the suspend the system is
resuming from at the moment (Rafael Wysocki).
This should allow drivers to optimize their suspend/resume handling
in some cases and the changes include a couple of users of it (the
i8042 input driver, PCI PM).
- PCI PM fix to prevent runtime-suspended devices with PME enabled
from being resumed during system suspend even if they aren't
configured to wake up the system from sleep (Rafael Wysocki).
- New mechanism to report the number of a wakeup IRQ that woke up the
system from sleep last time (Alexandra Yates).
- Removal of unused interfaces from the generic power domains
framework and fixes related to latency measurements in that code
(Ulf Hansson, Daniel Lezcano).
- cpufreq core sysfs interface rework to make it handle CPUs that
share performance scaling settings (represented by a common cpufreq
policy object) more symmetrically (Viresh Kumar).
This should help to simplify the CPU offline/online handling among
other things.
- cpufreq core fixes and cleanups (Viresh Kumar).
- intel_pstate fixes related to the Turbo Activation Ratio (TAR)
mechanism on client platforms which causes the turbo P-states range
to vary depending on platform firmware settings (Srinivas
Pandruvada).
- intel_pstate sysfs interface fix (Prarit Bhargava).
- Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes
and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G
Bhat, Luis de Bethencourt).
- cpuidle mvebu driver cleanups (Russell King).
- OPP (Operating Performance Points) framework code reorganization to
make it more maintainable (Viresh Kumar).
- Intel Broxton support for the RAPL (Running Average Power Limits)
power capping driver (Amy Wiles).
- Assorted power management code fixes and cleanups (Dan Carpenter,
Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus
Villemoes)"
* tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (108 commits)
cpufreq: postfix policy directory with the first CPU in related_cpus
cpufreq: create cpu/cpufreq/policyX directories
cpufreq: remove cpufreq_sysfs_{create|remove}_file()
cpufreq: create cpu/cpufreq at boot time
cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask
cpufreq: ondemand: Drop unnecessary locks from update_sampling_rate()
PM / Domains: Merge measurements for PM QoS device latencies
PM / Domains: Don't measure ->start|stop() latency in system PM callbacks
PM / clk: Fix broken build due to non-matching code and header #ifdefs
ACPI / Documentation: add copy_dsdt to ACPI format options
ACPI / sysfs: correctly check failing memory allocation
ACPI / video: Add a quirk to force native backlight on Lenovo IdeaPad S405
ACPI / CPPC: Fix potential memory leak
ACPI / CPPC: signedness bug in register_pcc_channel()
ACPI / PAD: power_saving_thread() is not freezable
ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle
ACPI: Using correct irq when waiting for events
ACPI: Use correct IRQ when uninstalling ACPI interrupt handler
cpuidle: mvebu: disable the bind/unbind attributes and use builtin_platform_driver
cpuidle: mvebu: clean up multiple platform drivers
...
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
merged from tip/irq/for-arm to allow the arm64-specific part to be
upstreamed via the arm64 tree
- CPU feature detection reworked to cope with heterogeneous systems
where CPUs may not have exactly the same features. The features
reported by the kernel via internal data structures or ELF_HWCAP are
delayed until all the CPUs are up (and before user space starts)
- Support for 16KB pages, with the additional bonus of a 36-bit VA
space, though the latter only depending on EXPERT
- Implement native {relaxed, acquire, release} atomics for arm64
- New ASID allocation algorithm which avoids IPI on roll-over, together
with TLB invalidation optimisations (using local vs global where
feasible)
- KASan support for arm64
- EFI_STUB clean-up and isolation for the kernel proper (required by
KASan)
- copy_{to,from,in}_user optimisations (sharing the memcpy template)
- perf: moving arm64 to the arm32/64 shared PMU framework
- L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware
- Support for the contiguous PTE hint on kernel mapping (16 consecutive
entries may be able to use a single TLB entry)
- Generic CONFIG_HZ now used on arm64
- defconfig updates
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWOkmIAAoJEGvWsS0AyF7x4GgQAINU3NePjFFvWZNCkqobeH9+
jFKwtXamIudhTSdnXNXyYWmtRL9Krg3qI4zDQf68dvDFAZAze2kVuOi1yPpCbpFZ
/j/afNyQc7+PoyqRAzmT+EMPZlcuOA84Prrl1r3QWZ58QaFeVk/6ZxrHunTHxN0x
mR9PIXfWx73MTo+UnG8FChkmEY6LmV4XpemgTaMR9FqFhdT51OZSxDDAYXOTm4JW
a5HdN9OWjjJ2rhLlFEaC7tszG9B5doHdy2tr5ge/YERVJzIPDogHkMe8ZhfAJc+x
SQU5tKN6Pg4MOi+dLhxlk0/mKCvHLiEQ5KVREJnt8GxupAR54Bat+DQ+rP9cSnpq
dRQTcARIOyy9LGgy+ROAsSo+NiyM5WuJ0/WJUYKmgWTJOfczRYoZv6TMKlwNOUYb
tGLCZHhKPM3yBHJlWbQykl3xmSuudxCMmjlZzg7B+MVfTP6uo0CRSPmYl+v67q+J
bBw/Z2RYXWYGnvlc6OfbMeImI6prXeE36+5ytyJFga0m+IqcTzRGzjcLxKEvdbiU
pr8n9i+hV9iSsT/UwukXZ8ay6zH7PrTLzILWQlieutfXlvha7MYeGxnkbLmdYcfe
GCj374io5cdImHcVKmfhnOMlFOLuOHphl9cmsd/O2LmCIqBj9BIeNH2Om8mHVK2F
YHczMdpESlJApE7kUc1e
=3six
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch
merged from tip/irq/for-arm to allow the arm64-specific part to be
upstreamed via the arm64 tree
- CPU feature detection reworked to cope with heterogeneous systems
where CPUs may not have exactly the same features. The features
reported by the kernel via internal data structures or ELF_HWCAP are
delayed until all the CPUs are up (and before user space starts)
- Support for 16KB pages, with the additional bonus of a 36-bit VA
space, though the latter only depending on EXPERT
- Implement native {relaxed, acquire, release} atomics for arm64
- New ASID allocation algorithm which avoids IPI on roll-over, together
with TLB invalidation optimisations (using local vs global where
feasible)
- KASan support for arm64
- EFI_STUB clean-up and isolation for the kernel proper (required by
KASan)
- copy_{to,from,in}_user optimisations (sharing the memcpy template)
- perf: moving arm64 to the arm32/64 shared PMU framework
- L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware
- Support for the contiguous PTE hint on kernel mapping (16 consecutive
entries may be able to use a single TLB entry)
- Generic CONFIG_HZ now used on arm64
- defconfig updates
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (91 commits)
arm64/efi: fix libstub build under CONFIG_MODVERSIONS
ARM64: Enable multi-core scheduler support by default
arm64/efi: move arm64 specific stub C code to libstub
arm64: page-align sections for DEBUG_RODATA
arm64: Fix build with CONFIG_ZONE_DMA=n
arm64: Fix compat register mappings
arm64: Increase the max granular size
arm64: remove bogus TASK_SIZE_64 check
arm64: make Timer Interrupt Frequency selectable
arm64/mm: use PAGE_ALIGNED instead of IS_ALIGNED
arm64: cachetype: fix definitions of ICACHEF_* flags
arm64: cpufeature: declare enable_cpu_capabilities as static
genirq: Make the cpuhotplug migration code less noisy
arm64: Constify hwcap name string arrays
arm64/kvm: Make use of the system wide safe values
arm64/debug: Make use of the system wide safe value
arm64: Move FP/ASIMD hwcap handling to common code
arm64/HWCAP: Use system wide safe values
arm64/capabilities: Make use of system wide safe value
arm64: Delay cpu feature capability checks
...
Includes a number of fixes for the arch-timer, introducing proper
level-triggered semantics for the arch-timers, a series of patches to
synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint
improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups
getting rid of redundant state, and finally a stylistic change that gets rid of
some ctags warnings.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWOhgAAAoJEEtpOizt6ddyKS8H/2ZHMTPjo6yChnrusNWy4Qbr
6laPDlzL+g45oMQRwNL7GnM1deRftaxvT2Wi+X84D/6Y/BD6MPds4HgtBfuWcSZ1
CyRJ0Ot/zrxenucSuJuOjq+a9gdizdAczkbB1MfYDULJH8fb6D+7RYLo3zgh4Xo4
pla3L9U6gSWe+YopBjZtZH43m3fwiwSM/v+uHOTIcXrsbR+fEgx/EFSKmA/DUCuo
P5cFO/ceUGu7nATCexu5V82TgR2hvurrsR7mqfwY8YcF6HRM+NEOoS29xWC77v5S
u/F08TKuKQLv0YTEFTyLETI/oEeuC0cHtrRQBNf4+9kXEOzKyXaae0wR/I6X2Ss=
=GMNk
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM Changes for v4.4-rc1
Includes a number of fixes for the arch-timer, introducing proper
level-triggered semantics for the arch-timers, a series of patches to
synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint
improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups
getting rid of redundant state, and finally a stylistic change that gets rid of
some ctags warnings.
Conflicts:
arch/x86/include/asm/kvm_host.h
Pull locking changes from Ingo Molnar:
"The main changes in this cycle were:
- More gradual enhancements to atomic ops: new atomic*_read_ctrl()
ops, synchronize atomic_{read,set}() ordering requirements between
architectures, add atomic_long_t bitops. (Peter Zijlstra)
- Add _{relaxed|acquire|release}() variants for inc/dec atomics and
use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
This enables weakly ordered architectures (such as arm64) to make
use of more locking related optimizations. (Davidlohr Bueso)
- Implement atomic[64]_{inc,dec}_relaxed() on ARM. (Will Deacon)
- Futex kernel data cache footprint micro-optimization. (Rasmus
Villemoes)
- pvqspinlock runtime overhead micro-optimization. (Waiman Long)
- misc smaller fixlets"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
locking/rwsem: Use acquire/release semantics
locking/mcs: Use acquire/release semantics
locking/rtmutex: Use acquire/release semantics
locking/mutex: Use acquire/release semantics
locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
atomic: Implement atomic_read_ctrl()
atomic, arch: Audit atomic_{read,set}()
atomic: Add atomic_long_t bitops
futex: Force hot variables into a single cache line
locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
locking/osq: Relax atomic semantics
locking/qrwlock: Rename ->lock to ->wait_lock
locking/Documentation/lockstat: Fix typo - lokcing -> locking
locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
Pull EFI changes from Ingo Molnar:
"The main changes in this cycle were:
- further EFI code generalization to make it more workable for ARM64
- various extensions, such as 64-bit framebuffer address support,
UEFI v2.5 EFI_PROPERTIES_TABLE support
- code modularization simplifications and cleanups
- new debugging parameters
- various fixes and smaller additions"
* 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
efi: Fix warning of int-to-pointer-cast on x86 32-bit builds
efi: Use correct type for struct efi_memory_map::phys_map
x86/efi: Fix kernel panic when CONFIG_DEBUG_VIRTUAL is enabled
efi: Add "efi_fake_mem" boot option
x86/efi: Rename print_efi_memmap() to efi_print_memmap()
efi: Auto-load the efi-pstore module
efi: Introduce EFI_NX_PE_DATA bit and set it from properties table
efi: Add support for UEFIv2.5 Properties table
efi: Add EFI_MEMORY_MORE_RELIABLE support to efi_md_typeattr_format()
efifb: Add support for 64-bit frame buffer addresses
efi/arm64: Clean up efi_get_fdt_params() interface
arm64: Use core efi=debug instead of uefi_debug command line parameter
efi/x86: Move efi=debug option parsing to core
drivers/firmware: Make efi/esrt.c driver explicitly non-modular
efi: Use the generic efi.memmap instead of 'memmap'
acpi/apei: Use appropriate pgprot_t to map GHES memory
arm64, acpi/apei: Implement arch_apei_get_mem_attributes()
arm64/mm: Add PROT_DEVICE_nGnRnE and PROT_NORMAL_WT
acpi, x86: Implement arch_apei_get_mem_attributes()
efi, x86: Rearrange efi_mem_attributes()
...
Pull irq updates from Thomas Gleixner:
"The irq departement delivers:
- Rework the irqdomain core infrastructure to accomodate ACPI based
systems. This is required to support ARM64 without creating
artificial device tree nodes.
- Sanitize the ACPI based ARM GIC initialization by making use of the
new firmware independent irqdomain core
- Further improvements to the generic MSI management
- Generalize the irq migration on CPU hotplug
- Improvements to the threaded interrupt infrastructure
- Allow the migration of "chained" low level interrupt handlers
- Allow optional force masking of interrupts in disable_irq[_nosysnc]
- Support for two new interrupt chips - Sigh!
- A larger set of errata fixes for ARM gicv3
- The usual pile of fixes, updates, improvements and cleanups all
over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
Document that IRQ_NONE should be returned when IRQ not actually handled
PCI/MSI: Allow the MSI domain to be device-specific
PCI: Add per-device MSI domain hook
of/irq: Use the msi-map property to provide device-specific MSI domain
of/irq: Split of_msi_map_rid to reuse msi-map lookup
irqchip/gic-v3-its: Parse new version of msi-parent property
PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
of/irq: Add support code for multi-parent version of "msi-parent"
irqchip/gic-v3-its: Add handling of PCI requester id.
PCI/MSI: Add helper function pci_msi_domain_get_msi_rid().
of/irq: Add new function of_msi_map_rid()
Docs: dt: Add PCI MSI map bindings
irqchip/gic-v2m: Add support for multiple MSI frames
irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec
irqchip/mxs: Add Alphascale ASM9260 support
irqchip/mxs: Prepare driver for hardware with different offsets
irqchip/mxs: Panic if ioremap or domain creation fails
irqdomain: Documentation updates
irqdomain/msi: Use fwnode instead of of_node
...
For reasons not entirely apparent, but now enshrined in history, the
architectural mapping of AArch32 banked registers to AArch64 registers
actually orders SP_<mode> and LR_<mode> backwards compared to the
intuitive r13/r14 order, for all modes except FIQ.
Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding
subtle bugs with KVM and AArch32 guests.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Increase the standard cacheline size to avoid having locks in the same
cacheline.
Cavium's ThunderX core implements cache lines of 128 byte size. With
current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
share the same cache line leading a performance degradation.
Increasing the size fixes that.
Increasing the size has no negative impact to cache invalidation on
systems with a smaller cache line. There is an impact on memory usage,
but that's not too important for arm64 use cases.
Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The comparison between TASK_SIZE_64 and MODULES_VADDR does not
make any sense on arm64, it is simply something that has been
carried over from the ARM port which arm64 is based on. So drop it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
test_bit and set_bit take the bit number to operate on, rather than a
mask. This patch fixes the ICACHEF_* definitions so that they represent
the bit index in __icache_flags as opposed to the mask returned by the
BIT macro.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The ARM architecture only saves the exit class to the HSR (ESR_EL2 for
arm64) on synchronous exceptions, not on asynchronous exceptions like an
IRQ. However, we only report the exception class on kvm_exit, which is
confusing because an IRQ looks like it exited at some PC with the same
reason as the previous exit. Add a lookup table for the exception index
and prepend the kvm_exit tracepoint text with the exception type to
clarify this situation.
Also resolve the exception class (EC) to a human-friendly text version
so the trace output becomes immediately usable for debugging this code.
Cc: Wei Huang <wei@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We introduce kvm_arm_halt_guest and resume functions. They
will be used for IRQ forward state change.
Halt is synchronous and prevents the guest from being re-entered.
We use the same mechanism put in place for PSCI former pause,
now renamed power_off. A new flag is introduced in arch vcpu state,
pause, only meant to be used by those functions.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
The kvm_vcpu_arch pause field is renamed into power_off to prepare
for the introduction of a new pause field. Also vcpu_pause is renamed
into vcpu_sleep since we will sleep until both power_off and pause are
false.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We currently schedule a soft timer every time we exit the guest if the
timer did not expire while running the guest. This is really not
necessary, because the only work we do in the timer work function is to
kick the vcpu.
Kicking the vcpu does two things:
(1) If the vpcu thread is on a waitqueue, make it runnable and remove it
from the waitqueue.
(2) If the vcpu is running on a different physical CPU from the one
doing the kick, it sends a reschedule IPI.
The second case cannot happen, because the soft timer is only ever
scheduled when the vcpu is not running. The first case is only relevant
when the vcpu thread is on a waitqueue, which is only the case when the
vcpu thread has called kvm_vcpu_block().
Therefore, we only need to make sure a timer is scheduled for
kvm_vcpu_block(), which we do by encapsulating all calls to
kvm_vcpu_block() with kvm_timer_{un}schedule calls.
Additionally, we only schedule a soft timer if the timer is enabled and
unmasked, since it is useless otherwise.
Note that theoretically userspace can use the SET_ONE_REG interface to
change registers that should cause the timer to fire, even if the vcpu
is blocked without a scheduled timer, but this case was not supported
before this patch and we leave it for future work for now.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Some times it is useful for architecture implementations of KVM to know
when the VCPU thread is about to block or when it comes back from
blocking (arm/arm64 needs to know this to properly implement timers, for
example).
Therefore provide a generic architecture callback function in line with
what we do elsewhere for KVM generic-arch interactions.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Use the system wide value of ID_AA64DFR0 to make safer decisions
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Extend struct arm64_cpu_capabilities to handle the HWCAP detection
and make use of the system wide value of the feature registers for
a reliable set of HWCAPs.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that we can reliably read the system wide safe value for a
feature register, use that to compute the system capability.
This patch also replaces the 'feature-register-specific'
methods with a generic routine to check the capability.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
At the moment we run through the arm64_features capability list for
each CPU and set the capability if one of the CPU supports it. This
could be problematic in a heterogeneous system with differing capabilities.
Delay the CPU feature checks until all the enabled CPUs are up(i.e,
smp_cpus_done(), so that we can make better decisions based on the
overall system capability. Once we decide and advertise the capabilities
the alternatives can be applied. From this state, we cannot roll back
a feature to disabled based on the values from a new hotplugged CPU,
due to the runtime patching and other reasons. So, for all new CPUs,
we need to make sure that they have the established system capabilities.
Failing which, we bring the CPU down, preventing it from turning online.
Once the capabilities are decided, any new CPU booting up goes through
verification to ensure that it has all the enabled capabilities and also
invokes the respective enable() method on the CPU.
The CPU errata checks are not delayed and is still executed per-CPU
to detect the respective capabilities. If we ever come across a non-errata
capability that needs to be checked on each-CPU, we could introduce them via
a new capability table(or introduce a flag), which can be processed per CPU.
The next patch will make the feature checks use the system wide
safe value of a feature register.
NOTE: The enable() methods associated with the capability is scheduled
on all the CPUs (which is the only use case at the moment). If we need
a different type of 'enable()' which only needs to be run once on any CPU,
we should be able to handle that when needed.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
[catalin.marinas@arm.com: static variable and coding style fixes]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
check_cpu_capabilities runs through a given list of caps and
checks if the system has the cap, updates the system capability
bitmap and also runs any enable() methods associated with them.
All of this is not quite obvious from the name 'check'. This
patch splits the check_cpu_capabilities into two parts :
1) update_cpu_capabilities
=> Runs through the given list and updates the system
wide capability map.
2) enable_cpu_capabilities
=> Runs through the given list and invokes enable() (if any)
for the caps enabled on the system.
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Suggested-by: Catalin Marinas <catalin.marinsa@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Make use of the system wide safe register to decide the support
for mixed endian.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add an API for reading the safe CPUID value across the
system from the new infrastructure.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch consolidates the CPU Sanity check to the new infrastructure.
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds an infrastructure to keep track of the CPU feature
registers on the system. For each register, the infrastructure keeps
track of the system wide safe value of the feature bits. Also, tracks
the which fields of a register should be matched strictly across all
the CPUs on the system for the SANITY check infrastructure.
The feature bits are classified into following 3 types depending on
the implication of the possible values. This information is used to
decide the safe value for a feature.
LOWER_SAFE - The smaller value is safer
HIGHER_SAFE - The bigger value is safer
EXACT - We can't decide between the two, so a predefined safe_value is used.
This infrastructure will be later used to make better decisions for:
- Kernel features (e.g, KVM, Debug)
- SANITY Check
- CPU capability
- ELF HWCAP
- Exposing CPU Feature register to userspace.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
[catalin.marinas@arm.com: whitespace fix]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Introduce a helper to extract cpuid feature for any given
width.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Move the mixed endian support detection code to cpufeature.c
from cpuinfo.c. This also moves the update_cpu_features()
used by mixed endian detection code, which will get more
functionality.
Also moves the ID register field shifts to asm/sysreg.h,
where all the useful definitions will end up in later patches.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Delay the ELF HWCAP initialisation until all the (enabled) CPUs are
up, i.e, smp_cpus_done(). This is in preparation for detecting the
common features across the CPUS and creating a consistent ELF HWCAP
for the system.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch turns on the 16K page support in the kernel. We
support 48bit VA (4 level page tables) and 47bit VA (3 level
page tables).
With 16K we can map 128 entries using contiguous bit hint
at level 3 to map 2M using single TLB entry.
TODO: 16K supports 32 contiguous entries at level 2 to get us
1G(which is not yet supported by the infrastructure). That should
be a separate patch altogether.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ensure that the selected page size is supported by the CPU(s). If it doesn't
park it.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We choose NR_FIX_BTMAPS such that each slot (NR_FIX_BTMAPS * PAGE_SIZE)
can address 256K.
Use division to derive NR_FIX_BTMAPS rather than defining it for each
page size.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We use !CONFIG_ARM64_64K_PAGES for CONFIG_ARM64_4K_PAGES
(and vice versa) in code. It all worked well, so far since
we only had two options. Now, with the introduction of 16K,
these cases will break. This patch cleans up the code to
use the required CONFIG symbol expression without the assumption
that !64K => 4K (and vice versa)
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that we can calculate the number of levels required for
mapping a va width, reserve exact number of pages that would
be required to cover the idmap. The idmap should be able to handle
the maximum physical address size supported.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Introduce helpers for finding the number of page table
levels required for a given VA width, shift for a particular
page table level.
Convert the existing users to the new helpers. More users
to follow.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We use section maps with 4K page size to create the swapper/idmaps.
So far we have used !64K or 4K checks to handle the case where we
use the section maps.
This patch adds a new symbol, ARM64_SWAPPER_USES_SECTION_MAPS, to
handle cases where we use section maps, instead of using the page size
symbols.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Move the kernel pagetable (both swapper and idmap) definitions
from the generic asm/page.h to a new file, asm/kernel-pgtable.h.
This is mostly a cosmetic change, to clean up the asm/page.h to
get rid of the arch specific details which are not needed by the
generic code.
Also renames the symbols to prevent conflicts. e.g,
BLOCK_SHIFT => SWAPPER_BLOCK_SHIFT
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
These were introduced by commit 03875ad52f (arm64: add
kc_offset_to_vaddr and kc_vaddr_to_offset macro).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With iommu_dma_ops in place, hook them up to the configuration code, so
IOMMU-fronted devices will get them automatically.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 208473c1f3 ("ARM: wire up new syscalls") hooked up the new
userfaultfd and membarrier syscalls for ARM, so do the same for our
compat syscall table in arm64.
Signed-off-by: Will Deacon <will.deacon@arm.com>
This reverts commit 1b6d7f8742.
This patch would conflict with Dan Williams' "tree-wide convert to
memremap()" series (ioremap_cache replaced by arch_memremap)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch add kc_offset_to_vaddr() and kc_vaddr_to_offset(),
the default version doesn't work on arm64, because arm64 kernel address
is below the PAGE_OFFSET, like module address and vmemmap address are
all below PAGE_OFFSET address.
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add ioremap_cache macro, because some code will test if this macro
is defined or not, and will generate a generric version if not defined,
for example, memremap.c do like this.
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Sparse reports some new issues introduced by the kasan patches:
arch/arm64/mm/kasan_init.c:91:13: warning: no previous prototype for
'kasan_early_init' [-Wmissing-prototypes] void __init kasan_early_init(void)
^
arch/arm64/mm/kasan_init.c:91:13: warning: symbol 'kasan_early_init'
was not declared. Should it be static? [sparse]
This patch resolves the problem by adding a prototype for
kasan_early_init and marking the function as asmlinkage, since it's only
called from head.S.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We want the tty fixes and reverts in here as well so that people can
properly test and use it.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds arch specific code for kernel address sanitizer
(see Documentation/kasan.txt).
1/8 of kernel addresses reserved for shadow memory. There was no
big enough hole for this, so virtual addresses for shadow were
stolen from vmalloc area.
At early boot stage the whole shadow region populated with just
one physical page (kasan_zero_page). Later, this page reused
as readonly zero shadow for some memory that KASan currently
don't track (vmalloc).
After mapping the physical memory, pages for shadow memory are
allocated and mapped.
Functions like memset/memmove/memcpy do a lot of memory accesses.
If bad pointer passed to one of these function it is important
to catch this. Compiler's instrumentation cannot do this since
these functions are written in assembly.
KASan replaces memory functions with manually instrumented variants.
Original functions declared as weak symbols so strong definitions
in mm/kasan/kasan.c could replace them. Original functions have aliases
with '__' prefix in name, so we could call non-instrumented variant
if needed.
Some files built without kasan instrumentation (e.g. mm/slub.c).
Original mem* function replaced (via #define) with prefixed variants
to disable memory access checks for such files.
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This will be used by KASAN latter.
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 654672d4ba ("locking/atomics: Add _{acquire|release|relaxed}()
variants of some atomic operation") introduced a relaxed atomic API to
Linux that maps nicely onto the arm64 memory model, including the new
ARMv8.1 atomic instructions.
This patch hooks up the API to our relaxed atomic instructions, rather
than have them all expand to the full-barrier variants as they do
currently.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For more control over which functions are called with the MMU off or
with the UEFI 1:1 mapping active, annotate some assembler routines as
position independent. This is done by introducing ENDPIPROC(), which
replaces the ENDPROC() declaration of those routines.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On 32bit platforms, we cannot assure that an I/O ldrd or strd will be
done atomically. Besides, an hypervisor would be unable to emulate such
accesses.
In order to allow the AArch32 version of the driver to split them into
two 32bit accesses while keeping the requirement for atomic writes, this
patch specializes the IROUTER and TYPER accesses.
Since the latter is an ID register, it won't need to be read atomically,
but we still avoid future confusion by using gic_read_typer instead of a
generic gic_readq.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This patch does a few simple compatibility-related changes:
- change the system register access prototypes to their actual size,
- homogenise mpidr accesses with unsigned long,
- force the 64bit register values to unsigned long long.
Note: the list registers are 64bit on GICv3, but the AArch32 vGIC driver
will need to split their values into two 32bit registers: LRn and LRCn.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This patch moves the GICv3 system register access helpers to
arch/arm64/. Their 32bit counterparts will need to use mrc/mcr accesses
instead of mrs_s/msr_s.
[maz: fixed conflict with Cavium erratum handling]
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When cpu is disabled, all irqs will be migratged to another cpu.
In some cases, a new affinity is different, the old affinity need
to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
the old affinity can not be updated. Fix it by using irq_do_set_affinity.
And migrating interrupts is a core code matter, so use the generic
function irq_migrate_all_off_this_cpu() to migrate interrupts in
kernel/irq/migration.c.
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The default page attributes for a PMD being broken should have the CONT bit
set. Create a new definition for an early boot range of PTE's that are
contiguous.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add the supporting macros to check if the contiguous bit
is set, set the bit, or clear it in a PTE entry.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Define the bit positions in the PTE and PMD for the
contiguous bit.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add the number of pages required to form a contiguous range,
as well as some supporting constants.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that the arm_pmu framework has been factored out to drivers/perf we
can make use of it for arm64, gaining support for heterogeneous PMUs
and unifying the two codebases before they diverge further.
The as yet unused PMU name for PMUv3 is changed to armv8_pmuv3, matching
the style previously applied to the 32-bit PMUs.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
update_mmu_cache() consists of a dsb(ishst) instruction so that new user
mappings are guaranteed to be visible to the page table walker on
exception return.
In reality this can be a very expensive operation which is rarely needed.
Removing this barrier shows a modest improvement in hackbench scores and
, in the worst case, we re-take the user fault and establish that there
was nothing to do.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
__flush_tlb_pgtable is used to invalidate intermediate page table
entries after they have been cleared and are about to be freed. Since
pXd_clear imply memory barriers, we don't need the extra one here.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
switch_mm performs some checks to try and avoid entering the ASID
allocator:
(1) If we're switching to the init_mm (no user mappings), then simply
set a reserved TTBR0 value with no page table (the zero page)
(2) If prev == next *and* the mm_cpumask indicates that we've run on
this CPU before, then we can skip the allocator.
However, there is plenty of redundancy here. With the new ASID allocator,
if prev == next, then we know that our ASID is valid and do not need to
worry about re-allocation. Consequently, we can drop the mm_cpumask check
in (2) and move the prev == next check before the init_mm check, since
if prev == next == init_mm then there's nothing to do.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The TLB gather code sets fullmm=1 when tearing down the entire address
space for an mm_struct on exit or execve. Given that the ASID allocator
will never re-allocate a dirty ASID, this flushing is not needed and can
simply be avoided in the flushing code.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The ASID macro returns a 64-bit (long long) value, so there is no need
to cast to (unsigned long) before shifting prior to a TLBI operation.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Our current switch_mm implementation suffers from a number of problems:
(1) The ASID allocator relies on IPIs to synchronise the CPUs on a
rollover event
(2) Because of (1), we cannot allocate ASIDs with interrupts disabled
and therefore make use of a TIF_SWITCH_MM flag to postpone the
actual switch to finish_arch_post_lock_switch
(3) We run context switch with a reserved (invalid) TTBR0 value, even
though the ASID and pgd are updated atomically
(4) We take a global spinlock (cpu_asid_lock) during context-switch
(5) We use h/w broadcast TLB operations when they are not required
(e.g. in flush_context)
This patch addresses these problems by rewriting the ASID algorithm to
match the bitmap-based arch/arm/ implementation more closely. This in
turn allows us to remove much of the complications surrounding switch_mm,
including the ugly thread flag.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
There are a number of places where a single CPU is running with a
private page-table and we need to perform maintenance on the TLB and
I-cache in order to ensure correctness, but do not require the operation
to be broadcast to other CPUs.
This patch adds local variants of tlb_flush_all and __flush_icache_all
to support these use-cases and updates the callers respectively.
__local_flush_icache_all also implies an isb, since it is intended to be
used synchronously.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With commit b08d4640a3 ("arm64: remove dead code"),
cpu_set_idmap_tcr_t0sz is no longer called and can therefore be removed
from the kernel.
This patch removes the function and effectively inlines the helper
function __cpu_set_tcr_t0sz into cpu_set_default_tcr_t0sz.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In order to not use lengthy (UL(0xffffffffffffffff) << VA_BITS) everywhere,
replace it with VA_START.
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add support for debug communications channel based
hvc console for arm64 cpus.
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6910fa1 ("arm64: enable PTE type bit in the mask for pte_modify") fixes
a problem whereby a large block of PROT_NONE mapped memory is
incorrectly mapped as block descriptors when mprotect is called.
Unfortunately, a subtle bug was introduced by this fix to the THP logic.
If one mmaps a large block of memory, then faults it such that it is
collapsed into THPs; resulting calls to mprotect on this area of memory
will lead to incorrect table descriptors being written instead of block
descriptors. This is because pmd_modify calls pte_modify which is now
allowed to modify the type of the page table entry.
This patch reverts commit 6910fa16db, and
fixes the problem it was trying to address by adjusting PAGE_NONE to
represent a table entry. Thus no change in pte type is required when
moving from PROT_NONE to a different protection.
Fixes: 6910fa16db ("arm64: enable PTE type bit in the mask for pte_modify")
Cc: <stable@vger.kernel.org> # 4.0+
Cc: Feng Kan <fkan@apm.com>
Reported-by: Ganapatrao Kulkarni <Ganapatrao.Kulkarni@caviumnetworks.com>
Tested-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Now that we have a basic infrastructure to register irqchips and
call them on discovery of a matching entry in MADT, convert the
GIC driver to this new probing method.
It ends up being a code deletion party, which is a rather good thing.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
DT enjoys a rather nice probing infrastructure for irqchips, while
ACPI is so far stuck into a very distant past.
This patch introduces a declarative API, allowing irqchips to be
self-contained and be called when a particular entry is matched
in the MADT table.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch implements Cavium ThunderX erratum 23154.
The gicv3 of ThunderX requires a modified version for reading the IAR
status to ensure data synchronization. Since this is in the fast-path
and called with each interrupt, runtime patching is used using jump
label patching for smallest overhead (no-op). This is the same
technique as used for tracepoints.
Signed-off-by: Robert Richter <rrichter@cavium.com>
Reviewed-by: Marc Zygnier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1442869119-1814-3-git-send-email-rric@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
and a few PPC bug fixes too.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJWBSn7AAoJEL/70l94x66Dxd4H/RT6kWWj9x4grEYUkcJUDyK2
AXm7XcKQm04auwAic8Otr+ts/Qix/50kWmBe/TU0QLgqb8rj5Dj3yGFK6Z1y6mAz
KvaxqMJd4tZGTqN0DDvC2ItEdzjfAdeJZo/FHXqPHVspG0G14T7STLna02LTBBEJ
tNzY9qor8nFhg2fT2szqKaudUNgTqkCTpo57o2BrHE96SHG+m0WdpQCV1F5hPVpg
Te0Pb7qX9xng5n3sQ7IV/t3QYbrza1ACwNQS9XJa0Yu6iEz7JdmVmzHQASK9ynn6
hUHhsNYGx4IsPjPtfJk2GroNaRDZL+VMzw07tfcOvPx8xkS9hS63pwzmSBqfLrM=
=Ywqn
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC
bug fixes too"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: disable halt_poll_ns as default for s390x
KVM: x86: fix off-by-one in reserved bits check
KVM: x86: use correct page table format to check nested page table reserved bits
KVM: svm: do not call kvm_set_cr0 from init_vmcb
KVM: x86: trap AMD MSRs for the TSeg base and mask
KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()
KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit
KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs
kvm: svm: reset mmu on VCPU reset
We observed some performance degradation on s390x with dynamic
halt polling. Until we can provide a proper fix, let's enable
halt_poll_ns as default only for supported architectures.
Architectures are now free to set their own halt_poll_ns
default value.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().
We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().
And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJV+/ucAAoJEL/70l94x66DV8YH/1KDym/1GJ+/Br/YkHZnM53l
3Q0PwSLu9cNcIL9lUuDLwGTaVj+y8ud1Hjr/uzvKwivktmUYVZhkdtnZmnanvGOM
qKB9K3nFXCPx8uqy8Dn7fOwEKcg9FmDOTTkWy13HDnXO+V4crSVVt+rPw+6FUMld
NV5tYdw9Lu7y3XrveDebPWaPtyDL7OAagzmeK47eMffxG7X9Hf1H2aT7HueRi7x/
SkLIe3gmiOWmHVJDPE9TOmFYIj19gywDFysKes1gdVJLVUIXiELMT7SrvAYnToVB
zISIEj7Zx4SINPxpf2dUn8REm7NsmJY+PffLIl/Nv+ozGggFQGFH0SMZ08p0bxw=
=tfmn
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Mostly stable material, a lot of ARM fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
sched: access local runqueue directly in single_task_running
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
arm64: KVM: Remove all traces of the ThumbEE registers
arm: KVM: Disable virtual timer even if the guest is not using it
arm64: KVM: Disable virtual timer even if the guest is not using it
arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources
KVM: s390: Replace incorrect atomic_or with atomic_andnot
arm: KVM: Fix incorrect device to IPA mapping
arm64: KVM: Fix user access for debug registers
KVM: vmx: fix VPID is 0000H in non-root operation
KVM: add halt_attempted_poll to VCPU stats
kvm: fix zero length mmio searching
kvm: fix double free for fast mmio eventfd
kvm: factor out core eventfd assign/deassign logic
kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd
KVM: make the declaration of functions within 80 characters
KVM: arm64: add workaround for Cortex-A57 erratum #852523
KVM: fix polling for guest halt continued even if disable it
arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores
arm64: KVM: set {v,}TCR_EL2 RES1 bits
...
Pull irq updates from Thomas Gleixner:
"This is a rather large update post rc1 due to the final steps of
cleanups and API changes which had to wait for the preparatory patches
to hit your tree.
- Regression fixes for ARM GIC irqchips
- Regression fixes and lockdep anotations for renesas irq chips
- The leftovers of the cleanup and preparatory patches which have
been ignored by maintainers
- Final conversions of the newly merged users of obsolete APIs
- Final removal of obsolete APIs
- Final removal of ARM artifacts which had been introduced during the
conversion of ARM to the generic interrupt code.
- Final split of the irq_data into chip specific and common data to
reflect the needs of hierarchical irq domains.
- Treewide removal of the first argument of interrupt flow handlers,
i.e. the irq number, which is not used by the majority of handlers
and simple to retrieve from the other argument the irq descriptor.
- A few comment updates and build warning fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
arm64: Remove ununsed set_irq_flags
ARM: Remove ununsed set_irq_flags
sh: Kill off set_irq_flags usage
irqchip: Kill off set_irq_flags usage
gpu/drm: Kill off set_irq_flags usage
genirq: Remove irq argument from irq flow handlers
genirq: Move field 'msi_desc' from irq_data into irq_common_data
genirq: Move field 'affinity' from irq_data into irq_common_data
genirq: Move field 'handler_data' from irq_data into irq_common_data
genirq: Move field 'node' from irq_data into irq_common_data
irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
genirq: Simplify irq_data_to_desc()
genirq: Remove __irq_set_handler_locked()
pinctrl/pistachio: Use irq_set_handler_locked
gpio: vf610: Use irq_set_handler_locked
powerpc/mpc8xx: Use irq_set_handler_locked()
powerpc/ipic: Use irq_set_handler_locked()
powerpc/cpm2: Use irq_set_handler_locked()
...
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV+slQAAoJECPQ0LrRPXpD+WUQAMLC3ZUasJX1gsVixd++zAwB
FXu0TFlKCUsLWllXZtyhGI6ya7ljuCzfhRbA/eZFmFVbwDnULt1p5ahw7eHCIZ2a
yY93TS6XN3YHwVpY7f2lDsvLhBLyWeTdWhj5TtLy6mslQyEUqxdmsiC7gl40Fp2S
8tKIxoYYRpmbgKl/Lbi8GxdHH6c0aQ2Nt7Fq4nV9dJqy5tiGdg6OxqgU/rVmkdkv
Rv1jrdtncstNRi9NBbKRRDp5DTqWboF35HJQpdIRpR8jJTLuuzzCimP5Hz9crKuO
uXchIq2GtQB60NklZtPL15zMdmfdq+JHwdC14v05kB5Ai8NThGwKYQ3JF+krO3cG
RKsAlrIq0AwPN8hAboLcKGzjLFFryaHZsa+d7elxaaDQz1FGz4uP56fIUURoGZuX
vWTsKLRKcuPCYtnV6Frg2BCTB6nq1cRgjmMC9TABnraelZ3z0lDl4wFngg4aL2u6
QYOdP8L++/S1HAPOF7VhFYndXkbM3KoVLAepev8jvzRnwg4QVrqsvfgwFSdMNcMz
ga7bJ4pUEP+Qq1i0qc41P9O708bCGm7TIw3CzTdKIZhc/l0t137lw1rhv67JfXZh
cAni4osjhpdZUT0F9lIl/6OQB3Kgk6on3cs909Y/tT1srh9s+iVO1AwpGY1j5T4j
gFRy90o2LBuepoI/8yF3
=NNtz
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.3-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
Second set of KVM/ARM changes for 4.3-rc2
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.
Whilst this breaks migrating a guest started on a previous version of
the kernel, it is much better to kill these (non existent) registers
as soon as possible.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[maz: added commend about migration]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Now that all users of set_irq_flags and custom flags are converted to
genirq functions, the ARM specific set_irq_flags can be removed.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This new statistic can help diagnosing VCPUs that, for any reason,
trigger bad behavior of halt_poll_ns autotuning.
For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
10+20+40+80+160+320+480 = 1110 microseconds out of every
479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
is consuming about 30% more CPU than it would use without
polling. This would show as an abnormally high number of
attempted polling compared to the successful polls.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Fix timer interrupt injection after the rework
that went in during the merge window
- Reset the timer to zero on reboot
- Make sure the TCR_EL2 RES1 bits are really set to 1
- Fix a PSCI affinity bug for non-existing vcpus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV9VQYAAoJECPQ0LrRPXpDYFsQAIp+nIxv6LijQdq310EF7Z95
1j16vx9NfsAsNH0pwiRmx4gfNOHlPp6f30Kb1HJf2TN08Rc7TS8j4Dr3zYnZEdQj
9eNGTjCjlz93VQwKnTBvBvYkHsCEWC76pOxiYoFmnLsGc4m/OilHhohOzOZTAMFC
Le07VXJwrhGHQ5bjdIMmFj+5rMj4eWfT2bYg8uVl5EUNFNkDgAMtT/Nbml90friC
x+r2I9H+G8hHmemOv6hefW7l2JScLSYpqLeGdxZUtnGy9LNP3+jTD1QyqUftUPuS
HtcB4zCtRk62nMupgFNV514Kf26wcwpS53vhd8aM2kxSkpr5xkUFoL+uJNT8ssIH
Y+FwAOeTD0osWbmxw8Gf9d3qYzPBQSCRP3E4mCbNNNsTlMRU+Hu9au0VNBKggkfF
fZ4UphCVX2C+cWaQcB40950h1F76Vx3boAfcMgiXoO+8LJx9OogDtr8+j5cBBbKU
+tMngLhelsMtSCTam9c0jS6BsQfHDy3W1CQNl30cqd2RFjvygOSAqzdCMlBk0lLG
4Sq0eib1zJM+rE2OQs8SaAftyQ0kbElmM3XZB3ZiitaHgkUKV7kt9dMUG9Vfn2SE
/ycgR9eoBq5P6FKhyROmL+0g824nYJmn4fwpecw1rHoRr8jyZjvpbE3VSL0aI5XW
w4UYsAQSTcqM6bHpNYuu
=G02h
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM changes for 4.3-rc2
- Fix timer interrupt injection after the rework
that went in during the merge window
- Reset the timer to zero on reboot
- Make sure the TCR_EL2 RES1 bits are really set to 1
- Fix a PSCI affinity bug for non-existing vcpus
Depending on CONFIG_ARM64_HW_AFDBM, we use either bit 57 or 51 of the
pte to represent PTE_WRITE. Given that bit 51 is reserved prior to
ARMv8.1, we can just use that bit regardless of the config option. That
also matches what happens if a kernel configured with ARM64_HW_AFDBM=y
is run on a CPU without the DBM functionality.
Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The pte_modify() function with hardware AF/DBM enabled must transfer the
hardware dirty information to the software PTE_DIRTY bit. However, it
was setting this bit in newprot and the mask does not cover such bit.
This patch sets PTE_DIRTY on the original pte which will be preserved in
the returned value.
Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 2f4b829c62 ("arm64: Add support for hardware updates of the
access and dirty pte bits") introduced support for handling hardware
updates of the access flag and dirty status. The PTE is automatically
dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when
the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to
detect a hardware dirtied pte. The pte_dirty() macro checks for both
software PTE_DIRTY and pte_hw_dirty().
Functions like pte_modify() clear the PTE_RDONLY bit since it is meant
to be set in set_pte_at() when written to memory. In such cases,
pte_hw_dirty() would return true even though such pte is clean. This
patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together
with PTE_RDONLY.
Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reported-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Table 8 of UEFI 2.5 section 2.3.6.1 defines mappings from EFI
memory types to MAIR attribute encodings for arm64.
If the physical address has memory attributes defined by EFI
memmap as EFI_MEMORY_[UC|WC|WT], return approprate page
protection type according to the UEFI spec. Otherwise, return
PAGE_KERNEL.
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ Small stylistic tweaks. ]
Reviewed-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1441372302-23242-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Merge third patch-bomb from Andrew Morton:
- even more of the rest of MM
- lib/ updates
- checkpatch updates
- small changes to a few scruffy filesystems
- kmod fixes/cleanups
- kexec updates
- a dma-mapping cleanup series from hch
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits)
dma-mapping: consolidate dma_set_mask
dma-mapping: consolidate dma_supported
dma-mapping: cosolidate dma_mapping_error
dma-mapping: consolidate dma_{alloc,free}_noncoherent
dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}
mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd()
mm: make sure all file VMAs have ->vm_ops set
mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()
mm: mark most vm_operations_struct const
namei: fix warning while make xmldocs caused by namei.c
ipc: convert invalid scenarios to use WARN_ON
zlib_deflate/deftree: remove bi_reverse()
lib/decompress_unlzma: Do a NULL check for pointer
lib/decompressors: use real out buf size for gunzip with kernel
fs/affs: make root lookup from blkdev logical size
sysctl: fix int -> unsigned long assignments in INT_MIN case
kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo
kexec: align crash_notes allocation to make it be inside one physical page
kexec: remove unnecessary test in kimage_alloc_crash_control_pages()
kexec: split kexec_load syscall from kexec core code
...
Almost everyone implements dma_set_mask the same way, although some time
that's hidden in ->set_dma_mask methods.
This patch consolidates those into a common implementation that either
calls ->set_dma_mask if present or otherwise uses the default
implementation. Some architectures used to only call ->set_dma_mask
after the initial checks, and those instance have been fixed to do the
full work. h8300 implemented dma_set_mask bogusly as a no-ops and has
been fixed.
Unfortunately some architectures overload unrelated semantics like changing
the dma_ops into it so we still need to allow for an architecture override
for now.
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most architectures just call into ->dma_supported, but some also return 1
if the method is not present, or 0 if no dma ops are present (although
that should never happeb). Consolidate this more broad version into
common code.
Also fix h8300 which inorrectly always returned 0, which would have been
a problem if it's dma_set_mask implementation wasn't a similarly buggy
noop.
As a few architectures have much more elaborate implementations, we
still allow for arch overrides.
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently there are three valid implementations of dma_mapping_error:
(1) call ->mapping_error
(2) check for a hardcoded error code
(3) always return 0
This patch provides a common implementation that calls ->mapping_error
if present, then checks for DMA_ERROR_CODE if defined or otherwise
returns 0.
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most architectures do not support non-coherent allocations and either
define dma_{alloc,free}_noncoherent to their coherent versions or stub
them out.
Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
implements them directly.
This patch moves the Openrisc version to common code, and handles the
DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.
Note that actual non-coherent allocations require a dma_cache_sync
implementation, so if non-coherent allocations didn't work on
an architecture before this patch they still won't work after it.
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since 2009 we have a nice asm-generic header implementing lots of DMA API
functions for architectures using struct dma_map_ops, but unfortunately
it's still missing a lot of APIs that all architectures still have to
duplicate.
This series consolidates the remaining functions, although we still need
arch opt outs for two of them as a few architectures have very
non-standard implementations.
This patch (of 5):
The coherent DMA allocator works the same over all architectures supporting
dma_map operations.
This patch consolidates them and converges the minor differences:
- the debug_dma helpers are now called from all architectures, including
those that were previously missing them
- dma_alloc_from_coherent and dma_release_from_coherent are now always
called from the generic alloc/free routines instead of the ops
dma-mapping-common.h always includes dma-coherent.h to get the defintions
for them, or the stubs if the architecture doesn't support this feature
- checks for ->alloc / ->free presence are removed. There is only one
magic instead of dma_map_ops without them (mic_dma_ops) and that one
is x86 only anyway.
Besides that only x86 needs special treatment to replace a default devices
if none is passed and tweak the gfp_flags. An optional arch hook is provided
for that.
[linux@roeck-us.net: fix build]
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
- Convert xen-blkfront to the multiqueue API
- [arm] Support binding event channels to different VCPUs.
- [x86] Support > 512 GiB in a PV guests (off by default as such a
guest cannot be migrated with the current toolstack).
- [x86] PMU support for PV dom0 (limited support for using perf with
Xen and other guests).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV7wIdAAoJEFxbo/MsZsTR0hEH/04HTKLKGnSJpZ5WbMPxqZxE
UqGlvhvVWNAmFocZmbPcEi9T1qtcFrX5pM55JQr6UmAp3ovYsT2q1Q1kKaOaawks
pSfc/YEH3oQW5VUQ9Lm9Ru5Z8Btox0WrzRREO92OF36UOgUOBOLkGsUfOwDinNIM
lSk2djbYwDYAsoeC3PHB32wwMI//Lz6B/9ZVXcyL6ULynt1ULdspETjGnptRPZa7
JTB5L4/soioKOn18HDwwOhKmvaFUPQv9Odnv7dc85XwZreajhM/KMu3qFbMDaF/d
WVB1NMeCBdQYgjOrUjrmpyr5uTMySiQEG54cplrEKinfeZgKlEyjKvjcAfJfiac=
=Ktjl
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"Xen features and fixes for 4.3:
- Convert xen-blkfront to the multiqueue API
- [arm] Support binding event channels to different VCPUs.
- [x86] Support > 512 GiB in a PV guests (off by default as such a
guest cannot be migrated with the current toolstack).
- [x86] PMU support for PV dom0 (limited support for using perf with
Xen and other guests)"
* tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits)
xen: switch extra memory accounting to use pfns
xen: limit memory to architectural maximum
xen: avoid another early crash of memory limited dom0
xen: avoid early crash of memory limited dom0
arm/xen: Remove helpers which are PV specific
xen/x86: Don't try to set PCE bit in CR4
xen/PMU: PMU emulation code
xen/PMU: Intercept PMU-related MSR and APIC accesses
xen/PMU: Describe vendor-specific PMU registers
xen/PMU: Initialization code for Xen PMU
xen/PMU: Sysfs interface for setting Xen PMU mode
xen: xensyms support
xen: remove no longer needed p2m.h
xen: allow more than 512 GB of RAM for 64 bit pv-domains
xen: move p2m list if conflicting with e820 map
xen: add explicit memblock_reserve() calls for special pages
mm: provide early_memremap_ro to establish read-only mapping
xen: check for initrd conflicting with e820 map
xen: check pre-allocated page tables for conflict with memory map
xen: check for kernel memory conflicting with memory layout
...
Currently we don't set the RES1 bits of TCR_EL2 and VTCR_EL2 when
configuring them, which could lead to unexpected behaviour when an
architectural meaning is defined for those bits.
Set the RES1 bits to avoid issues.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
- Support for new architectural features introduced in ARMv8.1:
* Privileged Access Never (PAN) to catch user pointer dereferences in
the kernel
* Large System Extension (LSE) for building scalable atomics and locks
(depends on locking/arch-atomic from tip, which is included here)
* Hardware Dirty Bit Management (DBM) for updating clean PTEs
automatically
- Move our PSCI implementation out into drivers/firmware/, where it can
be shared with arch/arm/. RMK has also pulled this component branch
and has additional patches moving arch/arm/ over. MAINTAINERS is
updated accordingly.
- Better BUG implementation based on the BRK instruction for trapping
- Leaf TLB invalidation for unmapping user pages
- Support for PROBE_ONLY PCI configurations
- Various cleanups and non-critical fixes, including:
* Always flush FP/SIMD state over exec()
* Restrict memblock additions based on range of linear mapping
* Ensure *(LIST_POISON) generates a fatal fault
* Context-tracking syscall return no longer corrupts return value when
not forced on.
* Alternatives patching synchronisation/stability improvements
* Signed sub-word cmpxchg compare fix (tickled by HAVE_CMPXCHG_LOCAL)
* Force SMP=y
* Hide direct DCC access from userspace
* Fix EFI stub memory allocation when DRAM starts at 0x0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV5XXWAAoJELescNyEwWM0p4UIAIQwgoUnj01LvtImjMyG0NiY
38GbAia7FsyIktSjuCaEhLsWjL8WSMscRsz6MLK01ir3iOoKdtXd/OptlsJTV5c5
5POPAU6hvdfKj6MtsaOAOx4dz7bhM/HB9JSZmcbHqytOxIi4Tp1JoBrmM1mpNwmp
VFy+GAOs5H6Lb/xUMm50pVUx+mjMXsH4Bo1c/0Y/gYsjhcvcRgE2iqnl7UExgDcW
5sbhpsdw8zleDx+kzTmt5QoFWk/4l3d/F+0dzLCYfxzCLNYacksbQqEbGFVAsiIl
aACK3Uqk7v7ZtFqqQLtNzE6Pfiw0CzajINPUyykoMCnDtMsyhYbxqezywCAPpSY=
=8qHf
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
- Support for new architectural features introduced in ARMv8.1:
* Privileged Access Never (PAN) to catch user pointer dereferences in
the kernel
* Large System Extension (LSE) for building scalable atomics and locks
(depends on locking/arch-atomic from tip, which is included here)
* Hardware Dirty Bit Management (DBM) for updating clean PTEs
automatically
- Move our PSCI implementation out into drivers/firmware/, where it can
be shared with arch/arm/. RMK has also pulled this component branch
and has additional patches moving arch/arm/ over. MAINTAINERS is
updated accordingly.
- Better BUG implementation based on the BRK instruction for trapping
- Leaf TLB invalidation for unmapping user pages
- Support for PROBE_ONLY PCI configurations
- Various cleanups and non-critical fixes, including:
* Always flush FP/SIMD state over exec()
* Restrict memblock additions based on range of linear mapping
* Ensure *(LIST_POISON) generates a fatal fault
* Context-tracking syscall return no longer corrupts return value when
not forced on.
* Alternatives patching synchronisation/stability improvements
* Signed sub-word cmpxchg compare fix (tickled by HAVE_CMPXCHG_LOCAL)
* Force SMP=y
* Hide direct DCC access from userspace
* Fix EFI stub memory allocation when DRAM starts at 0x0
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
arm64: flush FP/SIMD state correctly after execve()
arm64: makefile: fix perf_callchain.o kconfig dependency
arm64: set MAX_MEMBLOCK_ADDR according to linear region size
of/fdt: make memblock maximum physical address arch configurable
arm64: Fix source code file path in comments
arm64: entry: always restore x0 from the stack on syscall return
arm64: mdscr_el1: avoid exposing DCC to userspace
arm64: kconfig: Move LIST_POISON to a safe value
arm64: Add __exception_irq_entry definition for function graph
arm64: mm: ensure patched kernel text is fetched from PoU
arm64: alternatives: ensure secondary CPUs execute ISB after patching
arm64: make ll/sc __cmpxchg_case_##name asm consistent
arm64: dma-mapping: Simplify pgprot handling
arm64: restore cpu suspend/resume functionality
ARM64: PCI: do not enable resources on PROBE_ONLY systems
arm64: cmpxchg: truncate sub-word signed types before comparison
arm64: alternative: put secondary CPUs into polling loop during patch
arm64/Documentation: clarify wording regarding memory below the Image
arm64: lse: fix lse cmpxchg code indentation
arm64: remove redundant object file list
...
Pull ARM development updates from Russell King:
"Included in this update:
- moving PSCI code from ARM64/ARM to drivers/
- removal of some architecture internals from global kernel view
- addition of software based "privileged no access" support using the
old domains register to turn off the ability for kernel
loads/stores to access userspace. Only the proper accessors will
be usable.
- addition of early fixup support for early console
- re-addition (and reimplementation) of OMAP special interconnect
barrier
- removal of finish_arch_switch()
- only expose cpuX/online in sysfs if hotpluggable
- a number of code cleanups"
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits)
ARM: software-based priviledged-no-access support
ARM: entry: provide uaccess assembly macro hooks
ARM: entry: get rid of multiple macro definitions
ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die()
ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
ARM: mm: improve do_ldrd_abort macro
ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
ARM: entry: efficiency cleanups
ARM: entry: get rid of asm_trace_hardirqs_on_cond
ARM: uaccess: simplify user access assembly
ARM: domains: remove DOMAIN_TABLE
ARM: domains: keep vectors in separate domain
ARM: domains: get rid of manager mode for user domain
ARM: domains: move initial domain setting value to asm/domains.h
ARM: domains: provide domain_mask()
ARM: domains: switch to keeping domain value in register
ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE
ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD()
ARM: 8416/1: Feroceon: use of_iomap() to map register base
ARM: 8415/1: early fixmap support for earlycon
...
Three architectures already define these, and we'll need them genericly
soon.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The linear region size of a 39-bit VA kernel is only 256 GB, which
may be insufficient to cover all of system RAM, even on platforms
that have much less than 256 GB of memory but which is laid out
very sparsely.
So make sure we clip the memory we will not be able to map before
installing it into the memblock memory table, by setting
MAX_MEMBLOCK_ADDR accordingly.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Architecture specific code for i386 and x86_64 was unified and merged to
the arch/x86. This patch fix old path of x86 architecture in a comment
from the arch/arm64/include/asm/fixmap.h.
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Highlights for KVM PPC this time around:
- Book3S: A few bug fixes
- Book3S: Allow micro-threading on POWER8
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJV2D6aAAoJECszeR4D/txggBMP/3nHD3UjEAFUhhA6VjfK2wNw
IW2aXQ5+2T51l1K8iSGMyKpW2w4zG5Bv9LdBP2badhaVpgM4//nVf7kcEBrdhjYq
ns7V3klzTuNY5RBbWZz3Zri0mgCkJVF1XlC3xBzGPSNKpZyrkORhlxfg5GXig8lj
pvUcku7XgkCFabAIIZmf0pg9hpDHpH3k1G9yZxuA8pys951IPRoo1CgsYmWSbmzh
jfA2CxBl10dHZOuk/ENyJveJgtthmBB4ezCbWXy+wcMzBKhMC5R93LUoiKXMLWpM
HkziNGjHA1gFSxDtfUVgkcXfan3a5JmlC+u50dLCTetXOVL7m2beIiXwv3smfjLn
AkpcChceEChxn0MxwKJjNvU+RVh3kmv8rklfPlBXHTtQ5ZSXxlcxYrmgL64stmrt
e27dzvJd9J7KX6wEpNyuZINsmFyn3lM3IoxqmSsVCRd43fzhZt9QGcYEXMIe1+lb
E7QncsYMuuWB/sfSieyPaXtmK5ym2+R220xlKezBZdzWdtisPrpCRyl7BdiqCj6O
1gROi6qEyj3m5Qw/eGbFKBF0d8oVXqo1wBJkbihMl55D+jMeZMk673aeGhno8au1
kH+Im+H5xU3oEzdqvC9y3c9kE2sRkzj43GjepIb86Y463fg6KQ5j2gbZUZolGsGH
AnRSGcbbVer/q+9kymPw
=t+9t
-----END PGP SIGNATURE-----
Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm-queue
Patch queue for ppc - 2015-08-22
Highlights for KVM PPC this time around:
- Book3S: A few bug fixes
- Book3S: Allow micro-threading on POWER8
Currently, the event channel rebind code is gated with the presence of
the vector callback.
The virtual interrupt controller on ARM has the concept of per-CPU
interrupt (PPI) which allow us to support per-VCPU event channel.
Therefore there is no need of vector callback for ARM.
Xen is already using a free PPI to notify the guest VCPU of an event.
Furthermore, the xen code initialization in Linux (see
arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
Introduce new helper xen_support_evtchn_rebind to allow architecture
decide whether rebind an event is support or not. It will always return
true on ARM and keep the same behavior on x86.
This is also allow us to drop the usage of xen_have_vector_callback
entirely in the ARM code.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
This patch only saves and restores FP/SIMD registers on Guest access. To do
this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit.
lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD
context is not saved/restored
[chazy/maz: fixed save/restore logic for 32bit guests]
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The gic_handle_irq() is defined with __exception_irq_entry attribute.
A single remaining work is to add its definition as ARM did. Below
shows how function graph data is changed with these hunks.
A prologue of an interrupt handler is drawn as follows.
- current status
0) 0.208 us | cpuidle_not_available();
0) | default_idle_call() {
0) | arch_cpu_idle() {
0) | __handle_domain_irq() {
0) | irq_enter() {
0) 0.313 us | rcu_irq_enter();
0) 0.261 us | __local_bh_disable_ip();
- with this change
0) 0.625 us | cpuidle_not_available();
0) | default_idle_call() {
0) | arch_cpu_idle() {
0) ==========> |
0) | gic_handle_irq() {
0) | __handle_domain_irq() {
0) | irq_enter() {
0) 0.885 us | rcu_irq_enter();
0) 0.781 us | __local_bh_disable_ip();
An epilogue of an interrupt handler is recorded as follows.
- current status
0) 0.261 us | idle_cpu();
0) | rcu_irq_exit() {
0) 0.521 us | rcu_eqs_enter_common.isra.46();
0) 2.552 us | }
0) ! 322.448 us | }
0) ! 583.437 us | }
0) # 1656.041 us | }
0) # 1658.073 us | }
- with this change
0) 0.677 us | idle_cpu();
0) | rcu_irq_exit() {
0) 1.770 us | rcu_eqs_enter_common.isra.46();
0) 7.968 us | }
0) # 1803.541 us | }
0) # 2626.667 us | }
0) # 2632.969 us | }
0) <========== |
0) # 14425.00 us | }
0) # 14430.98 us | }
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rabin Vincent <rabin@rab.in>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit 8a14849 (arm64: KVM: Switch vgic save/restore to
alternative_insn) vgic_sr_vectors is not used anymore, so remove
remaining leftovers and kill the structure.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
UEFI spec 2.5 section 2.3.6.1 defines that
EFI_MEMORY_[UC|WC|WT|WB] are possible EFI memory types for
AArch64.
Each of those EFI memory types is mapped to a corresponding
AArch64 memory type. So we need to define PROT_DEVICE_nGnRnE
and PROT_NORMWL_WT additionaly.
MT_NORMAL_WT is defined, and its encoding is added to MAIR_EL1
when initializing the CPU.
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438936621-5215-6-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The ll/sc __cmpxchg_case_##name assembly mostly uses symbolic names for
operands, but in a single case uses %2 to refer to what is otherwise
known as %[v]. This makes the code more painful to read than is
necessary.
Use %[v] instead.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To enable sharing with arm, move the core PSCI framework code to
drivers/firmware. This results in a minor gain in lines of code, but
this will quickly be amortised by the removal of code currently
duplicated in arch/arm.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There are various problems and short-comings with the current
static_key interface:
- static_key_{true,false}() read like a branch depending on the key
value, instead of the actual likely/unlikely branch depending on
init value.
- static_key_{true,false}() are, as stated above, tied to the
static_key init values STATIC_KEY_INIT_{TRUE,FALSE}.
- we're limited to the 2 (out of 4) possible options that compile to
a default NOP because that's what our arch_static_branch() assembly
emits.
So provide a new static_key interface:
DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);
Which define a key of different types with an initial true/false
value.
Then allow:
static_branch_likely()
static_branch_unlikely()
to take a key of either type and emit the right instruction for the
case.
This means adding a second arch_static_branch_jump() assembly helper
which emits a JMP per default.
In order to determine the right instruction for the right state,
encode the branch type in the LSB of jump_entry::key.
This is the final step in removing the naming confusion that has led to
a stream of avoidable bugs such as:
a833581e37 ("x86, perf: Fix static_key bug in load_mm_cr4()")
... but it also allows new static key combinations that will give us
performance enhancements in the subsequent patches.
Tested-by: Rabin Vincent <rabin@rab.in> # arm
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire()
with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips,
powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work
reliably on non-scalar types.
WRITE_ONCE() and READ_ONCE() were introduced in the following commits:
230fa253df ("kernel: Provide READ_ONCE and ASSIGN_ONCE")
43239cbe79 ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When performing a cmpxchg operation on a signed sub-word type (e.g. s8),
we need to ensure that the upper register bits of the "old" value used
for comparison are zeroed, otherwise we may erroneously fail the cmpxchg
which may even be interpreted as success by the caller (if the compiler
performs the truncation as part of its check). This has been observed
in mod_state, where negative values where causing problems with
this_cpu_cmpxchg.
This patch fixes the issue by explicitly casting 8-bit and 16-bit "old"
values using unsigned types in our cmpxchg wrappers. 32-bit types can be
left alone, since the underlying asm makes use of W registers in this
case.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When patching the kernel text with alternatives, we may end up patching
parts of the stop_machine state machine (e.g. atomic_dec_and_test in
ack_state) and consequently corrupt the instruction stream of any
secondary CPUs.
This patch passes the cpu_online_mask to stop_machine, forcing all of
the CPUs into our own callback which can place the secondary cores into
a dumb (but safe!) polling loop whilst the patching is carried out.
Signed-off-by: Will Deacon <will.deacon@arm.com>
For some reason, the ll/sc cmpxchg asm is all off to the left and
awkward to read in conjunction with the following (correctly indented)
LSE version.
This patch shifts the ll/sc code back to where it should be.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 4b3dc9679c ("arm64: force CONFIG_SMP=y and remove redundant
and therfore can not be selected anymore.
Remove dead #ifdef-block depending on UP_LATE_INIT in
arch/arm64/kernel/setup.c
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
[will: kill do_post_cpus_up_work altogether]
Signed-off-by: Will Deacon <will.deacon@arm.com>
pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte,
so fix the macro definition to use bitwise & instead of logical &&.
Signed-off-by: Will Deacon <will.deacon@arm.com>
When unlocking a spinlock, we perform a read-modify-write on the owner
ticket in order to increment it and store it back with release
semantics.
In the LL/SC case, we load the 16-bit ticket using a 32-bit load and
therefore store back the wrong halfword on a big-endian system,
corrupting the lock after the first unlock and killing the system dead.
This patch fixes the unlock code to use 16-bit accessors consistently.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The flush_tlb_page() function is used on user address ranges when PTEs
(or PMDs/PUDs for huge pages) were changed (attributes or clearing). For
such cases, it is more efficient to invalidate only the last level of
the TLB with the "tlbi vale1is" instruction.
In the TLB shoot-down case, the TLB caching of the intermediate page
table levels (pmd, pud, pgd) is handled by __flush_tlb_pgtable() via the
__(pte|pmd|pud)_free_tlb() functions and it is not deferred to
tlb_finish_mmu() (as of commit 285994a62c - "arm64: Invalidate the TLB
corresponding to intermediate page table levels"). The tlb_flush()
function only needs to invalidate the TLB for the last level of page
tables; the __flush_tlb_range() function gains a fourth argument for
last level TLBI.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch moves the MAX_TLB_RANGE check into the
flush_tlb(_kernel)_range functions directly to avoid the
undescore-prefixed definitions (and for consistency with a subsequent
patch).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
lib/list_sort.c defines a 'struct debug_el', where "el" is assumedly a
a contraction of "element". This conflicts with 'enum debug_el' in our
asm/debug-monitors.h header file, where "el" stands for Exception Level.
The result is build failure when targetting allmodconfig, so rename our
enum to 'dbg_active_el' to be slightly more explicit about what it is.
Signed-off-by: Will Deacon <will.deacon@arm.com>
If we attempt to atomic64_dec_if_positive on INT_MIN, we will underflow
and incorrectly decide that the original parameter was positive.
This patches fixes the broken condition code so that we handle this
corner case correctly.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We don't need duplicate cmpxchg implementations, so use cmpxchg to
implement atomic{,64}_cmpxchg, like we do for xchg already.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch makes use of prfm to prefetch cachelines for write prior to
ldxr/stxr loops when using the ll/sc atomic routines.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The common (i.e. identical for ll/sc and lse) atomic macros in atomic.h
are needlessley different for atomic_t and atomic64_t.
This patch tidies up the definitions to make them consistent across the
two atomic types and factors out common code such as the add_unless
implementation based on cmpxchg.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
cmpxchg doesn't require memory barrier semantics when the value
comparison fails, so make the barrier conditional on success.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We can perform the cmpxchg comparison using eor and cbnz which avoids
the "cc" clobber for the ll/sc case and consequently for the LSE case
where we may have to fall-back on the ll/sc code at runtime.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our cmpxchg_double primitives
so that the LSE casp instruction is used instead.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our cmpxchg primitives so that
the LSE cas instruction is used instead.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our xchg primitives so that
the LSE swp instruction (yes, you read right!) is used instead.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our bitops functions so that
LSE atomic instructions are used instead.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of our locking functions so that
LSE atomic instructions are used for spinlocks and rwlocks.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.
This patch introduces runtime patching of atomic_t and atomic64_t
routines so that the call-site for the out-of-line ll/sc sequences is
patched with an LSE atomic instruction when we detect that
the CPU supports it.
If binutils is not recent enough to assemble the LSE instructions, then
the ll/sc sequences are inlined as though CONFIG_ARM64_LSE_ATOMICS=n.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In order to patch in the new atomic instructions at runtime, we need to
generate wrappers around the out-of-line exclusive load/store atomics.
This patch adds a new Kconfig option, CONFIG_ARM64_LSE_ATOMICS. which
causes our atomic functions to branch to the out-of-line ll/sc
implementations. To avoid the register spill overhead of the PCS, the
out-of-line functions are compiled with specific compiler flags to
force out-of-line save/restore of any registers that are usually
caller-saved.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add a CPU feature for the LSE atomic instructions, so that they can be
patched in at runtime when we detect that they are supported.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation for the Large System Extension (LSE) atomic instructions
introduced by ARM v8.1, move the current exclusive load/store (LL/SC)
atomics into their own header file.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
cpufeature.h makes use of DECLARE_BITMAP, which in turn relies on the
BITS_TO_LONGS and DIV_ROUND_UP macros.
This patch includes kernel.h in cpufeature.h to prevent all users having
to do the same thing.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
STXR can fail for a number of reasons, so don't fail an rwlock trylock
operation simply because the STXR reported failure.
I'm not aware of any issues with the current code, but this makes it
consistent with spin_trylock and also other architectures (e.g. arch/arm).
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Implement atomic logic ops -- atomic_{or,xor,and}.
These will replace the atomic_{set,clear}_mask functions that are
available on some archs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Implement atomic logic ops -- atomic_{or,xor,and}.
These will replace the atomic_{set,clear}_mask functions that are
available on some archs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Our ticket-based spinlock structures rely on a definition of u16, so
include linux/types.h explicitly to ensure the thing compiles.
Found by a module build failure in -next:
arch/arm64/include/asm/spinlock_types.h:27:2: error: unknown type name 'u16'
arch/arm64/include/asm/spinlock_types.h:28:2: error: unknown type name 'u16'
arch/arm64/include/asm/spinlock_types.h:33:13: error: expected declaration specifiers or '...' before numeric constant
include/linux/spinlock_types.h:21:2: error: unknown type name 'arch_spinlock_t'
arch/arm64/include/asm/spinlock.h:34:35: error: unknown type name 'arch_spinlock_t'
arch/arm64/include/asm/spinlock.h:65:37: error: unknown type name 'arch_spinlock_t'
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, the minimal default BUG() implementation from asm-
generic is used for arm64.
This patch uses the BRK software breakpoint instruction to generate
a trap instead, similarly to most other arches, with the generic
BUG code generating the dmesg boilerplate.
This allows bug metadata to be moved to a separate table and
reduces the amount of inline code at BUG and WARN sites. This also
avoids clobbering any registers before they can be dumped.
To mitigate the size of the bug table further, this patch makes
use of the existing infrastructure for encoding addresses within
the bug table as 32-bit offsets instead of absolute pointers.
(Note that this limits the kernel size to 2GB.)
Traps are registered at arch_initcall time for aarch64, but BUG
has minimal real dependencies and it is desirable to be able to
generate bug splats as early as possible. This patch redirects
all debug exceptions caused by BRK directly to bug_handler() until
the full debug exception support has been initialised.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
<asm/debug-monitors.h> relies on <asm/ptrace.h>, but doesn't
declare this dependency. This becomes a problem once
debug-monitors.h starts getting included all over the place to get
the BRK immedates.
The missing include of <asm/memory.h> (for UL()) in <asm/esr.h> is
also added. The series no longer relies on this, but I spotted it
during development and it may as well get fixed.
No functional change.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The way the KGDB_DYN_BRK_INS_BYTEx macros are declared is more
complex than it needs to be. Also, the macros are only used in one
place, which is arch-specific anyway.
This patch refactors the macros to simplify them, and exposes an
argument so that we can have a single macro instead of 4.
As a side effect, this patch also fixes some anomalous spellings of
"KGDB".
These changes alter the compile types of some integer constants
that are harmless but trigger truncation warnings in gcc when
assigning to 32-bit variables. This patch adds an explicit cast
for the affected cases.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
It makes sense to keep all the architectural exception syndrome
definitions in the same place.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The naming of DBG_ESR_VAL_BRK is inconsistent with the way other
similar macros are named.
This patch makes the naming more consistent, and appends "64"
as a reminder that this ESR pattern only matches from AArch64
state.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
<asm/esr.h> has perfectly good constants for defining ESR values
already. Let's use them.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There are only 16 comment bits in a BRK instruction, which
correspond to ESR bits 15:0. Bits 24:16 of the ESR are RES0,
and might have weird meanings in the future.
This code inserts 16 bits of comment in the ESR value instead of
20 (almost certainly a typo in the original code).
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The size of an A64 BRK instruction is the same as the size of all other
A64 instructions, because all A64 instructions are the same size.
BREAK_INSTR_SIZE is retained for readibility, but it should not be
an independent constant from AARCH64_INSN_SIZE.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There are cases where we want to compile out both versions of an
alternative code block, so add an enable parameter to the new conditional
alternative assembly macros in the same way as alternative_insn.
Signed-off-by: Will Deacon <will.deacon@arm.com>
'Privileged Access Never' is a new arm8.1 feature which prevents
privileged code from accessing any virtual address where read or write
access is also permitted at EL0.
This patch enables the PAN feature on all CPUs, and modifies {get,put}_user
helpers temporarily to permit access.
This will catch kernel bugs where user memory is accessed directly.
'Unprivileged loads and stores' using ldtrb et al are unaffected by PAN.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: use ALTERNATIVE in asm and tidy up pan_enable check]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The system register encoding generated by sys_reg() works only
for MRS/MSR(Register) operations, as we hardcode Bit20 to 1 in
mrs_s/msr_s mask. This makes it unusable for generating instructions
accessing registers with Op0 < 2(e.g, PSTATE.x with Op0=0).
As per ARMv8 ARM, (Ref: ARMv8 ARM, Section: "System instruction class
encoding overview", C5.2, version:ARM DDI 0487A.f), the instruction
encoding reserves bits [20-19] for Op0.
This patch generalises the sys_reg, mrs_s and msr_s macros, so that
we could use them to access any of the supported system register.
Cc: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some uses of ALTERNATIVE() may depend on a feature that is disabled at
compile time by a Kconfig option. In this case the unused alternative
instructions waste space, and if the original instruction is a nop, it
wastes time and space.
This patch adds an optional 'config' option to ALTERNATIVE() and
alternative_insn that allows the compiler to remove both the original
and alternative instructions if the config option is not defined.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When a new cpu feature is available, the cpu feature bits will have some
initial value, which is incremented when the feature is updated.
This patch changes 'register_value' to be 'min_field_value', and checks
the feature bits value (interpreted as a signed int) is greater than this
minimum.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds an 'enable()' callback to cpu capability/feature
detection, allowing features that require some setup or configuration
to get this opportunity once the feature has been detected.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Later patches need config_sctlr_el1 to set/clear bits in the sctlr_el1
register.
This patch moves this function into header a file.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The existing alternative_insn macro has some limitations that make it
hard to work with. In particular the fact it takes instructions from it
own macro arguments means it doesn't play very nicely with C pre-processor
macros because the macro arguments look like a string to the C
pre-processor. Workarounds are (probably) possible but things start to
look ugly.
Introduce an alternative set of macros that allows instructions to be
presented to the assembler as normal and switch everything over to the
new macros.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Based on arch/arm/include/asm/cputype.h, this function does the
shifting and sign extension necessary when accessing cpu feature fields.
Signed-off-by: James Morse <james.morse@arm.com>
Suggested-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Remove paragraph about writing to the Free Software Foundation's
mailing address from GPL notice.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Nobody seems to be producing !SMP systems anymore, so this is just
becoming a source of kernel bugs, particularly if people want to use
coherent DMA with non-shared pages.
This patch forces CONFIG_SMP=y for arm64, removing a modest amount of
code in the process.
Signed-off-by: Will Deacon <will.deacon@arm.com>
We currently bundle the callchain handling code with the PMU code,
despite the fact the two are distinct, and the former can be useful even
in the absence of the latter.
Follow the example of arch/arm and factor the callchain handling into
its own file dependent on CONFIG_PERF_EVENTS rather than
CONFIG_HW_PERF_EVENTS.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARMv8.1 architecture extensions introduce support for hardware
updates of the access and dirty information in page table entries. With
TCR_EL1.HA enabled, when the CPU accesses an address with the PTE_AF bit
cleared in the page table, instead of raising an access flag fault the
CPU sets the actual page table entry bit. To ensure that kernel
modifications to the page tables do not inadvertently revert a change
introduced by hardware updates, the exclusive monitor (ldxr/stxr) is
adopted in the pte accessors.
When TCR_EL1.HD is enabled, a write access to a memory location with the
DBM (Dirty Bit Management) bit set in the corresponding pte
automatically clears the read-only bit (AP[2]). Such DBM bit maps onto
the Linux PTE_WRITE bit and to check whether a writable (DBM set) page
is dirty, the kernel tests the PTE_RDONLY bit. In order to allow
read-only and dirty pages, the kernel needs to preserve the software
dirty bit. The hardware dirty status is transferred to the software
dirty bit in ptep_set_wrprotect() (using load/store exclusive loop) and
pte_modify().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 68234df4ea ("arm64: kill flush_cache_all()") removed
soft_reset() from the kernel. This was the only caller of
setup_mm_for_reboot(), so remove that also.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Brown reported an allnoconfig build failure in -next:
Today's linux-next fails to build an arm64 allnoconfig due to "mm:
make GUP handle pfn mapping unless FOLL_GET is requested" which
causes:
> arm64-allnoconfig
> ../mm/gup.c:51:4: error: implicit declaration of function
'update_mmu_cache' [-Werror=implicit-function-declaration]
Fix the error by moving the function to asm/pgtable.h, as is the case
for most other architectures.
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 68234df4ea ("arm64: kill flush_cache_all()") removed the
only users of these macros.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Finally advertise the KVM capability for SET_GUEST_DEBUG. Once arm
support is added this check can be moved to the common
kvm_vm_ioctl_check_extension() code.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This introduces a level of indirection for the debug registers. Instead
of using the sys_regs[] directly we store registers in a structure in
the vcpu. The new kvm_arm_reset_debug_ptr() sets the debug ptr to the
guest context.
Because we no longer give the sys_regs offset for the sys_reg_desc->reg
field, but instead the index into a debug-specific struct we need to
add a number of additional trap functions for each register. Also as the
generic generic user-space access code no longer works we have
introduced a new pair of function pointers to the sys_reg_desc structure
to override the generic code when needed.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This adds support for single-stepping the guest. To do this we need to
manipulate the guests PSTATE.SS and MDSCR_EL1.SS bits to trigger
stepping. We take care to preserve MDSCR_EL1 and trap access to it to
ensure we don't affect the apparent state of the guest.
As we have to enable trapping of all software debug exceptions we
suppress the ability of the guest to single-step itself. If we didn't we
would have to deal with the exception arriving while the guest was in
kernelspace when the guest is expecting to single-step userspace. This
is something we don't want to unwind in the kernel. Once the host is no
longer debugging the guest its ability to single-step userspace is
restored.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This is a precursor for later patches which will need to do more to
setup debug state before entering the hyp.S switch code. The existing
functionality for setting mdcr_el2 has been moved out of hyp.S and now
uses the value kept in vcpu->arch.mdcr_el2.
As the assembler used to previously mask and preserve MDCR_EL2.HPMN I've
had to add a mechanism to save the value of mdcr_el2 as a per-cpu
variable during the initialisation code. The kernel never sets this
number so we are assuming the bootcode has set up the correct value
here.
This also moves the conditional setting of the TDA bit from the hyp code
into the C code which is currently used for the lazy debug register
context switch code.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Commit 2ae416b142 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.
As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.
The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The BAD_MADT_ENTRY() macro is designed to work for all of the subtables
of the MADT. In the ACPI 5.1 version of the spec, the struct for the
GICC subtable (struct acpi_madt_generic_interrupt) is 76 bytes long; in
ACPI 6.0, the struct is 80 bytes long. But, there is only one definition
in ACPICA for this struct -- and that is the 6.0 version. Hence, when
BAD_MADT_ENTRY() compares the struct size to the length in the GICC
subtable, it fails if 5.1 structs are in use, and there are systems in
the wild that have them.
This patch adds the BAD_MADT_GICC_ENTRY() that checks the GICC subtable
only, accounting for the difference in specification versions that are
possible. The BAD_MADT_ENTRY() will continue to work as is for all other
MADT subtables.
This code is being added to an arm64 header file since that is currently
the only architecture using the GICC subtable of the MADT. As a GIC is
specific to ARM, it is also unlikely the subtable will be used elsewhere.
Fixes: aeb823bbac ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
[catalin.marinas@arm.com: extra brackets around macro arguments]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Merge second patchbomb from Andrew Morton:
- most of the rest of MM
- lots of misc things
- procfs updates
- printk feature work
- updates to get_maintainer, MAINTAINERS, checkpatch
- lib/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits)
exit,stats: /* obey this comment */
coredump: add __printf attribute to cn_*printf functions
coredump: use from_kuid/kgid when formatting corename
fs/reiserfs: remove unneeded cast
NILFS2: support NFSv2 export
fs/befs/btree.c: remove unneeded initializations
fs/minix: remove unneeded cast
init/do_mounts.c: add create_dev() failure log
kasan: remove duplicate definition of the macro KASAN_FREE_PAGE
fs/efs: femove unneeded cast
checkpatch: emit "NOTE: <types>" message only once after multiple files
checkpatch: emit an error when there's a diff in a changelog
checkpatch: validate MODULE_LICENSE content
checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY
checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
checkpatch: fix processing of MEMSET issues
checkpatch: suggest using ether_addr_equal*()
checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files
checkpatch: remove local from codespell path
checkpatch: add --showfile to allow input via pipe to show filenames
...
Nobody used these hooks so they were removed from common code, and can now
be removed from the architectures.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull asm/scatterlist.h removal from Jens Axboe:
"We don't have any specific arch scatterlist anymore, since parisc
finally switched over. Kill the include"
* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
remove scatterlist.h generation from arch Kbuild files
remove <asm/scatterlist.h>
Currently we have many duplicates in definitions of
hugetlb_prefault_arch_hook. In all architectures this function is empty.
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu). This includes remapping
the vDSO to the place it has at checkpoint time.
However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service. So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.
This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold. The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.
This patch (of 3):
This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)
The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.
The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.
In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- CPU ops and PSCI (Power State Coordination Interface) refactoring
following the merging of the arm64 ACPI support, together with
handling of Trusted (secure) OS instances
- Using fixmap for permanent FDT mapping, removing the initial dtb
placement requirements (within 512MB from the start of the kernel
image). This required moving the FDT self reservation out of the
memreserve processing
- Idmap (1:1 mapping used for MMU on/off) handling clean-up
- Removing flush_cache_all() - not safe on ARM unless the MMU is off.
Last stages of CPU power down/up are handled by firmware already
- "Alternatives" (run-time code patching) refactoring and support for
immediate branch patching, GICv3 CPU interface access
- User faults handling clean-up
And some fixes:
- Fix for VDSO building with broken ELF toolchains
- Fixing another case of init_mm.pgd usage for user mappings (during
ASID roll-over broadcasting)
- Fix for FPSIMD reloading after CPU hotplug
- Fix for missing syscall trace exit
- Workaround for .inst asm bug
- Compat fix for switching the user tls tpidr_el0 register
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVitZgAAoJEGvWsS0AyF7x+ToP/0Yci5bNsYVwVay8N1rK6WHh
SGzDMzyxcSBjQpz2IhhTJ28eTAEH+a+HWQms+0tBehjqxqkvjuzBN0okDkc/z8NB
7Z0BV2aLkQcMwTbjgIh5jm25ZpGmvmvbWPD5oBwgmgQ4v4i1OLRKgx7+YQ+z9rWZ
zC70d0UwyWjs2oxmjd2ZrAkps7v/qozEFhcRHxLzCn8Mbw+3FcTQsqMbfnoWGnH0
YuGfHQQqBY4/HC7uAslMCy7tXeJXqb+NsgrnAovjfEbHGDjsg0KNl06K++LHwE37
A5Noa3M0AQEPYqx/sg0Ec8RNUUEMB4RA2DCaibp7XlVGncXOwFfiyk/M5uVrYXIO
ku5QF0ytUfZKzrMq3yQKbEDuCPOFTqjjdVpkeXKFdW66zYTohKVc3vUBV5xHZ5uO
7Kr8H0ZnhAv3OxPyKdEwAuQ5sJdWwQSvZyGClxMUO4eC/UzD0Mwwf1Y8WYtiAXx+
NSTeBKw/m33W3/KhNuNH1+qGEOKhuXuKX7AcYA84Rab8ytxYWcurHCG2bmhzpEse
2DZtNMybrP/HMQPyJlYgGac8B3QbsAIAkkU1f+dJTAv9otuBDhscaDQyZ9Y6WVht
/k8zJiZeMEuGAmwgTkzLmWs/8pQq42nW4J4eQdXPZAwp4ghCIypPWfaZASAwee6/
p+es3v8P4k9wkv2TFZMh
=YeGl
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Mostly refactoring/clean-up:
- CPU ops and PSCI (Power State Coordination Interface) refactoring
following the merging of the arm64 ACPI support, together with
handling of Trusted (secure) OS instances
- Using fixmap for permanent FDT mapping, removing the initial dtb
placement requirements (within 512MB from the start of the kernel
image). This required moving the FDT self reservation out of the
memreserve processing
- Idmap (1:1 mapping used for MMU on/off) handling clean-up
- Removing flush_cache_all() - not safe on ARM unless the MMU is off.
Last stages of CPU power down/up are handled by firmware already
- "Alternatives" (run-time code patching) refactoring and support for
immediate branch patching, GICv3 CPU interface access
- User faults handling clean-up
And some fixes:
- Fix for VDSO building with broken ELF toolchains
- Fix another case of init_mm.pgd usage for user mappings (during
ASID roll-over broadcasting)
- Fix for FPSIMD reloading after CPU hotplug
- Fix for missing syscall trace exit
- Workaround for .inst asm bug
- Compat fix for switching the user tls tpidr_el0 register"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
arm64: use private ratelimit state along with show_unhandled_signals
arm64: show unhandled SP/PC alignment faults
arm64: vdso: work-around broken ELF toolchains in Makefile
arm64: kernel: rename __cpu_suspend to keep it aligned with arm
arm64: compat: print compat_sp instead of sp
arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP
arm64: entry: fix context tracking for el0_sp_pc
arm64: defconfig: enable memtest
arm64: mm: remove reference to tlb.S from comment block
arm64: Do not attempt to use init_mm in reset_context()
arm64: KVM: Switch vgic save/restore to alternative_insn
arm64: alternative: Introduce feature for GICv3 CPU interface
arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning
arm64: fix bug for reloading FPSIMD state after CPU hotplug.
arm64: kernel thread don't need to save fpsimd context.
arm64: fix missing syscall trace exit
arm64: alternative: Work around .inst assembler bugs
arm64: alternative: Merge alternative-asm.h into alternative.h
arm64: alternative: Allow immediate branch as alternative instruction
arm64: Rework alternate sequence for ARM erratum 845719
...
- ACPICA update to upstream revision 20150515 including basic
support for ACPI 6 features: new ACPI tables introduced by
ACPI 6 (STAO, XENV, WPBT, NFIT, IORT), changes related to the
other tables (DTRM, FADT, LPIT, MADT), new predefined names
(_BTH, _CR3, _DSD, _LPI, _MTL, _PRR, _RDI, _RST, _TFP, _TSN),
fixes and cleanups (Bob Moore, Lv Zheng).
- ACPI device power management core code update to follow ACPI 6
which reflects the ACPI device power management implementation
in Windows (Rafael J Wysocki).
- Rework of the backlight interface selection logic to reduce the
number of kernel command line options and improve the handling
of DMI quirks that may be involved in that and to make the
code generally more straightforward (Hans de Goede).
- Fixes for the ACPI Embedded Controller (EC) driver related to
the handling of EC transactions (Lv Zheng).
- Fix for a regression related to the ACPI resources management
and resulting from a recent change of ACPI initialization code
ordering (Rafael J Wysocki).
- Fix for a system initialization regression related to ACPI
introduced during the 3.14 cycle and caused by running the
code that switches the platform over to the ACPI mode too
early in the initialization sequence (Rafael J Wysocki).
- Support for the ACPI _CCA device configuration object related
to DMA cache coherence (Suravee Suthikulpanit).
- ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
- ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
- ACPI processor driver cleanups (Hanjun Guo).
- Cleanups and documentation update related to the ACPI device
properties interface based on _DSD (Rafael J Wysocki).
- ACPI device power management fixes (Rafael J Wysocki).
- Assorted cleanups related to ACPI (Dominik Brodowski. Fabian
Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
- Fix for a long-standing issue causing General Protection Faults
to be generated occasionally on return to user space after resume
from ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
- Fix to make the suspend core code return -EBUSY consistently in
all cases when system suspend is aborted due to wakeup detection
(Ruchi Kandoi).
- Support for automated device wakeup IRQ handling allowing drivers
to make their PM support more starightforward (Tony Lindgren).
- New tracepoints for suspend-to-idle tracing and rework of the
prepare/complete callbacks tracing in the PM core (Todd E Brandt,
Rafael J Wysocki).
- Wakeup sources framework enhancements (Jin Qian).
- New macro for noirq system PM callbacks (Grygorii Strashko).
- Assorted cleanups related to system suspend (Rafael J Wysocki).
- cpuidle core cleanups to make the code more efficient (Rafael J
Wysocki).
- powernv/pseries cpuidle driver update (Shilpasri G Bhat).
- cpufreq core fixes related to CPU online/offline that should
reduce the overhead of these operations quite a bit, unless the
CPU in question is physically going away (Viresh Kumar, Saravana
Kannan).
- Serialization of cpufreq governor callbacks to avoid race
conditions in some cases (Viresh Kumar).
- intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
Bhargava, Joe Konno).
- cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
Holla, Felipe Balbi, Tang Yuantian).
- Assorted cleanups in cpufreq drivers and core (Shailendra Verma,
Fabian Frederick, Wang Long).
- New Device Tree bindings for representing Operating Performance
Points (Viresh Kumar).
- Updates for the common clock operations support code in the PM
core (Rajendra Nayak, Geert Uytterhoeven).
- PM domains core code update (Geert Uytterhoeven).
- Intel Knights Landing support for the RAPL (Running Average Power
Limit) power capping driver (Dasaratharaman Chandramouli).
- Fixes related to the floor frequency setting on Atom SoCs in the
RAPL power capping driver (Ajay Thomas).
- Runtime PM framework documentation update (Ben Dooks).
- cpupower tool fix (Herton R Krzesinski).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJViJdWAAoJEILEb/54YlRx/9gP/3gHoFevNRycvn0VpKqdufCI
Mxy2LBBLlfyW2uD3+NvqvA2WWSo0Cs/LgXa04eAVxPdU7k48s8w+54U23wSouzjW
gfwAmuHxzDR8v0h8X3h6BxNzmkIQHtmDcQlA/cZdHejY/UUw01yxRGNUUZDNbxlm
WXn2nmlBLmGqXTYq0fpBV+3jicUghJqHHsBCqa3VR2yQioHMJG01F4UZMqYTZunN
OIvDUghxByKz6alzdCqlLl1Y0exV6vwWUAzBsl1qHqmHu/bWFSZn3ujNNVrjqHhw
Kl7/8dC2pQkv3Zo3gEVvfQ0onotwWZxGHzPQRdvmxvRnBunQVCi/wynx90yABX/r
PPb/iBNV0mZskbF0zb0GZT3ZZWGA8Z0p3o5JQv2jV4m62qTzx8w50Y5kbn9N1WT+
5bre7AVbVAlGonWszcS9iE+6TOboRz9OD1CCwPFXHItFutlBkau+1hHfFoLM0o9n
LhpGuyszT/EUa1BHkLzuCckFqO2DpbF3N2CKmuTekw0CdgdsvRL2pRByuerk3j7R
WQhlcvBq5YH6j43AuoEZKp8r1iN8oG/iqlrMYQaYWrW9hJaoQOoU8dGJxp/e7gKN
r/qeYjETI+tIsjCbtH5WQzzxDI3gPISAYAtfqs7G34EEo+Lwp6kyRUAF4kDot2V3
ZIyuKMmTu4cdwDETr/O+
=7jTj
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"The rework of backlight interface selection API from Hans de Goede
stands out from the number of commits and the number of affected
places perspective. The cpufreq core fixes from Viresh Kumar are
quite significant too as far as the number of commits goes and because
they should reduce CPU online/offline overhead quite a bit in the
majority of cases.
From the new featues point of view, the ACPICA update (to upstream
revision 20150515) adding support for new ACPI 6 material to ACPICA is
the one that matters the most as some new significant features will be
based on it going forward. Also included is an update of the ACPI
device power management core to follow ACPI 6 (which in turn reflects
the Windows' device PM implementation), a PM core extension to support
wakeup interrupts in a more generic way and support for the ACPI _CCA
device configuration object.
The rest is mostly fixes and cleanups all over and some documentation
updates, including new DT bindings for Operating Performance Points.
There is one fix for a regression introduced in the 4.1 cycle, but it
adds quite a number of lines of code, it wasn't really ready before
Thursday and you were on vacation, so I refrained from pushing it on
the last minute for 4.1.
Specifics:
- ACPICA update to upstream revision 20150515 including basic support
for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO,
XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM,
FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI,
_MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore,
Lv Zheng).
- ACPI device power management core code update to follow ACPI 6
which reflects the ACPI device power management implementation in
Windows (Rafael J Wysocki).
- rework of the backlight interface selection logic to reduce the
number of kernel command line options and improve the handling of
DMI quirks that may be involved in that and to make the code
generally more straightforward (Hans de Goede).
- fixes for the ACPI Embedded Controller (EC) driver related to the
handling of EC transactions (Lv Zheng).
- fix for a regression related to the ACPI resources management and
resulting from a recent change of ACPI initialization code ordering
(Rafael J Wysocki).
- fix for a system initialization regression related to ACPI
introduced during the 3.14 cycle and caused by running the code
that switches the platform over to the ACPI mode too early in the
initialization sequence (Rafael J Wysocki).
- support for the ACPI _CCA device configuration object related to
DMA cache coherence (Suravee Suthikulpanit).
- ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
- ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
- ACPI processor driver cleanups (Hanjun Guo).
- cleanups and documentation update related to the ACPI device
properties interface based on _DSD (Rafael J Wysocki).
- ACPI device power management fixes (Rafael J Wysocki).
- assorted cleanups related to ACPI (Dominik Brodowski, Fabian
Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
- fix for a long-standing issue causing General Protection Faults to
be generated occasionally on return to user space after resume from
ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
- fix to make the suspend core code return -EBUSY consistently in all
cases when system suspend is aborted due to wakeup detection (Ruchi
Kandoi).
- support for automated device wakeup IRQ handling allowing drivers
to make their PM support more starightforward (Tony Lindgren).
- new tracepoints for suspend-to-idle tracing and rework of the
prepare/complete callbacks tracing in the PM core (Todd E Brandt,
Rafael J Wysocki).
- wakeup sources framework enhancements (Jin Qian).
- new macro for noirq system PM callbacks (Grygorii Strashko).
- assorted cleanups related to system suspend (Rafael J Wysocki).
- cpuidle core cleanups to make the code more efficient (Rafael J
Wysocki).
- powernv/pseries cpuidle driver update (Shilpasri G Bhat).
- cpufreq core fixes related to CPU online/offline that should reduce
the overhead of these operations quite a bit, unless the CPU in
question is physically going away (Viresh Kumar, Saravana Kannan).
- serialization of cpufreq governor callbacks to avoid race
conditions in some cases (Viresh Kumar).
- intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
Bhargava, Joe Konno).
- cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
Holla, Felipe Balbi, Tang Yuantian).
- assorted cleanups in cpufreq drivers and core (Shailendra Verma,
Fabian Frederick, Wang Long).
- new Device Tree bindings for representing Operating Performance
Points (Viresh Kumar).
- updates for the common clock operations support code in the PM core
(Rajendra Nayak, Geert Uytterhoeven).
- PM domains core code update (Geert Uytterhoeven).
- Intel Knights Landing support for the RAPL (Running Average Power
Limit) power capping driver (Dasaratharaman Chandramouli).
- fixes related to the floor frequency setting on Atom SoCs in the
RAPL power capping driver (Ajay Thomas).
- runtime PM framework documentation update (Ben Dooks).
- cpupower tool fix (Herton R Krzesinski)"
* tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits)
cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
x86: Load __USER_DS into DS/ES after resume
PM / OPP: Add binding for 'opp-suspend'
PM / OPP: Allow multiple OPP tables to be passed via DT
PM / OPP: Add new bindings to address shortcomings of existing bindings
ACPI: Constify ACPI device IDs in documentation
ACPI / enumeration: Document the rules regarding the PRP0001 device ID
ACPI / video: Make acpi_video_unregister_backlight() private
acpi-video-detect: Remove old API
toshiba-acpi: Port to new backlight interface selection API
thinkpad-acpi: Port to new backlight interface selection API
sony-laptop: Port to new backlight interface selection API
samsung-laptop: Port to new backlight interface selection API
msi-wmi: Port to new backlight interface selection API
msi-laptop: Port to new backlight interface selection API
intel-oaktrail: Port to new backlight interface selection API
ideapad-laptop: Port to new backlight interface selection API
fujitsu-laptop: Port to new backlight interface selection API
eeepc-laptop: Port to new backlight interface selection API
dell-wmi: Port to new backlight interface selection API
...
Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.
The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.
The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).
The main changes in this cycle were:
* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)
- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:
[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------
This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.
- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)
* x86/asm changes:
- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)
- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)
- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)
- NOP handling refinements. (Borislav Petkov)
* x86/mm changes:
- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)
- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)
* x86/ras changes:
- Add support for deferred errors on AMD (Aravind Gopalakrishnan)
This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.
- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)
- Misc cleanups (Borislav Petkov)
* x86/platform changes:
- Intel Atom SoC updates
... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...
Pull scheduler updates from Ingo Molnar:
"The main changes are:
- lockless wakeup support for futexes and IPC message queues
(Davidlohr Bueso, Peter Zijlstra)
- Replace spinlocks with atomics in thread_group_cputimer(), to
improve scalability (Jason Low)
- NUMA balancing improvements (Rik van Riel)
- SCHED_DEADLINE improvements (Wanpeng Li)
- clean up and reorganize preemption helpers (Frederic Weisbecker)
- decouple page fault disabling machinery from the preemption
counter, to improve debuggability and robustness (David
Hildenbrand)
- SCHED_DEADLINE documentation updates (Luca Abeni)
- topology CPU masks cleanups (Bartosz Golaszewski)
- /proc/sched_debug improvements (Srikar Dronamraju)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
sched/deadline: Remove needless parameter in dl_runtime_exceeded()
sched: Remove superfluous resetting of the p->dl_throttled flag
sched/deadline: Drop duplicate init_sched_dl_class() declaration
sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
sched/deadline: Make init_sched_dl_class() __init
sched/deadline: Optimize pull_dl_task()
sched/preempt: Add static_key() to preempt_notifiers
sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched
sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
sched/debug: Properly format runnable tasks in /proc/sched_debug
sched/numa: Only consider less busy nodes as numa balancing destinations
Revert 095bebf61a ("sched/numa: Do not move past the balance point if unbalanced")
sched/fair: Prevent throttling in early pick_next_task_fair()
preempt: Reorganize the notrace definitions a bit
preempt: Use preempt_schedule_context() as the official tracing preemption point
sched: Make preempt_schedule_context() function-tracing safe
x86: Remove cpu_sibling_mask() and cpu_core_mask()
x86: Replace cpu_**_mask() with topology_**_cpumask()
...
printk_ratelimit() shares the ratelimiting state with other callers what
may lead to scenarios where at the time we want to print out debug
information we already limited, so nothing appears in the dmesg - this
makes exception-trace quite poor helper in debugging.
Additionally, we have imbalance with some messages limited with global
ratelimit state and other messages limited with their private state
defined via pr_*_ratelimited().
To address this inconsistency show_unhandled_signals_ratelimited()
macro is introduced and caller sites are converted to use it.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch renames __cpu_suspend to cpu_suspend so that it's aligned
with ARM32. It also removes the redundant wrapper created.
This is in preparation to implement generic PSCI system suspend using
the cpu_{suspend,resume} which now has the same interface on both ARM
and ARM64.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
section 6.2.17 _CCA states that ARM platforms require ACPI _CCA
object to be specified for DMA-cabpable devices. Therefore, this patch
specifies ACPI_CCA_REQUIRED in arm64 Kconfig.
In addition, to handle the case when _CCA is missing, arm64 would assign
dummy_dma_ops to disable DMA capability of the device.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
tlb.S has been removed since fa48e6f "arm64: mm: Optimise tlb flush logic
where we have >4K granule", so align comment with that.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
So far, we configured the world-switch by having a small array
of pointers to the save and restore functions, depending on the
GIC used on the platform.
Loading these values each time is a bit silly (they never change),
and it makes sense to rely on the instruction patching instead.
This leads to a nice cleanup of the code.
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add a new item to the feature set (ARM64_HAS_SYSREG_GIC_CPUIF)
to indicate that we have a system register GIC CPU interface
This will help KVM switching to alternative instruction patching.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add ioremap_wt() to all arch-specific asm/io.h headers which
define ioremap_wc() locally. These headers do not include
<asm-generic/iomap.h>. Some of them include <asm-generic/io.h>,
but ioremap_wt() is defined for consistency since they define
all ioremap_xxx locally.
In all architectures without Write-Through support, ioremap_wt()
is defined indentical to ioremap_nocache().
frv and m68k already have ioremap_writethrough(). On those we
add ioremap_wt() indetical to ioremap_writethrough() and defines
ARCH_HAS_IOREMAP_WT in both architectures.
The ioremap_wt() interface is exported to drivers.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arnd@arndb.de
Cc: hch@lst.de
Cc: hmh@hmh.eng.br
Cc: jgross@suse.com
Cc: konrad.wilk@oracle.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: linux-nvdimm@lists.01.org
Cc: stefan.bader@canonical.com
Cc: yigal@plexistor.com
Link: http://lkml.kernel.org/r/1433436928-31903-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
AArch64 toolchains suffer from the following bug:
$ cat blah.S
1:
.inst 0x01020304
.if ((. - 1b) != 4)
.error "blah"
.endif
$ aarch64-linux-gnu-gcc -c blah.S
blah.S: Assembler messages:
blah.S:3: Error: non-constant expression in ".if" statement
which precludes the use of msr_s and co as part of alternatives.
We workaround this issue by not directly testing the labels
themselves, but by moving the current output pointer by a value
that should always be zero. If this value is not null, then
we will trigger a backward move, which is expclicitely forbidden.
This triggers the error we're after:
AS arch/arm64/kvm/hyp.o
arch/arm64/kvm/hyp.S: Assembler messages:
arch/arm64/kvm/hyp.S:1377: Error: attempt to move .org backwards
scripts/Makefile.build:294: recipe for target 'arch/arm64/kvm/hyp.o' failed
make[1]: *** [arch/arm64/kvm/hyp.o] Error 1
Makefile:946: recipe for target 'arch/arm64/kvm' failed
Not pretty, but at least works on the current toolchains.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
asm/alternative-asm.h and asm/alternative.h are extremely similar,
and really deserve to live in the same file (as this makes further
modufications a bit easier).
Fold the content of alternative-asm.h into alternative.h, and
update the few users.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In order to deal with branches located in alternate sequences,
but pointing to the main kernel text, it is required to extract
the relative displacement encoded in the instruction, and to be
able to update said instruction with a new offset (once it is
known).
For this, we introduce three new helpers:
- aarch64_insn_is_branch_imm is a predicate indicating if the
instruction is an immediate branch
- aarch64_get_branch_offset returns a signed value representing
the byte offset encoded in a branch instruction
- aarch64_set_branch_offset takes an instruction and an offset,
and returns the corresponding updated instruction.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently, the FDT blob needs to be in the same 512 MB region as
the kernel, so that it can be mapped into the kernel virtual memory
space very early on using a minimal set of statically allocated
translation tables.
Now that we have early fixmap support, we can relax this restriction,
by moving the permanent FDT mapping to the fixmap region instead.
This way, the FDT blob may be anywhere in memory.
This also moves the vetting of the FDT to mmu.c, since the early
init code in head.S does not handle mapping of the FDT anymore.
At the same time, fix up some comments in head.S that have gone stale.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit a4780adeef ("ARM: 7735/2: Preserve the user r/w register
TPIDRURW on context switch and fork"), arch/arm/ has context switched
the user-writable TLS register, so do the same for compat tasks running
under the arm64 kernel.
Reported-by: André Hentschel <nerv@dawncrow.de>
Tested-by: André Hentschel <nerv@dawncrow.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The 32-bit ARM port doesn't have ACPI headers, and conditionally
including them is going to look horrendous. In preparation for sharing
the PSCI invocation code with 32-bit, move the acpi_psci_* function
declarations and definitions such that the PSCI client code need not
include ACPI headers.
While it would seem like we could simply hide the ACPI includes in
psci.h, the ACPI headers have hilarious circular dependencies which make
this infeasible without reorganising most of ACPICA. So rather than
doing that, move the acpi_psci_* prototypes into psci.h.
The psci_acpi_init function is made dependent on CONFIG_ACPI (with a
stub implementation in asm/psci.h) such that it need not be built for
32-bit ARM or kernels without ACPI support. The currently missing __init
annotations are added to the prototypes in the header.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Al Stone <al.stone@linaro.org>
Reviewed-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
The PSCI MIGRATE_INFO_UP_CPU call returns a physical ID, which we will
need to map back to a Linux logical ID.
Implement a reusable get_logical_index to map from a physical ID to a
logical ID.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
For ARM64, when tracing with tracepoint events, the IP and pstate are set
to 0, preventing the perf code parsing the callchain and resolving the
symbols correctly.
./perf record -e sched:sched_switch -g --call-graph dwarf ls
[ perf record: Captured and wrote 0.146 MB perf.data ]
./perf report -f
Samples: 194 of event 'sched:sched_switch', Event count (approx.): 194
Children Self Command Shared Object Symbol
100.00% 100.00% ls [unknown] [.] 0000000000000000
The fix is to implement perf_arch_fetch_caller_regs for ARM64, which fills
several necessary registers used for callchain unwinding, including pc,sp,
fp and spsr .
With this patch, callchain can be parsed correctly as follows:
......
+ 2.63% 0.00% ls [kernel.kallsyms] [k] vfs_symlink
+ 2.63% 0.00% ls [kernel.kallsyms] [k] follow_down
+ 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_get
+ 2.63% 0.00% ls [kernel.kallsyms] [k] do_execveat_common.isra.33
- 2.63% 0.00% ls [kernel.kallsyms] [k] pfkey_send_policy_notify
pfkey_send_policy_notify
pfkey_get
v9fs_vfs_rename
page_follow_link_light
link_path_walk
el0_svc_naked
.......
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>