The TERES-I is an open hardware laptop built by Olimex using the
Allwinner A64 SoC.
Add the board specific .dts file, which includes the A64 .dtsi and
enables the peripherals that we support so far.
Signed-off-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
The A64 SoC features two display pipelines, one has a LCD output, the
other has a HDMI output.
Add support for simplefb for the LCD output. Tested on Teres I.
This patch was inspired by work of Icenowy Zheng.
Signed-off-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Add a watchdog node for the A64, automatically enabled on all boards.
Since the device is compatible with an existing driver, we only reserve
a new compatible string to be used together with the fall back.
Tested on Olimex Teres-I.
Signed-off-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Add the proper pin group node to reference in board files.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Pine H64 is an Allwinner H6-based SBC from Pine64, with the following
features:
- 1GiB/2GiB/4GiB LPDDR3 DRAM (in 4GiB situation only 3GiB is
accessible)
- AXP805 PMIC
- Raspberry-Pi-compatible GPIO header, "Euler" GPIO header (not
compatible with the "Euler" on Pine A64) and "Expansion" pin header
- 2 USB 2.0 ports and 1 USB 3.0 ports
- Audio jack
- MicroSD slot and eMMC module slot
- on-board SPI NOR flash
- 1Gbps Ethernet port (via RTL8211E PHY)
- HDMI port
Adds initial support for it, including the UART on the Expansion pin
header.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Allwinner H6 is a new SoC with Cortex-A53 cores from Allwinner, with its
memory map fully reworked and some high-speed peripherals (PCIe, USB
3.0) introduced.
This commit adds the basical DTSI file of it, including the clock
support and UART support.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC
V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses
the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead
of Silicon provider service ID 0xC2001700.
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Expose the new features introduced by Arm v8.4 extensions to
Arm v8-A profile.
These include :
1) Data indpendent timing of instructions. (DIT, exposed as HWCAP_DIT)
2) Unaligned atomic instructions and Single-copy atomicity of loads
and stores. (AT, expose as HWCAP_USCAT)
3) LDAPR and STLR instructions with immediate offsets (extension to
LRCPC, exposed as HWCAP_ILRCPC)
4) Flag manipulation instructions (TS, exposed as HWCAP_FLAGM).
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we started keeping modules within 4 GB of the core kernel
in all cases, we no longer need to special case the adr_l/ldr_l/str_l
macros for modules to deal with them being loaded farther away.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The printk symbol was intended as a generic address that is always
exported, however that turned out to be false with CONFIG_PRINTK=n:
ERROR: "printk" [arch/arm64/kernel/arm64-reloc-test.ko] undefined!
This changes the references to memstart_addr, which should be there
regardless of configuration.
Fixes: a257e02579 ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The schematic of the espressobin is publicly available, add a comment
where to find it.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the PCIe host
controller used on CP110 component of the Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "PCI: armada8k: Fix clock resource by adding
a register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the NAND controller
used on CP110 component of the Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "mtd: nand: marvell: Fix clock resource by adding
a register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the safexcel EIP97
used on CP110 component of the Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "crypto: inside-secure - fix clock resource by
adding a register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the harware RNG
used on CP110 component of the Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "hwrng: omap - Fix clock resource by adding a
register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the XOR engine
controller used on CP110 component of the Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "dmaengine: mv_xor_v2: Fix clock resource by
adding a register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
This extra clock is needed to access the registers of the USB host
controller used on Armada 7K/8K SoCs.
This follow the changes already made in the binding documentation (as
well as in the driver): "usb: host: xhci-plat: Fix clock resource by
adding a register clock"
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Cortex-A57 and A72 are vulnerable to the so-called "variant 3a" of
Meltdown, where an attacker can speculatively obtain the value
of a privileged system register.
By enabling ARM64_HARDEN_EL2_VECTORS on these CPUs, obtaining
VBAR_EL2 is not disclosing the hypervisor mappings anymore.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We're now ready to map our vectors in weird and wonderful locations.
On enabling ARM64_HARDEN_EL2_VECTORS, a vector slot gets allocated
if this hasn't been already done via ARM64_HARDEN_BRANCH_PREDICTOR
and gets mapped outside of the normal RAM region, next to the
idmap.
That way, being able to obtain VBAR_EL2 doesn't reveal the mapping
of the rest of the hypervisor code.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We're about to need to allocate hardening slots from other parts
of the kernel (in order to support ARM64_HARDEN_EL2_VECTORS).
Turn the counter into an atomic_t and make it available to the
rest of the kernel. Also add BP_HARDEN_EL2_SLOTS as the number of
slots instead of the hardcoded 4...
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Until now, all EL2 executable mappings were derived from their
EL1 VA. Since we want to decouple the vectors mapping from
the rest of the hypervisor, we need to be able to map some
text somewhere else.
The "idmap" region (for lack of a better name) is ideally suited
for this, as we have a huge range that hardly has anything in it.
Let's extend the IO allocator to also deal with executable mappings,
thus providing the required feature.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, the branch from the vector slots to the main vectors can at
most be 4GB from the main vectors (the reach of ADRP), and this
distance is known at compile time. If we were to remap the slots
to an unrelated VA, things would break badly.
A way to achieve VA independence would be to load the absolute
address of the vectors (__kvm_hyp_vector), either using a constant
pool or a series of movs, followed by an indirect branch.
This patches implements the latter solution, using another instance
of a patching callback. Note that since we have to save a register
pair on the stack, we branch to the *second* instruction in the
vectors in order to compensate for it. This also results in having
to adjust this balance in the invalid vector entry point.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, we only reserve a single instruction in the BPI template in
order to branch to the vectors. As we're going to stuff a few more
instructions there, let's reserve a total of 5 instructions, which
we're going to patch later on as required.
We also introduce a small refactor of the vectors themselves, so that
we stop carrying the target branch around.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no reason why the BP hardening vectors shouldn't be part
of the HYP text at compile time, rather than being mapped at runtime.
Also introduce a new config symbol that controls the compilation
of bpi.S.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
All our useful entry points into the hypervisor are starting by
saving x0 and x1 on the stack. Let's move those into the vectors
by introducing macros that annotate whether a vector is valid or
not, thus indicating whether we want to stash registers or not.
The only drawback is that we now also stash registers for el2_error,
but this should never happen, and we pop them back right at the
start of the handling sequence.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We currently provide the hyp-init code with a kernel VA, and expect
it to turn it into a HYP va by itself. As we're about to provide
the hypervisor with mappings that are not necessarily in the memory
range, let's move the kern_hyp_va macro to kvm_get_hyp_vector.
No functionnal change.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The main idea behind randomising the EL2 VA is that we usually have
a few spare bits between the most significant bit of the VA mask
and the most significant bit of the linear mapping.
Those bits could be a bunch of zeroes, and could be useful
to move things around a bit. Of course, the more memory you have,
the less randomisation you get...
Alternatively, these bits could be the result of KASLR, in which
case they are already random. But it would be nice to have a
*different* randomization, just to make the job of a potential
attacker a bit more difficult.
Inserting these random bits is a bit involved. We don't have a spare
register (short of rewriting all the kern_hyp_va call sites), and
the immediate we want to insert is too random to be used with the
ORR instruction. The best option I could come up with is the following
sequence:
and x0, x0, #va_mask
ror x0, x0, #first_random_bit
add x0, x0, #(random & 0xfff)
add x0, x0, #(random >> 12), lsl #12
ror x0, x0, #(63 - first_random_bit)
making it a fairly long sequence, but one that a decent CPU should
be able to execute without breaking a sweat. It is of course NOPed
out on VHE. The last 4 instructions can also be turned into NOPs
if it appears that there is no free bits to use.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we're moving towards a much more dynamic way to compute our
HYP VA, let's express the mask in a slightly different way.
Instead of comparing the idmap position to the "low" VA mask,
we directly compute the mask by taking into account the idmap's
(VA_BIT-1) bit.
No functionnal change.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The encoder for ADD/SUB (immediate) can only cope with 12bit
immediates, while there is an encoding for a 12bit immediate shifted
by 12 bits to the left.
Let's fix this small oversight by allowing the LSL_12 bit to be set.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add an encoder for the EXTR instruction, which also implements the ROR
variant (where Rn == Rm).
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we're about to change the way we map devices at HYP, we need
to move away from kern_hyp_va on an IO address.
One way of achieving this is to store the VAs in kvm_vgic_global_state,
and use that directly from the HYP code. This requires a small change
to create_hyp_io_mappings so that it can also return a HYP VA.
We take this opportunity to nuke the vctrl_base field in the emulated
distributor, as it is not used anymore.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Both HYP io mappings call ioremap, followed by create_hyp_io_mappings.
Let's move the ioremap call into create_hyp_io_mappings itself, which
simplifies the code a bit and allows for further refactoring.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
kvm_vgic_global_state is part of the read-only section, and is
usually accessed using a PC-relative address generation (adrp + add).
It is thus useless to use kern_hyp_va() on it, and actively problematic
if kern_hyp_va() becomes non-idempotent. On the other hand, there is
no way that the compiler is going to guarantee that such access is
always PC relative.
So let's bite the bullet and provide our own accessor.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Now that we can dynamically compute the kernek/hyp VA mask, there
is no need for a feature flag to trigger the alternative patching.
Let's drop the flag and everything that depends on it.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, we're using a complicated sequence of alternatives to
patch the kernel/hyp VA mask on non-VHE, and NOP out the
masking altogether when on VHE.
The newly introduced dynamic patching gives us the opportunity
to simplify that code by patching a single instruction with
the correct mask (instead of the mind bending cumulative masking
we have at the moment) or even a single NOP on VHE. This also
adds some initial code that will allow the patching callback
to switch to a more complex patching.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We lack a way to encode operations such as AND, ORR, EOR that take
an immediate value. Doing so is quite involved, and is all about
reverse engineering the decoding algorithm described in the
pseudocode function DecodeBitMasks().
This has been tested by feeding it all the possible literal values
and comparing the output with that of GAS.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We're missing the a way to generate the encoding of the N immediate,
which is only a single bit used in a number of instruction that take
an immediate.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We've so far relied on a patching infrastructure that only gave us
a single alternative, without any way to provide a range of potential
replacement instructions. For a single feature, this is an all or
nothing thing.
It would be interesting to have a more flexible grained way of patching
the kernel though, where we could dynamically tune the code that gets
injected.
In order to achive this, let's introduce a new form of dynamic patching,
assiciating a callback to a patching site. This callback gets source and
target locations of the patching request, as well as the number of
instructions to be patched.
Dynamic patching is declared with the new ALTERNATIVE_CB and alternative_cb
directives:
asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback)
: "r" (v));
or
alternative_cb callback
mov x0, #0
alternative_cb_end
where callback is the C function computing the alternative.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We can finally get completely rid of any calls to the VGICv3
save/restore functions when the AP lists are empty on VHE systems. This
requires carefully factoring out trap configuration from saving and
restoring state, and carefully choosing what to do on the VHE and
non-VHE path.
One of the challenges is that we cannot save/restore the VMCR lazily
because we can only write the VMCR when ICC_SRE_EL1.SRE is cleared when
emulating a GICv2-on-GICv3, since otherwise all Group-0 interrupts end
up being delivered as FIQ.
To solve this problem, and still provide fast performance in the fast
path of exiting a VM when no interrupts are pending (which also
optimized the latency for actually delivering virtual interrupts coming
from physical interrupts), we orchestrate a dance of only doing the
activate/deactivate traps in vgic load/put for VHE systems (which can
have ICC_SRE_EL1.SRE cleared when running in the host), and doing the
configuration on every round-trip on non-VHE systems.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The APRs can only have bits set when the guest acknowledges an interrupt
in the LR and can only have a bit cleared when the guest EOIs an
interrupt in the LR. Therefore, if we have no LRs with any
pending/active interrupts, the APR cannot change value and there is no
need to clear it on every exit from the VM (hint: it will have already
been cleared when we exited the guest the last time with the LRs all
EOIed).
The only case we need to take care of is when we migrate the VCPU away
from a CPU or migrate a new VCPU onto a CPU, or when we return to
userspace to capture the state of the VCPU for migration. To make sure
this works, factor out the APR save/restore functionality into separate
functions called from the VCPU (and by extension VGIC) put/load hooks.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Just like we can program the GICv2 hypervisor control interface directly
from the core vgic code, we can do the same for the GICv3 hypervisor
control interface on VHE systems.
We do this by simply calling the save/restore functions when we have VHE
and we can then get rid of the save/restore function calls from the VHE
world switch function.
One caveat is that we now write GICv3 system register state before the
potential early exit path in the run loop, and because we sync back
state in the early exit path, we have to ensure that we read a
consistent GIC state from the sync path, even though we have never
actually run the guest with the newly written GIC state. We solve this
by inserting an ISB in the early exit path.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The vgic-v2-sr.c file now only contains the logic to replay unaligned
accesses to the virtual CPU interface on 16K and 64K page systems, which
is only relevant on 64-bit platforms. Therefore move this file to the
arm64 KVM tree, remove the compile directive from the 32-bit side
makefile, and remove the ifdef in the C file.
Since this file also no longer saves/restores anything, rename the file
to vgic-v2-cpuif-proxy.c to more accurately describe the logic in this
file.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We can program the GICv2 hypervisor control interface logic directly
from the core vgic code and can instead do the save/restore directly
from the flush/sync functions, which can lead to a number of future
optimizations.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
To make the code more readable and to avoid the overhead of a function
call, let's get rid of a pair of the alternative function selectors and
explicitly call the VHE and non-VHE functions using the has_vhe() static
key based selector instead, telling the compiler to try to inline the
static function if it can.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We do not have to change the c15 trap setting on each switch to/from the
guest on VHE systems, because this setting only affects guest EL1/EL0
(and therefore not the VHE host).
The PMU and debug trap configuration can also be done on vcpu load/put
instead, because they don't affect how the VHE host kernel can access the
debug registers while executing KVM kernel code.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no longer a need for an alternative to choose the right
function to tell us whether or not FPSIMD was enabled for the VM,
because we can simply can the appropriate functions directly from within
the _vhe and _nvhe run functions.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we are about to be more lazy with some of the trap configuration
register read/writes for VHE systems, move the logic that is currently
shared between VHE and non-VHE into a separate function which can be
called from either the world-switch path or from vcpu_load/vcpu_put.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When running a 32-bit VM (EL1 in AArch32), the AArch32 system registers
can be deferred to vcpu load/put on VHE systems because neither
the host kernel nor host userspace uses these registers.
Note that we can't save DBGVCR32_EL2 conditionally based on the state of
the debug dirty flag on VHE after this change, because during
vcpu_load() we haven't calculated a valid debug flag yet, and when we've
restored the register during vcpu_load() we also have to save it during
vcpu_put(). This means that we'll always restore/save the register for
VHE on load/put, but luckily vcpu load/put are called rarely, so saving
an extra register unconditionally shouldn't significantly hurt
performance.
We can also not defer saving FPEXC32_32 because this register only holds
a guest-valid value for 32-bit guests during the exit path when the
guest has used FPSIMD registers and restored the register in the early
assembly handler from taking the EL2 fault, and therefore we have to
check if fpsimd is enabled for the guest in the exit path and save the
register then, for both VHE and non-VHE guests.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
32-bit registers are not used by a 64-bit host kernel and can be
deferred, but we need to rework the accesses to these register to access
the latest values depending on whether or not guest system registers are
loaded on the CPU or only reside in memory.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Some system registers do not affect the host kernel's execution and can
therefore be loaded when we are about to run a VCPU and we don't have to
restore the host state to the hardware before the time when we are
actually about to return to userspace or schedule out the VCPU thread.
The EL1 system registers and the userspace state registers only
affecting EL0 execution do not need to be saved and restored on every
switch between the VM and the host, because they don't affect the host
kernel's execution.
We mark all registers which are now deffered as such in the
vcpu_{read,write}_sys_reg accessors in sys-regs.c to ensure the most
up-to-date copy is always accessed.
Note MPIDR_EL1 (controlled via VMPIDR_EL2) is accessed from other vcpu
threads, for example via the GIC emulation, and therefore must be
declared as immediate, which is fine as the guest cannot modify this
value.
The 32-bit sysregs can also be deferred but we do this in a separate
patch as it requires a bit more infrastructure.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
ELR_EL1 is not used by a VHE host kernel and can be deferred, but we
need to rework the accesses to this register to access the latest value
depending on whether or not guest system registers are loaded on the CPU
or only reside in memory.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
SPSR_EL1 is not used by a VHE host kernel and can be deferred, but we
need to rework the accesses to this register to access the latest value
depending on whether or not guest system registers are loaded on the CPU
or only reside in memory.
The handling of accessing the various banked SPSRs for 32-bit VMs is a
bit clunky, but this will be improved in following patches which will
first prepare and subsequently implement deferred save/restore of the
32-bit registers, including the 32-bit SPSRs.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We are about to defer saving and restoring some groups of system
registers to vcpu_put and vcpu_load on supported systems. This means
that we need some infrastructure to access system registes which
supports either accessing the memory backing of the register or directly
accessing the system registers, depending on the state of the system
when we access the register.
We do this by defining read/write accessor functions, which can handle
both "immediate" and "deferrable" system registers. Immediate registers
are always saved/restored in the world-switch path, but deferrable
registers are only saved/restored in vcpu_put/vcpu_load when supported
and sysregs_loaded_on_cpu will be set in that case.
Note that we don't use the deferred mechanism yet in this patch, but only
introduce infrastructure. This is to improve convenience of review in
the subsequent patches where it is clear which registers become
deferred.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Currently we access the system registers array via the vcpu_sys_reg()
macro. However, we are about to change the behavior to some times
modify the register file directly, so let's change this to two
primitives:
* Accessor macros vcpu_write_sys_reg() and vcpu_read_sys_reg()
* Direct array access macro __vcpu_sys_reg()
The accessor macros should be used in places where the code needs to
access the currently loaded VCPU's state as observed by the guest. For
example, when trapping on cache related registers, a write to a system
register should go directly to the VCPU version of the register.
The direct array access macro can be used in places where the VCPU is
known to never be running (for example userspace access) or for
registers which are never context switched (for example all the PMU
system registers).
This rewrites all users of vcpu_sys_regs to one of the macros described
above.
No functional change.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We currently handle 32-bit accesses to trapped VM system registers using
the 32-bit index into the coproc array on the vcpu structure, which is a
union of the coproc array and the sysreg array.
Since all the 32-bit coproc indices are created to correspond to the
architectural mapping between 64-bit system registers and 32-bit
coprocessor registers, and because the AArch64 system registers are the
double in size of the AArch32 coprocessor registers, we can always find
the system register entry that we must update by dividing the 32-bit
coproc index by 2.
This is going to make our lives much easier when we have to start
accessing system registers that use deferred save/restore and might
have to be read directly from the physical CPU.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On non-VHE systems we need to save the ELR_EL2 and SPSR_EL2 so that we can
return to the host in EL1 in the same state and location where we issued a
hypercall to EL2, but on VHE ELR_EL2 and SPSR_EL2 are not useful because we
never enter a guest as a result of an exception entry that would be directly
handled by KVM. The kernel entry code already saves ELR_EL1/SPSR_EL1 on
exception entry, which is enough. Therefore, factor out these registers into
separate save/restore functions, making it easy to exclude them from the VHE
world-switch path later on.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no need to have multiple identical functions with different
names for saving host and guest state. When saving and restoring state
for the host and guest, the state is the same for both contexts, and
that's why we have the kvm_cpu_context structure. Delete one
version and rename the other into simply save/restore.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The comment only applied to SPE on non-VHE systems, so we simply remove
it.
Suggested-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we are about to handle system registers quite differently between VHE
and non-VHE systems. In preparation for that, we need to split some of
the handling functions between VHE and non-VHE functionality.
For now, we simply copy the non-VHE functions, but we do change the use
of static keys for VHE and non-VHE functionality now that we have
separate functions.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we are about to move calls around in the sysreg save/restore logic,
let's first rewrite the alternative function callers, because it is
going to make the next patches much easier to read.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There's a semantic difference between the EL1 registers that control
operation of a kernel running in EL1 and EL1 registers that only control
userspace execution in EL0. Since we can defer saving/restoring the
latter, move them into their own function.
The ARMv8 ARM (ARM DDI 0487C.a) Section D10.2.1 recommends that
ACTLR_EL1 has no effect on the processor when running the VHE host, and
we can therefore move this register into the EL1 state which is only
saved/restored on vcpu_put/load for a VHE host.
We also take this chance to rename the function saving/restoring the
remaining system register to make it clear this function deals with
the EL1 system registers.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The VHE switch function calls __timer_enable_traps and
__timer_disable_traps which don't do anything on VHE systems.
Therefore, simply remove these calls from the VHE switch function and
make the functions non-conditional as they are now only called from the
non-VHE switch path.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no need to reset the VTTBR to zero when exiting the guest on
VHE systems. VHE systems don't use stage 2 translations for the EL2&0
translation regime used by the host.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
VHE kernels run completely in EL2 and therefore don't have a notion of
kernel and hyp addresses, they are all just kernel addresses. Therefore
don't call kern_hyp_va() in the VHE switch function.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far this is mostly (see below) a copy of the legacy non-VHE switch
function, but we will start reworking these functions in separate
directions to work on VHE and non-VHE in the most optimal way in later
patches.
The only difference after this patch between the VHE and non-VHE run
functions is that we omit the branch-predictor variant-2 hardening for
QC Falkor CPUs, because this workaround is specific to a series of
non-VHE ARMv8.0 CPUs.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The current world-switch function has functionality to detect a number
of cases where we need to fixup some part of the exit condition and
possibly run the guest again, before having restored the host state.
This includes populating missing fault info, emulating GICv2 CPU
interface accesses when mapped at unaligned addresses, and emulating
the GICv3 CPU interface on systems that need it.
As we are about to have an alternative switch function for VHE systems,
but VHE systems still need the same early fixup logic, factor out this
logic into a separate function that can be shared by both switch
functions.
No functional change.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Instead of having multiple calls from the world switch path to the debug
logic, each figuring out if the dirty bit is set and if we should
save/restore the debug registers, let's just provide two hooks to the
debug save/restore functionality, one for switching to the guest
context, and one for switching to the host context, and we get the
benefit of only having to evaluate the dirty flag once on each path,
plus we give the compiler some more room to inline some of this
functionality.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The debug save/restore functions can be improved by using the has_vhe()
static key instead of the instruction alternative. Using the static key
uses the same paradigm as we're going to use elsewhere, it makes the
code more readable, and it generates slightly better code (no
stack setups and function calls unless necessary).
We also use a static key on the restore path, because it will be
marginally faster than loading a value from memory.
Finally, we don't have to conditionally clear the debug dirty flag if
it's set, we can just clear it.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no need to figure out inside the world-switch if we should
save/restore the debug registers or not, we might as well do that in the
higher level debug setup code, making it easier to optimize down the
line.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We have numerous checks around that checks if the HCR_EL2 has the RW bit
set to figure out if we're running an AArch64 or AArch32 VM. In some
cases, directly checking the RW bit (given its unintuitive name), is a
bit confusing, and that's not going to improve as we move logic around
for the following patches that optimize KVM on AArch64 hosts with VHE.
Therefore, introduce a helper, vcpu_el1_is_32bit, and replace existing
direct checks of HCR_EL2.RW with the helper.
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we are about to move a bunch of save/restore logic for VHE kernels to
the load and put functions, we need some infrastructure to do this.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We currently have a separate read-modify-write of the HCR_EL2 on entry
to the guest for the sole purpose of setting the VF and VI bits, if set.
Since this is most rarely the case (only when using userspace IRQ chip
and interrupts are in flight), let's get rid of this operation and
instead modify the bits in the vcpu->arch.hcr[_el2] directly when
needed.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We always set the IMO and FMO bits in the HCR_EL2 when running the
guest, regardless if we use the vgic or not. By moving these flags to
HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of
HCR_EL2 in the world switch code, and we can also soon get rid of the
other one.
This is safe, because even though the IMO and FMO bits control both
taking the interrupts to EL2 and remapping ICC_*_EL1 to ICV_*_EL1 when
executed at EL1, as long as we ensure that these bits are clear when
running the EL1 host, we're OK, because we reset the HCR_EL2 to only
have the HCR_RW bit set when returning to EL1 on non-VHE systems.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
VHE actually doesn't rely on clearing the VTTBR when returning to the
host kernel, and that is the current key mechanism of hyp_panic to
figure out how to attempt to return to a state good enough to print a
panic statement.
Therefore, we split the hyp_panic function into two functions, a VHE and
a non-VHE, keeping the non-VHE version intact, but changing the VHE
behavior.
The vttbr_el2 check on VHE doesn't really make that much sense, because
the only situation where we can get here on VHE is when the hypervisor
assembly code actually called into hyp_panic, which only happens when
VBAR_EL2 has been set to the KVM exception vectors. On VHE, we can
always safely disable the traps and restore the host registers at this
point, so we simply do that unconditionally and call into the panic
function directly.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We already have the percpu area for the host cpu state, which points to
the VCPU, so there's no need to store the VCPU pointer on the stack on
every context switch. We can be a little more clever and just use
tpidr_el2 for the percpu offset and load the VCPU pointer from the host
context.
This has the benefit of being able to retrieve the host context even
when our stack is corrupted, and it has a potential performance benefit
because we trade a store plus a load for an mrs and a load on a round
trip to the guest.
This does require us to calculate the percpu offset without including
the offset from the kernel mapping of the percpu array to the linear
mapping of the array (which is what we store in tpidr_el1), because a
PC-relative generated address in EL2 is already giving us the hyp alias
of the linear mapping of a kernel address. We do this in
__cpu_init_hyp_mode() by using kvm_ksym_ref().
The code that accesses ESR_EL2 was previously using an alternative to
use the _EL1 accessor on VHE systems, but this was actually unnecessary
as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2
accessor does the same thing on both systems.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Calling vcpu_load() registers preempt notifiers for this vcpu and calls
kvm_arch_vcpu_load(). The latter will soon be doing a lot of heavy
lifting on arm/arm64 and will try to do things such as enabling the
virtual timer and setting us up to handle interrupts from the timer
hardware.
Loading state onto hardware registers and enabling hardware to signal
interrupts can be problematic when we're not actually about to run the
VCPU, because it makes it difficult to establish the right context when
handling interrupts from the timer, and it makes the register access
code difficult to reason about.
Luckily, now when we call vcpu_load in each ioctl implementation, we can
simply remove the call from the non-KVM_RUN vcpu ioctls, and our
kvm_arch_vcpu_load() is only used for loading vcpu content to the
physical CPU when we're actually going to run the vcpu.
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tweak the SHA256 update routines to invoke the SHA256 block transform
block by block, to avoid excessive scheduling delays caused by the
NEON algorithm running with preemption disabled.
Also, remove a stale comment which no longer applies now that kernel
mode NEON is actually disallowed in some contexts.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
CBC MAC is strictly sequential, and so the current AES code simply
processes the input one block at a time. However, we are about to add
yield support, which adds a bit of overhead, and which we prefer to
align with other modes in terms of granularity (i.e., it is better to
have all routines yield every 64 bytes and not have an exception for
CBC MAC which yields every 16 bytes)
So unroll the loop by 4. We still cannot perform the AES algorithm in
parallel, but we can at least merge the loads and stores.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
CBC encryption is strictly sequential, and so the current AES code
simply processes the input one block at a time. However, we are
about to add yield support, which adds a bit of overhead, and which
we prefer to align with other modes in terms of granularity (i.e.,
it is better to have all routines yield every 64 bytes and not have
an exception for CBC encrypt which yields every 16 bytes)
So unroll the loop by 4. We still cannot perform the AES algorithm in
parallel, but we can at least merge the loads and stores.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AES block mode implementation using Crypto Extensions or plain NEON
was written before real hardware existed, and so its interleave factor
was made build time configurable (as well as an option to instantiate
all interleaved sequences inline rather than as subroutines)
We ended up using INTERLEAVE=4 with inlining disabled for both flavors
of the core AES routines, so let's stick with that, and remove the option
to configure this at build time. This makes the code easier to modify,
which is nice now that we're adding yield support.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When kernel mode NEON was first introduced on arm64, the preserve and
restore of the userland NEON state was completely unoptimized, and
involved saving all registers on each call to kernel_neon_begin(),
and restoring them on each call to kernel_neon_end(). For this reason,
the NEON crypto code that was introduced at the time keeps the NEON
enabled throughout the execution of the crypto API methods, which may
include calls back into the crypto API that could result in memory
allocation or other actions that we should avoid when running with
preemption disabled.
Since then, we have optimized the kernel mode NEON handling, which now
restores lazily (upon return to userland), and so the preserve action
is only costly the first time it is called after entering the kernel.
So let's put the kernel_neon_begin() and kernel_neon_end() calls around
the actual invocations of the NEON crypto code, and run the remainder of
the code with kernel mode NEON disabled (and preemption enabled)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When kernel mode NEON was first introduced on arm64, the preserve and
restore of the userland NEON state was completely unoptimized, and
involved saving all registers on each call to kernel_neon_begin(),
and restoring them on each call to kernel_neon_end(). For this reason,
the NEON crypto code that was introduced at the time keeps the NEON
enabled throughout the execution of the crypto API methods, which may
include calls back into the crypto API that could result in memory
allocation or other actions that we should avoid when running with
preemption disabled.
Since then, we have optimized the kernel mode NEON handling, which now
restores lazily (upon return to userland), and so the preserve action
is only costly the first time it is called after entering the kernel.
So let's put the kernel_neon_begin() and kernel_neon_end() calls around
the actual invocations of the NEON crypto code, and run the remainder of
the code with kernel mode NEON disabled (and preemption enabled)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When kernel mode NEON was first introduced on arm64, the preserve and
restore of the userland NEON state was completely unoptimized, and
involved saving all registers on each call to kernel_neon_begin(),
and restoring them on each call to kernel_neon_end(). For this reason,
the NEON crypto code that was introduced at the time keeps the NEON
enabled throughout the execution of the crypto API methods, which may
include calls back into the crypto API that could result in memory
allocation or other actions that we should avoid when running with
preemption disabled.
Since then, we have optimized the kernel mode NEON handling, which now
restores lazily (upon return to userland), and so the preserve action
is only costly the first time it is called after entering the kernel.
So let's put the kernel_neon_begin() and kernel_neon_end() calls around
the actual invocations of the NEON crypto code, and run the remainder of
the code with kernel mode NEON disabled (and preemption enabled)
Note that this requires some reshuffling of the registers in the asm
code, because the XTS routines can no longer rely on the registers to
retain their contents between invocations.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When kernel mode NEON was first introduced on arm64, the preserve and
restore of the userland NEON state was completely unoptimized, and
involved saving all registers on each call to kernel_neon_begin(),
and restoring them on each call to kernel_neon_end(). For this reason,
the NEON crypto code that was introduced at the time keeps the NEON
enabled throughout the execution of the crypto API methods, which may
include calls back into the crypto API that could result in memory
allocation or other actions that we should avoid when running with
preemption disabled.
Since then, we have optimized the kernel mode NEON handling, which now
restores lazily (upon return to userland), and so the preserve action
is only costly the first time it is called after entering the kernel.
So let's put the kernel_neon_begin() and kernel_neon_end() calls around
the actual invocations of the NEON crypto code, and run the remainder of
the code with kernel mode NEON disabled (and preemption enabled)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
for ARM64. This is ported from the 32-bit version. It may be useful on
devices with 64-bit ARM CPUs that don't have the Cryptography
Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
processor on the Raspberry Pi 3.
It generally works the same way as the 32-bit version, but there are
some slight differences due to the different instructions, registers,
and syntax available in ARM64 vs. in ARM32. For example, in the 64-bit
version there are enough registers to hold the XTS tweaks for each
128-byte chunk, so they don't need to be saved on the stack.
Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:
Algorithm Encryption Decryption
--------- ---------- ----------
Speck64/128-XTS (NEON) 92.2 MB/s 92.2 MB/s
Speck128/256-XTS (NEON) 75.0 MB/s 75.0 MB/s
Speck128/256-XTS (generic) 47.4 MB/s 35.6 MB/s
AES-128-XTS (NEON bit-sliced) 33.4 MB/s 29.6 MB/s
AES-256-XTS (NEON bit-sliced) 24.6 MB/s 21.7 MB/s
The code performs well on higher-end ARM64 processors as well, though
such processors tend to have the Crypto Extensions which make AES
preferred. For example, here are the same benchmarks run on a HiKey960
(with CPU affinity set for the A73 cores), with the Crypto Extensions
implementation of AES-256-XTS added:
Algorithm Encryption Decryption
--------- ----------- -----------
AES-256-XTS (Crypto Extensions) 1273.3 MB/s 1274.7 MB/s
Speck64/128-XTS (NEON) 359.8 MB/s 348.0 MB/s
Speck128/256-XTS (NEON) 292.5 MB/s 286.1 MB/s
Speck128/256-XTS (generic) 186.3 MB/s 181.8 MB/s
AES-128-XTS (NEON bit-sliced) 142.0 MB/s 124.3 MB/s
AES-256-XTS (NEON bit-sliced) 104.7 MB/s 91.1 MB/s
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull "Freescale arm64 device tree updates for 4.17" from Shawn Guo:
- Move cpu_thermal device out of bus node to fix DTC simple_bus_reg
warning seen with W=1 switch.
- Fix IFC child nodes' unit-address to eliminate DTC simple_bus_reg
warnings.
- Add a dummy size memory 'reg' property for LS1046A device tree to
avoid unit_address_vs_reg DTC warning, and the real size will be
filled by bootloader.
- Update ls208xa-qds board device tree to fix unit_address_vs_reg
warnings with DSPI device.
- Add idle-states for LS1012A and LS1043A, and correct
arm,psci-suspend-param setting for already added idle-states.
- DPAA QBMan portal and watchdog device addition.
* tag 'imx-dt64-4.17' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
dt-bindings: ifc: Fix the unit address format in the examples
arm64: dts: ls1046a: add a dummy memory 'reg' property
arm64: dts: fsl: fix ifc simple-bus unit address format warnings
arm64: dts: fsl: update the cpu idle node
arm64: dts: ls1043a: add cpu idle support
arm64: dts: ls1012a: add cpu idle support
arm64: dts: ls208xa-qds: Fix the 'reg' property
arm64: dts: ls208xa-qds: Pass unit name to dspi child nodes
arm64: dts: ls208xa: Move cpu_thermal out of bus node
arm64: dts: ls1088a: Move cpu_thermal out of bus node
arm64: dts: ls1046a: Move cpu_thermal out of bus node
arm64: dts: ls1043a: Move cpu_thermal out of bus node
arm64: dts: ls1012a: Move cpu_thermal out of bus node
arm64: dts: Add DPAA QBMan portal 9
arm64: dts: ls1088a: add DT node of watchdog
- Peace of mind locking fix in vgic_mmio_read_pending
- Allow hw-mapped interrupts to be reset when the VM resets
- Fix GICv2 multi-source SGI injection
- Fix MMIO synchronization for GICv2 on v3 emulation
- Remove excess verbosity on the console
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlqqp/cVHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDAGkP/2LMhFN561PKlqgu5V4hFvowJiXb
Gbb/qi095vtDGccbKmJKAZp3jyOM2oJEMUkx5RBYglWjW0mxb3zPAAxhldXiqv/2
CrOGGlS/FwfyIjCt7870pltDOIgRmk8Fv/MyQjjGKF6VAghd6yVHIZiOUjiriUyz
6hNyc2znLm0tBqm4j3HTXKHpD23YseW387pQoeQ03/WiXiZ60O3e3k0yppXO81qE
b7TGT4Bz04mxlAISZVZeTmG7P7P4ej6+NhOH+1kxacseLzHdECPBA0JRcwRpfLkP
5JFodUOX7/KHpvpMLUxRNRnLBei9WUL4o2LAEV0qDaj7nlAud0kKUm22RLaVKDm+
8FSUQ12XKqnZsRrl6IizU1oAb1I1iV3j9HF5iNf3mk9AO27REGk0b8fDyRzDj300
xpySgvIgA+f+EyY+3ve0AmEUa5QKz/WLuik2ZCqpVOuufrO8XpS+zjn1L1tzTlkR
95EahDA7enutw47G0uWtxoPMeU4HTZS/CAiFwUbq8BEK7T3Rct7UySPLwgeYBoji
MUlCRhPyAANCJmtO6rpOS3htkQ3XkkO1DVIGLuWC5Zl00W1T5I5+VRrVL1YI4v3O
d2ui9r5X5Vmg4OUdhr2D9fXgPWWKEbqD90jv40rGLsMl0g/IwrC+o2VxgYxSeu5x
CLUYILwEA5NDZSof
=iyYE
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-fixes-for-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
kvm/arm fixes for 4.16, take 2
- Peace of mind locking fix in vgic_mmio_read_pending
- Allow hw-mapped interrupts to be reset when the VM resets
- Fix GICv2 multi-source SGI injection
- Fix MMIO synchronization for GICv2 on v3 emulation
- Remove excess verbosity on the console
According to Documentation/process/license-rules.rst, move the SPDX
License Identifier to the very top of the file. I used C++ comment
style not only for the SPDX line but for the entire block because
this seems Linus' preference [1]. I also dropped the parentheses to
follow the examples in that document.
[1] https://lkml.org/lkml/2017/11/25/133
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This patch adds regulators that have fixed voltage for audio codec
on UniPhier LD11/20 Global boards. This patch fixes warnings about
TAS57xx audio codec such as "tas571x 0-001b: 0-001b supply AVDD
not found, using dummy regulator".
Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Add nodes of the AVE ethernet controller for LD11 and LD20 SoCs
and the boards.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This patch adds compress audio node for S/PDIF on UniPhier LD11/20
global boards. And adds settings of AIO for it.
Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This patch adds codec node for TI TAS571x on UniPhier LD11/20
global boards. And adds settings of AIO for speaker out.
Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Since 'num-slots' had already deprecated, remove the property in
device-tree file.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Calling vcpu_load() registers preempt notifiers for this vcpu and calls
kvm_arch_vcpu_load(). The latter will soon be doing a lot of heavy
lifting on arm/arm64 and will try to do things such as enabling the
virtual timer and setting us up to handle interrupts from the timer
hardware.
Loading state onto hardware registers and enabling hardware to signal
interrupts can be problematic when we're not actually about to run the
VCPU, because it makes it difficult to establish the right context when
handling interrupts from the timer, and it makes the register access
code difficult to reason about.
Luckily, now when we call vcpu_load in each ioctl implementation, we can
simply remove the call from the non-KVM_RUN vcpu ioctls, and our
kvm_arch_vcpu_load() is only used for loading vcpu content to the
physical CPU when we're actually going to run the vcpu.
Cc: stable@vger.kernel.org
Fixes: 9b062471e5 ("KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl")
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Enable AHCI on Jetson TX1 and add sata phy node.
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add the (previously omitted) SCIF0 pin data to the V3M Starter Kit board's
device tree.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Display and graphics can't work together without an SMMU, so it is
effectively always getting enabled anyway.
Signed-off-by: Thierry Reding <treding@nvidia.com>
On R-Car H3, on-chip peripheral modules that can make use of DMA are
wired to either SYS-DMAC0 only, or to both SYS-DMAC1 and SYS-DMAC2.
Add the missing DMA properties pointing to SYS-DMAC2 for HSCIF[0-2],
SCIF[0125], and I2C[0-2]. These were initially left out because early
firmware versions prohibited using SYS-DMAC2. This restriction has been
lifted in IPL and Secure Monitor Rev1.0.6 (released on Feb 25, 2016).
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add r8a7795 IPMMU-PV1 and keep it disabled by default.
This device is not present in r8a7795 ES1.x and
is removed from the DT of those SoCs.
This corrects an omission in
3b7e7848f0 ("arm64: dts: renesas: r8a7795: Add IPMMU device nodes")
This does not have any runtime effect.
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Sort root sub-nodes alphabetically for allow for easier maintenance of
this file.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Define the Eagle board dependent part of the I2C0 device node.
The I2C0 bus is populated by ON Semiconductor PCA9653 I/O expander and
Analog Devices ADV7511W HDMI transmitter (but we're only describing the
former chip now).
Based on the original (and large) patch by Vladimir Barinov.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Define the generic R8A77970 parts of the I2C[0-4] device node.
Based on the original (and large) patch by Daisuke Matsushita
<daisuke.matsushita.ns@hitachi.com>.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Document clearly which SoC this DTS applies to, to distinguish from
Salvator-XS boards equipped with other SoCs.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Set the "phy-mode" property of EtherAVB device to "rgmii" and let board
files override it if the installed PHY layer provides delays for the
RX/TX channels.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Set the "phy-mode" property of EtherAVB device to "rgmii" and let board
files override it if the installed PHY layer provides delays for the
RX/TX channels.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Set the "phy-mode" property of EtherAVB device to "rgmii" and let board
files override it if the installed PHY layer provides delays for the
RX/TX channels.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Set the "phy-mode" property of EtherAVB device to "rgmii" and let board
files override it if the installed PHY layer provides delays for the
RX/TX channels.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
As the PHY interface installed on the V3MSK board provides TX and RX
channels delays, make the "phy-mode" property a board-specific one,
meant to override the one specified in the SoC DTSI.
Follow up patches will reset the r8a77970 SoC DTSI to use "rgmii"
mode and let the board file override that.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
As the PHY interface installed on the Eagle board provides TX and RX
channels delays, make the "phy-mode" property a board-specific one,
meant to override the one specified in the SoC DTSI.
Follow up patches will reset the r8a77970 SoC DTSI to use "rgmii" mode
and let the board file override that.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
As the PHY interface installed on the Draak board, provides TX
channel delay, make the "phy-mode" property a board-specific one, meant
to override the one specified in the SoC DTSI.
Follow up patches will reset the r8a77995 SoC DTSI to use "rgmii" mode
and let the board file override that.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
As the PHY interface installed on the ULCB board provides TX
channel delay, make the "phy-mode" property a board-specific one, meant
to override the one specified in the SoC DTSI.
Follow up patches will reset the r8a7795/96 SoC DTSI to use "rgmii" mode\
and let the board files override that.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
As the PHY interface installed on the Salvator-X[S] board, provides TX
channel delay, make the "phy-mode" property a board-specific one, meant
to override the one specified in the SoC DTSI.
Follow up patches will reset the r8a7795/96/965 SoC DTSI to use "rgmii"
mode and let the board files override that.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Populate the device node for the Interrupt Controller for External
Devices (INTC-EX) on R-Car M3-N, which serves external IRQ pins
IRQ[0-5].
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Populate the device node for the IIC Bus Interface for DVFS (IIC for
DVFS) on R-Car M3-N, and add an alias to fix its bus number.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add initial support for the Renesas Salvator-XS (Salvator-X 2nd version)
development board equipped with an R-Car M3-N SiP.
Most features are enabled through the shared salvator-xs.dtsi board
description. The memory configuration is specific to the M3-N SiP.
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
[geert: Switch to SPDX-License-Identifier, update patch description]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add "#interrupt-cells" property and "interrupt-controller" label to
"interrupt-controller@e61c0000" device node.
This silences the following DTC compiler warnings:
Warning (interrupts_property): Missing interrupt-controller or
interrupt-map property in /soc/interrupt-controller@e61c0000
Warning (interrupts_property): Missing #interrupt-cells in
interrupt-parent /soc/interrupt-controller@e61c000
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add "#pwm-cells" property to "pwm@e6e31000" device node.
This silences the following DTC compiler warning:
Warning (pwms_property): Missing property '#pwm-cells' in node
/soc/pwm@e6e31000 or bad phandle (referred from /backlight:pwms[0])
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add "#phy-cells" property to "usb-phy@e65ee000" device node.
This silences the following DTC compiler warning:
Warning (phys_property): Missing property '#phy-cells' in node
/soc/usb-phy@e65ee000 or bad phandle (referred from
/soc/usb@ee020000:phys[0])
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Remove "reg" property from cache-controller-0 device node as it does not
have any unit address.
This silences the following DTC compiler warning:
Warning (unit_address_vs_reg): Node /cpus/cache-controller-0 has a reg
or ranges property, but no unit name
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add "#address-cells" and "#size-cells" properties to all place-holder nodes
that have children nodes defined by salvator-x[s].dtsi device tree.
This silences the following DTC compiler warnings:
Warning (reg_format): "reg" property in /soc/.. has invalid length (4 bytes)
(#address-cells == 2, #size-cells == 1)
Warning (avoid_default_addr_size): Relying on default #address-cells
value for /soc/...
Warning (avoid_default_addr_size): Relying on default #size-cells value
for /soc/...
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add "reg" properties to place-holder nodes with unit address defined for
R-Car M3-N SoC.
This silences the following DTC compiler warning:
Warning (unit_address_vs_reg): Node /soc/... has a unit name,
but no reg property
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Add basic support for R-Car Salvator-X M3-N (R8A77965) board.
Based on original work from:
Takeshi Kihara <takeshi.kihara.df@renesas.com>
Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Basic support for the Gen 3 R-Car M3-N SoC.
Based on original work from:
Takeshi Kihara <takeshi.kihara.df@renesas.com>
Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Describe frequencies, other than the default for CA53 cores. This is a
pre-requisite for using providing alternative frequencies for use with
CPUFreq with these cores.
Signed-off-by: Dien Pham <dien.pham.ry@renesas.com>
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Describe frequencies, other than the default for CA53 cores. This is a
pre-requisite for using providing alternative frequencies for use with
CPUFreq with these cores.
Signed-off-by: Dien Pham <dien.pham.ry@renesas.com>
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
sdhci for rk3399-sapphire works for eMMC but keep-power-in-suspend
is an optional property for SDIO.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
pinctrl hogs on rk3399-gru-kevin that broke suspend. This gets fixed
by moving the wifi pinctrl to the correct node.
Also revert the usb3 phy-port enablement, as a missing feature in the
type-c phy breaks usb on all non-gru rk3399 boards.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCAAuFiEE7v+35S2Q1vLNA3Lx86Z5yZzRHYEFAlqj23sQHGhlaWtvQHNu
dGVjaC5kZQAKCRDzpnnJnNEdgf9kB/9/dpdz7OVY4u2WPhr7AHZjv+dBl9dt0uI6
1ttbWzH2sXQcfUivy89hAEQYBB3hR5GsZZIIWA9ProsiHpnH6hHQTwRmy/H+2QBI
hg1XyYAh7eiQoOjvMUT6at2SfCRofPbSoLBVo3JBlwlbSCBEy+WpQZjLXWdD+Bqy
Cj1H6PWlCV8p0W7T1YsMfltOM7Kh55tew+1+XTpkEJ2PHgz+Ht1mh4h59nEvG+s/
BdJ2hos9UW/O7OPdpa6pQUGf8/tZRSacmAXL8Z6W+kUzzsDTnYFN2S8cuJ7CeZCY
0+azrYptuSvs+1xAe221axu0GlVPltSC3eylI4pWECpFfh3f/ARK
=iow6
-----END PGP SIGNATURE-----
Merge tag 'v4.16-rockchip-dts64fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes
Pull "Rockchip dts64 fixes for 4.16" from Heiko Stübner:
Pinctrl got a fix in 4.16-rc1, that exposed an issue with wifi-related
pinctrl hogs on rk3399-gru-kevin that broke suspend. This gets fixed
by moving the wifi pinctrl to the correct node.
Also revert the usb3 phy-port enablement, as a missing feature in the
type-c phy breaks usb on all non-gru rk3399 boards.
* tag 'v4.16-rockchip-dts64fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399"
arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
- Enable deadline IO scheduler to improve the performnace for some
usecaese without changing the default IO scheduler
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaoqFIAAoJEAvIV27ZiWZcK1oP/3lBn9Yex3n9zrRfv/3KWZxX
Spri2eoAHAGXVnjgQOG6Hr3Xd8xss3UovRecJ5Vbla3/5S/wVGfA4dxdmBxl8x6g
mwUFaBhLS2ir2/apnLdppgfNpzOftHBqHDIjjxFMeWxLBD/00vvOAZNMdEzDZldB
g7fHotVsrZvVQWUe8luKRXrc7+I7XHS+JOZcqOk+4iAjz4YIwghbe1jlBHmTjN3q
J3yb8Pz4PfNMpnst4bW6zKS7Cw++yzyUZiZB+0A6kajzGR5e2pdcJI3tEHfb0Py9
8ThWH49uIOR/BHCGy5g6UAZ7amvZt0A4tA38QCb0gbXtV1ebr3Coe3Mo9Dm3nhPf
bBPX4OY114JRlsAH0dXWxtHI+pBh5xPbjMe7urD6bNP1ZxpG0LLRcb1XsGKqBPoB
zjdtMWb6T4jKj3n9W4SqT3a/1rwG6Ki0ACBRo9GVZEI3pWop0sCTyilLIpkbwqCl
9aU4QmdfsqKz8oxy63wqdjMk/ep8ybKSOSXXaGr/jUmwZ25kpkhb1K9AUiMZWU2k
cguV+zq4bb8L9L/qaLIQesMqwp76fvAIFP6qJ2lEtH+NuWUx6FtGkVV+0dSjGeCT
PZ/FkmvlJ5dBPZqM8dOf4zHh6k1tXn/Gyau9KnQYp5MJCHlDdHhLooS2AvfXJjzh
5hfhdS27JQc3IcURri6n
=BscW
-----END PGP SIGNATURE-----
Merge tag 'hisi-defconfig-for-4.17' of git://github.com/hisilicon/linux-hisi into next/soc
Pull "ARM64: hisilicon: defconfig updates for 4.17" from Wei Xu:
- Enable deadline IO scheduler to improve the performnace for some
usecaese without changing the default IO scheduler
* tag 'hisi-defconfig-for-4.17' of git://github.com/hisilicon/linux-hisi:
arm64: defconfig: enable IOSCHED_DEADLINE
- convert to the SPDX-License-Identifier
- add missing clocks (for the registers) on some of the peripherals
- use the new nand driver
- add more uart for Armada 7K/8K SoCs
-----BEGIN PGP SIGNATURE-----
iF0EABECAB0WIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCWqLBBgAKCRALBhiOFHI7
1dQ0AKCOUzXMHkC5/f0hwgPYpmzKuQGZjACfQA6eVistsl3QfTN0Nph57jRW9IE=
=KBPW
-----END PGP SIGNATURE-----
Merge tag 'mvebu-dt64-4.17-1' of git://git.infradead.org/linux-mvebu into next/dt
Pull "mvebu dt64 for 4.17 (part 1)" from Gregory CLEMENT:
- convert to the SPDX-License-Identifier
- add missing clocks (for the registers) on some of the peripherals
- use the new nand driver
- add more uart for Armada 7K/8K SoCs
* tag 'mvebu-dt64-4.17-1' of git://git.infradead.org/linux-mvebu:
ARM64: dts: marvell: armada-cp110: Add apb_pclk clock for the uart nodes
arm64: dts: marvell: use reworked NAND controller driver on Armada 8K
arm64: dts: marvell: use reworked NAND controller driver on Armada 7K
ARM64: dts: marvell: armada-cp110: Add registers clock for sata node
arm64: dts: marvell: armada-8080-db: use SPDX-License-Identifier
arm64: dts: marvell: armada-8040-mcbin: use SPDX-License-Identifier
arm64: dts: marvell: armada-8040-db: use SPDX-License-Identifier
arm64: dts: marvell: armada-7040-db: use SPDX-License-Identifier
arm64: dts: marvell: armada-3720-espressobin: use SPDX-License-Identifier
arm64: dts: marvell: armada-3720-db: use SPDX-License-Identifier
arm64: dts: marvell: use SPDX-License-Identifier for Armada SoCs
arm64: dts: marvell: mcbin: fix board name typo
arm64: dts: marvell: mcbin: enable uart headers
arm64: dts: marvell: add CP110 uart peripherals
ARM64: dts: marvell: armada-cp110: Add registers clock for I2C nodes
ARM64: dts: marvell: armada-cp110: Add registers clock for SPI nodes
- Add XGE CPLD control support for hip07 SoC
- Disable the SMMU on hip06 and hip07 SoCs becuase of
the hardware limitation
- Enable HS200 mode for the MMC controller on hi6220 hikey board
- Remove "cooling-{min|max}-level" this kind unused properties
for hi6220 SoC
- Add watchdog node for hi6220 SoC
- Remove "CPU_NAP" idle state on hikey960 board since it is
not stable and useless with the updated firmware
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaoqayAAoJEAvIV27ZiWZcCqkP/0vCtll6jMLSNvFFFjdgspXP
RqzjVCHs0mCdLiMenFFP+uNaeK8h3ctIf/JeO57/zE7lTBvcDklmoQAkVhY3Ctsr
pq/Nv/neDZjZu0OnWspVXMEo8BSBK/NVxzntKh+PHmfP26McEpOejGz7r4E4tbib
TNxZFfWqj7Luz7LZJ/uhKfT5HOPz/mVVIf9ean2chyFd7+mKakNCWAj+UYf8a50P
o/PGbL+UEqgUPy8lmFKAoB5fCI85uDMcL4Pl0DgeOmA7EC68QQ+4mDsdUt0PK3W2
DcaoGEfbXlsxOsByq0xkuCQxLDcfUFhBAYCKP4UoKyWtmMwTGA5o2H5QQ54iEpoH
SoBwbLq6LdPyDV5O7SfcPTS6Ii4daKF7R55EqOzqhu544PpEd3hyv4wy45Tcfiky
VEU0g5Wl/C5ow4scYlllRK9K6DSAvJ71pDb3OEadZYp9ILe2WxHePHLS8l1rZRqo
i1UdFxyJAUupKEXIMVtBEuirjkDkS55K6LgBBSQQxst6yyEI+NPzB32WQ0UGK+de
hA5b8qMojB2kzHshSBSXvoIquQAfPYlLU4BbupP0jTCZR6N7Ss+0WDsJ0x8Rt3KY
quXR8UmiavN9uU8bBA9UYrlPzHeu/EA2A19pU+NOAElQzZCU9tG7rrr5TaS6ZeC3
6Vh0sPpWM2ettFbbSwq1
=HIMw
-----END PGP SIGNATURE-----
Merge tag 'hisi-arm64-dt-for-4.17' of git://github.com/hisilicon/linux-hisi into next/dt
Pull "ARM64: DT: Hisilicon SoC DT updates for 4.17" from Wei Xu:
- Add XGE CPLD control support for hip07 SoC
- Disable the SMMU on hip06 and hip07 SoCs becuase of
the hardware limitation
- Enable HS200 mode for the MMC controller on hi6220 hikey board
- Remove "cooling-{min|max}-level" this kind unused properties
for hi6220 SoC
- Add watchdog node for hi6220 SoC
- Remove "CPU_NAP" idle state on hikey960 board since it is
not stable and useless with the updated firmware
* tag 'hisi-arm64-dt-for-4.17' of git://github.com/hisilicon/linux-hisi:
arm64: dts: Hi3660: Remove 'CPU_NAP' idle state
arm64: dts: hi6220: enable watchdog
ARM64: dts: hi6220: Remove "cooling-{min|max}-level" for CPU nodes
arm64: dts: hikey: Enable HS200 mode on eMMC
arm64: dts: hisi: Disable hisilicon smmu node on hip06/hip07
arm64: dts: hisi: add hns-dsaf cpld control for the hip07 SoC
The ACLK_VIO is a parent clock used by a several children,
its suggested clock rate is 400MHz. Right now it gets 400MHz
because it sources from CPLL(800M) and divides by 2 after reset.
It's good not to rely on default values like this, so let's
explicitly set it.
NOTE: it's expected that at least one board may override cru node and
set the CPLL to 1.6 GHz. On that board it will be very important to be
explicit about aclk-vio being 400 MHz.
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
add mmc device nodes and proper setup for used pins
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Jimin Wang <jimin.wang@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
This patch adds SATA support fot MT7622.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
This patch adds PCIe support for MT7622.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
[mb: fix type in commit message]
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
add ethernet device nodes which enable GMAC1 with SGMII interface
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
add nodes for NOR flash, parallel Nand flash with error correction code
support.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: RogerCC Lin <rogercc.lin@mediatek.com>
Cc: Guochun Mao <guochun.mao@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
This patch also cleans up two oscillators that provide clocks for MT7623.
Switch the uart clocks to the real ones while at it.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Add clocks, regulators and opp information into cpu nodes.
In addition, the power supply for cpu nodes is deployed on
mt7622-rfb1 board.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Enable pwrap and MT6380 on mt7622-rfb1 board. Also add all mt6380
regulator nodes in an alone file to allow similar boards using MT6380
able to resue the configuration.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Acked-by: Philippe Ombredanne <pombredanne@nexb.com>
[mb: add missing space]
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
add pinctrl device nodes and rfb1 board, additionally include all pin
groups possible being used on rfb1 board and available gpio keys.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Add clock controller nodes for MT7622 and include header for topckgen,
infracfg, pericfg, apmixedsys, ethsys, sgmiisys, pciesys and ssusbsys
for those devices nodes to be added afterwards.
In addition, provides an oscillator node for the source of PLLs and dummy
clock for PWARP to complement missing support of clock gate for the
wrapper circuit in the driver.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
- The SMCCC firmware interface for the spectre variant 2 mitigation has
been updated to allow the discovery of whether the CPU needs the
workaround. This pull request relaxes the kernel check on the return
value from firmware.
- Fix the commit allowing changing from global to non-global page table
entries which inadvertently disallowed other safe attribute changes.
- Fix sleeping in atomic during the arm_perf_teardown_cpu() code.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlqi040ACgkQa9axLQDI
XvFnJQ//YTCYifVu7pBY50czqDjBZ8BONQJFtMCsz/id4fBeELrciN5jNklWXA/y
yYg+9Rb4UAEomqCRJWRU6MdIx52UagWlJ2Cn0G5q48uMdY9YFCJ4V8M6IFikvSUp
o0p6Ldhee4r2yv6iBs125c7vIW/4c3nrTb03nsEJrjesKjcW1JSrzuJ0Py+x6ZIP
AMuZocGlUOZ3NlKTPTQqY//fFCBp/hjvYzgUmPpcSZE/3E5pLHoxAIkkLMsaXaLH
eWAbT9/E3NfQoBX2xisp7fyfd5nXZZ5IfEFJC90Dtl+yMb4I3DPgmBXclGFC8Rxd
YOyabVAx9vpyBPGa9h4EtwMSRmiNwLwKxfCcXii8gAV7lPDqOyzduQTeepNCv6iY
ioPHnx3mEEpfEF8TCV0lXzcsPdQnkfQcciJGxoz31KQe3TIp1keGASfwbn/Q575S
i8/pHg9PS1r18tQIrrm/0lnBvkiyBFiKxPgOaWk4GXFYNh34GS9+xnTOsTuGOgGg
vjQ0gRIkseqOeVuZSwD6kkj0f70NsjreTOaXF8eCA4cpGIia+cGUAOPR1SKTF3o6
XkDjCRpde0KZoon95qye0+mVVJHOPgLs5VXFEngF7HCbI6spXxMSKuKoRYUbXZQj
ddXQeaPY0wisMWmerDM9jkbhaprNsKp7b9CGmZKWAYXaa6+Y93w=
=jVvu
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- The SMCCC firmware interface for the spectre variant 2 mitigation has
been updated to allow the discovery of whether the CPU needs the
workaround. This pull request relaxes the kernel check on the return
value from firmware.
- Fix the commit allowing changing from global to non-global page table
entries which inadvertently disallowed other safe attribute changes.
- Fix sleeping in atomic during the arm_perf_teardown_cpu() code.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
arm_pmu: Use disable_irq_nosync when disabling SPI in CPU teardown hook
arm64: mm: fix thinko in non-global page table attribute check
A recent update to the ARM SMCCC ARCH_WORKAROUND_1 specification
allows firmware to return a non zero, positive value to describe
that although the mitigation is implemented at the higher exception
level, the CPU on which the call is made is not affected.
Let's relax the check on the return value from ARCH_WORKAROUND_1
so that we only error out if the returned value is negative.
Fixes: b092201e00 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This extra clock is needed to access the registers of the UARTs used on
CP110 component of the Armada 7K/8K SoCs.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Currently, as reported by Eric, an invalid si_code value 0 is
passed in many signals delivered to userspace in response to faults
and other kernel errors. Typically 0 is passed when the fault is
insufficiently diagnosable or when there does not appear to be any
sensible alternative value to choose.
This appears to violate POSIX, and is intuitively wrong for at
least two reasons arising from the fact that 0 == SI_USER:
1) si_code is a union selector, and SI_USER (and si_code <= 0 in
general) implies the existence of a different set of fields
(siginfo._kill) from that which exists for a fault signal
(siginfo._sigfault). However, the code raising the signal
typically writes only the _sigfault fields, and the _kill
fields make no sense in this case.
Thus when userspace sees si_code == 0 (SI_USER) it may
legitimately inspect fields in the inactive union member _kill
and obtain garbage as a result.
There appears to be software in the wild relying on this,
albeit generally only for printing diagnostic messages.
2) Software that wants to be robust against spurious signals may
discard signals where si_code == SI_USER (or <= 0), or may
filter such signals based on the si_uid and si_pid fields of
siginfo._sigkill. In the case of fault signals, this means
that important (and usually fatal) error conditions may be
silently ignored.
In practice, many of the faults for which arm64 passes si_code == 0
are undiagnosable conditions such as exceptions with syndrome
values in ESR_ELx to which the architecture does not yet assign any
meaning, or conditions indicative of a bug or error in the kernel
or system and thus that are unrecoverable and should never occur in
normal operation.
The approach taken in this patch is to translate all such
undiagnosable or "impossible" synchronous fault conditions to
SIGKILL, since these are at least probably localisable to a single
process. Some of these conditions should really result in a kernel
panic, but due to the lack of diagnostic information it is
difficult to be certain: this patch does not add any calls to
panic(), but this could change later if justified.
Although si_code will not reach userspace in the case of SIGKILL,
it is still desirable to pass a nonzero value so that the common
siginfo handling code can detect incorrect use of si_code == 0
without false positives. In this case the si_code dependent
siginfo fields will not be correctly initialised, but since they
are not passed to userspace I deem this not to matter.
A few faults can reasonably occur in realistic userspace scenarios,
and _should_ raise a regular, handleable (but perhaps not
ignorable/blockable) signal: for these, this patch attempts to
choose a suitable standard si_code value for the raised signal in
each case instead of 0.
arm64 was the only arch to define a BUS_FIXME code, so after this
patch nobody defines it. This patch therefore also removes the
relevant code from siginfo_layout().
Cc: James Morse <james.morse@arm.com>
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The DCache clean & ICache invalidation requirements for instructions
to be data coherence are discoverable through new fields in CTR_EL0.
The following two control bits DIC and IDC were defined for this
purpose. No need to perform point of unification cache maintenance
operations from software on systems where CPU caches are transparent.
This patch optimize the three functions __flush_cache_user_range(),
clean_dcache_area_pou() and invalidate_icache_range() if the hardware
reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two
instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic
in order to avoid the unnecessary overhead.
CTR_EL0.DIC: Instruction cache invalidation requirements for
instruction to data coherence. The meaning of this bit[29].
0: Instruction cache invalidation to the point of unification
is required for instruction to data coherence.
1: Instruction cache cleaning to the point of unification is
not required for instruction to data coherence.
CTR_EL0.IDC: Data cache clean requirements for instruction to data
coherence. The meaning of this bit[28].
0: Data cache clean to the point of unification is required for
instruction to data coherence, unless CLIDR_EL1.LoC == 0b000
or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000).
1: Data cache clean to the point of unification is not required
for instruction to data coherence.
Co-authored-by: Philip Elcan <pelcan@codeaurora.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Omit patching of ADRP instruction at module load time if the current
CPUs are not susceptible to the erratum.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: Drop duplicate initialisation of .def_scope field]
Signed-off-by: Will Deacon <will.deacon@arm.com>
In some cases, core variants that are affected by a certain erratum
also exist in versions that have the erratum fixed, and this fact is
recorded in a dedicated bit in system register REVIDR_EL1.
Since the architecture does not require that a certain bit retains
its meaning across different variants of the same model, each such
REVIDR bit is tightly coupled to a certain revision/variant value,
and so we need a list of revidr_mask/midr pairs to carry this
information.
So add the struct member and the associated macros and handling to
allow REVIDR fixes to be taken into account.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Working around Cortex-A53 erratum #843419 involves special handling of
ADRP instructions that end up in the last two instruction slots of a
4k page, or whose output register gets overwritten without having been
read. (Note that the latter instruction sequence is never emitted by
a properly functioning compiler, which is why it is disregarded by the
handling of the same erratum in the bfd.ld linker which we rely on for
the core kernel)
Normally, this gets taken care of by the linker, which can spot such
sequences at final link time, and insert a veneer if the ADRP ends up
at a vulnerable offset. However, linux kernel modules are partially
linked ELF objects, and so there is no 'final link time' other than the
runtime loading of the module, at which time all the static relocations
are resolved.
For this reason, we have implemented the #843419 workaround for modules
by avoiding ADRP instructions altogether, by using the large C model,
and by passing -mpc-relative-literal-loads to recent versions of GCC
that may emit adrp/ldr pairs to perform literal loads. However, this
workaround forces us to keep literal data mixed with the instructions
in the executable .text segment, and literal data may inadvertently
turn into an exploitable speculative gadget depending on the relative
offsets of arbitrary symbols.
So let's reimplement this workaround in a way that allows us to switch
back to the small C model, and to drop the -mpc-relative-literal-loads
GCC switch, by patching affected ADRP instructions at runtime:
- ADRP instructions that do not appear at 4k relative offset 0xff8 or
0xffc are ignored
- ADRP instructions that are within 1 MB of their target symbol are
converted into ADR instructions
- remaining ADRP instructions are redirected via a veneer that performs
the load using an unaffected movn/movk sequence.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: tidied up ADRP -> ADR instruction patching.]
[will: use ULL suffix for 64-bit immediate]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Enable Tegra BPMP thermal sensor support by default, built as a module.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Enable Tegra186 CPU frequency scaling support by default.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Whether or not we will ever decide to start using x18 as a platform
register in Linux is uncertain, but by that time, we will need to
ensure that UEFI runtime services calls don't corrupt it.
So let's start issuing warnings now for this, and increase the
likelihood that these firmware images have all been replaced by that time.
This has been fixed on the EDK2 side in commit:
6d73863b5464 ("BaseTools/tools_def AARCH64: mark register x18 as reserved")
dated July 13, 2017.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180308080020.22828-6-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Enable the various CPUFREQ governors and statistics to ease development.
Don't change the default governor - performance governor.
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Enable the driver for the TSENS IP that is present across several QCOM
SoCs.
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
The APCS block is present on several Qualcomm SoCs e.g. 8916, 8996. On the
8916 it is needed to enable the clock controller that in turn enables
cpufreq on the platform while on the 8996 it is needed for communication
with RPM.
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Set correct clocks and interrupt values.
Fixes the incorrect SPI master configuration. This is
mandatory to make the SPI5 interface functional.
Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Add cpu cooling maps for cpu passive trip points. The cpu cooling
device states are mapped to cpufreq based scaling frequencies.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
A 2MB shared memory region is used on MSM8996 for exchanging sector data
in rmtfs. Add this chunk of reserved memory now that we have the
rmtfs-mem compatible to describe it and its memory protection
properties.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Add a CPU OPP table to allow CPU frequency scaling on msm8916 platforms.
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Tested-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
There are clock controller registers in the APCS block, which purpose
is to control the main CPU mux and divider. Add the clock properties as
part of the APCS device-tree node.
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
The APCS block was exposed until now as a syscon, but now we have a
proper driver for this block. Add the compatible string of the new
driver to probe and register the mailbox functionality.
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Add a device tree node for the A53 PLL, which exists on msm8916
platforms.
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
GICv3 does not have affinity bitmap in the binding for PPI
interrupts. It can be specified using a 4th cell if needed
as documented in the bindings. Clean up the wrong use of the
affinity bitmap using the GIC_CPU_MASK_SIMPLE() macro
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Enable NVIDIA Tegra194 support in the default 64-bit ARM configuration.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
We currently have to rely on the GCC large code model for KASLR for
two distinct but related reasons:
- if we enable full randomization, modules will be loaded very far away
from the core kernel, where they are out of range for ADRP instructions,
- even without full randomization, the fact that the 128 MB module region
is now no longer fully reserved for kernel modules means that there is
a very low likelihood that the normal bottom-up allocation of other
vmalloc regions may collide, and use up the range for other things.
Large model code is suboptimal, given that each symbol reference involves
a literal load that goes through the D-cache, reducing cache utilization.
But more importantly, literals are not instructions but part of .text
nonetheless, and hence mapped with executable permissions.
So let's get rid of our dependency on the large model for KASLR, by:
- reducing the full randomization range to 4 GB, thereby ensuring that
ADRP references between modules and the kernel are always in range,
- reduce the spillover range to 4 GB as well, so that we fallback to a
region that is still guaranteed to be in range
- move the randomization window of the core kernel to the middle of the
VMALLOC space
Note that KASAN always uses the module region outside of the vmalloc space,
so keep the kernel close to that if KASAN is enabled.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When PLTs are emitted at relocation time, we really should not exceed
the number that we counted when parsing the relocation tables, and so
currently, we BUG() on this condition. However, even though this is a
clear bug in this particular piece of code, we can easily recover by
failing to load the module.
So instead, return 0 from module_emit_plt_entry() if this condition
occurs, which is not a valid kernel address, and can hence serve as
a flag value that makes the relocation routine bail out.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add device tree files for the Tegra194 P2972-0000 development board.
The board consists of the P2888 compute module and the P2822 baseboard.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add the chip-level device tree, including binding headers, for the
NVIDIA Tegra194 "Xavier" system-on-chip. Only a small subset of devices
are initially available, enough to boot to UART console.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Xilinx zc1751 boards is used for silicon validation. Board can be
extended with 5 FMCs/DCs cards to connect various IPs. Describe all
these combinations.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
These 3 boards requires minimal support to get Linux up and running.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Xilinx zcu111 is a customer board. It is reusing some parts from zcu102.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Xilinx zcu106 is a customer board. It is reusing some parts from zcu102.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Xilinx zcu104 is another customer board. It is sort of zcu102 clone
with some differences.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
This patch is adding revA, revB and rev1.0. There are also other
revisions between which should be backward compatible with previous
versions. Unfortunately all revs are still in use.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
This board has 2GB of memory, i2c, sd, wifi sdio, spis, uarts, display
port and usbs.
Board is using fixed clocks because clock driver hasn't been merged yet.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Rob Herring <robh@kernel.org>
This patch add 8-bit bus width property to eMMC node.
Signed-off-by: P L Sai Krishna <lakshmis@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
This patch adds the sata port phy OOB timing values in the sata
device-tree node.
Signed-off-by: Anurag Kumar Vulisha <anuragku@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
This patch adds a specific wetek dtsi to handle the specific Hub and Play2
boards by no more depending on the p20x dtsi.
This simplifies the hub and play2 dts and will avoid breaking these
boards when adding p200 and p201 specific changes.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Different modules maybe installed by the user on the eMMC connector
of the odroid-c2. While the red modules are working without an issue,
it seems some black modules (apparently Samsung based) are having
issue at 200MHz
While the tuning algorithm introduced in v4.14 enables high speed modes
on every other tested designs, it seems a problem remains for this
particular combination of board and eMMC module.
Lowering the maximum frequency of the eMMC on this board until we can
figure out a better solution.
Fixes: d341ca88ee ("mmc: meson-gx: rework tuning function")
Suggested-by: Ellie Reeves <ellierevves@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Move the SPDX-License-Identifier lines to the top and drop the
license splat.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
The compatible in pwm_AO_cd is wrong and does not match anything.
Correct this with the correct compatible string
Fixes: 4a81e5ddfb ("ARM64: dts: meson-axg: add PWM DT info for Meson-Axg SoC")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
add the secure AO system controller with chipid enabled
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
1. Add support for HDMI audio on Exynos 5433 TM2/TM2E boards.
2. Add support for USB-MHL connector on Exynos 5433 TM2/TM2E boards.
-----BEGIN PGP SIGNATURE-----
iQItBAABCAAXBQJantDTEBxrcnprQGtlcm5lbC5vcmcACgkQwTdm5oaLg9fa+BAA
kYIg2JwUTzzwB5ObHHuTY8vBoeeON6Yq9G/jh8ik3kjCJtZPnvNjxSi9BiT9bKOk
OGWZtoOoYbRciUj8VTaixvf6XsNq9N+0wsA33K6bKIBwhx6P8Tn76pqJxacnw81M
P1l74ImgFoYFGRI/+D7vD7k/Bhup5zYUL97WX+Agbhys7WCaQln91/+3cMIM3s/E
NRiB1ODBi81n60LFOsUKoCi7Us7Bgsi6+EJsbVO/yvJm5jPqwF1DA8hlIAuUxmhO
qaMc4sHsdzvZ3GGb39tYYTkbVcsf9QFwy0muY+GkPLa/P2Qy+EFT5gok0Ys7I9iM
GEylcfnRPboKAUWYJ3AqmX05KpVoRodg5tOwjZx7N5DvVtafdSyNVr5n7sKB4/Oq
eB6Hx9AgzywtkrSfpQD6BXvSTsdFiXG1qYb3SV5goI945PmSOpo1Ke2PrPMfVblg
qO0MvRaLLzHU8YoI0cJXysVTGjLakUHqVFJvQj+iH7xi9MqRP91vOS3FNx3oDUIu
Yp/lznGBA6m7M+ANuSxf2u0ZBQ1CfJeUM6KliRI5JTYqM78Aw4O9ng1AXr6BWFLe
GXKG5FbE7wVMWjAaTLwK0tuCJY+WVG4kVf29C/J9epUt4TosuyfW1Oob+KxyQcSk
7cdzyfsPvPwxa61Qgk2C+sAEtvcGUmyDotQHjqHA2j0=
=p0sr
-----END PGP SIGNATURE-----
Merge tag 'samsung-dt64-4.17' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/dt
Pull "Samsung DTS ARM64 changes for v4.17" from Krzysztof Kozłowski:
1. Add support for HDMI audio on Exynos 5433 TM2/TM2E boards.
2. Add support for USB-MHL connector on Exynos 5433 TM2/TM2E boards.
* tag 'samsung-dt64-4.17' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
arm64: dts: exynos: add OF graph between MHL and USB connector
arm64: dts: exynos: add micro-USB connector node to TM2 platforms
ARM: dts: exynos: Add support for HDMI audio on Exynos 5433 TM2 board
ARM: dts: exynos: Update I2S0 device node in exynos5433
ARM: dts: exynos: Add I2S1 device node to exynos5433
'linux,stdout-path' has been deprecated for some time in favor of
'stdout-path'. Now dtc will warn on occurrences of 'linux,stdout-path'.
Search and replace the one occurrence with 'stdout-path'.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>