The ARM kernel supports writethrough data cache via the
CONFIG_CPU_DCACHE_WRITETHROUGH option. However, that
functionality wasn't implemented in the arch/arm/boot/compressed
code. It is now necessary due to a new ARM926EJS processor
that has an issue with writeback data cache.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To be able to relocate the .bss section at run time independently from
the rest of the code, we must make sure that no GOTOFF relocations are
used with .bss symbols. This usually means that no global variables can
be marked static unless they're also const.
To enforce this, suffice to fail the build whenever a private symbol
is allocated to .bss and list those symbols for convenience.
The user_stack and user_stack_end labels in head.S were converted into
non exported symbols to remove false positives.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Tony Lindgren <tony@atomide.com>
In commit d239b1dc09 the hardcoded 4x estimate for the decompressed
kernel size was replaced by the exact Image file size and passed to
the linker as a symbol value. Turns out that this is unneeded as the
size is already included at the end of the compressed piggy data.
For those compressed formats that don't include this data, the build
system already takes care of appending it using size_append in
scripts/Makefile.lib. So let's use that instead.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Tony Lindgren <tony@atomide.com>
For correctness, the initial page table located right before the
decompressed kernel should be considered when determining if relocation
is required.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Tony Lindgren <tony@atomide.com>
If the zImage load address is slightly below the relocation address,
there is a risk for the copied data to overwrite the copy loop or
cache flush code that the relocation process requires. Always
bump the relocation address by the size of that code to avoid this
issue.
Noticed by Tony Lindgren <tony@atomide.com>.
While at it, let's start the copy from the restart symbol which makes
the above code size computation possible by the assembler directly
(same sections), given that we don't need to preserve the code before
that point anyway. And therefore we don't need to carry the _start
pointer in r5 anymore.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Otherwise cache_clean_flush can overwrite some of the relocated
area depending on where the kernel image gets loaded. This fixes
booting on n900 after commit 6d7d0ae515
(ARM: 6750/1: improvements to compressed/head.S).
Thanks to Aaro Koskinen <aaro.koskinen@nokia.com> for debugging
the address of the relocated area that gets corrupted, and to
Nicolas Pitre <nicolas.pitre@linaro.org> for the other uncompress
related fixes.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
The Marvell PJ4 is ARMv7 capable, so we don't support it in
ARMv6 mode anymore.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Saeed Bishara <saeed.bishara@gmail.com>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
The inline assembly differences for v6 vs. v7 are purely
optimizations. On a v7 processor, an mrc with the pc sets the
condition codes to the 28-31 bits of the register being read. It
just so happens that the TX/RX full bits the DCC support code is
testing for are high enough in the register to be put into the
condition codes. On a v6 processor, this "feature" isn't
implemented and thus we have to do the usual read, mask, test
operations to check for TX/RX full. Thus, we can drop the v7
implementation and just use the v6 implementation for both.
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In the case of a conflict between the memory used by the compressed
kernel with its decompressor code and the memory used for the
decompressed kernel, we currently store the later after the former and
relocate it afterwards.
This would be more efficient to do this the other way around i.e.
relocate the compressed data up front instead, resulting in a smaller
copy. That also has the advantage of making the code smaller and more
straight forward.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some installers would binary patch the kernel zImage to replace the
first few nops with custom instructions. This breaks the Thumb2 kernel
as the mode switch is right at the beginning. Let's move it towards the
end of the nop sequence instead.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Introduce a CPU_V6K configuration option for platforms to select if they
have a V6K CPU core. This allows us to identify whether we need to
support ARMv6 CPUs without the V6K SMP extensions at build time.
Currently CPU_V6K is just an alias for CPU_V6, and all places which
reference CPU_V6 are replaced by (CPU_V6 || CPU_V6K).
Select CPU_V6K from platforms which are known to be V6K-only.
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: Sourav Poddar <sourav.poddar@ti.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The code which makes up the zImage header intends to leave a
32-byte gap followed by a branch to the real entry point, a magic
number, and a word containing the absolute entry point address.
This gets messed up with with CONFIG_THUMB2_KERNEL, because the
size of the initial padding NOPs changes.
Instead, the header can be made fully compatible by restoring it to
ARM.
In the Thumb-2 case, we can replace the initial NOPs with a
sequence which switches to Thumb and jumps to the real entry point.
As a consequence, the zImage entry point is now always ARM, so no
special magic is needed any more for the uImage rules in the
Thumb-2 case.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some instruction operand combinations are used here which are nor
permitted in Thumb-2.
In particular, most uses of pc as an operand are disallowed in
Thumb-2, and deprecated in ARM from ARMv7 onwards.
The modified code introduced by this patch should be compatible
with all architecture versions >= v3, with or without
CONFIG_THUMB2_KERNEL.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The .stack section doesn't contain any contents, and doesn't require
initialization either. Rather than marking the output section with
'NOLOAD' but still having it exist in the object files, mark it with
%nobits which avoids the assembler marking the section with 'CONTENTS'.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Partially revert e69edc7, which introduced automatic zreladdr
support. The change in the way the manual definition is defined
seems to be error and conflict prone. Go back to the original way
we were handling this for the time being, while keeping the automatic
zreladdr facility.
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
"ARM: Auto calculate ZRELADDR and provide option for exceptions" broke
the Thumb-2 decompressor because it removed an entry in the LC0 table
but didn't adjust the offset the Thumb-2 code uses to load the SP from
that table.
Fix it, and also change the ARM code to use the separate SP-load since
ARM instructions that include the SP in the LDM register list are
deprecated.
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As long as the zImage is placed within the 128MB range from the start of
memory, ZRELADDR (Address where the decompressed kernel will be placed,
usually == PHYS_OFFSET + TEXT_OFFSET) can be determined at run-time by
masking PC with 0xf80000000.
Running through all the Makefile.boot, all those zreladdr-y
addresses == 0x[0-f][08]00_0000 + TEXT_OFFSET can be determined at
run-time.
Option CONFIG_AUTO_ZRELADDR and CONFIG_ZRELADDR are introduced,
CONFIG_ZRELADDR _must_ be explicitly specified if:
- ((zreladdr-y - TEXT_OFFSET) & ~0xf8000000) != 0, which means
masking PC with 0xf8000000 will result in an incorrect address.
Currently this is only a problem on u300.
- or the assumption of the zImage being loaded by the bootloader within
the first 128MB of RAM is incorrect
- or when ZBOOT_ROM is used, where the above assumption is usually wrong.
[ukleinek: changed mask from 0xf0000000 to 0xf8000000 for mx1 and shark
+ some review fixes from the mailing list]
Original-Idea-and-Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Eric Miao <eric.miao@canonical.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
The only reference in arch/arm/boot/compressed to PARAMS_PHYS is
params() in head.S, which can be directly converted to the exact
address as specified by arch/arm/mach-rpc/Makefile.boot.
Signed-off-by: Eric Miao <eric.miao@canonical.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
This adds missing registers to the list of corrupted registers and
removes a wrong comment about r9 on entry
While at it the formatting of the comment to cache_off is changed to
resemble the other two.
Acked-by: Eric Miao <eric.miao@canonical.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Probably the register content for cache operations is "don't care" in
practice, but as r1 is explicitly zeroed, use that one.
Acked-by: Eric Miao <eric.miao@canonical.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
__armv3_mpu_cache_on seems broken. As there is noone around who knows
about these machines just keep the code as is but point out the strange
things.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Update CPUID pattern of PXA9xx in head.S and fix the duplicate
entries for pxa935.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
98e12b5a6e ("ARM: Fix decompressor's kernel size estimation for
ROM=y") broke the Thumb-2 decompressor because it added an entry in the
LC0 table but didn't adjust the offset the Thumb-2 code uses to load the
SP from that table. Fix it.
Cc: stable <stable@kernel.org>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 2552fc2 changed the way the decompressor decides if it is safe
to decompress the kernel directly to its final location. Unfortunately,
it took the top of the compressed data as being the stack pointer,
which it is for ROM=n cases. However, for ROM=y, the stack pointer
is not relevant, and results in the wrong answer.
Fix this by explicitly storing the end of the biggybacked data in the
decompressor, and use that to calculate the compressed image size.
CC: <stable@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Otherwise more complicated uart configuration won't be possible.
We can use r1 for tmp register for both head.S and debug.S.
NOTE: This patch depends on another patch to add the the tmp register
into all debug-macro.S files. That can be done with:
$ sed -i -e "s/addruart,rx|addruart, rx/addruart, rx, tmp/"
arch/arm/*/include/*/debug-macro.S
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Without this patch arch/arm/compressed/head.S defaults to generic
DCC code that does not work for v7.
For more information on the v7 DCC, see Cortex-A8 TRM
"12.11.1 Debug communications channel".
To use it with post 2.6.33-rc1 or later, you need to have:
CONFIG_DEBUG_LL=y
ONFIG_DEBUG_ICEDCC=y
CONFIG_EARLY_PRINTK=y
Earlier kernels need commit 93fd03a8c6
backported.
Tested on omap3430.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The Marvell Dove (88AP510) is a high-performance, highly integrated,
low power SoC with high-end ARM-compatible processor (known as PJ4),
graphics processing unit, high-definition video decoding acceleration
hardware, and a broad range of peripherals.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Since the Thumb-2 instructions can be 16-bit wide, data in the .text
sections may not be aligned to a 32-bit word and this leads to unaligned
exceptions. This patch does not affect the ARM code generation.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch supports the cache handling for some old Feroceon cores for
which the CPU ID is like 0x41159260. This is a complement to
commit ab6d15d506.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Starting with ARMv6, the CPUs support the BE-8 variant of big-endian
(byte-invariant). This patch adds the core support:
- setting of the BE-8 mode via the CPSR.E register for both kernel and
user threads
- big-endian page table walking
- REV used to rotate instructions read from memory during fault
processing as they are still little-endian format
- Kconfig and Makefile support for BE-8. The --be8 option must be passed
to the final linking stage to convert the instructions to
little-endian
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Adds support for Faraday FA526 core. This core is used at least by:
Cortina Systems Gemini and Centroid family
Cavium Networks ECONA family
Grain Media GM8120
Pixelplus ImageARM
Prolific PL-1029
Faraday IP evaluation boards
v2:
- move TLB_BTB to separate patch
- update copyrights
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
"""The Marvell® PXA168 processor is the first in a family of application
processors targeted at mass market opportunities in computing and consumer
devices. It balances high computing and multimedia performance with low
power consumption to support extended battery life, and includes a wealth
of integrated peripherals to reduce overall BOM cost .... """
See http://www.marvell.com/featured/pxa168.jsp for more information.
1. Marvell Mohawk core is a hybrid of xscale3 and its own ARM core,
there are many enhancements like instructions for flushing the
whole D-cache, and so on
2. Clock reuses Russell's common clkdev, and added the basic support
for UART1/2.
3. Devices are a bit different from the 'mach-pxa' way, the platform
devices are now dynamically allocated only when necessary (i.e.
when pxa_register_device() is called). Description for each device
are stored in an array of 'struct pxa_device_desc'. Now that:
a. this array of device description is marked with __initdata and
can be freed up system is fully up
b. which means board code has to add all needed devices early in
his initializing function
c. platform specific data can now be marked as __initdata since
they are allocated and copied by platform_device_add_data()
4. only the basic UART1/2/3 are added, more devices will come later.
Signed-off-by: Jason Chagas <chagas@marvell.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
SCALE: add ice dcc support
Tested on the ixp425 with the ice PEEDI
Ack-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PXA935 has changed its implementor ID from Intel to Marvell, this
patch modifies arch/arm/boot/compressed/head.S and proc-xsc3.S to
support a smooth bootup.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
The flush_cache_all function on ARMv7 is implemented as a series of
cache operations by set/way. These are not guaranteed to be ordered with
previous memory accesses, requiring a DMB. This patch also adds barriers
for the TLB operations in compressed/head.S
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
These instructions were placed in the code directly as opcodes because
early compilers didn't support them. Toolchains supporting ARMv7
understand these instructions and the patch puts the mnemonics back.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This declaration specifies the "function" type and size for various
assembly functions, mainly needed for generating the correct branch
instructions in Thumb-2.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>