The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
are only accessible to privileged DMA engines. Adding it to the
arm dma-mapping.c so that the ARM32 DMA IOMMU mapper can make use of it.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
commit ab6494f0c9 ("nommu: Add noMMU support to the DMA API") have
add CONFIG_MMU compilation flag but that prohibit to use dma_mmap_wc()
when the platform doesn't have MMU.
This patch call vm_iomap_memory() in noMMU case to test if addresses
are correct and set vma->vm_flags rather than all return an error.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When the data cache is PIPT or VIPT non-aliasing, and cache operations
are broadcast by the hardware, we can always postpone the flush in
flush_dcache_page(). A similar change was done for ARM64 in commit
b5b6c9e914 ("arm64: Avoid cache flushing in flush_dcache_page()").
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.
Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We can allow modules to be loaded into the vmalloc region, where they
should also benefit from the same protections as those loaded into
the more efficient module region. Allow these functions to operate
there as well.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The set_memory_*() bounds checks are buggy on several fronts:
1. They fail to round the region size up if the passed address is not
page aligned.
2. The region check was incomplete, and didn't correspond with what
was being asked of apply_to_page_range()
So, rework change_memory_common() to fix these problems, adding an
"in_region()" helper to determine whether the start & size fit within
the provided region start and stop addresses.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The UniPhier outer cache (arch/arm/mm/cache-uniphier.c) has 128 byte
line length and its tags are also managed per 128 byte line. This
is very unfortunate, but the current 64 byte alignment for kmalloc()
causes sharing problems on DMA if used with this outer cache.
This commit adds ARM_L1_CACHE_SHIFT_7 to increase the DMA minimum
alignment to 128 byte if CACHE_UNIPHIER is enabled. There are
several drivers that assume aligning to L1_CACHE_BYTES will be DMA
safe, so this commit also changes the L1_CACHE_BYTES for safety.
Having said that, I hesitate to align all the other SoCs in Multi
platform to the UniPhier's requirement. So, I am disabling the
CONFIG_CACHE_UNIPHIER by default, so that multi_v7_defconfig will
still stay with CONFIG_ARM_L1_CACHE_SHIFT=6. With this commit,
UniPhier SoCs will become slower, but it is much better than system
crash. If desired, the outer-cache can be enabled by merge_config
or something.
Note:
The UniPhier PH1-Pro5 SoC is equipped also with L3 cache with 256
byte line size but its tags are managed per 128 byte sub-line.
So, ARM_L1_CACHE_SHIFT_7 should be fine for all the UniPhier SoCs.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
fs_initcall is definitely too late to initialize DMA-debug hash tables,
because some drivers might get probed and use DMA mapping framework
already in core_initcall. Late initialization of DMA-debug results in
false warning about accessing memory, that was not allocated, like this
one:
------------[ cut here ]------------
WARNING: CPU: 5 PID: 1 at lib/dma-debug.c:1104 check_unmap+0xa1c/0xe50
exynos-sysmmu 10a60000.sysmmu: DMA-API: device driver tries to free DMA memory it has not allocated [device
address=0x000000006ebd0000] [size=16384 bytes]
Modules linked in:
CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc5-00028-g39dde3d-dirty #44
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c0119dd4>] (unwind_backtrace) from [<c01122bc>] (show_stack+0x20/0x24)
[<c01122bc>] (show_stack) from [<c062714c>] (dump_stack+0x84/0xa0)
[<c062714c>] (dump_stack) from [<c0132560>] (__warn+0x14c/0x180)
[<c0132560>] (__warn) from [<c01325dc>] (warn_slowpath_fmt+0x48/0x50)
[<c01325dc>] (warn_slowpath_fmt) from [<c06814f8>] (check_unmap+0xa1c/0xe50)
[<c06814f8>] (check_unmap) from [<c06819c4>] (debug_dma_unmap_page+0x98/0xc8)
[<c06819c4>] (debug_dma_unmap_page) from [<c076c3e8>] (exynos_iommu_domain_free+0x158/0x380)
[<c076c3e8>] (exynos_iommu_domain_free) from [<c0764a30>] (iommu_domain_free+0x34/0x60)
[<c0764a30>] (iommu_domain_free) from [<c011f168>] (release_iommu_mapping+0x30/0xb8)
[<c011f168>] (release_iommu_mapping) from [<c011f23c>] (arm_iommu_release_mapping+0x4c/0x50)
[<c011f23c>] (arm_iommu_release_mapping) from [<c0b061ac>] (s5p_mfc_probe+0x640/0x80c)
[<c0b061ac>] (s5p_mfc_probe) from [<c07e6750>] (platform_drv_probe+0x70/0x148)
[<c07e6750>] (platform_drv_probe) from [<c07e25c0>] (driver_probe_device+0x12c/0x6b0)
[<c07e25c0>] (driver_probe_device) from [<c07e2c6c>] (__driver_attach+0x128/0x17c)
[<c07e2c6c>] (__driver_attach) from [<c07df74c>] (bus_for_each_dev+0x88/0xc8)
[<c07df74c>] (bus_for_each_dev) from [<c07e1b6c>] (driver_attach+0x34/0x58)
[<c07e1b6c>] (driver_attach) from [<c07e1350>] (bus_add_driver+0x18c/0x32c)
[<c07e1350>] (bus_add_driver) from [<c07e4198>] (driver_register+0x98/0x148)
[<c07e4198>] (driver_register) from [<c07e5cb0>] (__platform_driver_register+0x58/0x74)
[<c07e5cb0>] (__platform_driver_register) from [<c174cb30>] (s5p_mfc_driver_init+0x1c/0x20)
[<c174cb30>] (s5p_mfc_driver_init) from [<c0102690>] (do_one_initcall+0x64/0x258)
[<c0102690>] (do_one_initcall) from [<c17014c0>] (kernel_init_freeable+0x3d0/0x4d0)
[<c17014c0>] (kernel_init_freeable) from [<c116eeb4>] (kernel_init+0x18/0x134)
[<c116eeb4>] (kernel_init) from [<c010bbd8>] (ret_from_fork+0x14/0x3c)
---[ end trace dc54c54bd3581296 ]---
This patch moves initialization of DMA-debug to core_initcall. This is
safe from the initialization perspective. dma_debug_do_init() internally calls
debugfs functions and debugfs also gets initialised at core_initcall(), and
that is earlier than arch code in the link order, so it will get initialized
just before the DMA-debug.
Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull uaccess.h prepwork from Al Viro:
"Preparations to tree-wide switch to use of linux/uaccess.h (which,
obviously, will allow to start unifying stuff for real). The last step
there, ie
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
`git grep -l "$PATT"|grep -v ^include/linux/uaccess.h`
is not taken here - I would prefer to do it once just before or just
after -rc1. However, everything should be ready for it"
* 'work.uaccess2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
remove a stray reference to asm/uaccess.h in docs
sparc64: separate extable_64.h, switch elf_64.h to it
score: separate extable.h, switch module.h to it
mips: separate extable.h, switch module.h to it
x86: separate extable.h, switch sections.h to it
remove stray include of asm/uaccess.h from cacheflush.h
mn10300: remove a bogus processor.h->uaccess.h include
xtensa: split uaccess.h into C and asm sides
bonding: quit messing with IOCTL
kill __kernel_ds_p off
mn10300: finish verify_area() off
frv: move HAVE_ARCH_UNMAPPED_AREA to pgtable.h
exceptions: detritus removal
These are updates for platform specific code on 32-bit ARM machines,
essentially anything that can not (yet) be expressed using DT files.
Noteworthy changes include:
- We get support for running in big-endian mode on two platforms:
sunxi (Allwinner) and s3c24xx (old Samsung).
- The recently added Uniphier platform now uses standard PSCI
methods for SMP booting and we remove support for old bootloader
versions that did not support it yet.
- In sunxi, we gain support for the "Nextthing GR8" SoC, which
is a close relative of the Allwinner A13 and R8 chips.
- PXA completes its move over to the generic dmaengine framework
and removes its old private API
- mach-bcm gains support for BCM47189/BCM53573, their first ARM
SoC with integrated 802.11ac wireless networking.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAV/gUVGCrR//JCVInAQKoyA/8DeV4M2IL/csGvnDwroH8rjDCvdeVDG0b
QhwtoIoy5Ur85fjZKsIJnu+8lZyB9Kun5p9mrjGsl1fo2PMjNKkOIN2BvV2r35Bo
kcpogVbOzFBJ3decX2QlQ41hhZaFphGWt21oBtslDabRBMyDxsRrv10Qy1gazw6F
aDwvSYUarUajtYq4tDli1mFbj6Tu5YZgL/mRWjEwM7fy4AE9MBd/R7/dAYGF6n7v
LF4l46k4ZIWl1txFcTJ84fV1ugf0O1f0j3umpaRo3QFWonFXmEkFqkyVPZmfoqPf
Q0MvLOZEOImA3UH1njpPV4PsZiDuA/aPuKYrV3aCfAcpKTvkWL5AJrc8YCBv3x/m
rOeLC3EKKj7u0IBoIW4YjzFngMkLthrYQ4Mz2URa0CJNwnW3GK1HswmU8wvpF73p
AMXxfpIjcf7tkauxMX3HOIltWa6DAa5C19lqKhiRzdwm884ZSJ3BRIswh1SHA4bz
f9h80FhI9GisfUL8k+axTtI5nsaLc6fzT4rCbQlp/WyeWFODEycx9T0mhvzd9Adc
7vEvAssh21x4AyZmfcKwb/7xsX15zN+dkB9AuX21ssmOvZ2Tb1zYYHItp0xtEi3R
5hL/8TRAHyUgyDq6MBQyg3EOSW6A+IqrVPRi10ND5q8+dK9Xh1bx08Wp3fZRQMHw
cBAWWa7pQLM=
=y4XF
-----END PGP SIGNATURE-----
Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Arnd Bergmann:
"These are updates for platform specific code on 32-bit ARM machines,
essentially anything that can not (yet) be expressed using DT files.
Noteworthy changes include:
- We get support for running in big-endian mode on two platforms:
sunxi (Allwinner) and s3c24xx (old Samsung).
- The recently added Uniphier platform now uses standard PSCI methods
for SMP booting and we remove support for old bootloader versions
that did not support it yet.
- In sunxi, we gain support for the "Nextthing GR8" SoC, which is a
close relative of the Allwinner A13 and R8 chips.
- PXA completes its move over to the generic dmaengine framework and
removes its old private API
- mach-bcm gains support for BCM47189/BCM53573, their first ARM SoC
with integrated 802.11ac wireless networking"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
ARM: imx legacy: pca100: move peripheral initialization to .init_late
ARM: imx legacy: mx27ads: move peripheral initialization to .init_late
ARM: imx legacy: mx21ads: move peripheral initialization to .init_late
ARM: imx legacy: pcm043: move peripheral initialization to .init_late
ARM: imx legacy: mx35-3ds: move peripheral initialization to .init_late
ARM: imx legacy: mx27-3ds: move peripheral initialization to .init_late
ARM: imx legacy: imx27-visstrim-m10: move peripheral initialization to .init_late
ARM: imx legacy: vpr200: move peripheral initialization to .init_late
ARM: imx legacy: mx31moboard: move peripheral initialization to .init_late
ARM: imx legacy: armadillo5x0: move peripheral initialization to .init_late
ARM: imx legacy: qong: move peripheral initialization to .init_late
ARM: imx legacy: mx31-3ds: move peripheral initialization to .init_late
ARM: imx legacy: pcm037: move peripheral initialization to .init_late
ARM: imx legacy: mx31lilly: move peripheral initialization to .init_late
ARM: imx legacy: mx31ads: move peripheral initialization to .init_late
ARM: imx legacy: mx31lite: move peripheral initialization to .init_late
ARM: imx legacy: kzm: move peripheral initialization to .init_late
MAINTAINERS: update list of Oxnas maintainers
ARM: orion5x: remove extraneous NO_IRQ
ARM: orion: simplify orion_ge00_switch_init
...
This is bit large pile of code which bring in some nice additions:
- Error reporting: we have added a new mechanism for users of dmaenegine to
register a callback_result which tells them the result of the dma
transaction. Right now only one user ntb is using it.
- As we discussed on KS mailing list and pointed out NO_IRQ has no place in
kernel, this also remove NO_IRQ from dmaengine subsystem (both arm and
ppc users)
- Support for IOMMU slave transfers and it implementation for arm.
- To get better build coverage, enable COMPILE_TEST for bunch of driver,
and fix the warning and sparse complaints on these.
- Apart from above, usual updates spread across drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX9cGYAAoJEHwUBw8lI4NHXh0P/3OsctPYnwcangOz268hHDap
7ZHwau96K7DRi8cFCc0XmG083Ivqih/fWMFJBUOEsuwS3zPHkfgfhsvm7MqrK3vv
psJIwnubwTVVQ3lePYJlnna6mijcRNXVAooRLiqylA3QPIYRxECDFVDRNwf39D+I
bYp5tmlFcobugOUUoMqq1D/gH8EHUWxrnrsS6UBBpYm+cusc6u9/JXlOb4pcJGSL
V340zQ0S9FNuEM3b+1kMAeq3DG2wLXv9oJzz/6EN59sx5AdjlYUPHd/PvTYOeG0T
crdtDfL+7xcqP0Ms4SGTOD4kXSe6nErr3bIBHQXI6ZmJn0j//+3yU21kTMl95kM+
RM7nE4vItuQR0jPxVlhuLCcf3q7zMi+noOPZ1DVRTE1Yf9AizAgbPXyOE+jzGUUi
6E+0Mj6CLpFH/Mffxphs7L6GKwfWqaLjAupbjR6EWZud37KAwvpcB1CkJEgT9C4s
OiZ4INTPxXmw9dX/T9CPOyh8oZ8mB9LTUzHoJDvDGuwYm7HE0U9pzHG4bP0mjIIt
y3RboP78t1HC9oZUrxCoGhvekJtok0k3RLGJTSx9ujklY9MJGG/F1KEC6APp5tXu
0UToMXpgXSUkKEZesmsJFj/lbh1+h/yo5zTG5Hek8lh1K0sczaoWu3xTTSY9SSZQ
ihlqyvdzSBweKo8ktU8A
=9iA3
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"This is bit large pile of code which bring in some nice additions:
- Error reporting: we have added a new mechanism for users of
dmaenegine to register a callback_result which tells them the
result of the dma transaction. Right now only one user (ntb) is
using it.
- As we discussed on KS mailing list and pointed out NO_IRQ has no
place in kernel, this also remove NO_IRQ from dmaengine subsystem
(both arm and ppc users)
- Support for IOMMU slave transfers and its implementation for arm.
- To get better build coverage, enable COMPILE_TEST for bunch of
driver, and fix the warning and sparse complaints on these.
- Apart from above, usual updates spread across drivers"
* tag 'dmaengine-4.9-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (169 commits)
async_pq_val: fix DMA memory leak
dmaengine: virt-dma: move function declarations
dmaengine: omap-dma: Enable burst and data pack for SG
DT: dmaengine: rcar-dmac: document R8A7743/5 support
dmaengine: fsldma: Unmap region obtained by of_iomap
dmaengine: jz4780: fix resource leaks on error exit return
dma-debug: fix ia64 build, use PHYS_PFN
dmaengine: coh901318: fix integer overflow when shifting more than 32 places
dmaengine: edma: avoid uninitialized variable use
dma-mapping: fix m32r build warning
dma-mapping: fix ia64 build, use PHYS_PFN
dmaengine: ti-dma-crossbar: enable COMPILE_TEST
dmaengine: omap-dma: enable COMPILE_TEST
dmaengine: edma: enable COMPILE_TEST
dmaengine: ti-dma-crossbar: Fix of_device_id data parameter usage
dmaengine: ti-dma-crossbar: Correct type for of_find_property() third parameter
dmaengine/ARM: omap-dma: Fix the DMAengine compile test on non OMAP configs
dmaengine: edma: Rename set_bits and remove unused clear_bits helper
dmaengine: edma: Use correct type for of_find_property() third parameter
dmaengine: edma: Fix of_device_id data parameter usage (legacy vs TPCC)
...
Add methods to map/unmap device resources addresses for dma_map_ops that
are IOMMU aware. This is needed to map a device MMIO register from a
physical address.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The cachepolicy variable gets initialized using a masked pmd
value. So far, the pmd has been masked with flags valid for the
2-page table format, but the 3-page table format requires a
different mask. On LPAE, this lead to a wrong assumption of what
initial cache policy has been used. Later a check forces the
cache policy to writealloc and prints the following warning:
Forcing write-allocate cache policy for SMP
This patch introduces a new definition PMD_SECT_CACHE_MASK for
both page table formats which masks in all cache flags in both
cases.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The L2C-220 (AKA L220) and L2C-310 (AKA PL310) cache controllers feature
a Performance Monitoring Unit (PMU), which can be useful for tuning
and/or debugging. This hardware is always present and the relevant
registers are accessible to non-secure accesses. Thus, no special
firmware interface is necessary.
This patch adds support for the PMU, plugging into the usual perf
infrastructure. The overflow interrupt is not always available (e.g. on
RealView PBX A9 it is not wired up at all), and the hardware counters
saturate, so the driver does not make use of this. Instead, the driver
periodically polls and reset counters as required to avoid losing
events due to saturation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Kim Phillips <kim.phillips@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
According to ARM AN321 (section 4.12):
"If the vector table is in writable memory such as SRAM, either relocated
by VTOR or a device dependent memory remapping mechanism, then
architecturally a memory barrier instruction is required after the vector
table entry is updated, and if the exception is to be activated
immediately"
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cortex-M7 is a new member of the V7M processor family that adds, among
other things, caches over the features available in Cortex-M4.
This patch adds support for recognising the processor at boot time, and
make use of recently introduced cache functions.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch copies the method used for V7A/R CPUs to specify differing
processor info for different cores.
This patch differentiates Cortex-M3 and Cortex-M4 and leaves a fallback case
for any other V7M processors.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch does the plumbing required to invoke the V7M cache code added
in earlier patches in this series, although there is no users for that
yet.
In order to honour the I/D cache disable config options, this patch changes
the mechanism by which the CCR is set on boot, to be more like V7A/R.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This commit implements the cache operation for V7M.
It is based on V7 counterpart and differs as follows:
- cache operations are memory mapped
- only Thumb instruction set is supported
- we don't handle user access faults
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The UniPhier architecture (32bit) switched over to PSCI. Remove
the SoC-specific SMP operations.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Commit d781145549 (""ARM: 8512/1: proc-v7.S: Adjust stack address when
XIP_KERNEL"") introduced a macro which lives under asm/memory.h.
Unfortunately, for MMU-less systems (like R-class) it leads to build failure:
arch/arm/mm/proc-v7.S: Assembler messages:
arch/arm/mm/proc-v7.S:538: Error: unrecognised relocation suffix
make[1]: *** [arch/arm/mm/proc-v7.o] Error 1
make: *** [arch/arm/mm] Error 2
since it is implicitly pulled via asm/pgtable.h for MMU capable systems only.
To fix it include asm/memory.h explicitly.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Guided by grsecurity's analogous __read_only markings in arch/arm,
this applies several uses of __ro_after_init to structures that are
only updated during __init.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As per L2C-310 TRM[1]:
"... You can control this feature using bits 30,27 and 23 of the
Prefetch Control Register. Bit 23 and 27 are only used if you set bit 30
HIGH..."
which means there is no need to clear bit 23 if bit 30 is being cleared.
[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0246e/CJAJACBJ.html
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Replace magic numbers used for L310 Prefetch Control Register
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
According to Documentation/printk-formats.txt when printing
a size_t variable we should use %zu or %zx format specifiers.
As we are printing a memory size value, we should better use %zu
in this case.
Reported-by: Frank Mori Hess <fmh6jj@gmail.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The late_alloc() PTE allocation function used by create_mapping_late()
does not call pgtable_page_ctor() on PTE pages it allocates, leaving
the per-page spinlock uninitialized.
Since generic page table manipulation code may assume that translation
table pages that are not owned by init_mm are covered by fully
constructed struct pages, the following crash may occur with the new
UEFI memory attributes table code.
efi: memattr: Processing EFI Memory Attributes table:
efi: memattr: 0x0000ffa16000-0x0000ffa82fff [Runtime Code |RUN| | |XP| | | | | | | | ]
Unable to handle kernel NULL pointer dereference at virtual address 00000010
pgd = c0204000
[00000010] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361
Hardware name: Generic DT based system
task: ed858000 ti: ed842000 task.ti: ed842000
PC is at __lock_acquire+0xa0/0x19a8
...
[<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88)
[<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c)
[<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238)
[<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c)
[<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378)
[<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c)
[<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174)
[<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8)
[<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114)
[<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24)
The crash is due to the fact that the UEFI page tables are not owned by
init_mm, but are not covered by fully constructed struct pages.
Given that the UEFI subsystem is currently the only user of
create_mapping_late(), add an unconditional call to pgtable_page_ctor() to
late_alloc().
Fixes: 9fc68b717c ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To limit the amount of mapped low memory, we determine a physical address
boundary based on the start of the vmalloc area using __pa().
Strictly speaking, the vmalloc area location is arbitrary and does not
necessarily corresponds to a valid physical address. For example, if
PAGE_OFFSET = 0x80000000
PHYS_OFFSET = 0x90000000
vmalloc_min = 0xf0000000
then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t
is a 32-bit type. Then the code that follows determines that the entire
physical memory is above that boundary and no low memory gets mapped at
all:
|[...]
|Machine model: Freescale i.MX51 NA04 Board
|Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM)
|Consider using a HIGHMEM enabled kernel.
To avoid this problem let's make vmalloc_limit a 64-bit value all the
time and determine that boundary explicitly without using __pa().
Reported-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull smp hotplug updates from Thomas Gleixner:
"This is the next part of the hotplug rework.
- Convert all notifiers with a priority assigned
- Convert all CPU_STARTING/DYING notifiers
The final removal of the STARTING/DYING infrastructure will happen
when the merge window closes.
Another 700 hundred line of unpenetrable maze gone :)"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
timers/core: Correct callback order during CPU hot plug
leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
powerpc/numa: Convert to hotplug state machine
arm/perf: Fix hotplug state machine conversion
irqchip/armada: Avoid unused function warnings
ARC/time: Convert to hotplug state machine
clocksource/atlas7: Convert to hotplug state machine
clocksource/armada-370-xp: Convert to hotplug state machine
clocksource/exynos_mct: Convert to hotplug state machine
clocksource/arm_global_timer: Convert to hotplug state machine
rcu: Convert rcutree to hotplug state machine
KVM/arm/arm64/vgic-new: Convert to hotplug state machine
smp/cfd: Convert core to hotplug state machine
x86/x2apic: Convert to CPU hotplug state machine
profile: Convert to hotplug state machine
timers/core: Convert to hotplug state machine
hrtimer: Convert to hotplug state machine
x86/tboot: Convert to hotplug state machine
arm64/armv8 deprecated: Convert to hotplug state machine
hwtracing/coresight-etm4x: Convert to hotplug state machine
...
Pull ARM updates from Russell King:
"Included in this update are:
- Patches from Gregory Clement to fix the coherent DMA cases in our
dma-mapping code.
- A number of CPU errata updates and fixes.
- ARM cpuidle improvements from Jisheng Zhang.
- Fix from Kees for the location of _etext.
- Cleanups from Masahiro Yamada to avoid duplicated messages during
the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS.
- Remove a udelay loop limitation, allowing for faster CPUs to
calibrate the delay correctly.
- Cleanup some left-overs from the SW PAN implementation.
- Ensure that a modified address limit is not visible to exception
handlers"
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits)
ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient
ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init
ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
ARM: 8559/1: errata: Workaround erratum A12 821420
ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
ARM: save and reset the address limit when entering an exception
ARM: 8577/1: Fix Cortex-A15 798181 errata initialization
ARM: 8584/1: floppy: avoid gcc-6 warning
ARM: 8583/1: mm: fix location of _etext
ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
ARM: 8306/1: loop_udelay: remove bogomips value limitation
ARM: 8581/1: add missing <asm/prom.h> to arch/arm/kernel/devtree.c
ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready"
ARM: 8556/1: on a generic DT system: do not touch l2x0
ARM: uaccess: remove put_user() code duplication
ARM: 8580/1: Remove orphaned __addr_ok() definition
ARM: get rid of horrible *(unsigned int *)(regs + 1)
ARM: introduce svc_pt_regs structure
...
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-2. This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-5-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Brad Mouring <brad.mouring@ni.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.801270887@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When doing dma allocation with IOMMU the __iommu_alloc_atomic() was
used even when the system was coherent. However, this function
allocates from a non-cacheable pool, which is fine when the device is
not cache coherent but won't work as expected in the device is cache
coherent. Indeed, the CPU and device must access the memory using the
same cacheability attributes.
Moreover when the devices are coherent, the mmap call must not change
the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs
has been updated in the same way that it was done for the arm_dma_mmap
in commit 55af8a9164 ("ARM: 8387/1: arm/mm/dma-mapping.c: Add
arm_coherent_dma_mmap").
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a L2 cache controller is used in a system that provides hardware
coherency, the entire outer cache operations are useless, and can be
skipped. Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
controller and the Cortex-A9.
In the current kernel implementation, the outer cache flush range
operation is triggered by the dma_alloc function.
This operation can be take place during runtime and in some
circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
SoCs.
This patch extends the __dma_clear_buffer() function to receive a
boolean argument related to the coherency of the system. The same
things is done for the calling functions.
Reported-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The workaround for both errata is to set bit 24 in the diagnostic
register. There are no known end-user bugs solved by fixing this
errata, but the fix is trivial and it seems sane to apply it.
The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This erratum has a very simple workaround (set a bit in a register), so
let's apply it. Apparently the workaround's downside is a very slight
power impact.
Note that applying this errata fixes deadlocks that are easy to
reproduce with real world applications.
The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There are several similar errata on Cortex A12 and A17 that all have the same workaround: setting bit[12] of the Feature Register.
Technically the list of errata are:
- A12 818325: Execution of an UNPREDICTABLE STR or STM instruction
might deadlock. Fixed in r0p1.
- A12 852422: Execution of a sequence of instructions might lead to
either a data corruption or a CPU deadlock. Not fixed in any A12s
yet.
- A17 852423: Execution of a sequence of instructions might lead to
either a data corruption or a CPU deadlock. Not fixed in any A17s
yet.
Since A12 got renamed to A17 it seems likely that there won't be any
future Cortex-A12 cores, so we'll enable for all Cortex-A12.
For Cortex-A17 I believe that all known revisions are affected and that all knows revisions means <= r1p2. Presumably if a new A17 was
released it would have this problem fixed.
Note that in <https://patchwork.kernel.org/patch/4735341/> folks
previously expressed opposition to this change because:
A) It was thought to only apply to r0p0 and there were no known r0p0
boards supported in mainline.
B) It was argued that such a workaround beloned in firmware.
Now that this same fix solves other errata on real boards (like
rk3288) point A) is addressed.
Point B) is impossible to address on boards like rk3288. On rk3288
the firmware doesn't stay resident in RAM and isn't involved at all in
the suspend/resume process nor in the SMP bringup process. That means
that the most the firmware could do would be to set the bit on "core
0" and this bit would be lost at suspend/resume time. It is true that
we could write a "generic" solution that saved the boot-time "core 0"
value of this register and applied it at SMP bringup / resume time.
However, since this register (described as the "Feature Register" in
errata) appears to be undocumented (as far as I can tell) and is only
modified for these errata, that "generic" solution seems questionably
cleaner. The generic solution also won't fix existing users that
haven't happened to do a FW update.
Note that in ARM64 presumably PSCI will be universal and fixes like
this will end up in ATF. Hopefully we are nearing the end of this
style of errata workaround.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since commit 2b749cb3a5 ("ARM: realview: remove private barrier
implementation"), this config is not used by any platform.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The binary GCD algorithm is based on the following facts:
1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.
On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.
There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available. This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.
If fast __ffs is not available, the "even/odd" GCD variant is used.
I use the following code to benchmark:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#define swap(a, b) \
do { \
a ^= b; \
b ^= a; \
a ^= b; \
} while (0)
unsigned long gcd0(unsigned long a, unsigned long b)
{
unsigned long r;
if (a < b) {
swap(a, b);
}
if (b == 0)
return a;
while ((r = a % b) != 0) {
a = b;
b = r;
}
return b;
}
unsigned long gcd1(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
for (;;) {
a >>= __builtin_ctzl(a);
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd2(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
unsigned long gcd3(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
if (b == 1)
return r & -r;
for (;;) {
a >>= __builtin_ctzl(a);
if (a == 1)
return r & -r;
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd4(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
if (b == r)
return r;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == r)
return r;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
gcd0, gcd1, gcd2, gcd3, gcd4,
};
#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
#if defined(__x86_64__)
#define rdtscll(val) do { \
unsigned long __a,__d; \
__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
} while(0)
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
unsigned long long start, end;
unsigned long long ret;
unsigned long gcd_res;
rdtscll(start);
gcd_res = gcd(a, b);
rdtscll(end);
if (end >= start)
ret = end - start;
else
ret = ~0ULL - start + 1 + end;
*res = gcd_res;
return ret;
}
#else
static inline struct timespec read_time(void)
{
struct timespec time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
return time;
}
static inline unsigned long long diff_time(struct timespec start, struct timespec end)
{
struct timespec temp;
if ((end.tv_nsec - start.tv_nsec) < 0) {
temp.tv_sec = end.tv_sec - start.tv_sec - 1;
temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
} else {
temp.tv_sec = end.tv_sec - start.tv_sec;
temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}
return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
}
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
struct timespec start, end;
unsigned long gcd_res;
start = read_time();
gcd_res = gcd(a, b);
end = read_time();
*res = gcd_res;
return diff_time(start, end);
}
#endif
static inline unsigned long get_rand()
{
if (sizeof(long) == 8)
return (unsigned long)rand() << 32 | rand();
else
return rand();
}
int main(int argc, char **argv)
{
unsigned int seed = time(0);
int loops = 100;
int repeats = 1000;
unsigned long (*res)[TEST_ENTRIES];
unsigned long long elapsed[TEST_ENTRIES];
int i, j, k;
for (;;) {
int opt = getopt(argc, argv, "n:r:s:");
/* End condition always first */
if (opt == -1)
break;
switch (opt) {
case 'n':
loops = atoi(optarg);
break;
case 'r':
repeats = atoi(optarg);
break;
case 's':
seed = strtoul(optarg, NULL, 10);
break;
default:
/* You won't actually get here. */
break;
}
}
res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
memset(elapsed, 0, sizeof(elapsed));
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
/* Do we have args? */
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
unsigned long long min_elapsed[TEST_ENTRIES];
for (k = 0; k < repeats; k++) {
for (i = 0; i < TEST_ENTRIES; i++) {
unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
if (k == 0 || min_elapsed[i] > tmp)
min_elapsed[i] = tmp;
}
}
for (i = 0; i < TEST_ENTRIES; i++)
elapsed[i] += min_elapsed[i];
}
for (i = 0; i < TEST_ENTRIES; i++)
printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
k = 0;
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
for (i = 1; i < TEST_ENTRIES; i++) {
if (res[j][i] != res[j][0])
break;
}
if (i < TEST_ENTRIES) {
if (k == 0) {
k = 1;
fprintf(stderr, "Error:\n");
}
fprintf(stderr, "gcd(%lu, %lu): ", a, b);
for (i = 0; i < TEST_ENTRIES; i++)
fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
}
}
if (k == 0)
fprintf(stderr, "PASS\n");
free(res);
return 0;
}
Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 10174
gcd1: elapsed 2120
gcd2: elapsed 2902
gcd3: elapsed 2039
gcd4: elapsed 2812
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9309
gcd1: elapsed 2280
gcd2: elapsed 2822
gcd3: elapsed 2217
gcd4: elapsed 2710
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9589
gcd1: elapsed 2098
gcd2: elapsed 2815
gcd3: elapsed 2030
gcd4: elapsed 2718
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9914
gcd1: elapsed 2309
gcd2: elapsed 2779
gcd3: elapsed 2228
gcd4: elapsed 2709
PASS
[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM updates from Russell King:
"Changes included in this pull request:
- revert pxa2xx-flash back to using ioremap_cached() and switch
memremap() to use arch_memremap_wb()
- remove pci=firmware command line argument handling
- remove unnecessary arm_dma_set_mask() implementation, the generic
implementation will do for ARM
- removal of the ARM kallsyms "hack" to work around mode switching
veneers and vectors located below PAGE_OFFSET
- tidy up build system output a little
- add L2 cache power management DT bindings
- remove duplicated local_irq_disable() in reboot paths
- handle AMBA primecell devices better at registration time with PM
domains (needed for Samsung SoCs)
- ARM specific preparation to support Keystone II kexec"
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs
ARM: 8570/2: Documentation: devicetree: Add PL310 PM bindings
ARM: 8569/1: pl2x0: Add OF control of cache power management
ARM: 8568/1: reboot: remove duplicated local_irq_disable()
ARM: 8566/1: drivers: amba: properly handle devices with power domains
ARM: provide arm_has_idmap_alias() helper
ARM: kexec: remove 512MB restriction on kexec crashdump
ARM: provide improved virt_to_idmap() functionality
ARM: kexec: fix crashkernel= handling
ARM: 8557/1: specify install, zinstall, and uinstall as PHONY targets
ARM: 8562/1: suppress "include/generated/mach-types.h is up to date."
ARM: 8553/1: kallsyms: remove --page-offset command line option
ARM: 8552/1: kallsyms: remove special lower address limit for CONFIG_ARM
ARM: 8555/1: kallsyms: ignore ARM mode switching veneers
ARM: 8548/1: dma-mapping: remove arm_dma_set_mask()
ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handling
ARM: memremap: implement arch_memremap_wb()
memremap: add arch specific hook for MEMREMAP_WB mappings
mtd: pxa2xx-flash: switch back from memremap to ioremap_cached
ARM: reintroduce ioremap_cached() for creating cached I/O mappings
As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull ARM fixes from Russell King:
"These are a number of updates to fix a few problems found in the ARM
nommu code over the last couple of years, caused mostly by changes on
the mmu side"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8573/1: domain: move {set,get}_domain under config guard
ARM: 8572/1: nommu: change memory reserve for the vectors
ARM: 8571/1: nommu: fix PMSAv7 setup
This outer cache allows to control active ways independently for
each CPU, but currently nothing is done for secondary CPUs. In
other words, all the ways are locked for secondary CPUs by default.
This commit fixes it to fully bring out the performance of this
outer cache.
There would be two possible ways to achieve this:
[1] Each CPU initializes active ways for itself. This can be done
via the SSCLPDAWCR register. This is a banked register, so each
CPU sees a different instance of the register for its own.
[2] The master CPU initializes active ways for all the CPUs. This
is available via SSCDAWCARMR(N) registers, where all instances
of SSCLPDAWCR are mirrored. They are mapped at the address
SSCDAWCARMR + 4 * N, where N is the CPU number.
The outer cache frame work does not support a per-CPU init callback.
So this commit adopts [2]; the master CPU iterates over possible CPUs
setting up SSCDAWCARMR(N) registers.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
additional page above the base vector one. This change wasn't taken into
account by the nommu memreserve.
This patch ensures that the kernel won't overwrite any vector stub on
nommu.
[changed the MPU side too]
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 1c2f87c (ARM: 8025/1: Get rid of meminfo) broke the support for
MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
memblocks appropriately.
MPU initialisation only uses the first memory region, and removes all
subsequent ones. Because looping over all regions that need removal is
inefficient, and memblock_remove already handles memory ranges, we can
flatten the 'for_each_memblock' part.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add ability to override power management bits of 310 controllers
(dynamic clock gating and standby mode) through OF entries. As the
saved register is only applied when working on a supported controller,
it is safe to save the settings.
In order to maintain existing behavior, if the settings are not found
in the DT, the corresponding feature will be enabled.
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For kexec, we need more functionality from the IDMAP system. We need to
be able to convert physical addresses to their identity mappped versions
as well as virtual addresses.
Convert the existing arch_virt_to_idmap() to deal with physical
addresses instead.
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull ARM fixes from Russell King:
"Three further fixes for ARM.
Alexandre Courbot was having problems with DMA allocations with the
GFP flags affecting where the tracking data was being allocated from.
Vladimir Murzin noticed that the CPU feature code was not entirely
correct, which can cause some features to be misreported"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8564/1: fix cpu feature extracting helper
ARM: 8563/1: fix demoting HWCAP_SWP
ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc
This series wires up the generic memremap() function for ARM in a way
that allows it to be used as intended, i.e., without regard for whether
the region being mapped is covered by a struct page and/or the linear
mapping (lowmem)
Commit 19e6e5e539 ("ARM: 8547/1: dma-mapping: store buffer
information") allocates a structure meant for internal buffer management
with the GFP flags of the buffer itself. This can trigger the following
safeguard in the slab/slub allocator:
if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK);
BUG();
}
Fix this by filtering the flags that make the slab allocator unhappy.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull ARM fixes from Russell King:
"A couple of small fixes, and wiring up the new syscalls which appeared
during the merge window"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8550/1: protect idiv patching against undefined gcc behavior
ARM: wire up preadv2 and pwritev2 syscalls
ARM: SMP enable of cache maintanence broadcast
arm_dma_set_mask() implements exactly the same behavior as the fallback
that dma_set_mask() takes if the set_dma_mask op is not set. Remove it
and use that fallback instead like what is already done for
dma_get_mask().
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The generic memremap() falls back to using ioremap_cache() to create
MEMREMAP_WB mappings if the requested region is not already covered
by the linear mapping, unless the architecture provides an implementation
of arch_memremap_wb().
Since ioremap_cache() is not appropriate on ARM to map memory with the
same attributes used for the linear mapping, implement arch_memremap_wb()
which does exactly that. Also, relax the WARN() check to allow MT_MEMORY_RW
mappings of pfn_valid() pages.
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
The original ARM-only ioremap flavor 'ioremap_cached' has been renamed
to 'ioremap_cache' to align with other architectures, and subsequently
abused in generic code to map things like firmware tables in memory.
For that reason, there is currently an effort underway to deprecate
ioremap_cache, whose semantics are poorly defined, and which is typed
with an __iomem annotation that is inappropriate for mappings of ordinary
memory.
However, original users of ioremap_cached() used it in a context where
the I/O connotation is appropriate, and replacing those instances with
memremap() does not make sense. So let's revive ioremap_cached(), so
that we can change back those original users before we drop ioremap_cache
entirely in favor of memremap.
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Masahiro Yamada reports that we can fail to set the FW bit in the
auxiliary control register, which enables broadcasting the cache
maintanence operations. This occurs because we only check that the
SMP/nAMP bit is set, rather than checking whether all the bits we
want to be set are set.
Rearrange the code to ensure that all desired bits are set, and only
update the register if we discover some required bits are not set.
Tested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull ARM updates from Russell King:
"Another mixture of changes this time around:
- Split XIP linker file from main linker file to make it more
maintainable, and various XIP fixes, and clean up a resulting
macro.
- Decompressor cleanups from Masahiro Yamada
- Avoid printing an error for a missing L2 cache
- Remove some duplicated symbols in System.map, and move
vectors/stubs back into kernel VMA
- Various low priority fixes from Arnd
- Updates to allow bus match functions to return negative errno
values, touching some drivers and the driver core. Greg has acked
these changes.
- Virtualisation platform udpates form Jean-Philippe Brucker.
- Security enhancements from Kees Cook
- Rework some Kconfig dependencies and move PSCI idle management code
out of arch/arm into drivers/firmware/psci.c
- ARM DMA mapping updates, touching media, acked by Mauro.
- Fix places in ARM code which should be using virt_to_idmap() so
that Keystone2 can work.
- Fix Marvell Tauros2 to work again with non-DT boots.
- Provide a delay timer for ARM Orion platforms"
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (45 commits)
ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0
ARM: 8547/1: dma-mapping: store buffer information
ARM: 8543/1: decompressor: rename suffix_y to compress-y
ARM: 8542/1: decompressor: merge piggy.*.S and simplify Makefile
ARM: 8541/1: decompressor: drop redundant FORCE in Makefile
ARM: 8540/1: decompressor: use clean-files instead of extra-y to clean files
ARM: 8539/1: decompressor: drop more unneeded assignments to "targets"
ARM: 8538/1: decompressor: drop unneeded assignments to "targets"
ARM: 8532/1: uncompress: mark putc as inline
ARM: 8531/1: turn init_new_context into an inline function
ARM: 8530/1: remove VIRT_TO_BUS
ARM: 8537/1: drop unused DEBUG_RODATA from XIP_KERNEL
ARM: 8536/1: mm: hide __start_rodata_section_aligned for non-debug builds
ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNEL
ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
ARM: make the physical-relative calculation more obvious
ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL
ARM: 8411/1: Add default SPARSEMEM settings
ARM: 8503/1: clk_register_clkdev: remove format string interface
ARM: 8529/1: remove 'i' and 'zi' targets
...
The define has a comment from Nick Piggin from 2007:
/* For backwards compat. Remove me quickly. */
I guess 9 years should not be too hurried sense of 'quickly' even for
kernel measures.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are few things about *pte_alloc*() helpers worth cleaning up:
- 'vma' argument is unused, let's drop it;
- most __pte_alloc() callers do speculative check for pmd_none(),
before taking ptl: let's introduce pte_alloc() macro which does
the check.
The only direct user of __pte_alloc left is userfaultfd, which has
different expectation about atomicity wrt pmd.
- pte_alloc_map() and pte_alloc_map_lock() are redefined using
pte_alloc().
[sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
[sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM fixes from Russell King:
"Just two ARM fixes this time: one to fix the hyp-stub for older ARM
CPUs, and another to fix the set_memory_xx() permission functions to
deal with zero sizes correctly"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8544/1: set_memory_xx fixes
ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
Given a device which uses arm_coherent_dma_ops and on which
dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA
API with gfp=0 results in memory corruption and a memory leak.
p = dma_alloc_coherent(dev, sz, &dma, 0);
if (p)
dma_free_coherent(dev, sz, p, dma);
The memory leak is because the alloc allocates using
__alloc_simple_buffer() but the free attempts
dma_release_from_contiguous() which does not do free anything since the
page is not in the CMA area.
The memory corruption is because the free calls __dma_remap() on a page
which is backed by only first level page tables. The
apply_to_page_range() + __dma_update_pte() loop ends up interpreting the
section mapping as an addresses to a second level page table and writing
the new PTE to memory which is not used by page tables.
We don't have access to the GFP flags used for allocation in the free
function. Fix this by adding allocator backends and using this
information in the free function so that we always use the correct
release routine.
Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA")
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Keep a list of allocated DMA buffers so that we can store metadata in
alloc() which we later need in free().
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules.
Signed-off-by: Mika Penttilä mika.penttila@nextfour.com
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Replace calls to get_random_int() followed by a cast to (unsigned long)
with calls to get_random_long(). Also address shifting bug which, in
case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_DEBUG_ALIGN_RODATA is set, we get a link error:
arch/arm/mm/built-in.o:(.data+0x4bc): undefined reference to `__start_rodata_section_aligned'
However, this combination is useless, as XIP_KERNEL implies that all the
RODATA is already marked readonly, so both CONFIG_DEBUG_RODATA and
CONFIG_DEBUG_ALIGN_RODATA (which depends on the other) are not
needed with XIP_KERNEL, and this patches enforces that using a Kconfig
dependency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 25362dc496 ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The physical-relative calculation between the XIP text and data sections
introduced by the previous patch was far from obvious. Let's simplify it
by turning it into a macro which takes the two (virtual) addresses.
This allows us to arrange the calculation in a more obvious manner - we
can make it two sub-expressions which calculate the physical address for
each symbol, and then takes the difference of those physical addresses.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When XIP_KERNEL is enabled, the virt to phys address translation for RAM
is not the same as the virt to phys address translation for .text.
The only way to know where physical RAM is located is to use
PLAT_PHYS_OFFSET.
The MACRO will be useful for other places where there is a similar problem.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When rodata is large enough that it crosses a section boundary after the
kernel text, mark the rest NX. This is as close to full NX of rodata as
we can get without splitting page tables or doing section alignment via
CONFIG_DEBUG_ALIGN_RODATA.
When the config is:
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_ALIGN_RODATA is not set
Before:
---[ Kernel Mapping ]---
0x80000000-0x80100000 1M RW NX SHD
0x80100000-0x80a00000 9M ro x SHD
0x80a00000-0xa0000000 502M RW NX SHD
After:
---[ Kernel Mapping ]---
0x80000000-0x80100000 1M RW NX SHD
0x80100000-0x80700000 6M ro x SHD
0x80700000-0x80a00000 3M ro NX SHD
0x80a00000-0xa0000000 502M RW NX SHD
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For an XIP build, _etext does not represent the end of the
binary image that needs to stay mapped into the MODULES_VADDR area.
Years ago, data came before text in the memory map. However,
now that the order is text/init/data, an XIP_KERNEL needs to map
up to the data location in order to keep from cutting off
parts of the kernel that are needed.
We only map up to the beginning of data because data has already been
copied, so there's no reason to keep it around anymore.
A new symbol is created to make it clear what it is we are referring
to.
This fixes the bug where you might lose the end of your kernel area
after page table setup is complete.
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If we know that TLB efficiency will not be an issue when memory is
accessed then it's not terribly important to allocate big chunks of
memory. The whole point of allocating the big chunks was that it would
make TLB usage efficient.
As Marek Szyprowski indicated:
Please note that mapping memory with larger pages significantly
improves performance, especially when IOMMU has a little TLB
cache. This can be easily observed when multimedia devices do
processing of RGB data with 90/270 degree rotation
Image rotation is distinctly an operation that needs to bounce around
through memory, so it makes sense that TLB efficiency is important
there.
Video decoding, on the other hand, is a fairly sequential operation.
During video decoding it's not expected that we'll be jumping all over
memory. Decoding video is also pretty heavy and the TLB misses aren't a
huge deal. Presumably most HW video acceleration users of dma-mapping
will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES.
Allocating big chunks of memory is quite expensive, especially if we're
doing it repeadly and memory is full. In one (out of tree) usage model
it is common that arm_iommu_alloc_attrs() is called 16 times in a row,
each one trying to allocate 4 MB of memory. This is called whenever the
system encounters a new video, which could easily happen while the
memory system is stressed out. In fact, on certain social media
websites that auto-play video and have infinite scrolling, it's quite
common to see not just one of these 16x4MB allocations but 2 or 3 right
after another. Asking the system even to do a small amount of extra
work to give us big chunks in this case is just not a good use of time.
Allocating big chunks of memory is also expensive indirectly. Even if
we ask the system not to do ANY extra work to allocate _our_ memory,
we're still potentially eating up all big chunks in the system.
Presumably there are other users in the system that aren't quite as
flexible and that actually need these big chunks. By eating all the big
chunks we're causing extra work for the rest of the system. We also may
start making other memory allocations fail. While the system may be
robust to such failures (as is the case with dwc2 USB trying to allocate
buffers for Ethernet data and with WiFi trying to allocate buffers for
WiFi data), it is yet another big performance hit.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The __iommu_alloc_buffer() is expected to be called to allocate pretty
sizeable buffers. Upon simple tests of video I saw it trying to
allocate 4,194,304 bytes. The function tries to allocate large chunks
in order to optimize IOMMU TLB usage.
The current function is very, very slow.
One problem is the way it keeps trying and trying to allocate big
chunks. Imagine a very fragmented memory that has 4M free but no
contiguous pages at all. Further imagine allocating 4M (1024 pages).
We'll do the following memory allocations:
- For page 1:
- Try to allocate order 10 (no retry)
- Try to allocate order 9 (no retry)
- ...
- Try to allocate order 0 (with retry, but not needed)
- For page 2:
- Try to allocate order 9 (no retry)
- Try to allocate order 8 (no retry)
- ...
- Try to allocate order 0 (with retry, but not needed)
- ...
- ...
Total number of calls to alloc() calls for this case is:
sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
=> 9228
The above is obviously worse case, but given how slow alloc can be we
really want to try to avoid even somewhat bad cases. I timed the old
code with a device under memory pressure and it wasn't hard to see it
take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing
was done on kernel 3.14, so possibly mainline would behave
differently).
A second problem is that allocating big chunks under memory pressure
when we don't need them is just not a great idea anyway unless we really
need them. We can make due pretty well with smaller chunks so it's
probably wise to leave bigger chunks for other users once memory
pressure is on.
Let's adjust the allocation like this:
1. If a big chunk fails, stop trying to hard and bump down to lower
order allocations.
2. Don't try useless orders. The whole point of big chunks is to
optimize the TLB and it can really only make use of 2M, 1M, 64K and
4K sizes.
We'll still tend to eat up a bunch of big chunks, but that might be the
right answer for some users. A future patch could possibly add a new
DMA_ATTR that would let the caller decide that TLB optimization isn't
important and that we should use smaller chunks. Presumably this would
be a sane strategy for some callers.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The use of CONFIG_DEBUG_RODATA is generally seen as an essential part of
kernel self-protection:
http://www.openwall.com/lists/kernel-hardening/2015/11/30/13
Additionally, its name has grown to mean things beyond just rodata. To
get ARM closer to this, we ought to rearrange the names of the configs
that control how the kernel protects its memory. What was called
CONFIG_ARM_KERNMEM_PERMS is realy doing the work that other architectures
call CONFIG_DEBUG_RODATA.
This redefines CONFIG_DEBUG_RODATA to actually do the bulk of the
ROing (and NXing). In the place of the old CONFIG_DEBUG_RODATA, use
CONFIG_DEBUG_ALIGN_RODATA, since that's what the option does: adds
section alignment for making rodata explicitly NX, as arm does not split
the page tables like arm64 does without _ALIGN_RODATA.
Also adds human readable names to the sections so I could more easily
debug my typos, and makes CONFIG_DEBUG_RODATA default "y" for CPU_V7.
Results in /sys/kernel/debug/kernel_page_tables for each config state:
# CONFIG_DEBUG_RODATA is not set
# CONFIG_DEBUG_ALIGN_RODATA is not set
---[ Kernel Mapping ]---
0x80000000-0x80900000 9M RW x SHD
0x80900000-0xa0000000 503M RW NX SHD
CONFIG_DEBUG_RODATA=y
CONFIG_DEBUG_ALIGN_RODATA=y
---[ Kernel Mapping ]---
0x80000000-0x80100000 1M RW NX SHD
0x80100000-0x80700000 6M ro x SHD
0x80700000-0x80a00000 3M ro NX SHD
0x80a00000-0xa0000000 502M RW NX SHD
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_ALIGN_RODATA is not set
---[ Kernel Mapping ]---
0x80000000-0x80100000 1M RW NX SHD
0x80100000-0x80a00000 9M ro x SHD
0x80a00000-0xa0000000 502M RW NX SHD
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Make virt_to_idmap() return an unsigned long rather than phys_addr_t.
Returning phys_addr_t here makes no sense, because the definition of
virt_to_idmap() is that it shall return a physical address which maps
identically with the virtual address. Since virtual addresses are
limited to 32-bit, identity mapped physical addresses are as well.
Almost all users already had an implicit narrowing cast to unsigned long
so let's make this official and part of this interface.
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There are many locations that do
if (memory_was_allocated_by_vmalloc)
vfree(ptr);
else
kfree(ptr);
but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr(). Unless callers have special reasons, we can
replace this branch with kvfree(). Please check and reply if you found
problems.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Boris Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This branch is the culmination of 5 years of effort to bring the ARMv6
and ARMv7 platforms together such that they can all be enabled and
boot the same kernel. It has been a tremendous amount of cleanup and
refactoring by a huge number of people, and creation of several new
(and major) subsystems to better abstract out all the platform details
in an appropriate manner.
The bulk of this branch is a large patchset from Arnd that brings several
of the more minor and older platforms we have closer to multiplatform
support. Among these are MMP, S3C64xx, Orion5x, mv78xx0 and realview
Much of this is moving around header files from old mach directories,
but there are also some cleanup patches of debug_ll (lowlevel debug
per-platform options) and other parts.
Linus Walleij also has some patchs to clean up the older ARM Realview
platforms by finally introducing DT support, and Rob Herring has some
for ARM Versatile which is now DT-only. Both of these platforms are
now multiplatform.
Finally, a couple of patches from Russell for Dove PMU, and a fix from
Valentin Rothberg for Exynos ADC, which were rebased on top of the
series to avoid conflicts.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIUAwUAVqAGcmCrR//JCVInAQLDog/4x9F0PHGmZhexGfFOpi2Od63Jjx55izRU
zRXqRjjFjambOrZuOx8lEGDy/qzqKbsDU8D1P4IUugkDr2bLSXv+NTLZL1kNBIdm
YOlJhw/BmzLYqauOHmBzGhtv1FDUk3rqbgTsP5tTWj5LpSkwjmqui3HBZpi+f3Rr
YOn+NeQSARiw+51D0b106a9RFshQXRGgn5m3xFjLWhJqshb2z2Ew5cogX/zdwrrM
ss1BFomxsvgk6S+snN6v7cEX2iXe3r89qNR5jEW5BgNpQGFsAUeXPr9zzH07L/Qq
O7XLw9jt5MX/X5372zVHPb57WoflLbF9cFaaDUZV3eTqt3lC67BTxOtYIdC2i90k
E5GYlsy88CRwT2EO+ok/6UTryph+hVv7JqHfbKfnISrbraMCK36DtDTpBIpZ9uYF
rRB7ncJZUWBcyoe+qvitSl+2KV54iB1ez2RXsketxM98dDZsfB2M2ImFou1F/Pgg
ALvpifPubi/uDe7xNUsSuaT6/3jAomBuNsxnkYJ3NeiH/+duZbOYGkzK/LlcjZyc
UrA0IpLfwIFsBNzwfpZPZ1lkEu8Y1YZZ+Hv9k65q1wMuBDgrFI5zUeYrPZi4pN9T
Yo1xP9FstVLDouJrpGZo12VIIxR1UBeGqfRI/BZ58LEF3PRq/g2OVFsdQia5gZKr
ddiJKSL1Vw==
=z1AW
-----END PGP SIGNATURE-----
Merge tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC multiplatform code updates from Arnd Bergmann:
"This branch is the culmination of 5 years of effort to bring the ARMv6
and ARMv7 platforms together such that they can all be enabled and
boot the same kernel. It has been a tremendous amount of cleanup and
refactoring by a huge number of people, and creation of several new
(and major) subsystems to better abstract out all the platform details
in an appropriate manner.
The bulk of this branch is a large patchset from Arnd that brings
several of the more minor and older platforms we have closer to
multiplatform support. Among these are MMP, S3C64xx, Orion5x, mv78xx0
and realview Much of this is moving around header files from old mach
directories, but there are also some cleanup patches of debug_ll
(lowlevel debug per-platform options) and other parts.
Linus Walleij also has some patchs to clean up the older ARM Realview
platforms by finally introducing DT support, and Rob Herring has some
for ARM Versatile which is now DT-only. Both of these platforms are
now multiplatform.
Finally, a couple of patches from Russell for Dove PMU, and a fix from
Valentin Rothberg for Exynos ADC, which were rebased on top of the
series to avoid conflicts"
* tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits)
ARM: realview: don't select SMP_ON_UP for UP builds
ARM: s3c: simplify s3c_irqwake_{e,}intallow definition
ARM: s3c64xx: fix pm-debug compilation
iio: exynos-adc: fix irqf_oneshot.cocci warnings
ARM: realview: build realview-dt SMP support only when used
ARM: realview: select apropriate targets
ARM: realview: clean up header files
ARM: realview: make all header files local
ARM: no longer make CPU targets visible separately
ARM: integrator: use explicit core module options
ARM: realview: enable multiplatform
ARM: make default platform work for NOMMU
ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location
ARM: defconfig: use correct debug_ll settings
ARM: versatile: convert to multi-platform
ARM: versatile: merge mach code into a single file
ARM: versatile: switch to DT only booting and remove legacy code
ARM: versatile: add DT based PCI detection
ARM: pxa: mark ezx structures as __maybe_unused
ARM: pxa: mark raumfeld init functions as __maybe_unused
...
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).
On other hand page_mapcount() return mapcount for this particular small
page.
This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.
Most users outside core-mm should use page_mapcount() instead of
page_mapped().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With new refcounting we don't need to mark PMDs splitting. Let's drop
code to handle this.
pmdp_splitting_flush() is not needed too: on splitting PMD we will do
pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as
needed for fast_gup.
[arnd@arndb.de: fix unterminated ifdef in header file]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arm: arch_mmap_rnd() uses a hard-code value of 8 to generate the random
offset for the mmap base address. This value represents a compromise
between increased ASLR effectiveness and avoiding address-space
fragmentation. Replace it with a Kconfig option, which is sensibly
bounded, so that platform developers may choose where to place this
compromise. Keep 8 as the minimum acceptable value.
[arnd@arndb.de: ARM: avoid ARCH_MMAP_RND_BITS for NOMMU]
Signed-off-by: Daniel Cashman <dcashman@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The VMSA field of MMFR0 (bottom 4 bits) is incremented for each
added feature. PXN is supported if the value is >= 4 and LPAE
is supported if it is >= 5.
In case a kernel with CONFIG_ARM_LPAE disabled is used on a
processor that supports LPAE, we can still use PXN in short
descriptors. So check for >= 4 not == 4.
Signed-off-by: Jungseung Lee <js07.lee@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
According to commit 2503a5ecd8
"ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore
boards with L220" Some PB11MPCore RealView core tiles have broken
outer_sync.
We got rid of the custom barriers from the machine by disabling
outer sync, but that was just for the boardfile case. We have
to be able to do the same in the device tree case.
Since __l2c_init() is cloning and copying the L2C vtable,
we pass an argument to this function to optionally numb
the outer sync operation if desired, before initializing
the cache.
After this we can set up the cache correctly on the RealView
PB11MPCore. This was tested on a PB11MPCore known to have the
issue. Before this, spurious crashes would occur if we try to
set up the cache properly, after this it boots rock solid.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- Tested on the ARM PB11MPCore
- Tested with boardfile boot
- Tested with DeviceTree boot
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWdAsTAAoJEEEQszewGV1zMtsP/1So1gjBSf8NfLbRCndf4l3/
jrviXeGEyGAzvyXvH+KB/24R10NpJMC5zEB8LiUP1Ig/cmNA+XVIJ9IkE8QsjsDx
dA3aqYHTsXr8w8l6hzVX95GvAsxvg7hQ73rMsSg1x6r6pNeg859M4ZJis5gfB7Pf
w76Qazl4jJUE79t768mxOKf46Af6D3y/Oa8dpsNW7w3ehzeEODI4N5BvUyZ50vKT
j8ZdBs++S2ucW1s1zUTeo0wDjaOBK8gNmYKzDtwScCNs5K6l4GcggO0pkrX7MmN/
quNHcvULU0qnzETMjss6wb4xCAenZ98hqHTnBL2NjHd9A1tvTEGBzRfFx4jb5EI6
MspsOfxCLpj59QV7qwQJoYbFdzmQo6YgPn+lZoHA5wzxNqt2g0xiBscPYCVO30WT
mYEF27dUDYzq8EH/BkQ/0dsPcWSNZVWpTYj/14+L766maFsPHJYAVnRk1iRNK9Gq
uxdu9ZH8VusqFLFkikbNkj9xeFc9M0GN2F5AY0+Li24uzqjClessdXdfiOqQAWZ5
px0ET819Zz+HG0F8+YNKiLL3Ybnhz8EB4NrncXwSJPrvOGt+xATqSUycqrxwwMsu
a/kNRaZMRZ7jKdZPPHPESPXdJZKs0vGDThrEFMqKVTFtXfoGNQKdh/Ltixy+rPCH
CX8QrAZPi7CqRUEMXmFp
=tfk1
-----END PGP SIGNATURE-----
Merge tag 'realview-multiplatform-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform
Pull "Multiplatform support for the RealView" from Linus Walleij:
Here is the result of my application of the second part of Arnds
patchset, actually enabling multiplatform and getting the RealView
off the ground as a multiplatform target.
It is dependent on an outstanding patch to the irqchips tree bumping
the number of GICs to 2 for the RealView platform. I cannot say I will
be sleepless if these go in side by side: each branch will compile but
will not boot until both trees have been pulled hurting bisectability a
bit.
- Tested on the ARM PB11MPCore
- Tested with boardfile boot
- Tested with DeviceTree boot
* tag 'realview-multiplatform-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
ARM: realview: select apropriate targets
ARM: realview: clean up header files
ARM: realview: make all header files local
ARM: no longer make CPU targets visible separately
ARM: integrator: use explicit core module options
ARM: realview: enable multiplatform
Now that realview and integrator always select the correct CPU
type themselves based on the core tiles, there is no need to
still have them user-visible in arch/arm/mm/Kconfig. The
ARM925T symbol has been selected by the only user for many
years, so that can be removed along with the realview and
integrator specific ones.
This also solves randconfig build problems on realview.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull libnvdimm fixes from Dan Williams:
- Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
These are tagged for -stable. The scatterlist impact is potentially
corrupted dma addresses on HIGHMEM enabled platforms.
- A minor locking fix for the NFIT hot-add implementation that is new
in 4.4-rc. This would only trigger in the case a hot-add raced
driver removal.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dma-debug: Fix dma_debug_entry offset calculation
Revert "scatterlist: use sg_phys()"
nfit: acpi_nfit_notify(): Do not leave device locked
The proc-v7.S code uses a small temporary stack to preserve register
content in its setup code. This stack is located in the .text section
which is normally meant to be read-only.
Move that temporary stack to the .bss section and get its address in
a position independent way, similarly to what we do in other parts
of the kernel.
While at it, one comments was updated to reflect reality, and the list
of saved registers in the proc-v7.S case is updated to match the comment
next to it for coherency.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
multiplatform and extended DT support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWb9LqAAoJEEEQszewGV1z1I4QAIgylTG1hftD5xCtaKsgJv5X
pcp+eVVCeYjeO3AbknrXzBlty0u3/rDOR6n9aUn7ci63qErGi2GeoZ5glo6y3aOU
qKo2/M0LOoP3y6SGxMjaPTTpStjKsaj2XLjHLNrSHAKXsvoFB69vnAlQh2jU+ohX
JEl9wvkugMWicGaiooWQfG7OAf2Gb6AFAKQUfYNVNNXBTD13oQcFRgLwBKDkNlc9
7N6yzPQDNQiauytr7Ji/49fbkiOLFSB5yllhecb37F/b56XprGvsXJTRwsQhPwbj
ig28qu/g7LhfnkZUOTwhWH6WdyFarMlpA8oHHKrZBeGySgvdjXBVYH4IQAfhT2N9
WSh/he+w8T2+oMVj97gLSxKiP4ugUSsBR6daq5X8NESobxFVYzmFIYQxKxOwFif4
JDWvOKaQsRhx42iAq3CIMkh7yQsBC4tJjtyvVwHuGeC2zRlyODbBCnN4OGKDdYpJ
+VY4kr48Yld5IxYm6J6fXOEWTcFw2n/hvEVsik5mmgw1rYivJsCcvcVxv04xZYCl
6d13eCaSh2TfeKYs/Qf2yy3hmps4dWFcd18/xWRyFdnkCs5qGRbtgLiAQtUnQAGm
bpaI/rYeRnnF80af2dhpZT2i0zBL9uuur7FYZDB9xsQDOkqnCYxKvSW1CfNg5Gtc
senI3dczeTC3NtwRp4eC
=Zhx3
-----END PGP SIGNATURE-----
Merge tag 'realview-base-armsoc-1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform
Merge "Realview multiplatform support" from Linus Walleij:
The board and infrastructure changes for RealView
multiplatform and extended DT support.
* tag 'realview-base-armsoc-1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
ARM: realview: add an DT SMP boot method
ARM: realview: select SP810 and ICST for the DT variant
soc: versatile: add support for the PB11MPCore
clk: versatile-icst: add device tree support
clk: versatile-icst: refactor to allocate regmap separately
clk: versatile-icst: convert to use regmap
ARM: realview: remove private barrier implementation
ARM: no longer force unbuffered DMA for realview
clk/realview: stop using machine headers
ARM: realview: don't map undefined PCI registers
ARM: realview: remove sparsemem hack
Conflicts:
drivers/clk/versatile/Kconfig
commit db0fa0cb01 "scatterlist: use sg_phys()" did replacements of
the form:
phys_addr_t phys = page_to_phys(sg_page(s));
phys_addr_t phys = sg_phys(s) & PAGE_MASK;
However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.
Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb01 ("scatterlist: use sg_phys()")
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Vitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In cpu_v7_do_suspend routine, r11 is used while it is NOT
saved/restored, different compiler may have different usage
of ARM general registers, so it may cause issues during
calling cpu_v7_do_suspend.
We meet kernel fault occurs when using GCC 4.8.3, r11 contains
valid value before calling into cpu_v7_do_suspend, but when returned
from this routine, r11 is corrupted and lead to kernel fault.
Doing save/restore for those corrupted registers is a must in
assemble code.
Signed-off-by: Anson Huang <Anson.Huang@freescale.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: <stable@vger.kernel.org> # v3.3+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 42c4dafe80 ("ARM: 6202/1: Do not ARM_DMA_MEM_BUFFERABLE
on RealView boards with L210/L220") changed the generic setting for
ARM_DMA_MEM_BUFFERABLE to be disabled on any Realview kernel that includes
support for any of the ARM11 variations. Doing this was required to
allow doing DMA without a lockup in the l2x0 cache controller on the
Realview platform.
Unfortunately, in a kernel that also contains support for any ARMv7
based machine, the same change makes it impossible to do DMA on ARMv7,
which gets in the way of enabling multiplatform support on Realview.
As confirmed by Catalin Marinas and Linus Walleij, the current
code for Realview that we have in the kernel does not actually
perform any DMA, and this is unlikely to change in the future.
Therefore we can revert 42c4dafe80 without introducing regressions,
but we must never start using DMA on this platform in the future.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>