linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 12:37:59 +07:00

Author	SHA1	Message	Date
Rabin Vincent	19e6e5e539	ARM: 8547/1: dma-mapping: store buffer information Keep a list of allocated DMA buffers so that we can store metadata in alloc() which we later need in free(). Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-03-04 23:35:17 +00:00
Doug Anderson	14d3ae2efe	ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize alloc If we know that TLB efficiency will not be an issue when memory is accessed then it's not terribly important to allocate big chunks of memory. The whole point of allocating the big chunks was that it would make TLB usage efficient. As Marek Szyprowski indicated: Please note that mapping memory with larger pages significantly improves performance, especially when IOMMU has a little TLB cache. This can be easily observed when multimedia devices do processing of RGB data with 90/270 degree rotation Image rotation is distinctly an operation that needs to bounce around through memory, so it makes sense that TLB efficiency is important there. Video decoding, on the other hand, is a fairly sequential operation. During video decoding it's not expected that we'll be jumping all over memory. Decoding video is also pretty heavy and the TLB misses aren't a huge deal. Presumably most HW video acceleration users of dma-mapping will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES. Allocating big chunks of memory is quite expensive, especially if we're doing it repeadly and memory is full. In one (out of tree) usage model it is common that arm_iommu_alloc_attrs() is called 16 times in a row, each one trying to allocate 4 MB of memory. This is called whenever the system encounters a new video, which could easily happen while the memory system is stressed out. In fact, on certain social media websites that auto-play video and have infinite scrolling, it's quite common to see not just one of these 16x4MB allocations but 2 or 3 right after another. Asking the system even to do a small amount of extra work to give us big chunks in this case is just not a good use of time. Allocating big chunks of memory is also expensive indirectly. Even if we ask the system not to do ANY extra work to allocate _our_ memory, we're still potentially eating up all big chunks in the system. Presumably there are other users in the system that aren't quite as flexible and that actually need these big chunks. By eating all the big chunks we're causing extra work for the rest of the system. We also may start making other memory allocations fail. While the system may be robust to such failures (as is the case with dwc2 USB trying to allocate buffers for Ethernet data and with WiFi trying to allocate buffers for WiFi data), it is yet another big performance hit. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-02-11 15:33:38 +00:00
Doug Anderson	33298ef6d8	ARM: 8505/1: dma-mapping: Optimize allocation The __iommu_alloc_buffer() is expected to be called to allocate pretty sizeable buffers. Upon simple tests of video I saw it trying to allocate 4,194,304 bytes. The function tries to allocate large chunks in order to optimize IOMMU TLB usage. The current function is very, very slow. One problem is the way it keeps trying and trying to allocate big chunks. Imagine a very fragmented memory that has 4M free but no contiguous pages at all. Further imagine allocating 4M (1024 pages). We'll do the following memory allocations: - For page 1: - Try to allocate order 10 (no retry) - Try to allocate order 9 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - For page 2: - Try to allocate order 9 (no retry) - Try to allocate order 8 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - ... - ... Total number of calls to alloc() calls for this case is: sum(int(math.log(i, 2)) + 1 for i in range(1, 1025)) => 9228 The above is obviously worse case, but given how slow alloc can be we really want to try to avoid even somewhat bad cases. I timed the old code with a device under memory pressure and it wasn't hard to see it take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing was done on kernel 3.14, so possibly mainline would behave differently). A second problem is that allocating big chunks under memory pressure when we don't need them is just not a great idea anyway unless we really need them. We can make due pretty well with smaller chunks so it's probably wise to leave bigger chunks for other users once memory pressure is on. Let's adjust the allocation like this: 1. If a big chunk fails, stop trying to hard and bump down to lower order allocations. 2. Don't try useless orders. The whole point of big chunks is to optimize the TLB and it can really only make use of 2M, 1M, 64K and 4K sizes. We'll still tend to eat up a bunch of big chunks, but that might be the right answer for some users. A future patch could possibly add a new DMA_ATTR that would let the caller decide that TLB optimization isn't important and that we should use smaller chunks. Presumably this would be a sane strategy for some callers. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-02-11 15:33:37 +00:00
Tetsuo Handa	1d5cfdb076	tree wide: use kvfree() than conditional kfree()/vfree() There are many locations that do if (memory_was_allocated_by_vmalloc) vfree(ptr); else kfree(ptr); but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory using is_vmalloc_addr(). Unless callers have special reasons, we can replace this branch with kvfree(). Please check and reply if you found problems. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: David Rientjes <rientjes@google.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Boris Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-22 17:02:18 -08:00
Dan Williams	3e6110fd54	Revert "scatterlist: use sg_phys()" commit `db0fa0cb01` "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: `db0fa0cb01` ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <joro@8bytes.org> Reported-by: Vitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-15 12:54:06 -08:00
Mel Gorman	d0164adc89	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
Marek Szyprowski	7e31210349	ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap() IOMMU-based dma_mmap() implementation lacked proper support for offset parameter used in mmap call (it always assumed that mapping starts from offset zero). This patch adds support for offset parameter to IOMMU-based implementation. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-10-03 16:36:45 +01:00
Marek Szyprowski	371f0f085f	ARM: 8426/1: dma-mapping: add missing range check in dma_mmap() dma_mmap() function in IOMMU-based dma-mapping implementation lacked a check for valid range of mmap parameters (offset and buffer size), what might have caused access beyond the allocated buffer. This patch fixes this issue. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-10-03 16:36:45 +01:00
Linus Torvalds	99bc7215bc	Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM fixes from Russell King: "Three fixes and a resulting cleanup for -rc2: - Andre Przywara reported that he was seeing a warning with the new cast inside DMA_ERROR_CODE's definition, and fixed the incorrect use. - Doug Anderson noticed that kgdb causes a "scheduling while atomic" bug. - OMAP5 folk noticed that their Thumb-2 compiled X servers crashed when enabling support to cover ARMv6 CPUs due to a kernel bug leaking some conditional context into the signal handler" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition ARM: get rid of needless #if in signal handling code ARM: fix Thumb2 signal handling when ARMv6 is enabled	2015-09-19 21:05:02 -07:00
Andre Przywara	90cde5584a	ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition Commit `96231b2686`: ("ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use dma_addr_t, which makes the compiler barf on assigning this to an "int" variable on ARM with LPAE enabled: *********** In file included from /src/linux/include/linux/dma-mapping.h:86:0, from /src/linux/arch/arm/mm/dma-mapping.c:21: /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping': /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning: overflow in implicit constant conversion [-Woverflow] #define DMA_ERROR_CODE (~(dma_addr_t)0x0) ^ /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of macro DMA_ERROR_CODE' int i, ret = DMA_ERROR_CODE; ^ *********** Remove the actually unneeded initialization of "ret" in __iommu_create_mapping() and move the variable declaration inside the for-loop to make the scope of this variable more clear. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-09-16 23:58:46 +01:00
Christoph Hellwig	6894258eda	dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Linus Torvalds	c706c7eb0d	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM development updates from Russell King: "Included in this update: - moving PSCI code from ARM64/ARM to drivers/ - removal of some architecture internals from global kernel view - addition of software based "privileged no access" support using the old domains register to turn off the ability for kernel loads/stores to access userspace. Only the proper accessors will be usable. - addition of early fixup support for early console - re-addition (and reimplementation) of OMAP special interconnect barrier - removal of finish_arch_switch() - only expose cpuX/online in sysfs if hotpluggable - a number of code cleanups" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits) ARM: software-based priviledged-no-access support ARM: entry: provide uaccess assembly macro hooks ARM: entry: get rid of multiple macro definitions ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die() ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() ARM: mm: improve do_ldrd_abort macro ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() ARM: entry: efficiency cleanups ARM: entry: get rid of asm_trace_hardirqs_on_cond ARM: uaccess: simplify user access assembly ARM: domains: remove DOMAIN_TABLE ARM: domains: keep vectors in separate domain ARM: domains: get rid of manager mode for user domain ARM: domains: move initial domain setting value to asm/domains.h ARM: domains: provide domain_mask() ARM: domains: switch to keeping domain value in register ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD() ARM: 8416/1: Feroceon: use of_iomap() to map register base ARM: 8415/1: early fixmap support for earlycon ...	2015-09-03 16:27:01 -07:00
Russell King	40d3f02851	Merge branches 'cleanup', 'fixes', 'misc', 'omap-barrier' and 'uaccess' into for-linus	2015-09-03 15:28:37 +01:00
Linus Torvalds	d975f309a8	Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-block Pull SG updates from Jens Axboe: "This contains a set of scatter-gather related changes/fixes for 4.3: - Add support for limited chaining of sg tables even for architectures that do not set ARCH_HAS_SG_CHAIN. From Christoph. - Add sg chain support to target_rd. From Christoph. - Fixup open coded sg->page_link in crypto/omap-sham. From Christoph. - Fixup open coded crypto ->page_link manipulation. From Dan. - Also from Dan, automated fixup of manual sg_unmark_end() manipulations. - Also from Dan, automated fixup of open coded sg_phys() implementations. - From Robert Jarzmik, addition of an sg table splitting helper that drivers can use" * 'for-4.3/sg' of git://git.kernel.dk/linux-block: lib: scatterlist: add sg splitting function scatterlist: use sg_phys() crypto/omap-sham: remove an open coded access to ->page_link scatterlist: remove open coded sg_unmark_end instances crypto: replace scatterwalk_sg_chain with sg_chain target/rd: always chain S/G list scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN	2015-09-02 13:22:38 -07:00
Dan Williams	db0fa0cb01	scatterlist: use sg_phys() Coccinelle cleanup to replace open coded sg to physical address translations. This is in preparation for introducing scatterlists that reference __pfn_t. // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg) // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch virtual patch @@ struct scatterlist sg; @@ - page_to_phys(sg_page(sg)) + sg->offset + sg_phys(sg) @@ struct scatterlist sg; @@ - page_to_phys(sg_page(sg)) + sg_phys(sg) & PAGE_MASK Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-08-17 08:13:26 -06:00
Lorenzo Nava	21caf3a765	ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA This patch allows the use of CMA for DMA coherent memory allocation. At the moment if the input parameter "is_coherent" is set to true the allocation is not made using the CMA, which I think is not the desired behaviour. The patch covers the allocation and free of memory for coherent DMA. Signed-off-by: Lorenzo Nava <lorenx4@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-08-04 16:16:21 +01:00
Russell King	1234e3fda9	ARM: reduce visibility of dmac_* functions The dmac_* functions are private to the ARM DMA API implementation, and should not be used by drivers. In order to discourage their use, remove their prototypes and macros from asm/*.h. We have to leave dmac_flush_range() behind as Exynos and MSM IOMMU code use these; once these sites are fixed, this can be moved also. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-08-01 22:25:04 +01:00
Marek Szyprowski	462859aa7b	ARM: 8404/1: dma-mapping: fix off-by-one error in bitmap size check nr_bitmaps member of mapping structure stores the number of already allocated bitmaps and it is interpreted as loop iterator (it starts from 0 not from 1), so a comparison against number of possible bitmap extensions should include this fact. This patch fixes this by changing the extension failure condition. This issue has been introduced by commit `4d852ef8c2` ("arm: dma-mapping: Add support to extend DMA IOMMU mappings"). Reported-by: Hyungwon Hwang <human.hwang@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Hyungwon Hwang <human.hwang@samsung.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-07-17 15:08:40 +01:00
Russell King	9de44aa4dc	Merge branches 'arnd-fixes', 'clk', 'misc', 'v7' and 'fixes' into for-next	2015-06-12 21:18:08 +01:00
Mike Looijmans	55af8a9164	ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap When dma-coherent transfers are enabled, the mmap call must not change the pg_prot flags in the vma struct. Split the arm_dma_mmap into a common and specific parts, and add a "arm_coherent_dma_mmap" implementation that does not alter the page protection flags. Tested on a topic-miami board (Zynq) using the ACP port to transfer data between FPGA and CPU using the Dyplo framework. Without this patch, byte-wise access to mmapped coherent DMA memory was about 20x slower because of the memory being marked as non-cacheable, and transfer speeds would not exceed 240MB/s. After this patch, the mapped memory is cacheable and the transfer speed is again 600MB/s (limited by the FPGA) when the data is in the L2 cache, while data integrity is being maintained. The patch has no effect on non-coherent DMA. Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-06-06 10:44:04 +01:00
Marek Szyprowski	1424532b21	ARM: 8347/1: dma-mapping: fix off-by-one check in arm_setup_iommu_dma_ops Patch `22b3c181c6` ("arm: dma-mapping: limit IOMMU mapping size") added a check for IO address space size. However this patch broke IOMMU initialization for typical platforms initialized from device tree, which get the default IO address space size of 4GiB. This value doesn't fit into size_t and fails a check introduced by that commit resulting in failed dma-mapping/iommu initialization. This patch fixes this issue by adding proper support for full 4GiB address space size. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-05-03 23:21:55 +01:00
Linus Torvalds	bb0fd7ab09	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: "Included in this update are both some long term fixes and some new features. Fixes: - An integer overflow in the calculation of ELF_ET_DYN_BASE. - Avoiding OOMs for high-order IOMMU allocations - SMP requires the data cache to be enabled for synchronisation primitives to work, so prevent the CPU_DCACHE_DISABLE option being visible on SMP builds. - A bug going back 10+ years in the noMMU ARM94* CPU support code, where it corrupts registers. Found by folk getting Linux running on their cameras. - Versatile Express needs an errata workaround enabled for CPU hot-unplug to work. Features: - Clean up module linker by handling out of range relocations separately from relocation cases we don't handle. - Fix a long term bug in the pci_mmap_page_range() code, which we hope won't impact userspace (we hope there's no users of the existing broken interface.) - Don't map DMA coherent allocations when we don't have a MMU. - Drop experimental status for SMP_ON_UP. - Warn when DT doesn't specify ePAPR mandatory cache properties. - Add documentation concerning how we find the start of physical memory for AUTO_ZRELADDR kernels, detailing why we have chosen the mask and the implications of changing it. - Updates from Ard Biesheuvel to address some issues with large kernels (such as allyesconfig) failing to link. - Allow hibernation to work on modern (ARMv7) CPUs - this appears to have never worked in the past on these CPUs. - Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output format (hopefully without userspace breaking... let's hope that if it causes someone a problem, they tell us.) - Fix tegra-ahb DT offsets. - Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/ flush_dcache_all()) code to be more efficient, and enable this errata workaround by default for ARMv7+SMP CPUs. This complements the Versatile Express fix above. - Rework ARMv7 context code for errata 430973, so that only Cortex A8 CPUs are impacted by the branch target buffer flush when this errata is enabled. Also update the help text to indicate that all r1p* A8 CPUs are impacted. - Switch ARM to the generic show_mem() implementation, it conveys all the information which we were already reporting. - Prevent slow timer sources being used for udelay() - timers running at less than 1MHz are not useful for this, and can cause udelay() to return immediately, without any wait. Using such a slow timer is silly. - VDSO support for 32-bit ARM, mainly for gettimeofday() using the ARM architected timer. - Perf support for Scorpion performance monitoring units" vdso semantic conflict fixed up as per linux-next. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits) ARM: update errata 430973 documentation to cover Cortex A8 r1p* ARM: ensure delay timer has sufficient accuracy for delays ARM: switch to use the generic show_mem() implementation ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs ARM: enable ARM errata 643719 workaround by default ARM: cache-v7: optimise test for Cortex A9 r0pX devices ARM: cache-v7: optimise branches in v7_flush_cache_louis ARM: cache-v7: consolidate initialisation of cache level index ARM: cache-v7: shift CLIDR to extract appropriate field before masking ARM: cache-v7: use movw/movt instructions ARM: allow 16-bit instructions in ALT_UP() ARM: proc-arm94*.S: fix setup function ARM: vexpress: fix CPU hotplug with CT9x4 tile. ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations ...	2015-04-14 21:03:26 -07:00
Russell King	c848791f03	Merge branches 'misc', 'vdso' and 'fixes' into for-next Conflicts: arch/arm/mm/proc-macros.S	2015-04-14 22:28:25 +01:00
Linus Torvalds	3be1b98e07	PCI changes for the v4.1 merge window: Enumeration - Read capability list as dwords, not bytes (Sean O. Stalley) Resource management - Don't check for PNP overlaps with unassigned PCI BARs (Bjorn Helgaas) - Mark invalid BARs as unassigned (Bjorn Helgaas) - Show driver, BAR#, and resource on pci_ioremap_bar() failure (Bjorn Helgaas) - Fail pci_ioremap_bar() on unassigned resources (Bjorn Helgaas) - Assign resources before drivers claim devices (Yijing Wang) - Claim bus resources before pci_bus_add_devices() (Yijing Wang) Power management - Optimize device state transition delays (Aaron Lu) - Don't clear ASPM bits when the FADT declares it's unsupported (Matthew Garrett) Virtualization - Add ACS quirks for Intel 1G NICs (Alex Williamson) IOMMU - Add ptr to OF node arg to of_iommu_configure() (Murali Karicheri) - Move of_dma_configure() to device.c to help re-use (Murali Karicheri) - Fix size when dma-range is not used (Murali Karicheri) - Add helper functions pci_get[put]_host_bridge_device() (Murali Karicheri) - Add of_pci_dma_configure() to update DMA configuration (Murali Karicheri) - Update DMA configuration from DT (Murali Karicheri) - dma-mapping: limit IOMMU mapping size (Murali Karicheri) - Calculate device DMA masks based on DT dma-range size (Murali Karicheri) ARM Versatile host bridge driver - Check for devm_ioremap_resource() failures (Jisheng Zhang) Broadcom iProc host bridge driver - Add Broadcom iProc PCIe driver (Ray Jui) Marvell MVEBU host bridge driver - Add suspend/resume support (Thomas Petazzoni) Renesas R-Car host bridge driver - Fix position of MSI enable bit (Nobuhiro Iwamatsu) - Write zeroes to reserved PCIEPARL bits (Nobuhiro Iwamatsu) - Change PCIEPARL and PCIEPARH to PCIEPALR and PCIEPAUR (Nobuhiro Iwamatsu) - Verify that mem_res is 64K-aligned (Nobuhiro Iwamatsu) Samsung Exynos host bridge driver - Fix INTx enablement statement termination error (Jaehoon Chung) Miscellaneous - Make a shareable UUID for PCI firmware ACPI _DSM (Aaron Lu) - Clarify policy for vendor IDs in pci.txt (Michael S. Tsirkin) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVK/X+AAoJEFmIoMA60/r8hlkP/0e1GhAWA3DGR/+O2OPIkJ2w dQVgv5IN5KXGExT9RHiDL/Ib2PhDvdVI26sinjtkw/FcjyzoWVsPDUzCYudQaPSr zwzZto7dBzfv+exDN2LOqoSscCORAehApmTgYVg29cofJmKlO2ctDtpem1OT0MQ9 CMRMoBHhRe4FF3VJPOBPDXYpS89TObrY600aMDGk4S2uBboZI3aeYiTNLXJyh6fX vRg3TWnTfQHoZINW/YOqao/WbrRixZbO6q4n2IqhI6i/uaAc1IEALk9im8/2ri/s mgb/K5Elq+j4yUGnbFRz62pj/YxwnQKwVO4Nc7P66zENgoOXtv+OGRhlS4+d00/n ux0+BkoxJdaL8HQ/b7+uPydiD85lbERM+B2+LQQ7JN+HI+UEcQ0PsK2hSQKb3njD uEkktlKZViiqALijpL+vKRFe8U4GRE4KUfVsKHhhPPvY5sQTAZ3DrR36e1zKz2pA YJjtaHYW0S/tfoEzi3EnPistbJw5sT0/Waj31QTKb/P0Fr7pHnJfcwV7+unXbKla Osz8m6ELIqxhnuzhjlbayh4MKn49n1ZlwkwCnBdjgLQy0KZtxsWZoBg8LeGU077c TJXukRfl3H8LvpqGMYaxOyw7yUeKobEWy+Ylo5asFnfFw9h6zvW+Sc97jtBCrm4/ OZa7rKdPQGGMbQFMvDc2 =vEVs -----END PGP SIGNATURE----- Merge tag 'pci-v4.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Enumeration - Read capability list as dwords, not bytes (Sean O. Stalley) Resource management - Don't check for PNP overlaps with unassigned PCI BARs (Bjorn Helgaas) - Mark invalid BARs as unassigned (Bjorn Helgaas) - Show driver, BAR#, and resource on pci_ioremap_bar() failure (Bjorn Helgaas) - Fail pci_ioremap_bar() on unassigned resources (Bjorn Helgaas) - Assign resources before drivers claim devices (Yijing Wang) - Claim bus resources before pci_bus_add_devices() (Yijing Wang) Power management - Optimize device state transition delays (Aaron Lu) - Don't clear ASPM bits when the FADT declares it's unsupported (Matthew Garrett) Virtualization - Add ACS quirks for Intel 1G NICs (Alex Williamson) IOMMU - Add ptr to OF node arg to of_iommu_configure() (Murali Karicheri) - Move of_dma_configure() to device.c to help re-use (Murali Karicheri) - Fix size when dma-range is not used (Murali Karicheri) - Add helper functions pci_get[put]_host_bridge_device() (Murali Karicheri) - Add of_pci_dma_configure() to update DMA configuration (Murali Karicheri) - Update DMA configuration from DT (Murali Karicheri) - dma-mapping: limit IOMMU mapping size (Murali Karicheri) - Calculate device DMA masks based on DT dma-range size (Murali Karicheri) ARM Versatile host bridge driver - Check for devm_ioremap_resource() failures (Jisheng Zhang) Broadcom iProc host bridge driver - Add Broadcom iProc PCIe driver (Ray Jui) Marvell MVEBU host bridge driver - Add suspend/resume support (Thomas Petazzoni) Renesas R-Car host bridge driver - Fix position of MSI enable bit (Nobuhiro Iwamatsu) - Write zeroes to reserved PCIEPARL bits (Nobuhiro Iwamatsu) - Change PCIEPARL and PCIEPARH to PCIEPALR and PCIEPAUR (Nobuhiro Iwamatsu) - Verify that mem_res is 64K-aligned (Nobuhiro Iwamatsu) Samsung Exynos host bridge driver - Fix INTx enablement statement termination error (Jaehoon Chung) Miscellaneous - Make a shareable UUID for PCI firmware ACPI _DSM (Aaron Lu) - Clarify policy for vendor IDs in pci.txt (Michael S. Tsirkin)" * tag 'pci-v4.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (36 commits) PCI: Read capability list as dwords, not bytes PCI: layerscape: Simplify platform_get_resource_byname() failure checking PCI: keystone: Don't dereference possible NULL pointer PCI: versatile: Check for devm_ioremap_resource() failures PCI: Don't clear ASPM bits when the FADT declares it's unsupported PCI: Clarify policy for vendor IDs in pci.txt PCI/ACPI: Optimize device state transition delays PCI: Export pci_find_host_bridge() for use inside PCI core PCI: Make a shareable UUID for PCI firmware ACPI _DSM PCI: Fix typo in Thunderbolt kernel message PCI: exynos: Fix INTx enablement statement termination error PCI: iproc: Add Broadcom iProc PCIe support PCI: iproc: Add DT docs for Broadcom iProc PCIe driver PCI: Export symbols required for loadable host driver modules PCI: Add ACS quirks for Intel 1G NICs PCI: mvebu: Add suspend/resume support PCI: Cleanup control flow sparc/PCI: Claim bus resources before pci_bus_add_devices() PCI: Assign resources before drivers claim devices (pci_scan_root_bus()) PCI: Fail pci_ioremap_bar() on unassigned resources ...	2015-04-13 15:45:47 -07:00
Tomasz Figa	49f28aa6b0	ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations IOMMU should be able to use single pages as well as bigger blocks, so if higher order allocations fail, we should not affect state of the system, with events such as OOM killer, but rather fall back to order 0 allocations. This patch changes the behavior of ARM IOMMU DMA allocator to use __GFP_NORETRY, which bypasses OOM invocation, for orders higher than zero and, only if that fails, fall back to normal order 0 allocation which might invoke OOM killer. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-04-02 09:58:25 +01:00
Will Deacon	89cfdb19a8	ARM: 8289/1: dma-mapping: use to_dma_iommu_mapping instead of accessing archdata When using the IOMMU-backed DMA ops for a device, we store a pointer to the dma_iommu_mapping structure (used to keep track of the address space) in the archdata.mapping field of the struct device. Rather than access this field directly, use the to_dma_iommu_mapping helper in dma-mapping, so that we don't really care where the mapping information is held. Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-03-18 10:15:53 +00:00
Murali Karicheri	22b3c181c6	arm: dma-mapping: limit IOMMU mapping size arm_iommu_create_mapping() has size parameter of size_t and arm_setup_iommu_dma_ops() can take a value higher than that when this is called from the OF code. So limit the size to SIZE_MAX. Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> (AMD Seattle) Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> CC: Joerg Roedel <joro@8bytes.org> CC: Grant Likely <grant.likely@linaro.org> CC: Rob Herring <robh+dt@kernel.org> CC: Russell King <linux@arm.linux.org.uk> CC: Arnd Bergmann <arnd@arndb.de>	2015-03-12 11:43:09 -05:00
Russell King	8bf1268f48	ARM: dma-api: fix off-by-one error in __dma_supported() When validating the mask against the amount of memory we have available (so that we can trap 32-bit DMA addresses with >32-bits memory), we had not taken account of the fact that max_pfn is the maximum PFN number plus one that would be in the system. There are several references in the code which bear this out: mm/page_owner.c: for (; pfn < max_pfn; pfn++) { } arch/x86/kernel/setup.c: high_memory = (void )__va(max_pfn PAGE_SIZE - 1) Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-03-10 19:48:35 +00:00
Carlo Caione	6e8266e333	ARM: 8304/1: Respect NO_KERNEL_MAPPING when we don't have an IOMMU Even without an iommu, NO_KERNEL_MAPPING is still convenient to save on kernel address space in places where we don't need a kernel mapping. Implement support for it in the two places where we're creating an expensive mapping. __alloc_from_pool uses an internal pool from which we already have virtual addresses, so it's not relevant, and __alloc_simple_buffer uses alloc_pages, which will always return a lowmem page, which is already mapped into kernel space, so we can't prevent a mapping for it in that case. Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net> Signed-off-by: Carlo Caione <carlo@caione.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Drake <dsd@endlessm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-02-23 14:43:59 +00:00
Linus Torvalds	90c453ca22	Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM fix from Russell King: "Just one fix this time around. __iommu_alloc_buffer() can cause a BUG() if dma_alloc_coherent() is called with either __GFP_DMA32 or __GFP_HIGHMEM set. The patch from Alexandre addresses this" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer()	2015-02-22 09:57:16 -08:00
Alexandre Courbot	23be7fdafa	ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer() There doesn't seem to be any valid reason to allocate the pages array with the same flags as the buffer itself. Doing so can eventually lead to the following safeguard in mm/slab.c's cache_grow() to be hit: if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK); BUG(); } This happens when buffers are allocated with __GFP_DMA32 or __GFP_HIGHMEM. Fix this by allocating the pages array with GFP_KERNEL to follow what is done elsewhere in this file. Using GFP_KERNEL in __iommu_alloc_buffer() is safe because atomic allocations are handled by __iommu_alloc_atomic(). Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-02-20 11:14:42 +00:00
Linus Torvalds	5659c0e470	Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM fixes from Russell King: "A number of ARM fixes, the biggest is fixing a regression caused by appended DT blobs exceeding 64K, causing the decompressor fixup code to fail to patch the DT blob. Another important fix is for the ASID allocator from Will Deacon which prevents some rare crashes seen on some systems. Lastly, there's a build fix for v7M systems when printk support is disabled. The last two remaining fixes are more cosmetic - the IOMMU one prevents an annoying harmless warning message, and we disable the kernel strict memory permissions on non-MMU which can't support it anyway" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover ARM: 8298/1: ARM_KERNMEM_PERMS only works with MMU enabled ARM: 8295/1: fix v7M build for !CONFIG_PRINTK ARM: 8294/1: ATAG_DTB_COMPAT: remove the DT workspace's hardcoded 64KB size ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown	2015-02-04 09:42:55 -08:00
Laurent Pinchart	eab8d6530c	arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device() Commit `4bb25789ed` ("arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops") moved the setting of the DMA operations from arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA operations to be used are selected based on whether the device is connected to an IOMMU. However, the IOMMU detection scheme requires the IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver has been ported yet, this effectively breaks all IOMMU ARM users that depend on the IOMMU being handled transparently by the DMA mapping API. Fix this by restoring the setting of DMA IOMMU ops in arm_iommu_attach_device() and splitting the rest of the function into a new internal __arm_iommu_attach_device() function, called by arch_setup_dma_ops(). Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Olof Johansson <olof@lixom.net>	2015-01-29 10:56:27 -08:00
Will Deacon	c2273a1853	ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown When tearing down the DMA ops for a device via of_dma_deconfigure, we unconditionally detach the device from its IOMMU domain. For devices that aren't actually behind an IOMMU, this produces a "Not attached" warning message on the console. This patch changes the teardown code so that we don't detach from the IOMMU domain when there isn't an IOMMU dma mapping to start with. Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-01-29 15:22:44 +00:00
Linus Torvalds	6f51ee709e	ARM: SoC/iommu configuration for 3.19 The iomm-config branch contains work from Will Deacon, quoting his description: This series adds automatic IOMMU and DMA-mapping configuration for OF-based DMA masters described using the generic IOMMU devicetree bindings. Although there is plenty of future work around splitting up iommu_ops, adding default IOMMU domains and sorting out automatic IOMMU group creation for the platform_bus, this is already useful enough for people to port over their IOMMU drivers and start using the new probing infrastructure (indeed, Marek has patches queued for the Exynos IOMMU). The branch touches core ARM and IOMMU driver files, and the respective maintainers (Russell King and Joerg Roedel) agreed to have the contents merged through the arm-soc tree. The final version was ready just before the merge window, so we ended up delaying it a bit longer than the rest, but we don't expect to see regressions because this is just additional infrastructure that will get used in drivers starting in 3.20 but is unused so far. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAVJCfoGCrR//JCVInAQIfvxAAhVeEKyhroIGiuCmylWK/TdXja+xO46g+ hkrijO0cPB5C7K45AW2a2aCUM0jSjr81dUprQ/uojr3xXxnJ59t7tDAXpKpFy8xi 5gb/wd/Cea90RtR1mUnNr/+P1sJKemcvmhCuib7111E5wd/s617bLd1+zgCuHguj g733GjDE7SUSTEStviDg963pn+l2IartjhRPhAKmGWiLZA7RiWe35pzDTZGCApnd yfZafXxn4IeUcxQUT6lAsW7xShzCUI2CZ8nZ4tG6YcyR2UNB5BVrPb1BAm6Eb28C 1WmyjnAAyXxc6pqPTalO+JctpS7ujjbtwlOOwgthKyKMfpFnqyavablDl6GvtHn8 NIa3HdnKQTXl9/nRXCvIjeWDyaZEZ5ueacfhMm4PWRSIkqKFVgwY18nNkOul9fuz 0UD9EuN0PPHV2hCIp9Kl3Jju5pi2EEzCt/Vn0YGsZTZuVOfREZ3izDtyKFg1tjif AJ5kFRc1X+6hXNDUWUOmLOnjBvupbq2axFbLeAzQxla/O/0pwHWhiuqXu3uB4six 1Hlgt7yI7pob86VcQKTCg1v8kOvQTEuL2BtUWkCpbyrVSafYRVKwlUNnQlmu5F3c sL14hhK9QSHyCmJ7yKchY104QVKmN8v3ks8PyUNoPxq57ChH4E6FVAZpMz08uF5V mIWREpeIPNw= =ELLq -----END PGP SIGNATURE----- Merge tag 'iommu-config-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC/iommu configuration update from Arnd Bergmann: "The iomm-config branch contains work from Will Deacon, quoting his description: This series adds automatic IOMMU and DMA-mapping configuration for OF-based DMA masters described using the generic IOMMU devicetree bindings. Although there is plenty of future work around splitting up iommu_ops, adding default IOMMU domains and sorting out automatic IOMMU group creation for the platform_bus, this is already useful enough for people to port over their IOMMU drivers and start using the new probing infrastructure (indeed, Marek has patches queued for the Exynos IOMMU). The branch touches core ARM and IOMMU driver files, and the respective maintainers (Russell King and Joerg Roedel) agreed to have the contents merged through the arm-soc tree. The final version was ready just before the merge window, so we ended up delaying it a bit longer than the rest, but we don't expect to see regressions because this is just additional infrastructure that will get used in drivers starting in 3.20 but is unused so far" * tag 'iommu-config-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: iommu: store DT-probed IOMMU data privately arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops arm: call iommu_init before of_platform_populate dma-mapping: detect and configure IOMMU in of_dma_configure iommu: fix initialization without 'add_device' callback iommu: provide helper function to configure an IOMMU for an of master iommu: add new iommu_ops callback for adding an OF device dma-mapping: replace set_arch_dma_coherent_ops with arch_setup_dma_ops iommu: provide early initialisation hook for IOMMU drivers	2014-12-16 14:53:01 -08:00
Will Deacon	4bb25789ed	arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't actually called outside of a few drivers) into arch_setup_dma_ops, so that we can use IOMMUs for DMA transfers in a more generic fashion. Since this significantly complicates the arch_setup_dma_ops function, it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU is not set, the iommu parameter is ignored and the normal ops are used instead. Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-12-01 16:51:35 +00:00
Laura Abbott	005757298f	ARM: 8181/1: Drop extra return statement Commit `513510ddba` (common: dma-mapping: introduce common remapping functions) managed to end up with an extra return statement from the original patch. Drop it. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-10-29 17:20:51 +00:00
Laura Abbott	36d0fd2198	arm: use genalloc for the atomic pool ARM currently uses a bitmap for tracking atomic allocations. genalloc already handles this type of memory pool allocation so switch to using that instead. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Riley <davidriley@chromium.org> Cc: Olof Johansson <olof@lixom.net> Cc: Ritesh Harjain <ritesh.harjani@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:25:52 -04:00
Laura Abbott	513510ddba	common: dma-mapping: introduce common remapping functions For architectures without coherent DMA, memory for DMA may need to be remapped with coherent attributes. Factor out the the remapping code from arm and put it in a common location to reduce code duplication. As part of this, the arm APIs are now migrated away from ioremap_page_range to the common APIs which use map_vm_area for remapping. This should be an equivalent change and using map_vm_area is more correct as ioremap_page_range is intended to bring in io addresses into the cpu space and not regular kernel managed memory. Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Riley <davidriley@chromium.org> Cc: Olof Johansson <olof@lixom.net> Cc: Ritesh Harjain <ritesh.harjani@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Mitchel Humpherys <mitchelh@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:25:52 -04:00
Joonsoo Kim	a254129e86	CMA: generalize CMA reserved area management functionality Currently, there are two users on CMA functionality, one is the DMA subsystem and the other is the KVM on powerpc. They have their own code to manage CMA reserved area even if they looks really similar. From my guess, it is caused by some needs on bitmap management. KVM side wants to maintain bitmap not for 1 page, but for more size. Eventually it use bitmap where one bit represents 64 pages. When I implement CMA related patches, I should change those two places to apply my change and it seem to be painful to me. I want to change this situation and reduce future code management overhead through this patch. This change could also help developer who want to use CMA in their new feature development, since they can use CMA easily without copying & pasting this reserved area management code. In previous patches, we have prepared some features to generalize CMA reserved area management and now it's time to do it. This patch moves core functions to mm/cma.c and change DMA APIs to use these functions. There is no functional change in DMA APIs. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Alexander Graf <agraf@suse.de> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Gleb Natapov <gleb@kernel.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:16 -07:00
Russell King	6b076991dc	ARM: DMA: ensure that old section mappings are flushed from the TLB When setting up the CMA region, we must ensure that the old section mappings are flushed from the TLB before replacing them with page tables, otherwise we can suffer from mismatched aliases if the CPU speculatively prefetches from these mappings at an inopportune time. A mismatched alias can occur when the TLB contains a section mapping, but a subsequent prefetch causes it to load a page table mapping, resulting in the possibility of the TLB containing two matching mappings for the same virtual address region. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-17 19:26:08 +01:00
Linus Torvalds	eb3d3ec567	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into next Pull ARM updates from Russell King: - Major clean-up of the L2 cache support code. The existing mess was becoming rather unmaintainable through all the additions that others have done over time. This turns it into a much nicer structure, and implements a few performance improvements as well. - Clean up some of the CP15 control register tweaks for alignment support, moving some code and data into alignment.c - DMA properties for ARM, from Santosh and reviewed by DT people. This adds DT properties to specify bus translations we can't discover automatically, and to indicate whether devices are coherent. - Hibernation support for ARM - Make ftrace work with read-only text in modules - add suspend support for PJ4B CPUs - rework interrupt masking for undefined instruction handling, which allows us to enable interrupts earlier in the handling of these exceptions. - support for big endian page tables - fix stacktrace support to exclude stacktrace functions from the trace, and add save_stack_trace_regs() implementation so that kprobes can record stack traces. - Add support for the Cortex-A17 CPU. - Remove last vestiges of ARM710 support. - Removal of ARM "meminfo" structure, finally converting us solely to memblock to handle the early memory initialisation. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (142 commits) ARM: ensure C page table setup code follows assembly code (part II) ARM: ensure C page table setup code follows assembly code ARM: consolidate last remaining open-coded alignment trap enable ARM: remove global cr_no_alignment ARM: remove CPU_CP15 conditional from alignment.c ARM: remove unused adjust_cr() function ARM: move "noalign" command line option to alignment.c ARM: provide common method to clear bits in CPU control register ARM: 8025/1: Get rid of meminfo ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory type ARM: 8066/1: correction for ARM patch 8031/2 ARM: 8049/1: ftrace/add save_stack_trace_regs() implementation ARM: 8065/1: remove last use of CONFIG_CPU_ARM710 ARM: 8062/1: Modify ldrt fixup handler to re-execute the userspace instruction ARM: 8047/1: rwsem: use asm-generic rwsem implementation ARM: l2c: trial at enabling some Cortex-A9 optimisations ARM: l2c: add warnings for stuff modifying aux_ctrl register values ARM: l2c: print a warning with L2C-310 caches if the cache size is modified ARM: l2c: remove old .set_debug method ARM: l2c: kill L2X0_AUX_CTRL_MASK before anyone else makes use of this ...	2014-06-05 15:57:04 -07:00
Russell King	bd63ce27d9	Merge branch 'devel-stable' into for-next	2014-06-05 12:36:22 +01:00
Russell King	1fb333489f	Merge branches 'alignment', 'fixes', 'l2c' (early part) and 'misc' into for-next	2014-06-05 12:35:52 +01:00
Russell King	6b74f61a47	DT support for 'dma-ranges'and 'dma-coherent' properties with ARM updates - The 'dma-ranges' helps to take care of few DMAable system memory restrictions by use of dma_pfn_offset which is maintained per device. Arch code then uses it for dma address translations for such cases. We update the dma_pfn_offset accordingly during DT the device creation process. - The 'dma-coherent' property is used to setup arch's coherent dma_ops. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTajacAAoJEHJsHOdBp5c/780QAJN50zmxyZ7sqA9xGum8MSJl Vjpp1mw3eu7dZ1HoWcpn35l0tOEVpU/wo4ymtt6YYUhD3Po2LZCl3e43h91B/9/B Ih++WZaN+UmpUpp9YJyeS9pkl0wwEqSmJyTBXZrhFhl4o3KNQlHWPGOMJ5CBPaA0 Z03TT1MeOMiCo10xz6JCA/DjPnQz9m5ClxNXLwdP1KOiTDDsv4gtkTZ0UenttIoU DTerJ+GIt1Gzb+P92aGvuc9wgLKacYmH599m6fQcmd9cIG2oMN2Xdxzfqo56v7Sb TGwFcKWYlhPDbDPmcPlidS6j4O+r8cMRwgHLO3r6LHJezCGQOYU8GzN7m6DKt4ww lCIR/k9u4YY/ZiLFeQ+G0Au8T1J6DHdbCI5sciFI53XYT4HMsV1aNpogOim7adC8 4bPRmGCIN03aW+2ynLkFkdnXSBnaAyjt6qlr5zP8owsKDkV7+0WadQqyD2ovQ0FE sBt1HtOUGUsiR/97J4JFBGFxb84zMa6hXhFVUeFbyScCJNm2gkKeRQfiiB4mZi9L NAX/KVGyS6dktJaoLUiKi/p7aqOat3ezD1PrCziq4ceyWbDLag8Bq9H7rtb7vvqC ulHDUPfRy3Z9kmV8+QAznqPJVY1IHXJ18A+YFXF5ktr+5CJ51C8HjVZP3GZKncPC LpA1rRUEwEqsAwnjzcXW =Q7n3 -----END PGP SIGNATURE----- Merge tag 'dt-dma-properties-for-arm' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into devel-stable DT support for 'dma-ranges'and 'dma-coherent' properties with ARM updates - The 'dma-ranges' helps to take care of few DMAable system memory restrictions by use of dma_pfn_offset which is maintained per device. Arch code then uses it for dma address translations for such cases. We update the dma_pfn_offset accordingly during DT the device creation process. - The 'dma-coherent' property is used to setup arch's coherent dma_ops.	2014-05-23 12:30:52 +01:00
Russell King	deace4a6b4	ARM: dma-mapping: avoid calling dma_cache_maint_page() on dev=>cpu Avoid calling dma_cache_maint_page() when unmapping a DMA_TO_DEVICE buffer. The L1 cache ops never do anything in this circumstance, nor do they ever need to - all that matters for this case is that the data written is visible to the device before DMA starts. What happens during the transfer (provided the buffer is not written to) is of no real consequence. We already do this optimisation for the L2 cache. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-05-22 16:33:14 +01:00
Gioh Kim	e464ef16c4	arm: dma-mapping: add checking cma area initialized If CMA is turned on and CMA size is set to zero, kernel should behave as if CMA was not enabled at compile time. Every dma allocation should check existence of cma area before requesting memory. Signed-off-by: Gioh Kim <gioh.kim@lge.com> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> [mszyprow: removed redundant empty line from the patch] Signed-off-by: <m.szyprowski@samsung.com>	2014-05-22 08:09:31 +02:00
Ritesh Harjani	006f841db1	arm: dma-iommu: Clean up redundant variable mapping->size can be derived from mapping->bits << PAGE_SHIFT which makes mapping->size as redundant. Clean this up. Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com> Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>	2014-05-20 13:43:26 +02:00
Santosh Shilimkar	2161c2485d	ARM: dma: use phys_addr_t in __dma_page_[cpu_to_dev/dev_to_cpu] On a 32 bit ARM architecture with LPAE extension physical addresses cannot fit into unsigned long variable. So fix it by using phys_addr_t instead of unsigned long. Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>	2014-05-07 09:21:45 -04:00
Ritesh Harjani	59f0f119e8	arm: dma-mapping: Fix mapping size value 68efd7d2fb("arm: dma-mapping: remove order parameter from arm_iommu_create_mapping()") is causing kernel panic because it wrongly sets the value of mapping->size: Unable to handle kernel NULL pointer dereference at virtual address 000000a0 pgd = e7a84000 [000000a0] *pgd=00000000 ... PC is at bitmap_clear+0x48/0xd0 LR is at __iommu_remove_mapping+0x130/0x164 Fix it by correcting mapping->size value. Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>	2014-04-23 15:07:00 +02:00

1 2 3 4

192 Commits