Commit Graph

139 Commits

Author SHA1 Message Date
Marek Szyprowski
68efd7d2fb arm: dma-mapping: remove order parameter from arm_iommu_create_mapping()
The 'order' parameter for IOMMU-aware dma-mapping implementation was
introduced mainly as a hack to reduce size of the bitmap used for
tracking IO virtual address space. Since now it is possible to dynamically
resize the bitmap, this hack is not needed and can be removed without any
impact on the client devices. This way the parameters for
arm_iommu_create_mapping() becomes much easier to understand. 'size'
parameter now means the maximum supported IO address space size.

The code will allocate (resize) bitmap in chunks, ensuring that a single
chunk is not larger than a single memory page to avoid unreliable
allocations of size larger than PAGE_SIZE in atomic context.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2014-02-28 11:55:18 +01:00
Andreas Herrmann
4d852ef8c2 arm: dma-mapping: Add support to extend DMA IOMMU mappings
Instead of using just one bitmap to keep track of IO virtual addresses
(handed out for IOMMU use) introduce an array of bitmaps. This allows
us to extend existing mappings when running out of iova space in the
initial mapping etc.

If there is not enough space in the mapping to service an IO virtual
address allocation request, __alloc_iova() tries to extend the mapping
-- by allocating another bitmap -- and makes another allocation
attempt using the freshly allocated bitmap.

This allows arm iommu drivers to start with a decent initial size when
an dma_iommu_mapping is created and still to avoid running out of IO
virtual addresses for the mapping.

Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
[mszyprow: removed extensions parameter to arm_iommu_create_mapping()
 function, which will be modified in the next patch anyway, also some
 debug messages about extending bitmap]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2014-02-28 11:55:18 +01:00
Marek Szyprowski
10c8562f93 ARM: dma-mapping: fix GFP_ATOMIC macro usage
GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other
flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an
atomic allocation, the code must test __GFP_WAIT flag presence. This patch
fixes the issue introduced in v3.6-rc5

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: stable@vger.kernel.org
2014-02-11 09:40:05 +01:00
Russell King
6f14d778c1 Merge branches 'amba', 'fixes', 'kees', 'misc' and 'unstable/sa11x0' into for-next 2014-01-21 21:26:33 +00:00
Russell King
71b55663c5 ARM: fix executability of CMA mappings
The CMA region was being marked executable:

0xdc04e000-0xdc050000           8K     RW x      MEM/CACHED/WBRA
0xdc060000-0xdc100000         640K     RW x      MEM/CACHED/WBRA
0xdc4f5000-0xdc500000          44K     RW x      MEM/CACHED/WBRA
0xdcce9000-0xe0000000       52316K     RW x      MEM/CACHED/WBRA

This is mainly due to the badly worded MT_MEMORY_DMA_READY symbol, but
there are also a few other places in dma-mapping which should be
corrected to use the right constant.  Fix all these places:

0xdc04e000-0xdc050000           8K     RW NX     MEM/CACHED/WBRA
0xdc060000-0xdc100000         640K     RW NX     MEM/CACHED/WBRA
0xdc280000-0xdc300000         512K     RW NX     MEM/CACHED/WBRA
0xdc6fc000-0xe0000000       58384K     RW NX     MEM/CACHED/WBRA

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-11 09:53:22 +00:00
Russell King
9f28cde0bc ARM: another fix for the DMA mapping checks
Peter reports that OMAP audio broke with the recent fix for these
checks, caused by OMAP audio using a 64-bit DMA mask.  We should
allow 64-bit DMA masks even with 32-bit dma_addr_t if we can be sure
the amount of RAM we have won't allow the 32-bit dma_addr_t to
overflow.  Unfortunately, the checks to detect overflow were not
correct.

Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-09 23:24:26 +00:00
Russell King
11a5aa3256 ARM: dma-mapping: check DMA mask against available memory
Some buses have negative offsets, which causes the DMA mask checks to
falsely fail.  Fix this by using the actual amount of memory fitted in
the system.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-30 14:45:29 +00:00
Linus Torvalds
f47671e2d8 Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "Included in this series are:

   1. BE8 (modern big endian) changes for ARM from Ben Dooks
   2. big.Little support from Nicolas Pitre and Dave Martin
   3. support for LPAE systems with all system memory above 4GB
   4. Perf updates from Will Deacon
   5. Additional prefetching and other performance improvements from Will.
   6. Neon-optimised AES implementation fro Ard.
   7. A number of smaller fixes scattered around the place.

  There is a rather horrid merge conflict in tools/perf - I was never
  notified of the conflict because it originally occurred between Will's
  tree and other stuff.  Consequently I have a resolution which Will
  forwarded me, which I'll forward on immediately after sending this
  mail.

  The other notable thing is I'm expecting some build breakage in the
  crypto stuff on ARM only with Ard's AES patches.  These were merged
  into a stable git branch which others had already pulled, so there's
  little I can do about this.  The problem is caused because these
  patches have a dependency on some code in the crypto git tree - I
  tried requesting a branch I can pull to resolve these, and all I got
  each time from the crypto people was "we'll revert our patches then"
  which would only make things worse since I still don't have the
  dependent patches.  I've no idea what's going on there or how to
  resolve that, and since I can't split these patches from the rest of
  this pull request, I'm rather stuck with pushing this as-is or
  reverting Ard's patches.

  Since it should "come out in the wash" I've left them in - the only
  build problems they seem to cause at the moment are with randconfigs,
  and since it's a new feature anyway.  However, if by -rc1 the
  dependencies aren't in, I think it'd be best to revert Ard's patches"

I resolved the perf conflict roughly as per the patch sent by Russell,
but there may be some differences.  Any errors are likely mine.  Let's
see how the crypto issues work out..

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (110 commits)
  ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h"
  ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg().
  ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h
  ARM: 7871/1: amba: Extend number of IRQS
  ARM: 7887/1: Don't smp_cross_call() on UP devices in arch_irq_work_raise()
  ARM: 7872/1: Support arch_irq_work_raise() via self IPIs
  ARM: 7880/1: Clear the IT state independent of the Thumb-2 mode
  ARM: 7878/1: nommu: Implement dummy early_paging_init()
  ARM: 7876/1: clear Thumb-2 IT state on exception handling
  ARM: 7874/2: bL_switcher: Remove cpu_hotplug_driver_{lock,unlock}()
  ARM: footbridge: fix build warnings for netwinder
  ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu
  ARM: fix misplaced arch_virt_to_idmap()
  ARM: 7848/1: mcpm: Implement cpu_kill() to synchronise on powerdown
  ARM: 7847/1: mcpm: Factor out logical-to-physical CPU translation
  ARM: 7869/1: remove unused XSCALE_PMU Kconfig param
  ARM: 7864/1: Handle 64-bit memory in case of 32-bit phys_addr_t
  ARM: 7863/1: Let arm_add_memory() always use 64-bit arguments
  ARM: 7862/1: pcpu: replace __get_cpu_var_uses
  ARM: 7861/1: cacheflush: consolidate single-CPU ARMv7 cache disabling code
  ...
2013-11-14 08:51:29 +09:00
Linus Torvalds
8ceafbfa91 Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm
Pull DMA mask updates from Russell King:
 "This series cleans up the handling of DMA masks in a lot of drivers,
  fixing some bugs as we go.

  Some of the more serious errors include:
   - drivers which only set their coherent DMA mask if the attempt to
     set the streaming mask fails.
   - drivers which test for a NULL dma mask pointer, and then set the
     dma mask pointer to a location in their module .data section -
     which will cause problems if the module is reloaded.

  To counter these, I have introduced two helper functions:
   - dma_set_mask_and_coherent() takes care of setting both the
     streaming and coherent masks at the same time, with the correct
     error handling as specified by the API.
   - dma_coerce_mask_and_coherent() which resolves the problem of
     drivers forcefully setting DMA masks.  This is more a marker for
     future work to further clean these locations up - the code which
     creates the devices really should be initialising these, but to fix
     that in one go along with this change could potentially be very
     disruptive.

  The last thing this series does is prise away some of Linux's addition
  to "DMA addresses are physical addresses and RAM always starts at
  zero".  We have ARM LPAE systems where all system memory is above 4GB
  physical, hence having DMA masks interpreted by (eg) the block layers
  as describing physical addresses in the range 0..DMAMASK fails on
  these platforms.  Santosh Shilimkar addresses this in this series; the
  patches were copied to the appropriate people multiple times but were
  ignored.

  Fixing this also gets rid of some ARM weirdness in the setup of the
  max*pfn variables, and brings ARM into line with every other Linux
  architecture as far as those go"

* 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm: (52 commits)
  ARM: 7805/1: mm: change max*pfn to include the physical offset of memory
  ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: 7795/1: mm: dma-mapping: Add dma_max_pfn(dev) helper function
  ARM: 7794/1: block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()
  ARM: DMA-API: better handing of DMA masks for coherent allocations
  ARM: 7857/1: dma: imx-sdma: setup dma mask
  DMA-API: firmware/google/gsmi.c: avoid direct access to DMA masks
  DMA-API: dcdbas: update DMA mask handing
  DMA-API: dma: edma.c: no need to explicitly initialize DMA masks
  DMA-API: usb: musb: use platform_device_register_full() to avoid directly messing with dma masks
  DMA-API: crypto: remove last references to 'static struct device *dev'
  DMA-API: crypto: fix ixp4xx crypto platform device support
  DMA-API: others: use dma_set_coherent_mask()
  DMA-API: staging: use dma_set_coherent_mask()
  DMA-API: usb: use new dma_coerce_mask_and_coherent()
  DMA-API: usb: use dma_set_coherent_mask()
  DMA-API: parport: parport_pc.c: use dma_coerce_mask_and_coherent()
  DMA-API: net: octeon: use dma_coerce_mask_and_coherent()
  DMA-API: net: nxp/lpc_eth: use dma_coerce_mask_and_coherent()
  ...
2013-11-14 07:55:21 +09:00
Russell King
42cbe8271c Merge branches 'fixes', 'mmci' and 'sa11x0' into for-next 2013-11-12 10:59:08 +00:00
Russell King
4dcfa60071 ARM: DMA-API: better handing of DMA masks for coherent allocations
We need to start treating DMA masks as something which is specific to
the bus that the device resides on, otherwise we're going to hit all
sorts of nasty issues with LPAE and 32-bit DMA controllers in >32-bit
systems, where memory is offset from PFN 0.

In order to start doing this, we convert the DMA mask to a PFN using
the device specific dma_to_pfn() macro.  This is the reverse of the
pfn_to_dma() macro which is used to get the DMA address for the device.

This gives us a PFN mask, which we can then check against the PFN
limit of the DMA zone.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-31 14:49:21 +00:00
Russell King
0ea1ec713f ARM: dma-mapping: don't allow DMA mappings to be marked executable
DMA mapping permissions were being derived from pgprot_kernel directly
without using PAGE_KERNEL.  This causes them to be marked with executable
permission, which is not what we want.  Fix this.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-24 11:17:27 +01:00
Andreas Herrmann
c9b24996d5 ARM: dma-mapping: Always pass proper prot flags to iommu_map()
... otherwise it is impossible for the low level iommu driver to
figure out which pte flags should be used.

In __map_sg_chunk we can derive the flags from dma_data_direction.

In __iommu_create_mapping we should treat the memory like
DMA_BIDIRECTIONAL and pass both IOMMU_READ and IOMMU_WRITE to
iommu_map.
__iommu_create_mapping is used during dma_alloc_coherent (via
arm_iommu_alloc_attrs).  AFAIK dma_alloc_coherent is responsible for
allocation _and_ mapping.  I think this implies that access to the
mapped pages should be allowed.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-10-02 13:23:11 +02:00
Linus Torvalds
2e03285224 Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "This set includes adding support for Neon acceleration of RAID6 XOR
  code from Ard Biesheuvel, cache flushing and barrier updates from Will
  Deacon, and a cleanup to the ARM debug code which reduces the amount
  of code by about 500 lines.

  A few other cleanups, such as constifying the machine descriptors
  which already shouldn't be written to, cleaning up the printing of the
  L2 cache size"

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (55 commits)
  ARM: 7826/1: debug: support debug ll on hisilicon soc
  ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo
  ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables
  ARM: 7828/1: ARMv7-M: implement restart routine common to all v7-M machines
  ARM: 7827/1: highbank: fix debug uart virtual address for LPAE
  ARM: 7823/1: errata: workaround Cortex-A15 erratum 773022
  ARM: 7806/1: allow DEBUG_UNCOMPRESS for Tegra
  ARM: 7793/1: debug: use generic option for ep93xx PL10x debug port
  ARM: debug: move SPEAr debug to generic PL01x code
  ARM: debug: move davinci debug to generic 8250 code
  ARM: debug: move keystone debug to generic 8250 code
  ARM: debug: remove DEBUG_ROCKCHIP_UART
  ARM: debug: provide generic option choices for 8250 and PL01x ports
  ARM: debug: move PL01X debug include into arch/arm/include/debug/
  ARM: debug: provide PL01x debug uart phys/virt address configuration options
  ARM: debug: add support for word accesses to debug/8250.S
  ARM: debug: move 8250 debug include into arch/arm/include/debug/
  ARM: debug: provide 8250 debug uart phys/virt address configuration options
  ARM: debug: provide 8250 debug uart register shift configuration option
  ARM: debug: provide 8250 debug uart flow control configuration option
  ...
2013-09-05 18:07:32 -07:00
Alexander Graf
bf550fc93d Merge remote-tracking branch 'origin/next' into kvm-ppc-next
Conflicts:
	mm/Kconfig

CMA DMA split and ZSWAP introduction were conflicting, fix up manually.
2013-08-29 00:41:59 +02:00
Will Deacon
792a843a9f ARM: mm: remove redundant dsb() prior to range TLB invalidation
The kernel TLB range invalidation functions already contain dsb
instructions before and after the maintenance, so there is no need to
introduce additional barriers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:44 +01:00
Alexander Graf
20f7462aac Merge remote-tracking branch 'cmadma/for-v3.12-cma-dma' into kvm-ppc-next
Add prerequisite patch for CMA RMA allocation patches
2013-07-08 16:16:56 +02:00
Linus Torvalds
8b70a90cab Merge branch 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull ARM DMA mapping updates from Marek Szyprowski:
 "This contains important bugfixes and an update for IOMMU integration
  support for ARM architecture"

* 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
  ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
  ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
  ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
  ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
2013-07-06 12:41:54 -07:00
Aneesh Kumar K.V
f825c736e7 mm/cma: Move dma contiguous changes into a seperate config
We want to use CMA for allocating hash page table and real mode area for
PPC64. Hence move DMA contiguous related changes into a seperate config
so that ppc64 can enable CMA without requiring DMA contiguous.

Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[removed defconfig changes]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-07-02 10:08:22 +02:00
Russell King
3c0c01ab74 Merge branch 'devel-stable' into for-next
Conflicts:
	arch/arm/Makefile
	arch/arm/include/asm/glue-proc.h
2013-06-29 11:44:43 +01:00
Richard Zhao
5b91a98c61 ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
__iommu_alloc_buffer wants to split pages after allocation in order to
reduce the memory footprint. This does not work well with __GFP_COMP
pages, so drop this flag before allocation

One failure example is snd_malloc_dev_pages call dma_alloc_coherent with
__GFP_COMP.

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:29 +02:00
Ming Lei
63c181922f ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().

The patch might improve loading performance of applciation code a bit.

On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.

unsigned int sum = 0;

static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
	return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
		(tv2->tv_usec - tv1->tv_usec);
}

int main(int argc, char *argv[])
{
	char *mbuffer;
	int fd;
	int i;
	unsigned long page_size, size;
	struct stat stat;
	struct timeval t1, t2;

	page_size = getpagesize();
	fd = open(argv[1], O_RDONLY);
	assert(fd >= 0);

	fstat(fd, &stat);
	size = stat.st_size;
	printf("%s: file %s, file size %lu, page size %lu\n", argv[0],
		read_filename, size, page_size);

	gettimeofday(&t1, NULL);
	mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
	for (i = 0 ; i < size ; i += page_size)
		sum += mbuffer[i];
	munmap(mbuffer, page_size);
	gettimeofday(&t2, NULL);
	printf("\tread mmaped time: %luus\n", tv_diff(&t1, &t2));

	close(fd);
}

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:28 +02:00
Will Deacon
9e4b259d4f ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
The current code only clobbers a local variable, so the device is left
with a stale mapping pointer.

Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
Will Deacon
13987d68bc ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
IOMMU mappings take a prot parameter, identifying the protection bits
to enforce on the newly created mapping (READ or WRITE). The ARM
dma-mapping framework currently just passes 0 as the prot argument,
resulting in faulting mappings.

This patch infers the protection attributes based on the direction of
the DMA transfer.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
YoungJun Cho
836bfa0d29 ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
In __iommu_get_pages(), the cpu_addr is checked wheather in
atomic_pool range or not. So if the cpu_addr is in atomic_pool
range, it does not need to check twice.

Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
Catalin Marinas
1355e2a6eb ARM: mm: HugeTLB support for LPAE systems.
This patch adds support for hugetlbfs based on the x86 implementation.
It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt
for usage). The 64K pages configuration is not supported (section size
is 512MB in this case).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants replace numbers in places.
Split up into multiple files, to simplify future non-LPAE support,
removed huge_pmd_share code, as this is very rarely executed,
Added PROT_NONE support].
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
2013-06-04 16:52:37 +01:00
Ming Lei
b2a234ed64 ARM: 7730/1: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().

The patch might improve loading performance of applciation code a bit.

On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.

unsigned int sum = 0;
static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
	return (tv2->tv_sec - tv1->tv_sec) * 1000000 + (tv2->tv_usec - tv1->tv_usec);
}
int main(int argc, char *argv[])
{
	char *mbuffer;
	int fd;
	int i;
	unsigned long page_size, size;
	struct stat stat;
	struct timeval t1, t2;

	page_size = getpagesize();
	fd = open(argv[1], O_RDONLY);
	assert(fd >= 0);

	fstat(fd, &stat);
	size = stat.st_size;
	printf("%s: file %s, file size %lu, page size %lun", argv[0],
	        read_filename, size, page_size);

	gettimeofday(&t1, NULL);
	mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
	for (i = 0 ; i < size ; i += page_size)
	        sum += mbuffer[i];
	munmap(mbuffer, page_size);
	gettimeofday(&t2, NULL);
	printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));

	close(fd);
}

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-05-23 00:09:45 +01:00
Russell King
946342d03e Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Joonsoo Kim
dd0f67f474 ARM: 7693/1: mm: clean-up in order to reduce to call kmap_high_get()
In kmap_atomic(), kmap_high_get() is invoked for checking already
mapped area. In __flush_dcache_page() and dma_cache_maint_page(),
we explicitly call kmap_high_get() before kmap_atomic()
when cache_is_vipt(), so kmap_high_get() can be invoked twice.
This is useless operation, so remove one.

v2: change cache_is_vipt() to cache_is_vipt_nonaliasing() in order to
be self-documented

Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-17 16:55:01 +01:00
Marek Szyprowski
9d1400cf79 ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
Atomic pool should always be allocated from DMA zone if such zone is
available in the system to avoid issues caused by limited dma mask of
any of the devices used for making an atomic allocation.

Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Stable <stable@vger.kernel.org>	[v3.6+]
2013-03-14 09:25:19 +01:00
Marek Szyprowski
d589829107 ARM: DMA-mapping: fix memory leak in IOMMU dma-mapping implementation
This patch removes page_address() usage in IOMMU-aware dma-mapping
implementation and replaced it with direct use of the cpu virtual address
provided by the caller. page_address() returned incorrect address for
pages remapped in atomic pool, what caused memory leak.

Reported-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2013-02-25 15:30:44 +01:00
Seung-Woo Kim
60460abffc ARM: dma-mapping: Add maximum alignment order for dma iommu buffers
Alignment order for a dma iommu buffer is set by buffer size. For
large buffer, it is a waste of iommu address space. So configurable
parameter to limit maximum alignment order can reduce the waste.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Kyungmin.park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:43 +01:00
Marek Szyprowski
f8669bef11 ARM: dma-mapping: use himem for DMA buffers for IOMMU-mapped devices
IOMMU can provide access to any memory page, so there is no point in
limiting the allocated pages only to lowmem, once other parts of
dma-mapping subsystem correctly supports himem pages.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:43 +01:00
Marek Szyprowski
9848e48f4c ARM: dma-mapping: add support for CMA regions placed in highmem zone
This patch adds missing pieces to correctly support memory pages served
from CMA regions placed in high memory zones. Please note that the default
global CMA area is still put into lowmem and is limited by optional
architecture specific DMA zone. One can however put device specific CMA
regions in high memory zone to reduce lowmem usage.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
2013-02-25 15:30:42 +01:00
Prathyush K
18177d12c0 arm: dma mapping: export arm iommu functions
This patch adds EXPORT_SYMBOL_GPL calls to the three arm iommu
functions - arm_iommu_create_mapping, arm_iommu_free_mapping
and arm_iommu_attach_device. These three functions are arm specific
wrapper functions for creating/freeing/using an iommu mapping and
they are called by various drivers. If any of these drivers need
to be built as dynamic modules, these functions need to be exported.

Changelog v2: using EXPORT_SYMBOL_GPL as suggested by Marek.

Signed-off-by: Prathyush K <prathyush.k@samsung.com>
[m.szyprowski: extended with recently introduced
 EXPORT_SYMBOL_GPL(arm_iommu_detach_device)]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:42 +01:00
Hiroshi Doyu
6fe3675803 ARM: dma-mapping: Add arm_iommu_detach_device()
A counter part of arm_iommu_attach_device().

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:41 +01:00
Hiroshi Doyu
d09e1333ec ARM: dma-mapping: Set arm_dma_set_mask() for iommu->set_dma_mask()
struct dma_map_ops iommu_ops doesn't have ->set_dma_mask, which causes
crash when dma_set_mask() is called from some driver.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:41 +01:00
Russell King
633dc92a28 ARM: DMA mapping: fix bad atomic test
Realview fails to boot with this warning:
BUG: spinlock lockup suspected on CPU#0, init/1
 lock: 0xcf8bde10, .magic: dead4ead, .owner: init/1, .owner_cpu: 0
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:cf8bde10 r5:cf83d1c0 r4:cf8bde10 r3:cf83d1c0
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c018926c>] (spin_dump+0x84/0x98)
[<c01891e8>] (spin_dump+0x0/0x98) from [<c0189460>] (do_raw_spin_lock+0x100/0x198)
[<c0189360>] (do_raw_spin_lock+0x0/0x198) from [<c032cbac>] (_raw_spin_lock+0x3c/0x44)
[<c032cb70>] (_raw_spin_lock+0x0/0x44) from [<c01c9224>] (pl011_console_write+0xe8/0x11c)
[<c01c913c>] (pl011_console_write+0x0/0x11c) from [<c002aea8>] (call_console_drivers.clone.7+0xdc/0x104)
[<c002adcc>] (call_console_drivers.clone.7+0x0/0x104) from [<c002b320>] (console_unlock+0x2e8/0x454)
[<c002b038>] (console_unlock+0x0/0x454) from [<c002b8b4>] (vprintk_emit+0x2d8/0x594)
[<c002b5dc>] (vprintk_emit+0x0/0x594) from [<c0329718>] (printk+0x3c/0x44)
[<c03296dc>] (printk+0x0/0x44) from [<c002929c>] (warn_slowpath_common+0x28/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0070ab0>] (lockdep_trace_alloc+0xd8/0xf0)
[<c00709d8>] (lockdep_trace_alloc+0x0/0xf0) from [<c00c0850>] (kmem_cache_alloc+0x24/0x11c)
[<c00c082c>] (kmem_cache_alloc+0x0/0x11c) from [<c00bb044>] (__get_vm_area_node.clone.24+0x7c/0x16c)
[<c00bafc8>] (__get_vm_area_node.clone.24+0x0/0x16c) from [<c00bb7b8>] (get_vm_area_caller+0x48/0x54)
[<c00bb770>] (get_vm_area_caller+0x0/0x54) from [<c0020064>] (__alloc_remap_buffer.clone.15+0x38/0xb8)
[<c002002c>] (__alloc_remap_buffer.clone.15+0x0/0xb8) from [<c0020244>] (__dma_alloc+0x160/0x2c8)
[<c00200e4>] (__dma_alloc+0x0/0x2c8) from [<c00204d8>] (arm_dma_alloc+0x88/0xa0)[<c0020450>] (arm_dma_alloc+0x0/0xa0) from [<c00beb00>] (dma_pool_alloc+0xcc/0x1a8)
[<c00bea34>] (dma_pool_alloc+0x0/0x1a8) from [<c01a9d14>] (pl08x_fill_llis_for_desc+0x28/0x568)
[<c01a9cec>] (pl08x_fill_llis_for_desc+0x0/0x568) from [<c01aab8c>] (pl08x_prep_slave_sg+0x258/0x3b0)
[<c01aa934>] (pl08x_prep_slave_sg+0x0/0x3b0) from [<c01c9f74>] (pl011_dma_tx_refill+0x140/0x288)
[<c01c9e34>] (pl011_dma_tx_refill+0x0/0x288) from [<c01ca748>] (pl011_start_tx+0xe4/0x120)
[<c01ca664>] (pl011_start_tx+0x0/0x120) from [<c01c54a4>] (__uart_start+0x48/0x4c)
[<c01c545c>] (__uart_start+0x0/0x4c) from [<c01c632c>] (uart_start+0x2c/0x3c)
[<c01c6300>] (uart_start+0x0/0x3c) from [<c01c795c>] (uart_write+0xcc/0xf4)
[<c01c7890>] (uart_write+0x0/0xf4) from [<c01b0384>] (n_tty_write+0x1c0/0x3e4)
[<c01b01c4>] (n_tty_write+0x0/0x3e4) from [<c01acfe8>] (tty_write+0x144/0x240)
[<c01acea4>] (tty_write+0x0/0x240) from [<c01ad17c>] (redirected_tty_write+0x98/0xac)
[<c01ad0e4>] (redirected_tty_write+0x0/0xac) from [<c00c371c>] (vfs_write+0xbc/0x150)
[<c00c3660>] (vfs_write+0x0/0x150) from [<c00c39c0>] (sys_write+0x4c/0x78)
[<c00c3974>] (sys_write+0x0/0x78) from [<c0014460>] (ret_fast_syscall+0x0/0x3c)

This happens because the DMA allocation code is not respecting atomic
allocations correctly.

GFP flags should not be tested for GFP_ATOMIC to determine if an
atomic allocation is being requested.  GFP_ATOMIC is not a flag but
a value.  The GFP bitmask flags are all prefixed with __GFP_.

The rest of the kernel tests for __GFP_WAIT not being set to indicate
an atomic allocation.  We need to do the same.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-02-08 10:25:23 +00:00
Russell King
15653371c6 ARM: DMA: Fix struct page iterator in dma_cache_maint() to work with sparsemem
Subhash Jadavani reported this partial backtrace:
  Now consider this call stack from MMC block driver (this is on the ARMv7
  based board):

  [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c)
  [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c)
  [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from [<c0017ff8>] (dma_map_sg+0x3c/0x114)

This is caused by incrementing the struct page pointer, and running off
the end of the sparsemem page array.  Fix this by incrementing by pfn
instead, and convert the pfn to a struct page.

Cc: <stable@vger.kernel.org>
Suggested-by: James Bottomley <JBottomley@Parallels.com>
Tested-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-01-19 11:05:57 +00:00
Linus Torvalds
3c2e81ef34 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull DRM updates from Dave Airlie:
 "This is the one and only next pull for 3.8, we had a regression we
  found last week, so I was waiting for that to resolve itself, and I
  ended up with some Intel fixes on top as well.

  Highlights:
   - new driver: nvidia tegra 20/30/hdmi support
   - radeon: add support for previously unused DMA engines, more HDMI
     regs, eviction speeds ups and fixes
   - i915: HSW support enable, agp removal on GEN6, seqno wrapping
   - exynos: IPP subsystem support (image post proc), HDMI
   - nouveau: display class reworking, nv20->40 z compression
   - ttm: start of locking fixes, rcu usage for lookups,
   - core: documentation updates, docbook integration, monotonic clock
     usage, move from connector to object properties"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (590 commits)
  drm/exynos: add gsc ipp driver
  drm/exynos: add rotator ipp driver
  drm/exynos: add fimc ipp driver
  drm/exynos: add iommu support for ipp
  drm/exynos: add ipp subsystem
  drm/exynos: support device tree for fimd
  radeon: fix regression with eviction since evict caching changes
  drm/radeon: add more pedantic checks in the CP DMA checker
  drm/radeon: bump version for CS ioctl support for async DMA
  drm/radeon: enable the async DMA rings in the CS ioctl
  drm/radeon: add VM CS parser support for async DMA on cayman/TN/SI
  drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2)
  drm/radeon/kms: add 6xx/7xx CS parser for async DMA (v2)
  drm/radeon: fix htile buffer size computation for command stream checker
  drm/radeon: fix fence locking in the pageflip callback
  drm/radeon: make indirect register access concurrency-safe
  drm/radeon: add W|RREG32_IDX for MM_INDEX|DATA based mmio accesss
  drm/exynos: support extended screen coordinate of fimd
  drm/exynos: fix x, y coordinates for right bottom pixel
  drm/exynos: fix fb offset calculation for plane
  ...
2012-12-17 08:26:17 -08:00
Marek Szyprowski
549a17e447 ARM: dma-mapping: add support for DMA_ATTR_FORCE_CONTIGUOUS attribute
This patch adds support for DMA_ATTR_FORCE_CONTIGUOUS attribute for
dma_alloc_attrs() in IOMMU-aware implementation. For allocating physically
contiguous buffers Contiguous Memory Allocator is used.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-11-29 03:30:34 -08:00
Gregory CLEMENT
87b54e786a arm: dma mapping: Export a dma ops function arm_dma_set_mask
Expose another DMA operations function: arm_dma_set_mask. This
function will be added to a custom DMA ops for Armada 370/XP.
Depending of its configuration Armada 370/XP can be set as a "nearly"
coherent architecture. In this case the DMA ops is made of:
- specific functions for this architecture
- already exposed arm DMA related functions
- the arm_dma_set_mask which was not exposed yet.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-11-21 17:07:49 +01:00
Jingoo Han
3dd7ea9220 ARM: dma-mapping: fix build warning in __dma_alloc()
Fix build warning in __dma_alloc() as below:

arch/arm/mm/dma-mapping.c: In function '__dma_alloc':
arch/arm/mm/dma-mapping.c:653:29: warning: 'page' may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-24 07:38:15 +02:00
Marek Szyprowski
461b6f0d3d Merge branch 'next-cleanup' into for-v3.7 2012-10-02 09:24:24 +02:00
Hiroshi Doyu
abebfb18ea ARM: dma-mapping: Remove unsed var at arm_coherent_iommu_unmap_page
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:08 +02:00
Rob Herring
0fa478df44 ARM: add coherent iommu dma ops
Remove arch_is_coherent() from iommu dma ops and implement separate
coherent ops functions.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:06 +02:00
Rob Herring
dd37e9405a ARM: add coherent dma ops
arch_is_coherent is problematic as it is a global symbol. This
doesn't work for multi-platform kernels or platforms which can support
per device coherent DMA.

This adds arm_coherent_dma_ops to be used for devices which connected
coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
are modified at boot when arch_is_coherent is true.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:06 +02:00
Hiroshi Doyu
75c5971614 ARM: dma-mapping: Refrain noisy console message
With many IOMMU'able devices, console gets noisy.

Tegra30 has a few dozen of IOMMU'able devices.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:57:45 +02:00
Hiroshi Doyu
5a796eeb7b ARM: dma-mapping: Small logical clean up
Skip unnecessary operations if order == 0. A little bit easier to
read.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:57:45 +02:00
Sachin Kamat
ec10665cbf ARM: dma-mapping: Fix potential memory leak in atomic_pool_init()
When either of __alloc_from_contiguous or __alloc_remap_buffer fails
to provide a valid pointer, allocated memory is freed up and an error
is returned. 'pages' was however not freed before returning error.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-09-24 08:35:03 +02:00