Commit Graph

225 Commits

Author SHA1 Message Date
Alexander Graf
bf550fc93d Merge remote-tracking branch 'origin/next' into kvm-ppc-next
Conflicts:
	mm/Kconfig

CMA DMA split and ZSWAP introduction were conflicting, fix up manually.
2013-08-29 00:41:59 +02:00
Will Deacon
792a843a9f ARM: mm: remove redundant dsb() prior to range TLB invalidation
The kernel TLB range invalidation functions already contain dsb
instructions before and after the maintenance, so there is no need to
introduce additional barriers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:44 +01:00
Alexander Graf
20f7462aac Merge remote-tracking branch 'cmadma/for-v3.12-cma-dma' into kvm-ppc-next
Add prerequisite patch for CMA RMA allocation patches
2013-07-08 16:16:56 +02:00
Linus Torvalds
8b70a90cab Merge branch 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull ARM DMA mapping updates from Marek Szyprowski:
 "This contains important bugfixes and an update for IOMMU integration
  support for ARM architecture"

* 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
  ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
  ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
  ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
  ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
2013-07-06 12:41:54 -07:00
Aneesh Kumar K.V
f825c736e7 mm/cma: Move dma contiguous changes into a seperate config
We want to use CMA for allocating hash page table and real mode area for
PPC64. Hence move DMA contiguous related changes into a seperate config
so that ppc64 can enable CMA without requiring DMA contiguous.

Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[removed defconfig changes]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-07-02 10:08:22 +02:00
Russell King
3c0c01ab74 Merge branch 'devel-stable' into for-next
Conflicts:
	arch/arm/Makefile
	arch/arm/include/asm/glue-proc.h
2013-06-29 11:44:43 +01:00
Richard Zhao
5b91a98c61 ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
__iommu_alloc_buffer wants to split pages after allocation in order to
reduce the memory footprint. This does not work well with __GFP_COMP
pages, so drop this flag before allocation

One failure example is snd_malloc_dev_pages call dma_alloc_coherent with
__GFP_COMP.

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:29 +02:00
Ming Lei
63c181922f ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().

The patch might improve loading performance of applciation code a bit.

On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.

unsigned int sum = 0;

static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
	return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
		(tv2->tv_usec - tv1->tv_usec);
}

int main(int argc, char *argv[])
{
	char *mbuffer;
	int fd;
	int i;
	unsigned long page_size, size;
	struct stat stat;
	struct timeval t1, t2;

	page_size = getpagesize();
	fd = open(argv[1], O_RDONLY);
	assert(fd >= 0);

	fstat(fd, &stat);
	size = stat.st_size;
	printf("%s: file %s, file size %lu, page size %lu\n", argv[0],
		read_filename, size, page_size);

	gettimeofday(&t1, NULL);
	mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
	for (i = 0 ; i < size ; i += page_size)
		sum += mbuffer[i];
	munmap(mbuffer, page_size);
	gettimeofday(&t2, NULL);
	printf("\tread mmaped time: %luus\n", tv_diff(&t1, &t2));

	close(fd);
}

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:28 +02:00
Will Deacon
9e4b259d4f ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
The current code only clobbers a local variable, so the device is left
with a stale mapping pointer.

Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
Will Deacon
13987d68bc ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
IOMMU mappings take a prot parameter, identifying the protection bits
to enforce on the newly created mapping (READ or WRITE). The ARM
dma-mapping framework currently just passes 0 as the prot argument,
resulting in faulting mappings.

This patch infers the protection attributes based on the direction of
the DMA transfer.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
YoungJun Cho
836bfa0d29 ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
In __iommu_get_pages(), the cpu_addr is checked wheather in
atomic_pool range or not. So if the cpu_addr is in atomic_pool
range, it does not need to check twice.

Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-06-28 15:14:27 +02:00
Catalin Marinas
1355e2a6eb ARM: mm: HugeTLB support for LPAE systems.
This patch adds support for hugetlbfs based on the x86 implementation.
It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt
for usage). The 64K pages configuration is not supported (section size
is 512MB in this case).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants replace numbers in places.
Split up into multiple files, to simplify future non-LPAE support,
removed huge_pmd_share code, as this is very rarely executed,
Added PROT_NONE support].
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
2013-06-04 16:52:37 +01:00
Ming Lei
b2a234ed64 ARM: 7730/1: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().

The patch might improve loading performance of applciation code a bit.

On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.

unsigned int sum = 0;
static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
	return (tv2->tv_sec - tv1->tv_sec) * 1000000 + (tv2->tv_usec - tv1->tv_usec);
}
int main(int argc, char *argv[])
{
	char *mbuffer;
	int fd;
	int i;
	unsigned long page_size, size;
	struct stat stat;
	struct timeval t1, t2;

	page_size = getpagesize();
	fd = open(argv[1], O_RDONLY);
	assert(fd >= 0);

	fstat(fd, &stat);
	size = stat.st_size;
	printf("%s: file %s, file size %lu, page size %lun", argv[0],
	        read_filename, size, page_size);

	gettimeofday(&t1, NULL);
	mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
	for (i = 0 ; i < size ; i += page_size)
	        sum += mbuffer[i];
	munmap(mbuffer, page_size);
	gettimeofday(&t2, NULL);
	printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));

	close(fd);
}

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-05-23 00:09:45 +01:00
Russell King
946342d03e Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Joonsoo Kim
dd0f67f474 ARM: 7693/1: mm: clean-up in order to reduce to call kmap_high_get()
In kmap_atomic(), kmap_high_get() is invoked for checking already
mapped area. In __flush_dcache_page() and dma_cache_maint_page(),
we explicitly call kmap_high_get() before kmap_atomic()
when cache_is_vipt(), so kmap_high_get() can be invoked twice.
This is useless operation, so remove one.

v2: change cache_is_vipt() to cache_is_vipt_nonaliasing() in order to
be self-documented

Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-17 16:55:01 +01:00
Marek Szyprowski
9d1400cf79 ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
Atomic pool should always be allocated from DMA zone if such zone is
available in the system to avoid issues caused by limited dma mask of
any of the devices used for making an atomic allocation.

Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Stable <stable@vger.kernel.org>	[v3.6+]
2013-03-14 09:25:19 +01:00
Marek Szyprowski
d589829107 ARM: DMA-mapping: fix memory leak in IOMMU dma-mapping implementation
This patch removes page_address() usage in IOMMU-aware dma-mapping
implementation and replaced it with direct use of the cpu virtual address
provided by the caller. page_address() returned incorrect address for
pages remapped in atomic pool, what caused memory leak.

Reported-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2013-02-25 15:30:44 +01:00
Seung-Woo Kim
60460abffc ARM: dma-mapping: Add maximum alignment order for dma iommu buffers
Alignment order for a dma iommu buffer is set by buffer size. For
large buffer, it is a waste of iommu address space. So configurable
parameter to limit maximum alignment order can reduce the waste.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Kyungmin.park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:43 +01:00
Marek Szyprowski
f8669bef11 ARM: dma-mapping: use himem for DMA buffers for IOMMU-mapped devices
IOMMU can provide access to any memory page, so there is no point in
limiting the allocated pages only to lowmem, once other parts of
dma-mapping subsystem correctly supports himem pages.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:43 +01:00
Marek Szyprowski
9848e48f4c ARM: dma-mapping: add support for CMA regions placed in highmem zone
This patch adds missing pieces to correctly support memory pages served
from CMA regions placed in high memory zones. Please note that the default
global CMA area is still put into lowmem and is limited by optional
architecture specific DMA zone. One can however put device specific CMA
regions in high memory zone to reduce lowmem usage.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
2013-02-25 15:30:42 +01:00
Prathyush K
18177d12c0 arm: dma mapping: export arm iommu functions
This patch adds EXPORT_SYMBOL_GPL calls to the three arm iommu
functions - arm_iommu_create_mapping, arm_iommu_free_mapping
and arm_iommu_attach_device. These three functions are arm specific
wrapper functions for creating/freeing/using an iommu mapping and
they are called by various drivers. If any of these drivers need
to be built as dynamic modules, these functions need to be exported.

Changelog v2: using EXPORT_SYMBOL_GPL as suggested by Marek.

Signed-off-by: Prathyush K <prathyush.k@samsung.com>
[m.szyprowski: extended with recently introduced
 EXPORT_SYMBOL_GPL(arm_iommu_detach_device)]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:42 +01:00
Hiroshi Doyu
6fe3675803 ARM: dma-mapping: Add arm_iommu_detach_device()
A counter part of arm_iommu_attach_device().

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:41 +01:00
Hiroshi Doyu
d09e1333ec ARM: dma-mapping: Set arm_dma_set_mask() for iommu->set_dma_mask()
struct dma_map_ops iommu_ops doesn't have ->set_dma_mask, which causes
crash when dma_set_mask() is called from some driver.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-02-25 15:30:41 +01:00
Russell King
633dc92a28 ARM: DMA mapping: fix bad atomic test
Realview fails to boot with this warning:
BUG: spinlock lockup suspected on CPU#0, init/1
 lock: 0xcf8bde10, .magic: dead4ead, .owner: init/1, .owner_cpu: 0
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:cf8bde10 r5:cf83d1c0 r4:cf8bde10 r3:cf83d1c0
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c018926c>] (spin_dump+0x84/0x98)
[<c01891e8>] (spin_dump+0x0/0x98) from [<c0189460>] (do_raw_spin_lock+0x100/0x198)
[<c0189360>] (do_raw_spin_lock+0x0/0x198) from [<c032cbac>] (_raw_spin_lock+0x3c/0x44)
[<c032cb70>] (_raw_spin_lock+0x0/0x44) from [<c01c9224>] (pl011_console_write+0xe8/0x11c)
[<c01c913c>] (pl011_console_write+0x0/0x11c) from [<c002aea8>] (call_console_drivers.clone.7+0xdc/0x104)
[<c002adcc>] (call_console_drivers.clone.7+0x0/0x104) from [<c002b320>] (console_unlock+0x2e8/0x454)
[<c002b038>] (console_unlock+0x0/0x454) from [<c002b8b4>] (vprintk_emit+0x2d8/0x594)
[<c002b5dc>] (vprintk_emit+0x0/0x594) from [<c0329718>] (printk+0x3c/0x44)
[<c03296dc>] (printk+0x0/0x44) from [<c002929c>] (warn_slowpath_common+0x28/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0070ab0>] (lockdep_trace_alloc+0xd8/0xf0)
[<c00709d8>] (lockdep_trace_alloc+0x0/0xf0) from [<c00c0850>] (kmem_cache_alloc+0x24/0x11c)
[<c00c082c>] (kmem_cache_alloc+0x0/0x11c) from [<c00bb044>] (__get_vm_area_node.clone.24+0x7c/0x16c)
[<c00bafc8>] (__get_vm_area_node.clone.24+0x0/0x16c) from [<c00bb7b8>] (get_vm_area_caller+0x48/0x54)
[<c00bb770>] (get_vm_area_caller+0x0/0x54) from [<c0020064>] (__alloc_remap_buffer.clone.15+0x38/0xb8)
[<c002002c>] (__alloc_remap_buffer.clone.15+0x0/0xb8) from [<c0020244>] (__dma_alloc+0x160/0x2c8)
[<c00200e4>] (__dma_alloc+0x0/0x2c8) from [<c00204d8>] (arm_dma_alloc+0x88/0xa0)[<c0020450>] (arm_dma_alloc+0x0/0xa0) from [<c00beb00>] (dma_pool_alloc+0xcc/0x1a8)
[<c00bea34>] (dma_pool_alloc+0x0/0x1a8) from [<c01a9d14>] (pl08x_fill_llis_for_desc+0x28/0x568)
[<c01a9cec>] (pl08x_fill_llis_for_desc+0x0/0x568) from [<c01aab8c>] (pl08x_prep_slave_sg+0x258/0x3b0)
[<c01aa934>] (pl08x_prep_slave_sg+0x0/0x3b0) from [<c01c9f74>] (pl011_dma_tx_refill+0x140/0x288)
[<c01c9e34>] (pl011_dma_tx_refill+0x0/0x288) from [<c01ca748>] (pl011_start_tx+0xe4/0x120)
[<c01ca664>] (pl011_start_tx+0x0/0x120) from [<c01c54a4>] (__uart_start+0x48/0x4c)
[<c01c545c>] (__uart_start+0x0/0x4c) from [<c01c632c>] (uart_start+0x2c/0x3c)
[<c01c6300>] (uart_start+0x0/0x3c) from [<c01c795c>] (uart_write+0xcc/0xf4)
[<c01c7890>] (uart_write+0x0/0xf4) from [<c01b0384>] (n_tty_write+0x1c0/0x3e4)
[<c01b01c4>] (n_tty_write+0x0/0x3e4) from [<c01acfe8>] (tty_write+0x144/0x240)
[<c01acea4>] (tty_write+0x0/0x240) from [<c01ad17c>] (redirected_tty_write+0x98/0xac)
[<c01ad0e4>] (redirected_tty_write+0x0/0xac) from [<c00c371c>] (vfs_write+0xbc/0x150)
[<c00c3660>] (vfs_write+0x0/0x150) from [<c00c39c0>] (sys_write+0x4c/0x78)
[<c00c3974>] (sys_write+0x0/0x78) from [<c0014460>] (ret_fast_syscall+0x0/0x3c)

This happens because the DMA allocation code is not respecting atomic
allocations correctly.

GFP flags should not be tested for GFP_ATOMIC to determine if an
atomic allocation is being requested.  GFP_ATOMIC is not a flag but
a value.  The GFP bitmask flags are all prefixed with __GFP_.

The rest of the kernel tests for __GFP_WAIT not being set to indicate
an atomic allocation.  We need to do the same.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-02-08 10:25:23 +00:00
Russell King
15653371c6 ARM: DMA: Fix struct page iterator in dma_cache_maint() to work with sparsemem
Subhash Jadavani reported this partial backtrace:
  Now consider this call stack from MMC block driver (this is on the ARMv7
  based board):

  [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c)
  [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c)
  [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from [<c0017ff8>] (dma_map_sg+0x3c/0x114)

This is caused by incrementing the struct page pointer, and running off
the end of the sparsemem page array.  Fix this by incrementing by pfn
instead, and convert the pfn to a struct page.

Cc: <stable@vger.kernel.org>
Suggested-by: James Bottomley <JBottomley@Parallels.com>
Tested-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-01-19 11:05:57 +00:00
Linus Torvalds
3c2e81ef34 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull DRM updates from Dave Airlie:
 "This is the one and only next pull for 3.8, we had a regression we
  found last week, so I was waiting for that to resolve itself, and I
  ended up with some Intel fixes on top as well.

  Highlights:
   - new driver: nvidia tegra 20/30/hdmi support
   - radeon: add support for previously unused DMA engines, more HDMI
     regs, eviction speeds ups and fixes
   - i915: HSW support enable, agp removal on GEN6, seqno wrapping
   - exynos: IPP subsystem support (image post proc), HDMI
   - nouveau: display class reworking, nv20->40 z compression
   - ttm: start of locking fixes, rcu usage for lookups,
   - core: documentation updates, docbook integration, monotonic clock
     usage, move from connector to object properties"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (590 commits)
  drm/exynos: add gsc ipp driver
  drm/exynos: add rotator ipp driver
  drm/exynos: add fimc ipp driver
  drm/exynos: add iommu support for ipp
  drm/exynos: add ipp subsystem
  drm/exynos: support device tree for fimd
  radeon: fix regression with eviction since evict caching changes
  drm/radeon: add more pedantic checks in the CP DMA checker
  drm/radeon: bump version for CS ioctl support for async DMA
  drm/radeon: enable the async DMA rings in the CS ioctl
  drm/radeon: add VM CS parser support for async DMA on cayman/TN/SI
  drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2)
  drm/radeon/kms: add 6xx/7xx CS parser for async DMA (v2)
  drm/radeon: fix htile buffer size computation for command stream checker
  drm/radeon: fix fence locking in the pageflip callback
  drm/radeon: make indirect register access concurrency-safe
  drm/radeon: add W|RREG32_IDX for MM_INDEX|DATA based mmio accesss
  drm/exynos: support extended screen coordinate of fimd
  drm/exynos: fix x, y coordinates for right bottom pixel
  drm/exynos: fix fb offset calculation for plane
  ...
2012-12-17 08:26:17 -08:00
Marek Szyprowski
549a17e447 ARM: dma-mapping: add support for DMA_ATTR_FORCE_CONTIGUOUS attribute
This patch adds support for DMA_ATTR_FORCE_CONTIGUOUS attribute for
dma_alloc_attrs() in IOMMU-aware implementation. For allocating physically
contiguous buffers Contiguous Memory Allocator is used.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-11-29 03:30:34 -08:00
Gregory CLEMENT
87b54e786a arm: dma mapping: Export a dma ops function arm_dma_set_mask
Expose another DMA operations function: arm_dma_set_mask. This
function will be added to a custom DMA ops for Armada 370/XP.
Depending of its configuration Armada 370/XP can be set as a "nearly"
coherent architecture. In this case the DMA ops is made of:
- specific functions for this architecture
- already exposed arm DMA related functions
- the arm_dma_set_mask which was not exposed yet.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-11-21 17:07:49 +01:00
Jingoo Han
3dd7ea9220 ARM: dma-mapping: fix build warning in __dma_alloc()
Fix build warning in __dma_alloc() as below:

arch/arm/mm/dma-mapping.c: In function '__dma_alloc':
arch/arm/mm/dma-mapping.c:653:29: warning: 'page' may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-24 07:38:15 +02:00
Marek Szyprowski
461b6f0d3d Merge branch 'next-cleanup' into for-v3.7 2012-10-02 09:24:24 +02:00
Hiroshi Doyu
abebfb18ea ARM: dma-mapping: Remove unsed var at arm_coherent_iommu_unmap_page
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:08 +02:00
Rob Herring
0fa478df44 ARM: add coherent iommu dma ops
Remove arch_is_coherent() from iommu dma ops and implement separate
coherent ops functions.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:06 +02:00
Rob Herring
dd37e9405a ARM: add coherent dma ops
arch_is_coherent is problematic as it is a global symbol. This
doesn't work for multi-platform kernels or platforms which can support
per device coherent DMA.

This adds arm_coherent_dma_ops to be used for devices which connected
coherently (i.e. to the ACP port on Cortex-A9 or A15). The arm_dma_ops
are modified at boot when arch_is_coherent is true.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:58:06 +02:00
Hiroshi Doyu
75c5971614 ARM: dma-mapping: Refrain noisy console message
With many IOMMU'able devices, console gets noisy.

Tegra30 has a few dozen of IOMMU'able devices.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:57:45 +02:00
Hiroshi Doyu
5a796eeb7b ARM: dma-mapping: Small logical clean up
Skip unnecessary operations if order == 0. A little bit easier to
read.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-10-02 08:57:45 +02:00
Sachin Kamat
ec10665cbf ARM: dma-mapping: Fix potential memory leak in atomic_pool_init()
When either of __alloc_from_contiguous or __alloc_remap_buffer fails
to provide a valid pointer, allocated memory is freed up and an error
is returned. 'pages' was however not freed before returning error.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-09-24 08:35:03 +02:00
Thomas Petazzoni
f3d8752497 arm: mm: fix DMA pool affiliation check
The __free_from_pool() function was changed in
e9da6e9905. Unfortunately, the test that
checks whether the provided (start,size) is within the DMA pool has
been improperly modified. It used to be:

  if (start < coherent_head.vm_start || end > coherent_head.vm_end)

Where coherent_head.vm_end was non-inclusive (i.e, it did not include
the first byte after the pool). The test has been changed to:

  if (start < pool->vaddr || start > pool->vaddr + pool->size)

So now pool->vaddr + pool->size is inclusive (i.e, it includes the
first byte after the pool), so the test should be >= instead of >.

This bug causes the following message when freeing the *first* DMA
coherent buffer that has been allocated, because its virtual address
is exactly equal to pool->vaddr + pool->size :

WARNING: at /home/thomas/projets/linux-2.6/arch/arm/mm/dma-mapping.c:463 __free_from_pool+0xa4/0xc0()
freeing wrong coherent size from pool

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Lior Amsalem <alior@marvell.com>
Cc: Maen Suleiman <maen@marvell.com>
Cc: Tawfik Bayouk <tawfik@marvell.com>
Cc: Shadi Ammouri <shadi@marvell.com>
Cc: Eran Ben-Avi <benavi@marvell.com>
Cc: Yehuda Yitschak <yehuday@marvell.com>
Cc: Nadav Haklai <nadavh@marvell.com>
[m.szyprowski: rebased onto v3.6-rc5 and resolved conflict]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-09-10 16:15:48 +02:00
Hiroshi Doyu
479ed93a4b ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
Make use of the same atomic pool as DMA does, and skip a kernel page
mapping which can involve sleep'able operations at allocating a kernel
page table.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:07 +02:00
Hiroshi Doyu
665bad7bb9 ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
Support atomic allocation in __iommu_get_pages().

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
[moved __atomic_get_pages() under #ifdef CONFIG_ARM_DMA_USE_IOMMU
 to avoid unused fuction warning for no-IOMMU case]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:06 +02:00
Hiroshi Doyu
21d0a75951 ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
Check the given range("start", "size") is included in "atomic_pool" or not.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:06 +02:00
Hiroshi Doyu
6b3fe47264 ARM: dma-mapping: atomic_pool with struct page **pages
struct page **pages is necessary to align with non atomic path in
__iommu_get_pages(). atomic_pool() has the intialized **pages instead
of just *page.

Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:05 +02:00
Marek Szyprowski
fb71285f0c ARM: DMA-Mapping: print warning when atomic coherent allocation fails
Print a loud warning when system runs out of memory from atomic DMA
coherent pool to let users notice the potential problem.

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:03 +02:00
Marek Szyprowski
6e5267aa54 ARM: DMA-Mapping: add function for setting coherent pool size from platform code
Some platforms might require to increase atomic coherent pool to make
sure that their device will be able to allocate all their buffers from
atomic context. This function can be also used to decrease atomic
coherent pool size if coherent allocations are not used for the given
sub-platform.

Suggested-by: Josh Coombs <josh.coombs@gmail.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-28 21:01:02 +02:00
Aaro Koskinen
d9e0d149b5 ARM: dma-mapping: fix incorrect freeing of atomic allocations
Commit e9da6e9905 (ARM: dma-mapping:
remove custom consistent dma region) changed the way atomic allocations
are handled. However, arm_dma_free() was not modified accordingly, and
as a result freeing of atomic allocations does not work correctly when
CMA is disabled. Memory is leaked and following WARNINGs are seen:

[   57.698911] ------------[ cut here ]------------
[   57.753518] WARNING: at arch/arm/mm/dma-mapping.c:263 arm_dma_free+0x88/0xe4()
[   57.811473] trying to free invalid coherent area: e0848000
[   57.867398] Modules linked in: sata_mv(-)
[   57.921373] [<c000d270>] (unwind_backtrace+0x0/0xf0) from [<c0015430>] (warn_slowpath_common+0x50/0x68)
[   58.033924] [<c0015430>] (warn_slowpath_common+0x50/0x68) from [<c00154dc>] (warn_slowpath_fmt+0x30/0x40)
[   58.152024] [<c00154dc>] (warn_slowpath_fmt+0x30/0x40) from [<c000dc18>] (arm_dma_free+0x88/0xe4)
[   58.219592] [<c000dc18>] (arm_dma_free+0x88/0xe4) from [<c008fa30>] (dma_pool_destroy+0x100/0x148)
[   58.345526] [<c008fa30>] (dma_pool_destroy+0x100/0x148) from [<c019a64c>] (release_nodes+0x144/0x218)
[   58.475782] [<c019a64c>] (release_nodes+0x144/0x218) from [<c0197e10>] (__device_release_driver+0x60/0xb8)
[   58.614260] [<c0197e10>] (__device_release_driver+0x60/0xb8) from [<c0198608>] (driver_detach+0xd8/0xec)
[   58.756527] [<c0198608>] (driver_detach+0xd8/0xec) from [<c0197c54>] (bus_remove_driver+0x7c/0xc4)
[   58.901648] [<c0197c54>] (bus_remove_driver+0x7c/0xc4) from [<c004bfac>] (sys_delete_module+0x19c/0x220)
[   59.051447] [<c004bfac>] (sys_delete_module+0x19c/0x220) from [<c0009140>] (ret_fast_syscall+0x0/0x2c)
[   59.207996] ---[ end trace 0745420412c0325a ]---
[   59.287110] ------------[ cut here ]------------
[   59.366324] WARNING: at arch/arm/mm/dma-mapping.c:263 arm_dma_free+0x88/0xe4()
[   59.450511] trying to free invalid coherent area: e0847000
[   59.534357] Modules linked in: sata_mv(-)
[   59.616785] [<c000d270>] (unwind_backtrace+0x0/0xf0) from [<c0015430>] (warn_slowpath_common+0x50/0x68)
[   59.790030] [<c0015430>] (warn_slowpath_common+0x50/0x68) from [<c00154dc>] (warn_slowpath_fmt+0x30/0x40)
[   59.972322] [<c00154dc>] (warn_slowpath_fmt+0x30/0x40) from [<c000dc18>] (arm_dma_free+0x88/0xe4)
[   60.070701] [<c000dc18>] (arm_dma_free+0x88/0xe4) from [<c008fa30>] (dma_pool_destroy+0x100/0x148)
[   60.256817] [<c008fa30>] (dma_pool_destroy+0x100/0x148) from [<c019a64c>] (release_nodes+0x144/0x218)
[   60.445201] [<c019a64c>] (release_nodes+0x144/0x218) from [<c0197e10>] (__device_release_driver+0x60/0xb8)
[   60.634148] [<c0197e10>] (__device_release_driver+0x60/0xb8) from [<c0198608>] (driver_detach+0xd8/0xec)
[   60.823623] [<c0198608>] (driver_detach+0xd8/0xec) from [<c0197c54>] (bus_remove_driver+0x7c/0xc4)
[   61.013268] [<c0197c54>] (bus_remove_driver+0x7c/0xc4) from [<c004bfac>] (sys_delete_module+0x19c/0x220)
[   61.203472] [<c004bfac>] (sys_delete_module+0x19c/0x220) from [<c0009140>] (ret_fast_syscall+0x0/0x2c)
[   61.393390] ---[ end trace 0745420412c0325b ]---

The patch fixes this.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-09 07:46:07 +02:00
Aaro Koskinen
e4ea6918c9 ARM: dma-mapping: fix atomic allocation alignment
The alignment mask is calculated incorrectly. Fixing the calculation
makes strange hangs/lockups disappear during the boot with Amstrad E3
and 3.6-rc1 kernel.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-09 07:46:07 +02:00
Chris Brand
39f78e7056 ARM: mm: fix MMU mapping of CMA regions
Fix dma_contiguous_remap() so that it continues through all the
regions, even after encountering one that is outside lowmem.
Without this change, if you have two CMA regions, the first outside
lowmem and the seocnd inside lowmem, only the second one will get
set up in the MMU. Data written to that region then doesn't get
automatically flushed from the cache into memory.

Signed-off-by: Chris Brand <cbrand@broadcom.com>
[extended patch subject with 'fix' word]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-08-09 07:46:07 +02:00
Linus Torvalds
6f51f51582 Merge branch 'for-linus-for-3.6-rc1' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping updates from Marek Szyprowski:
 "Those patches are continuation of my earlier work.

  They contains extensions to DMA-mapping framework to remove limitation
  of the current ARM implementation (like limited total size of DMA
  coherent/write combine buffers), improve performance of buffer sharing
  between devices (attributes to skip cpu cache operations or creation
  of additional kernel mapping for some specific use cases) as well as
  some unification of the common code for dma_mmap_attrs() and
  dma_mmap_coherent() functions.  All extensions have been implemented
  and tested for ARM architecture."

* 'for-linus-for-3.6-rc1' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
  common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
  ARM: dma-mapping: add support for dma_get_sgtable()
  common: dma-mapping: introduce dma_get_sgtable() function
  ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
  common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
  common: dma-mapping: add support for generic dma_mmap_* calls
  ARM: dma-mapping: fix error path for memory allocation failure
  ARM: dma-mapping: add more sanity checks in arm_dma_mmap()
  ARM: dma-mapping: remove custom consistent dma region
  mm: vmalloc: use const void * for caller argument
  scatterlist: add sg_alloc_table_from_pages function
2012-07-30 10:11:31 -07:00
Marek Szyprowski
97ef952a20 ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients
to create a mapping for the buffer for the given device without performing
a CPU cache synchronization. CPU cache synchronization can be skipped for
the buffers which it is known that they are already in 'device' domain (CPU
caches have been already synchronized or there are only coherent mappings
for the buffer). For advanced users only, please use it with care.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:47 +02:00
Marek Szyprowski
dc2832e1e7 ARM: dma-mapping: add support for dma_get_sgtable()
This patch adds support for dma_get_sgtable() function which is required
to let drivers to share the buffers allocated by DMA-mapping subsystem.

Generic implementation based on virt_to_page() is not suitable for ARM
dma-mapping subsystem.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:47 +02:00
Marek Szyprowski
955c757e09 ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for
IOMMU allocations, what let drivers to save precious kernel virtual
address space for large buffers that are intended to be accessed only
from userspace.

This patch is heavily based on initial work kindly provided by Abhinav
Kochhar <abhinav.k@samsung.com>.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:46 +02:00
Marek Szyprowski
9fa8af91f0 ARM: dma-mapping: fix error path for memory allocation failure
This patch fixes incorrect check in error path. When the allocation of
first page fails, the kernel ops appears due to accessing -1 element of
the pages array.

Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30 12:25:45 +02:00
Marek Szyprowski
50262a4bf3 ARM: dma-mapping: add more sanity checks in arm_dma_mmap()
Add some sanity checks and forbid mmaping of buffers into vma areas larger
than allocated dma buffer.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30 12:25:45 +02:00
Marek Szyprowski
e9da6e9905 ARM: dma-mapping: remove custom consistent dma region
This patch changes dma-mapping subsystem to use generic vmalloc areas
for all consistent dma allocations. This increases the total size limit
of the consistent allocations and removes platform hacks and a lot of
duplicated code.

Atomic allocations are served from special pool preallocated on boot,
because vmalloc areas cannot be reliably created in atomic context.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
2012-07-30 12:25:45 +02:00
Russell King
91b006def3 Merge branches 'audit', 'delay', 'fixes', 'misc' and 'sta2x11' into for-linus 2012-07-27 23:06:32 +01:00
Prathyush K
46c87852e9 ARM: dma-mapping: modify condition check while freeing pages
WARNING: at mm/vmalloc.c:1471 __iommu_free_buffer+0xcc/0xd0()
Trying to vfree() nonexistent vm area (ef095000)
Modules linked in:
[<c0015a18>] (unwind_backtrace+0x0/0xfc) from [<c0025a94>] (warn_slowpath_common+0x54/0x64)
[<c0025a94>] (warn_slowpath_common+0x54/0x64) from [<c0025b38>] (warn_slowpath_fmt+0x30/0x40)
[<c0025b38>] (warn_slowpath_fmt+0x30/0x40) from [<c0016de0>] (__iommu_free_buffer+0xcc/0xd0)
[<c0016de0>] (__iommu_free_buffer+0xcc/0xd0) from [<c0229a5c>] (exynos_drm_free_buf+0xe4/0x138)
[<c0229a5c>] (exynos_drm_free_buf+0xe4/0x138) from [<c022b358>] (exynos_drm_gem_destroy+0x80/0xfc)
[<c022b358>] (exynos_drm_gem_destroy+0x80/0xfc) from [<c0211230>] (drm_gem_object_free+0x28/0x34)
[<c0211230>] (drm_gem_object_free+0x28/0x34) from [<c0211bd0>] (drm_gem_object_release_handle+0xcc/0xd8)
[<c0211bd0>] (drm_gem_object_release_handle+0xcc/0xd8) from [<c01abe10>] (idr_for_each+0x74/0xb8)
[<c01abe10>] (idr_for_each+0x74/0xb8) from [<c02114e4>] (drm_gem_release+0x1c/0x30)
[<c02114e4>] (drm_gem_release+0x1c/0x30) from [<c0210ae8>] (drm_release+0x608/0x694)
[<c0210ae8>] (drm_release+0x608/0x694) from [<c00b75a0>] (fput+0xb8/0x228)
[<c00b75a0>] (fput+0xb8/0x228) from [<c00b40c4>] (filp_close+0x64/0x84)
[<c00b40c4>] (filp_close+0x64/0x84) from [<c0029d54>] (put_files_struct+0xe8/0x104)
[<c0029d54>] (put_files_struct+0xe8/0x104) from [<c002b930>] (do_exit+0x608/0x774)
[<c002b930>] (do_exit+0x608/0x774) from [<c002bae4>] (do_group_exit+0x48/0xb4)
[<c002bae4>] (do_group_exit+0x48/0xb4) from [<c002bb60>] (sys_exit_group+0x10/0x18)
[<c002bb60>] (sys_exit_group+0x10/0x18) from [<c000ee80>] (ret_fast_syscall+0x0/0x30)

This patch modifies the condition while freeing to match the condition
used while allocation. This fixes the above warning which arises when
array size is equal to PAGE_SIZE where allocation is done using kzalloc
but free is done using vfree.

Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-16 08:59:55 +02:00
Alessandro Rubini
158e8bfe80 ARM: 7432/1: use the new linux/sizes.h
Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-06-28 17:14:35 +01:00
Marek Szyprowski
593f473554 ARM: dma-mapping: fix buffer chunk allocation order
IOMMU-aware dma_alloc_attrs() implementation allocates buffers in
power-of-two chunks to improve performance and take advantage of large
page mappings provided by some IOMMU hardware. However current code, due
to a subtle bug, allocated those chunks in the smallest-to-largest
order, what completely killed all the advantages of using larger than
page chunks. If a 4KiB chunk has been mapped as a first chunk, the
consecutive chunks are not aligned correctly to the power-of-two which
match their size and IOMMU drivers were not able to use internal
mappings of size other than the 4KiB (largest common denominator of
alignment and chunk size).

This patch fixes this issue by changing to the correct largest-to-smallest
chunk size allocation sequence.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-06-25 10:18:52 +02:00
Sachin Kamat
e53f517ff2 ARM: dma-mapping: Add missing static storage class specifier
Fixes the following sparse warnings:
arch/arm/mm/dma-mapping.c:231:15: warning: symbol 'consistent_base' was not
declared. Should it be static?
arch/arm/mm/dma-mapping.c:326:8: warning: symbol 'coherent_pool_size' was not
declared. Should it be static?

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-06-11 14:30:46 +02:00
Marek Szyprowski
f1ae98da85 ARM: dma-mapping: remove unconditional dependency on CMA
CMA has been enabled unconditionally on all ARMv6+ systems to solve the
long standing issue of double kernel mappings for all dma coherent
buffers. This however created a dependency on CONFIG_EXPERIMENTAL for
the whole ARM architecture what should be really avoided. This patch
removes this dependency and lets one use old, well-tested dma-mapping
implementation also on ARMv6+ systems without the need to use
EXPERIMENTAL stuff.

Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-06-04 08:01:24 +02:00
Marek Szyprowski
0f51596bd3 Merge branch 'for-next-arm-dma' into for-linus
Conflicts:
	arch/arm/Kconfig
	arch/arm/mm/dma-mapping.c

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-05-22 08:55:43 +02:00
Vitaly Andrianov
61f6c7a47a ARM: dma-mapping: use PMD size for section unmap
The dma_contiguous_remap() function clears existing section maps using
the wrong size (PGDIR_SIZE instead of PMD_SIZE).  This is a bug which
does not affect non-LPAE systems, where PGDIR_SIZE and PMD_SIZE are the same.
On LPAE systems, however, this bug causes the kernel to hang at this point.

This fix has been tested on both LPAE and non-LPAE kernel builds.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-05-21 15:09:40 +02:00
Marek Szyprowski
c790950928 ARM: integrate CMA with DMA-mapping subsystem
This patch adds support for CMA to dma-mapping subsystem for ARM
architecture. By default a global CMA area is used, but specific devices
are allowed to have their private memory areas if required (they can be
created with dma_declare_contiguous() function during board
initialisation).

Contiguous memory areas reserved for DMA are remapped with 2-level page
tables on boot. Once a buffer is requested, a low memory kernel mapping
is updated to to match requested memory access type.

GFP_ATOMIC allocations are performed from special pool which is created
early during boot. This way remapping page attributes is not needed on
allocation time.

CMA has been enabled unconditionally for ARMv6+ systems.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
CC: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Rob Clark <rob.clark@linaro.org>
Tested-by: Ohad Ben-Cohen <ohad@wizery.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Robert Nelson <robertcnelson@gmail.com>
Tested-by: Barry Song <Baohua.Song@csr.com>
2012-05-21 15:09:38 +02:00
Marek Szyprowski
4ce63fcd91 ARM: dma-mapping: add support for IOMMU mapper
This patch add a complete implementation of DMA-mapping API for
devices which have IOMMU support.

This implementation tries to optimize dma address space usage by remapping
all possible physical memory chunks into a single dma address space chunk.

DMA address space is managed on top of the bitmap stored in the
dma_iommu_mapping structure stored in device->archdata. Platform setup
code has to initialize parameters of the dma address space (base address,
size, allocation precision order) with arm_iommu_create_mapping() function.
To reduce the size of the bitmap, all allocations are aligned to the
specified order of base 4 KiB pages.

dma_alloc_* functions allocate physical memory in chunks, each with
alloc_pages() function to avoid failing if the physical memory gets
fragmented. In worst case the allocated buffer is composed of 4 KiB page
chunks.

dma_map_sg() function minimizes the total number of dma address space
chunks by merging of physical memory chunks into one larger dma address
space chunk. If requested chunk (scatter list entry) boundaries
match physical page boundaries, most calls to dma_map_sg() requests will
result in creating only one chunk in dma address space.

dma_map_page() simply creates a mapping for the given page(s) in the dma
address space.

All dma functions also perform required cache operation like their
counterparts from the arm linear physical memory mapping version.

This patch contains code and fixes kindly provided by:
- Krishna Reddy <vdumpa@nvidia.com>,
- Andrzej Pietrasiewicz <andrzej.p@samsung.com>,
- Hiroshi DOYU <hdoyu@nvidia.com>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:23 +02:00
Marek Szyprowski
f99d603412 ARM: dma-mapping: use alloc, mmap, free from dma_ops
This patch converts dma_alloc/free/mmap_{coherent,writecombine}
functions to use generic alloc/free/mmap methods from dma_map_ops
structure. A new DMA_ATTR_WRITE_COMBINE DMA attribute have been
introduced to implement writecombine methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:22 +02:00
Marek Szyprowski
51fde3499b ARM: dma-mapping: remove redundant code and do the cleanup
This patch just performs a global cleanup in DMA mapping implementation
for ARM architecture. Some of the tiny helper functions have been moved
to the caller code, some have been merged together.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:19 +02:00
Marek Szyprowski
15237e1f50 ARM: dma-mapping: move all dma bounce code to separate dma ops structure
This patch removes dma bounce hooks from the common dma mapping
implementation on ARM architecture and creates a separate set of
dma_map_ops for dma bounce devices.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:18 +02:00
Marek Szyprowski
2a550e73d3 ARM: dma-mapping: implement dma sg methods on top of any generic dma ops
This patch converts all dma_sg methods to be generic (independent of the
current DMA mapping implementation for ARM architecture). All dma sg
operations are now implemented on top of respective
dma_map_page/dma_sync_single_for* operations from dma_map_ops structure.

Before this patch there were custom methods for all scatter/gather
related operations. They iterated over the whole scatter list and called
cache related operations directly (which in turn checked if we use dma
bounce code or not and called respective version). This patch changes
them not to use such shortcut. Instead it provides similar loop over
scatter list and calls methods from the device's dma_map_ops structure.
This enables us to use device dependent implementations of cache related
operations (direct linear or dma bounce) depending on the provided
dma_map_ops structure.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:17 +02:00
Marek Szyprowski
2dc6a016bb ARM: dma-mapping: use asm-generic/dma-mapping-common.h
This patch modifies dma-mapping implementation on ARM architecture to
use common dma_map_ops structure and asm-generic/dma-mapping-common.h
helpers.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:14 +02:00
Marek Szyprowski
a227fb92a0 ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops
This patch removes the need for the offset parameter in dma bounce
functions. This is required to let dma-mapping framework on ARM
architecture to use common, generic dma_map_ops based dma-mapping
helpers.

Background and more detailed explaination:

dma_*_range_* functions are available from the early days of the dma
mapping api. They are the correct way of doing a partial syncs on the
buffer (usually used by the network device drivers). This patch changes
only the internal implementation of the dma bounce functions to let
them tunnel through dma_map_ops structure. The driver api stays
unchanged, so driver are obliged to call dma_*_range_* functions to
keep code clean and easy to understand.

The only drawback from this patch is reduced detection of the dma api
abuse. Let us consider the following code:

dma_addr = dma_map_single(dev, ptr, 64, DMA_TO_DEVICE);
dma_sync_single_range_for_cpu(dev, dma_addr+16, 0, 32, DMA_TO_DEVICE);

Without the patch such code fails, because dma bounce code is unable
to find the bounce buffer for the given dma_address. After the patch
the above sync call will be equivalent to:

dma_sync_single_range_for_cpu(dev, dma_addr, 16, 32, DMA_TO_DEVICE);

which succeeds.

I don't consider this as a real problem, because DMA API abuse should be
caught by debug_dma_* function family. This patch lets us to simplify
the internal low-level implementation without chaning the driver visible
API.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:13 +02:00
Marek Szyprowski
553ac78877 ARM: dma-mapping: introduce DMA_ERROR_CODE constant
Replace all uses of ~0 with DMA_ERROR_CODE, what should make the code
easier to read.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:12 +02:00
Marek Szyprowski
6b6f770b57 ARM: dma-mapping: use pr_* instread of printk
Replace all calls to printk with pr_* functions family.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-By: Subash Patel <subash.ramaswamy@linaro.org>
2012-05-21 15:06:11 +02:00
Marek Szyprowski
47142f07ee ARM: dma-mapping: use dma_mmap_from_coherent()
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-05-21 15:06:10 +02:00
Russell King
45cd5290bf ARM: add dma coherent region reporting via procfs
Add a new seqfile for reporting coherent DMA allocations.  This contains
the address range, size and the function which was used to allocate
each region, allowing these allocations to be viewed in much the same
way as /proc/vmallocinfo.

The DMA coherent region has limited space, so this allows allocation
failures to be viewed, as well as finding out how much space is being
used.

Make sure this file is only readable by root - same as vmallocinfo - to
prevent information leakage.

Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-01-23 10:23:57 +00:00
Sumit Bhattacharya
ea2e7057c0 ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations
dma_alloc_coherent wants to split pages after allocation in order to
reduce the memory footprint. This does not work well with GFP_COMP
pages, so drop this flag before allocation.

This patch is ported from arch/avr32
(commit 3611553ef9).

[swarren: s/HUGETLB_PAGE/HUGETLBFS/ in comment, minor comment cleanup]

Signed-off-by: Sumit Bhattacharya <sumitb@nvidia.com>
Tested-by: Varun Colbert <vcolbert@nvidia.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-11-26 21:58:53 +00:00
Catalin Marinas
53cbcbcf43 ARM: 7166/1: Use PMD_SHIFT instead of PGDIR_SHIFT in dma-consistent.c
Commit 99d1717d (ARM: Add init_consistent_dma_size()) introduces dynamic
allocation of the consistent_pte array. The number of PTEs should be
calculated based on the number of PMD entries rather than PGD, hence the
PMD_SHIFT.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jon Medhurst <tixy@yxit.co.uk>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-11-21 13:12:19 +00:00
Linus Torvalds
1fdb24e969 Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
  ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
  ARM: gic, local timers: use the request_percpu_irq() interface
  ARM: gic: consolidate PPI handling
  ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
  ARM: mach-s5p64x0: remove mach/memory.h
  ARM: mach-s3c64xx: remove mach/memory.h
  ARM: plat-mxc: remove mach/memory.h
  ARM: mach-prima2: remove mach/memory.h
  ARM: mach-zynq: remove mach/memory.h
  ARM: mach-bcmring: remove mach/memory.h
  ARM: mach-davinci: remove mach/memory.h
  ARM: mach-pxa: remove mach/memory.h
  ARM: mach-ixp4xx: remove mach/memory.h
  ARM: mach-h720x: remove mach/memory.h
  ARM: mach-vt8500: remove mach/memory.h
  ARM: mach-s5pc100: remove mach/memory.h
  ARM: mach-tegra: remove mach/memory.h
  ARM: plat-tcc: remove mach/memory.h
  ARM: mach-mmp: remove mach/memory.h
  ARM: mach-cns3xxx: remove mach/memory.h
  ...

Fix up mostly pretty trivial conflicts in:
 - arch/arm/Kconfig
 - arch/arm/include/asm/localtimer.h
 - arch/arm/kernel/Makefile
 - arch/arm/mach-shmobile/board-ap4evb.c
 - arch/arm/mach-u300/core.c
 - arch/arm/mm/dma-mapping.c
 - arch/arm/mm/proc-v7.S
 - arch/arm/plat-omap/Kconfig
largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
addition of NEED_MACH_MEMORY_H next to HAVE_IDE.
2011-10-28 12:02:27 -07:00
Russell King
06afb1a087 Merge branches 'arnd-randcfg-fixes', 'debug', 'io' (early part), 'l2x0', 'p2v', 'pgt' (early part) and 'smp' into for-linus 2011-10-25 08:19:29 +01:00
Russell King
d8e89b47e0 ARM: dma-mapping: free allocated page if unable to map
If the attempt to map a page for DMA fails (eg, because we're out of
mapping space) then we must not hold on to the page we allocated for
DMA - doing so will result in a memory leak.

Cc: <stable@kernel.org>
Reported-by: Bryan Phillippe <bp@darkforest.org>
Tested-by: Bryan Phillippe <bp@darkforest.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-26 09:36:50 +01:00
Catalin Marinas
e73fc88e19 ARM: 7059/1: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
the same value (21). This patch converts the PGDIR_* uses in the kernel
to the PMD_* equivalent so that LPAE builds can reuse the same code.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-08-23 15:30:33 +01:00
Jon Medhurst
97fef8bda9 ARM: Remove support for macro CONSISTENT_DMA_SIZE
There are now no platforms which set this macro.

Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
2011-08-22 12:00:12 +00:00
Jon Medhurst
99d1717dd7 ARM: Add init_consistent_dma_size()
This function can be called during boot to increase the size of the consistent
DMA region above it's default value of 2MB. It must be called before the memory
allocator is initialised, i.e. before any core_initcall.

Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2011-08-22 12:00:10 +00:00
Russell King
022ae537b2 ARM: dma: replace ISA_DMA_THRESHOLD with a variable
ISA_DMA_THRESHOLD has been unused by non-arch code, so lets now get
rid of it from ARM by replacing it with arm_dma_zone_mask.  Move
dma_supported() and dma_set_mask() out of line, and have
dma_supported() check this new variable instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-12 11:08:12 +01:00
Russell King
196f020fbb Merge branches 'fixes', 'pgt-next' and 'versatile' into devel 2011-03-20 09:32:12 +00:00
Russell King
516295e5ab ARM: pgtable: add pud-level code
Add pud_offset() et.al. between the pgd and pmd code in preparation of
using pgtable-nopud.h rather than 4level-fixup.h.

This incorporates a fix from Jamie Iles <jamie@jamieiles.com> for
uaccess_with_memcpy.c.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-02-21 19:24:14 +00:00
Linus Walleij
0adfca6ff2 ARM: 6622/1: fix dma_unmap_sg() documentation
The kerneldoc for this function is at odds with the DMA-API
document, which holds, so fix it.

Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-12 19:42:13 +00:00
Russell King
4073723acb Merge branch 'misc' into devel
Conflicts:
	arch/arm/Kconfig
	arch/arm/common/Makefile
	arch/arm/kernel/Makefile
	arch/arm/kernel/smp.c
2011-01-06 22:32:52 +00:00
Russell King
4ec3eb1363 Merge branch 'smp' into misc
Conflicts:
	arch/arm/kernel/entry-armv.S
	arch/arm/mm/ioremap.c
2011-01-06 22:32:03 +00:00
Russell King
24056f5250 ARM: DMA: add support for DMA debugging
Add ARM support for the DMA debug infrastructure, which allows the
DMA API usage to be debugged.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-06 22:31:11 +00:00
Russell King
9eedd96301 ARM: DMA: Replace page_to_dma()/dma_to_page() with pfn_to_dma()/dma_to_pfn()
Replace the page_to_dma() and dma_to_page() macros with their PFN
equivalents.  This allows us to map parts of memory which do not have
a struct page allocated to them to bus addresses.  This will be used
internally by dma_alloc_coherent()/dma_alloc_writecombine().

Build tested on Versatile, OMAP1, IOP13xx and KS8695.

Tested-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-01-03 11:27:43 +00:00
Nicolas Pitre
39af22a792 ARM: get rid of kmap_high_l1_vipt()
Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is no longer
necessary to carry an ad hoc version of kmap_atomic() added in commit
7e5a69e83b "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope
with reentrancy.

In fact, it is now actively wrong to rely on fixed kmap type indices
(namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a
concurrent instance of it may reuse any slot for any purpose.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19 12:56:46 -05:00
Russell King
c947f69fff ARM: Fix DMA coherent allocator alignment
An out by one bug meant that the DMA coherent allocator was aligning
to one more bit than it should, causing it to run out of available
memory quicker.  Fix this.

Reported-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-11-07 16:10:15 +00:00
Catalin Marinas
c01778001a ARM: 6379/1: Assume new page cache pages have dirty D-cache
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().

The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().

Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:17:43 +01:00
Russell King
2be23c475a ARM: Ensure PTE modifications via dma_alloc_coherent are visible
Dave Hylands reports:
| We've observed a problem with dma_alloc_writecombine when the system
| is under heavy load (heavy bus traffic).  We've managed to reduce the
| problem to the following snippet, which is run from a kthread in a
| continuous loop:
|
|   void *virtAddr;
|   dma_addr_t physAddr;
|   unsigned int numBytes = 256;
|
|   for (;;) {
|       virtAddr = dma_alloc_writecombine(NULL,
|             numBytes, &physAddr, GFP_KERNEL);
|       if (virtAddr == NULL) {
|          printk(KERN_ERR "Running out of memory\n");
|          break;
|       }
|
|       /* access DMA memory allocated */
|       tmp = virtAddr;
|       *tmp = 0x77;
|
|       /* free DMA memory */
|       dma_free_writecombine(NULL,
|             numBytes, virtAddr, physAddr);
|
|         ...sleep here...
|     }
|
| By itself, the code will run forever with no issues. However, as we
| increase our bus traffic (typically using DMA) then the *tmp = 0x77
| line will eventually cause a page fault. If we add a small delay (a
| few microseconds) before the *tmp = 0x77, then we don't see a page
| fault, even under heavy load.

A dsb() is required after modifying the PTE entries to ensure that they
will always be visible.  Add this dsb().

Reported-by: Dave Hylands <dhylands@gmail.com>
Tested-by: Dave Hylands <dhylands@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-08 16:27:56 +01:00
Russell King
5bc23d32d8 ARM: DMA coherent allocator: align remapped addresses
The DMA coherent remap area is used to provide an uncached mapping
of memory for coherency with DMA engines.  Currently, we look for
any free hole which our allocation will fit in with page alignment.

However, this can lead to fragmentation of the area, and allows small
allocations to cross L1 entry boundaries.  This is undesirable as we
want to move towards allocating sections of memory.

Align allocations according to the size, limiting the alignment between
the page and section sizes.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-07-27 10:43:48 +01:00
Catalin Marinas
a5e9d38b22 ARM: 6186/1: Avoid the CONSISTENT_DMA_SIZE warning on noMMU builds
This macro is not defined when !CONFIG_MMU so this patch moves the
CONSISTENT_* definitions to the CONFIG_MMU section.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-07-01 10:12:07 +01:00
Nicolas Pitre
7e5a69e83b ARM: 6007/1: fix highmem with VIPT cache and DMA
The VIVT cache of a highmem page is always flushed before the page
is unmapped.  This cache flush is explicit through flush_cache_kmaps()
in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in
kunmap_atomic().  There is also an implicit flush of those highmem pages
that were part of a process that just terminated making those pages free
as the whole VIVT cache has to be flushed on every task switch. Hence
unmapped highmem pages need no cache maintenance in that case.

However unmapped pages may still be cached with a VIPT cache because the
cache is tagged with physical addresses.  There is no need for a whole
cache flush during task switching for that reason, and despite the
explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(),
some highmem pages that were mapped in user space end up still cached
even when they become unmapped.

So, we do have to perform cache maintenance on those unmapped highmem
pages in the context of DMA when using a VIPT cache.  Unfortunately,
it is not possible to perform that cache maintenance using physical
addresses as all the L1 cache maintenance coprocessor functions accept
virtual addresses only.  Therefore we have no choice but to set up a
temporary virtual mapping for that purpose.

And of course the explicit cache flushing when unmapping a highmem page
on a system with a VIPT cache now can go, which should increase
performance.

While at it, because the code in __flush_dcache_page() has to be modified
anyway, let's also make sure the mapped highmem pages are pinned with
kmap_high_get() for the duration of the cache maintenance operation.
Because kunmap() does unmap highmem pages lazily, it was reported by
Gary King <GKing@nvidia.com> that those pages ended up being unmapped
during cache maintenance on SMP causing segmentation faults.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-14 11:11:27 +01:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Russell King
2741ecb4ce Merge branch 'misc2' into devel 2010-02-25 22:09:41 +00:00
Fenkart/Bostandzhyan
a7bd08c82e ARM: 5927/1: Make delimiters of DMA area globally visibly.
Adds DMA area to 'virtual memory map' startup message

Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-15 21:40:32 +00:00
Russell King
2ffe2da3e7 ARM: dma-mapping: fix for speculative prefetching
ARMv6 and ARMv7 CPUs can perform speculative prefetching, which makes
DMA cache coherency handling slightly more interesting.  Rather than
being able to rely upon the CPU not accessing the DMA buffer until DMA
has completed, we now must expect that the cache could be loaded with
possibly stale data from the DMA buffer.

Where DMA involves data being transferred to the device, we clean the
cache before handing it over for DMA, otherwise we invalidate the buffer
to get rid of potential writebacks.  On DMA Completion, if data was
transferred from the device, we invalidate the buffer to get rid of
any stale speculative prefetches.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:25 +00:00
Russell King
a9c9147eb9 ARM: dma-mapping: provide per-cpu type map/unmap functions
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:20 +00:00
Russell King
93f1d629e2 ARM: dma-mapping: simplify dma_cache_maint_page
dma_cache_maint_contiguous is now simple enough to live inside
dma_cache_maint_page, so move it there.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:16 +00:00
Russell King
65af191a04 ARM: dma-mapping: move selection of page ops out of dma_cache_maint_contiguous
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:14 +00:00
Russell King
4ea0d7371e ARM: dma-mapping: push buffer ownership down into dma-mapping.c
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:11 +00:00
Russell King
18eabe2347 ARM: dma-mapping: introduce the idea of buffer ownership
The DMA API has the notion of buffer ownership; make it explicit in the
ARM implementation of this API.  This gives us a set of hooks to allow
us to deal with CPU cache issues arising from non-cache coherent DMA.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-By: Jamie Iles <jamie@jamieiles.com>
2010-02-15 15:21:43 +00:00
Russell King
26a26d3296 ARM: dma-mapping: switch ARMv7 DMA mappings to retain 'memory' attribute
On ARMv7, it is invalid to map the same physical address multiple times
with different memory types.  Since system RAM is already mapped as
'memory', subsequent remapping of it must retain this attribute.

However, DMA memory maps it as "strongly ordered".  Fix this by introducing
'pgprot_dmacoherent()' which provides the necessary page table bits for
DMA mappings.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2009-11-24 17:41:36 +00:00
Russell King
acaac256b3 ARM: dma-mapping: get rid of setting/clearing the reserved page bit
It's unnecessary; x86 doesn't do it, and ALSA doesn't require it
anymore.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:36 +00:00
Russell King
31ebf94435 ARM: dma-mapping: Factor out noMMU dma buffer allocation code
This entirely separates the DMA coherent buffer remapping code from
the allocation code, and gets rid of the duplicate copy in the !MMU
section.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:35 +00:00
Russell King
ebd7a845fa ARM: dma-mapping: clean up coherent arch dma allocation
IXP23xx added support for dma_alloc_coherent() for DMA arches with an
exception in dma_alloc_coherent().  This is a subset of what goes on
in __dma_alloc(), and there is no reason why dma_alloc_writecombine()
should not be given the same treatment (except, maybe, that IXP23xx
doesn't use it.)

We can better deal with this by moving the arch_is_coherent() test
inside __dma_alloc() and killing the code duplication.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:35 +00:00
Russell King
88c58f3b92 ARM: dma-mapping: move consistent_init into CONFIG_MMU section
No point wrapping the contents of this function with #ifdef CONFIG_MMU
when we can place it and the core_initcall() entirely within the
existing conditional block.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:35 +00:00
Russell King
695ae0af5a ARM: dma-mapping: factor dma_free_coherent() common code
We effectively have three implementations of dma_free_coherent() mixed up
in the code; the incoherent MMU, coherent MMU and noMMU versions.

The coherent MMU and noMMU versions are actually functionally identical.
The incoherent MMU version is almost the same, but with the additional
step of unmapping the secondary mapping.

Separate out this additional step into __dma_free_remap() and simplify
the resulting dma_free_coherent() code.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:35 +00:00
Russell King
04da56943b ARM: dma-mapping: fix nommu dma_alloc_coherent()
The nommu version of dma_alloc_coherent was using kmalloc/kfree to manage
the memory.  dma_alloc_coherent() is expected to work with a granularity
of a page, so this is wrong.  Fix it by using the helper functions now
provided.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:34 +00:00
Russell King
3e82d012e9 ARM: dma-mapping: fix coherent arch dma_alloc_coherent()
The coherent architecture dma_alloc_coherent was using kmalloc/kfree to
manage the memory.  dma_alloc_coherent() is expected to work with a
granularity of a page, so this is wrong.  Fix it by using the helper
functions now provided.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:34 +00:00
Russell King
7a9a32a953 ARM: dma-mapping: functions to allocate/free a coherent buffer
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:34 +00:00
Russell King
13ccf3ad99 ARM: dma-mapping: split out vmregion code from dma coherent mapping code
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2009-11-24 17:41:34 +00:00
Russell King
c06e004c72 ARM: Use GFP_DMA only for masks _less_ than 32-bit
We were using GFP_DMA for masks other than 0xffffffff, which is
wrong when some masks are initialized to 0xffffffffffffffff.
This caused such masks to obtain memory from the precious DMA
pool.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-25 22:44:30 +00:00
Catalin Marinas
ab6494f0c9 nommu: Add noMMU support to the DMA API
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2009-07-24 12:35:02 +01:00
Nicolas Pitre
43377453af [ARM] introduce dma_cache_maint_page()
This is a helper to be used by the DMA mapping API to handle cache
maintenance for memory identified by a page structure instead of a
virtual address.  Those pages may or may not be highmem pages, and
when they're highmem pages, they may or may not be virtually mapped.
When they're not mapped then there is no L1 cache to worry about. But
even in that case the L2 cache must be processed since unmapped highmem
pages can still be L2 cached.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-03-15 21:01:21 -04:00
Russell King
1522ac3ec9 [ARM] Fix virtual to physical translation macro corner cases
The current use of these macros works well when the conversion is
entirely linear.  In this case, we can be assured that the following
holds true:

	__va(p + s) - s = __va(p)

However, this is not always the case, especially when there is a
non-linear conversion (eg, when there is a 3.5GB hole in memory.)
In this case, if 's' is the size of the region (eg, PAGE_SIZE) and
'p' is the final page, the above is most definitely not true.

So, we must ensure that __va() and __pa() are only used with valid
kernel direct mapped RAM addresses.  This patch tweaks the code
to achieve this.

Tested-by: Charles Moschel <fred99@carolina.rr.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-03-12 23:09:09 +00:00
David Howells
9c93af1ede NOMMU: Rename ARM's struct vm_region
Rename ARM's struct vm_region so that I can introduce my own global version
for NOMMU.  It's feasible that the ARM version may wish to use my global one
instead.

The NOMMU vm_region struct defines areas of the physical memory map that are
under mmap.  This may include chunks of RAM or regions of memory mapped
devices, such as flash.  It is also used to retain copies of file content so
that shareable private memory mappings of files can be made.  As such, it may
be compatible with what is described in the banner comment for ARM's vm_region
struct.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-01-08 12:04:47 +00:00
Russell King
309dbbabee [ARM] dma: don't touch cache on dma_*_for_cpu()
As per the dma_unmap_* calls, we don't touch the cache when a DMA
buffer transitions from device to CPU ownership.  Presently, no
problems have been identified with speculative cache prefetching
which in itself is a new feature in later architectures.  We may
have to revisit the DMA API later for these architectures anyway.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-30 11:01:36 +01:00
Russell King
2638b4dbe7 [ARM] dma: Reduce to one dma_sync_sg_* implementation
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-29 10:40:16 +01:00
Russell King
01135d92c1 [ARM] dma: Reduce to one dma_map_sg()/dma_unmap_sg() implementation
No point having two of these; dma_map_page() can do all the work
for us.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-25 23:39:24 +01:00
Russell King
afd1a321c4 [ARM] Update dma_map_sg()/dma_unmap_sg() API
Update the ARM DMA scatter gather APIs for the scatterlist changes.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-25 20:48:45 +01:00
Russell King
0ddbccd118 [ARM] dma: rename consistent.c to dma-mapping.c
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-25 15:59:19 +01:00