To allow IOMMU drivers to batch up TLB flushing operations and postpone
them until ->iotlb_sync() is called, extend the prototypes for the
->unmap() and ->iotlb_sync() IOMMU ops callbacks to take a pointer to
the current iommu_iotlb_gather structure.
All affected IOMMU drivers are updated, but there should be no
functional change since the extra parameter is ignored for now.
Signed-off-by: Will Deacon <will@kernel.org>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set PTE read/write attributes accordingly to the the protections requested
by IOMMU API.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Release all memory allocations associated with a released domain and emit
warning if domain is in-use at the time of destruction.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Both Tegra30 and Tegra114 have 4 ASID's and the corresponding bitfield of
the TLB_FLUSH register differs from later Tegra generations that have 128
ASID's.
In a result the PTE's are now flushed correctly from TLB and this fixes
problems with graphics (randomly failing tests) on Tegra30.
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tegra20 doesn't have SMMU. Move out checking of the SMMU presence from
the SMMU driver into the Memory Controller driver. This change makes code
consistent in regards to how GART/SMMU presence checking is performed.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use the new helpers dev_iommu_fwspec_get()/set() to access
the dev->iommu_fwspec pointer. This makes it easier to move
that pointer later into another struct.
Cc: Thierry Reding <thierry.reding@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
All iommu drivers use the default_iommu_map_sg implementation, and there
is no good reason to ever override it. Just expose it as iommu_map_sg
directly and remove the indirection, specially in our post-spectre world
where indirect calls are horribly expensive.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In case of error, the function iommu_group_alloc() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().
Fixes: 7f4c9176f7 ("iommu/tegra: Allow devices to be grouped")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Implement the ->device_group() and ->of_xlate() callbacks which are used
in order to group devices. Each group can then share a single domain.
This is implemented primarily in order to achieve the same semantics on
Tegra210 and earlier as on Tegra186 where the Tegra SMMU was replaced by
an ARM SMMU. Users of the IOMMU API can now use the same code to share
domains between devices, whereas previously they used to attach each
device individually.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The bus_set_iommu() function will call the add_device()
call-back which needs the iommu to be registered.
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Fixes: 0b480e4470 ('iommu/tegra: Add support for struct iommu_device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add a struct iommu_device to each tegra-smmu and register it
with the iommu-core. Also link devices added to the driver
to their respective hardware iommus.
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
As the last step to making groups mandatory, clean up the remaining
drivers by adding basic support. Whilst it may not perfectly reflect
the isolation capabilities of the hardware (tegra_smmu_swgroup sounds
suspiciously like something that might warrant representing at the
iommu_group level), using generic_device_group() should at least
maintain existing behaviour with respect to the API.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The include file does not need any PCI specifics, so remove
that include. Also fix the places that relied on it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The number of TLB lines was increased from 16 on Tegra30 to 32 on
Tegra114 and later. Parameterize the value so that the initial default
can be set accordingly.
On Tegra30, initializing the value to 32 would effectively disable the
TLB and hence cause massive latencies for memory accesses translated
through the SMMU. This is especially noticeable for isochronuous clients
such as display, whose FIFOs would continuously underrun.
Fixes: 8918465163 ("memory: Add NVIDIA Tegra memory controller support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
This code is used both when creating a new page directory entry and when
tearing it down, with only the PDE value changing between both cases.
Factor the code out so that it can be reused.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: make commit message more accurate]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Extract the use count reference accounting into a separate function and
separate it from allocating the PTE.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: extract and write commit message]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Rather than explicitly zeroing pages allocated via alloc_page(), add
__GFP_ZERO to the gfp mask to ask the allocator for zeroed pages.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Remove the unnecessary manipulation of the PageReserved flags in the
Tegra SMMU driver. None of this is required as the page(s) remain
private to the SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use the DMA API instead of calling architecture internal functions in
the Tegra SMMU driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Pass smmu_flush_ptc() the device address rather than struct page
pointer.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
smmu_flush_ptc() is used in two modes: one is to flush an individual
entry, the other is to flush all entries. We know at the call site
which we require. Split the function into smmu_flush_ptc_all() and
smmu_flush_ptc().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Drivers should not be using __cpuc_* functions nor outer_cache_flush()
directly. This change partly cleans up tegra-smmu.c.
The only difference between cache handling of the tegra variants is
Denver, which omits the call to outer_cache_flush(). This is due to
Denver being an ARM64 CPU, and the ARM64 architecture does not provide
this function. (This, in itself, is a good reason why these should not
be used.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: fix build failure on 64-bit ARM]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use kcalloc() to allocate the use-counter array for the page directory
entries/page tables. Using kcalloc() allows us to be provided with
zero-initialised memory from the allocators, rather than initialising
it ourselves.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Store the struct page pointer for the second level page tables, rather
than working back from the page directory entry. This is necessary as
we want to eliminate the use of physical addresses used with
arch-private functions, switching instead to use the streaming DMA API.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Fix the page table lookup in the unmap and iova_to_phys methods.
Neither of these methods should allocate a page table; a missing page
table should be treated the same as no mapping present.
More importantly, using as_get_pte() for an IOVA corresponding with a
non-present page table entry increments the use-count for the page
table, on the assumption that the caller of as_get_pte() is going to
setup a mapping. This is an incorrect assumption.
Fix both of these bugs by providing a separate helper which only looks
up the page table, but never allocates it. This is akin to pte_offset()
for CPU page tables.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add a pair of helpers to get the page directory and page table indexes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Factor out the common PTE setting code into a separate function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra SMMU unmap path has several problems:
1. as_pte_put() can perform a write-after-free
2. tegra_smmu_unmap() can perform cache maintanence on a page we have
just freed.
3. when a page table is unmapped, there is no CPU cache maintanence of
the write clearing the page directory entry, nor is there any
maintanence of the IOMMU to ensure that it sees the page table has
gone.
Fix this by getting rid of as_pte_put(), and instead coding the PTE
unmap separately from the PDE unmap, placing the PDE unmap after the
PTE unmap has been completed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
iova_to_phys() has several problems:
(a) iova_to_phys() is supposed to return 0 if there is no entry present
for the iova.
(b) if as_get_pte() fails, we oops the kernel by dereferencing a NULL
pointer. Really, we should not even be trying to allocate a page
table at all, but should only be returning the presence of the 2nd
level page table. This will be fixed in a subsequent patch.
Treat both of these conditions as "no mapping" conditions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Provide clients and swgroups files in debugfs. These files show for
which clients IOMMU translation is enabled and which ASID is associated
with each SWGROUP.
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The SMMU on Tegra30 and Tegra114 supports addressing up to 4 GiB of
physical memory. On Tegra124 the addressable physical memory was
extended to 16 GiB. The page frame number stored in PTEs therefore
requires 20 or 22 bits, depending on SoC generation.
In order to cope with this, compute the proper value at runtime.
Reported-by: Joseph Lo <josephl@nvidia.com>
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Each address space in the Tegra SMMU provides 4 GiB worth of addresses.
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The memory controller on NVIDIA Tegra exposes various knobs that can be
used to tune the behaviour of the clients attached to it.
Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.
This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller. It is supported on Tegra30, Tegra114 and Tegra124
currently. Tegra20 has a GART instead.
The Tegra SMMU operates on memory clients and SWGROUPs. A memory client
is a unidirectional, special-purpose DMA master. A SWGROUP represents a
set of memory clients that form a logical functional unit corresponding
to a single device. Typically a device has two clients: one client for
read transactions and one client for write transactions, but there are
also devices that have only read clients, but many of them (such as the
display controllers).
Because there is no 1:1 relationship between memory clients and devices
the driver keeps a table of memory clients and the SWGROUPs that they
belong to per SoC. Note that this is an exception and due to the fact
that the SMMU is tightly integrated with the rest of the Tegra SoC. The
use of these tables is discouraged in drivers for generic IOMMU devices
such as the ARM SMMU because the same IOMMU could be used in any number
of SoCs and keeping such tables for each SoC would not scale.
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Mapping and unmapping are more often than not in the critical path.
map_sg allows IOMMU driver implementations to optimize the process
of mapping buffers into the IOMMU page tables.
Instead of mapping a buffer one page at a time and requiring potentially
expensive TLB operations for each page, this function allows the driver
to map all pages in one go and defer TLB maintenance until after all
pages have been mapped.
Additionally, the mapping operation would be faster in general since
clients does not have to keep calling map API over and over again for
each physically contiguous chunk of memory that needs to be mapped to a
virtually contiguous region.
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make of_device_id array const, because all OF functions handle it as const.
Signed-off-by: Kiran Padwal <kiran.padwal@smartplayin.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This merge window brings a good size of cleanups on various
platforms. Among the bigger ones:
* Removal of Samsung s5pc100 and s5p64xx platforms. Both of these have
lacked active support for quite a while, and after asking around nobody
showed interest in keeping them around. If needed, they could be
resurrected in the future but it's more likely that we would prefer
reintroduction of them as DT and multiplatform-enabled platforms
instead.
* OMAP4 controller code register define diet. They defined a lot of registers
that were never actually used, etc.
* Move of some of the Tegra platform code (PMC, APBIO, fuse, powergate)
to drivers/soc so it can be shared with 64-bit code. This also converts them
over to traditional driver models where possible.
* Removal of legacy gpio-samsung driver, since the last users have been
removed (moved to pinctrl)
Plus a bunch of smaller changes for various platforms that sort of
dissapear in the diffstat for the above. clps711x cleanups, shmobile
header file refactoring/moves for multiplatform friendliness, some misc
cleanups, etc.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJT5DYPAAoJEIwa5zzehBx37egQAIiatNiLLqZnfo3rwGADRz/a
POfPovktj68aPcobyzoyhFtToMqGvi9PpysyFTIQD2HJFG+5BtiIAuqtg0875zDe
EpBWgsfugrm0YktJWAtUerj60oAmNPbKfaEm1cOOWuM2lb2mV+QkRrwSTAgsqkT7
927BzMXKKBRPOVLL0RYhoF8EXa0Eg8kCqAHP8fJrzVYkRp+UrZJDnGiUP1XmWJN+
VXQMu5SEjcPMtqT7+tfX455RfREHJfBcJ1ZN/dPF8HMWDwClQG0lyc6hifh1MxwO
8DjIZNkfZeKqgDqVyC17re7pc7p8md5HL8WXbrKpK0A9vQ5bRexbPHxcwJ1T/C2Y
465H+st5XXbuzV1gbMwjK1/ycsH0tCyffckk8Yl/2e1Fs7GgPNbAELtTdl+5vV1Y
xmDXkyo/9WlRM3LQ23IGKwW7VzN86EfWVuShssfro0fO7xDdb4OOYLdQI+4bCG+h
ytQYun1vU32OEyNik5RVNQuZaMrv2c93a3bID4owwuPHPmYOPVUQaqnRX/0E51eA
aHZYbk2GlUOV3Kq5aSS4iyLg1Yj+I9/NeH9U+A4nc+PQ5FlgGToaVSCuYuw4DqbP
AAG+sqQHbkBMvDPobQz/yd1qZbAb4eLhGy11XK1t5S65rApWI55GwNXnvbyxqt8x
wpmxJTASGxcfuZZgKXm7
=gbcE
-----END PGP SIGNATURE-----
Merge tag 'cleanup-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Olof Johansson:
"This merge window brings a good size of cleanups on various platforms.
Among the bigger ones:
- Removal of Samsung s5pc100 and s5p64xx platforms. Both of these
have lacked active support for quite a while, and after asking
around nobody showed interest in keeping them around. If needed,
they could be resurrected in the future but it's more likely that
we would prefer reintroduction of them as DT and
multiplatform-enabled platforms instead.
- OMAP4 controller code register define diet. They defined a lot of
registers that were never actually used, etc.
- Move of some of the Tegra platform code (PMC, APBIO, fuse,
powergate) to drivers/soc so it can be shared with 64-bit code.
This also converts them over to traditional driver models where
possible.
- Removal of legacy gpio-samsung driver, since the last users have
been removed (moved to pinctrl)
Plus a bunch of smaller changes for various platforms that sort of
dissapear in the diffstat for the above. clps711x cleanups, shmobile
header file refactoring/moves for multiplatform friendliness, some
misc cleanups, etc"
* tag 'cleanup-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (117 commits)
drivers: CCI: Correct use of ! and &
video: clcd-versatile: Depend on ARM
video: fix up versatile CLCD helper move
MAINTAINERS: Add sdhci-st file to ARCH/STI architecture
ARM: EXYNOS: Fix build breakge with PM_SLEEP=n
MAINTAINERS: Remove Kirkwood
ARM: tegra: Convert PMC to a driver
soc/tegra: fuse: Set up in early initcall
ARM: tegra: Always lock the CPU reset vector
ARM: tegra: Setup CPU hotplug in a pure initcall
soc/tegra: Implement runtime check for Tegra SoCs
soc/tegra: fuse: fix dummy functions
soc/tegra: fuse: move APB DMA into Tegra20 fuse driver
soc/tegra: Add efuse and apbmisc bindings
soc/tegra: Add efuse driver for Tegra
ARM: tegra: move fuse exports to soc/tegra/fuse.h
ARM: tegra: export apb dma readl/writel
ARM: tegra: Use a function to get the chip ID
ARM: tegra: Sort includes alphabetically
ARM: tegra: Move includes to include/soc/tegra
...
In order to not clutter the include/linux directory with SoC specific
headers, move the Tegra-specific headers out into a separate directory.
Signed-off-by: Thierry Reding <treding@nvidia.com>
This structure is read-only data and should never be modified.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
'tegra_smmu_pm_ops' is used only in this file. Make it static.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
When enabling LPAE on ARM, phys_addr_t becomes 64 bits wide and printing
a variable of that type using a simple %x format specifier causes the
compiler to complain. Change the format specifier to %pa, which is used
specifically for variables of type phys_addr_t.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Fix printk formats for dma_addr_t:
drivers/iommu/tegra-smmu.c: In function 'smmu_iommu_iova_to_phys':
>> drivers/iommu/tegra-smmu.c:774:2: warning: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'dma_addr_t' [-Wformat]
--
drivers/iommu/tegra-gart.c: In function 'gart_iommu_iova_to_phys':
>> drivers/iommu/tegra-gart.c:298:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'dma_addr_t' [-Wformat]
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This is required in case of PAMU, as it can support a window size of up
to 64G (even on 32bit).
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>