linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-19 23:48:36 +07:00

Author	SHA1	Message	Date
Gary R Hook	b92b4fb5c1	iommu/amd: Limit the IOVA page range to the specified addresses The extent of pages specified when applying a reserved region should include up to the last page of the range, but not the page following the range. Signed-off-by: Gary R Hook <gary.hook@amd.com> Fixes: `8d54d6c8b8` ('iommu/amd: Implement apply_dm_region call-back') Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:34 -06:00
Rob Clark	049541e178	iommu: qcom: wire up fault handler This is quite useful for debugging. Currently, always TERMINATE the translation when the fault handler returns (since this is all we need for debugging drivers). But I expect the SVM work should eventually let us do something more clever. Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:33 -06:00
Colin Ian King	2c40367cbf	iommu/amd: remove unused variable flush_addr Variable flush_addr is being assigned but is never read; it is redundant and can be removed. Cleans up the clang warning: drivers/iommu/amd_iommu.c:2388:2: warning: Value stored to 'flush_addr' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:32 -06:00
Alex Williamson	07d1c91b6c	iommu/amd: Fix alloc_irq_index() increment On an is_allocated() interrupt index, we ALIGN() the current index and then increment it via the for loop, guaranteeing that it is no longer aligned for alignments >1. We instead need to align the next index, to guarantee forward progress, moving the increment-only to the case where the index was found to be unallocated. Fixes: `37946d95fc` ('iommu/amd: Add align parameter to alloc_irq_index()') Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2017-11-03 10:50:31 -06:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Robin Murphy	8ff0f72371	iommu/arm-smmu-v3: Use burst-polling for sync completion While CMD_SYNC is unlikely to complete immediately such that we never go round the polling loop, with a lightly-loaded queue it may still do so long before the delay period is up. If we have no better completion notifier, use similar logic as we have for SMMUv2 to spin a number of times before each backoff, so that we have more chance of catching syncs which complete relatively quickly and avoid delaying unnecessarily. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:10 +01:00
Will Deacon	a529ea19aa	iommu/arm-smmu-v3: Consolidate identical timeouts We have separate (identical) timeout values for polling for a queue to drain and waiting for an MSI to signal CMD_SYNC completion. In reality, we only wait for the command queue to drain if we're waiting on a sync, so just merged these two timeouts into a single constant. Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:10 +01:00
Will Deacon	49806599c3	iommu/arm-smmu-v3: Split arm_smmu_cmdq_issue_sync in half arm_smmu_cmdq_issue_sync is a little unwieldy now that it supports both MSI and event-based polling, so split it into two functions to make things easier to follow. Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:09 +01:00
Robin Murphy	37de98f8f1	iommu/arm-smmu-v3: Use CMD_SYNC completion MSI As an IRQ, the CMD_SYNC interrupt is not particularly useful, not least because we often need to wait for sync completion within someone else's IRQ handler anyway. However, when the SMMU is both coherent and supports MSIs, we can have a lot more fun by not using it as an interrupt at all. Following the example suggested in the architecture and using a write targeting normal memory, we can let callers wait on a status variable outside the lock instead of having to stall the entire queue or even touch MMIO registers. Since multiple sync commands are guaranteed to complete in order, a simple incrementing sequence count is all we need to unambiguously support any realistic number of overlapping waiters. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:08 +01:00
Robin Murphy	dce032a15c	iommu/arm-smmu-v3: Forget about cmdq-sync interrupt The cmdq-sync interrupt is never going to be particularly useful, since for stage 1 DMA at least we'll often need to wait for sync completion within someone else's IRQ handler, thus have to implement polling anyway. Beyond that, the overhead of taking an interrupt, then still having to grovel around in the queue to figure out which sync command completed, doesn't seem much more attractive than simple polling either. Furthermore, if an implementation both has wired interrupts and supports MSIs, then we don't want to be taking the IRQ unnecessarily if we're using the MSI write to update memory. Let's just make life simpler by not even bothering to claim it in the first place. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:07 +01:00
Robin Murphy	2f657add07	iommu/arm-smmu-v3: Specialise CMD_SYNC handling CMD_SYNC already has a bit of special treatment here and there, but as we're about to extend it with more functionality for completing outside the CMDQ lock, things are going to get rather messy if we keep trying to cram everything into a single generic command interface. Instead, let's break out the issuing of CMD_SYNC into its own specific helper where upcoming changes will have room to breathe. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:06 +01:00
Robin Murphy	2a22baa2d1	iommu/arm-smmu-v3: Correct COHACC override message Slightly confusingly, when reporting a mismatch of the ID register value, we still refer to the IORT COHACC override flag as the "dma-coherent property" if we booted with ACPI. Update the message to be firmware-agnostic in line with SMMUv2. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:05 +01:00
Yisheng Xie	9cff86fd2b	iommu/arm-smmu-v3: Avoid ILLEGAL setting of STE.S1STALLD and CD.S According to Spec, it is ILLEGAL to set STE.S1STALLD if STALL_MODEL is not 0b00, which means we should not disable stall mode if stall or terminate mode is not configuable. Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which means if stall mode is force we should always set CD.S. As Jean-Philippe's suggestion, this patch introduce a feature bit ARM_SMMU_FEAT_STALL_FORCE, which means smmu only supports stall force. Therefore, we can avoid the ILLEGAL setting of STE.S1STALLD.by checking ARM_SMMU_FEAT_STALL_FORCE. This patch keeps the ARM_SMMU_FEAT_STALLS as the meaning of stall supported (force or configuable) to easy to expand the future function, i.e. we can only use ARM_SMMU_FEAT_STALLS to check whether we should register fault handle or enable master can_stall, etc to supporte platform SVM. The feature bit, STE.S1STALLD and CD.S setting will be like: STALL_MODEL FEATURE S1STALLD CD.S 0b00 ARM_SMMU_FEAT_STALLS 0b1 0b0 0b01 !ARM_SMMU_FEAT_STALLS && !ARM_SMMU_FEAT_STALL_FORCE 0b0 0b0 0b10 ARM_SMMU_FEAT_STALLS && ARM_SMMU_FEAT_STALL_FORCE 0b0 0b1 after apply this patch. Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:55:04 +01:00
Feng Kan	74f55d3441	iommu/arm-smmu: Enable bypass transaction caching for ARM SMMU 500 The ARM SMMU identity mapping performance was poor compared with the DMA mode. It was found that enable caching would restore the performance back to normal. The S2CRB_TLBEN bit in the ACR register would allow for caching of the stream to context register bypass transaction information. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Feng Kan <fkan@apm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:54 +01:00
Will Deacon	704c038255	iommu/arm-smmu-v3: Ensure we sync STE when only changing config field The SMMUv3 architecture permits caching of data structures deemed to be "reachable" by the SMU, which includes STEs marked as invalid. When transitioning an STE to a bypass/fault configuration at init or detach time, we mistakenly elide the CMDQ_OP_CFGI_STE operation in some cases, therefore potentially leaving the old STE state cached in the SMMU. This patch fixes the problem by ensuring that we perform the CMDQ_OP_CFGI_STE operation irrespective of the validity of the previous STE. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:54 +01:00
Robin Murphy	6948d4a7e1	iommu/arm-smmu: Remove ACPICA workarounds Now that the kernel headers have synced with the relevant upstream ACPICA updates, it's time to clean up the temporary local definitions. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-10-20 16:54:53 +01:00
Joerg Roedel	a593472591	Merge branches 'iommu/fixes', 'arm/omap', 'arm/exynos', 'x86/amd', 'x86/vt-d' and 'core' into next	2017-10-13 17:32:24 +02:00
Joerg Roedel	ce76353f16	iommu/amd: Finish TLB flush in amd_iommu_unmap() The function only sends the flush command to the IOMMU(s), but does not wait for its completion when it returns. Fix that. Fixes: `601367d76b` ('x86/amd-iommu: Remove iommu_flush_domain function') Cc: stable@vger.kernel.org # >= 2.6.33 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-13 17:32:19 +02:00
Tomasz Nowicki	538d5b3332	iommu/iova: Make rcache flush optional on IOVA allocation failure Since IOVA allocation failure is not unusual case we need to flush CPUs' rcache in hope we will succeed in next round. However, it is useful to decide whether we need rcache flush step because of two reasons: - Not scalability. On large system with ~100 CPUs iterating and flushing rcache for each CPU becomes serious bottleneck so we may want to defer it. - free_cpu_cached_iovas() does not care about max PFN we are interested in. Thus we may flush our rcaches and still get no new IOVA like in the commonly used scenario: if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev)) iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift); if (!iova) iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift); 1. First alloc_iova_fast() call is limited to DMA_BIT_MASK(32) to get PCI devices a SAC address 2. alloc_iova() fails due to full 32-bit space 3. rcaches contain PFNs out of 32-bit space so free_cpu_cached_iovas() throws entries away for nothing and alloc_iova() fails again 4. Next alloc_iova_fast() call cannot take advantage of rcache since we have just defeated caches. In this case we pick the slowest option to proceed. This patch reworks flushed_rcache local flag to be additional function argument instead and control rcache flush step. Also, it updates all users to do the flush as the last chance. Signed-off-by: Tomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-12 14:18:02 +02:00
Thomas Gleixner	331b57d148	Merge branch 'irq/urgent' into x86/apic Pick up core changes which affect the vector rework.	2017-10-12 11:02:50 +02:00
Marek Szyprowski	9d25e3cc83	iommu/exynos: Remove initconst attribute to avoid potential kernel oops Exynos SYSMMU registers standard platform device with sysmmu_of_match table, what means that this table is accessed every time a new platform device is registered in a system. This might happen also after the boot, so the table must not be attributed as initconst to avoid potential kernel oops caused by access to freed memory. Fixes: `6b21a5db36` ("iommu/exynos: Support for device tree") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-12 10:25:11 +02:00
Tom Lendacky	aba2d9a638	iommu/amd: Do not disable SWIOTLB if SME is active When SME memory encryption is active it will rely on SWIOTLB to handle DMA for devices that cannot support the addressing requirements of having the encryption mask set in the physical address. The IOMMU currently disables SWIOTLB if it is not running in passthrough mode. This is not desired as non-PCI devices attempting DMA may fail. Update the code to check if SME is active and not disable SWIOTLB. Fixes: `2543a786aa` ("iommu/amd: Allow the AMD IOMMU to work with memory encryption") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 19:49:30 +02:00
Joerg Roedel	53b9ec3fbb	iommu/amd: Enforce alignment for MSI IRQs Make use of the new alignment capability of alloc_irq_index() to enforce IRQ index alignment for MSI. Reported-by: Thomas Gleixner <tglx@linutronix.de> Fixes: `2b32450634` ('iommu/amd: Add routines to manage irq remapping tables') Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 16:04:20 +02:00
Joerg Roedel	37946d95fc	iommu/amd: Add align parameter to alloc_irq_index() For multi-MSI IRQ ranges the IRQ index needs to be aligned to the power-of-two of the requested IRQ count. Extend the alloc_irq_index() function to allow such an allocation. Reported-by: Thomas Gleixner <tglx@linutronix.de> Fixes: `2b32450634` ('iommu/amd: Add routines to manage irq remapping tables') Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 16:03:43 +02:00
Christos Gkekas	b117e03805	iommu/vt-d: Delete unnecessary check in domain_context_mapping_one() Variable did_old is unsigned so checking whether it is greater or equal to zero is not necessary. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-10 14:09:50 +02:00
Joerg Roedel	ec154bf56b	iommu/vt-d: Don't register bus-notifier under dmar_global_lock The notifier function will take the dmar_global_lock too, so lockdep complains about inverse locking order when the notifier is registered under the dmar_global_lock. Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Fixes: `59ce0515cd` ('iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-06 15:09:30 +02:00
Robin Murphy	4d689b6194	iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync Now that the core API issues its own post-unmap TLB sync call, push that operation out from the io-pgtable-arm-v7s internals into the users. For now, we leave the invalidation implicit in the unmap operation, since none of the current users would benefit much from any change to that. Note that the conversion of msm_iommu is implicit, since that apparently has no specific TLB sync operation anyway. CC: Yong Wu <yong.wu@mediatek.com> CC: Rob Clark <robdclark@gmail.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:25 +02:00
Robin Murphy	32b124492b	iommu/io-pgtable-arm: Convert to IOMMU API TLB sync Now that the core API issues its own post-unmap TLB sync call, push that operation out from the io-pgtable-arm internals into the users. For now, we leave the invalidation implicit in the unmap operation, since none of the current users would benefit much from any change to that. CC: Magnus Damm <damm+renesas@opensource.se> CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:25 +02:00
Robin Murphy	abbb8a0938	iommu/iova: Don't try to copy anchor nodes Anchor nodes are not reserved IOVAs in the way that copy_reserved_iova() cares about - while the failure from reserve_iova() is benign since the target domain will already have its own anchor, we still don't want to be triggering spurious warnings. Reported-by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Fixes: `bb68b2fbfb` ('iommu/iova: Add rbtree anchor node') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-02 15:45:24 +02:00
Robin Murphy	e8b1984027	iommu/iova: Try harder to allocate from rcache magazine When devices with different DMA masks are using the same domain, or for PCI devices where we usually try a speculative 32-bit allocation first, there is a fair possibility that the top PFN of the rcache stack at any given time may be unsuitable for the lower limit, prompting a fallback to allocating anew from the rbtree. Consequently, we may end up artifically increasing pressure on the 32-bit IOVA space as unused IOVAs accumulate lower down in the rcache stacks, while callers with 32-bit masks also impose unnecessary rbtree overhead. In such cases, let's try a bit harder to satisfy the allocation locally first - scanning the whole stack should still be relatively inexpensive. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:16 +02:00
Robin Murphy	b826ee9a4f	iommu/iova: Make rcache limit_pfn handling more robust When popping a pfn from an rcache, we are currently checking it directly against limit_pfn for viability. Since this represents iova->pfn_lo, it is technically possible for the corresponding iova->pfn_hi to be greater than limit_pfn. Although we generally get away with it in practice since limit_pfn is typically a power-of-two boundary and the IOVAs are size-aligned, it's pretty trivial to make the iova_rcache_get() path take the allocation size into account for complete safety. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:15 +02:00
Robin Murphy	7595dc588a	iommu/iova: Simplify domain destruction All put_iova_domain() should have to worry about is freeing memory - by that point the domain must no longer be live, so the act of cleaning up doesn't need to be concurrency-safe or maintain the rbtree in a self-consistent state. There's no need to waste time with locking or emptying the rcache magazines, and we can just use the postorder traversal helper to clear out the remaining rbtree entries in-place. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-28 14:57:15 +02:00
Robin Murphy	973f5fbedb	iommu/iova: Simplify cached node logic The logic of __get_cached_rbnode() is a little obtuse, but then __get_prev_node_of_cached_rbnode_or_last_node_and_update_limit_pfn() wouldn't exactly roll off the tongue... Now that we have the invariant that there is always a valid node to start searching downwards from, everything gets a bit easier to follow if we simplify that function to do what it says on the tin and return the cached node (or anchor node as appropriate) directly. In turn, we can then deduplicate the rb_prev() and limit_pfn logic into the main loop itself, further reduce the amount of code under the lock, and generally make the inner workings a bit less subtle. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:58 +02:00
Robin Murphy	bb68b2fbfb	iommu/iova: Add rbtree anchor node Add a permanent dummy IOVA reservation to the rbtree, such that we can always access the top of the address space instantly. The immediate benefit is that we remove the overhead of the rb_last() traversal when not using the cached node, but it also paves the way for further simplifications. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Zhen Lei	aa3ac9469c	iommu/iova: Make dma_32bit_pfn implicit Now that the cached node optimisation can apply to all allocations, the couple of users which were playing tricks with dma_32bit_pfn in order to benefit from it can stop doing so. Conversely, there is also no need for all the other users to explicitly calculate a 'real' 32-bit PFN, when init_iova_domain() can happily do that itself from the page granularity. CC: Thierry Reding <thierry.reding@gmail.com> CC: Jonathan Hunter <jonathanh@nvidia.com> CC: David Airlie <airlied@linux.ie> CC: Sudeep Dutt <sudeep.dutt@intel.com> CC: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: use iova_shift(), rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Robin Murphy	e60aa7b538	iommu/iova: Extend rbtree node caching The cached node mechanism provides a significant performance benefit for allocations using a 32-bit DMA mask, but in the case of non-PCI devices or where the 32-bit space is full, the loss of this benefit can be significant - on large systems there can be many thousands of entries in the tree, such that walking all the way down to find free space every time becomes increasingly awful. Maintain a similar cached node for the whole IOVA space as a superset of the 32-bit space so that performance can remain much more consistent. Inspired by work by Zhen Lei <thunder.leizhen@huawei.com>. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Zhen Lei	086c83acb7	iommu/iova: Optimise the padding calculation The mask for calculating the padding size doesn't change, so there's no need to recalculate it every loop iteration. Furthermore, Once we've done that, it becomes clear that we don't actually need to calculate a padding size at all - by flipping the arithmetic around, we can just combine the upper limit, size, and mask directly to check against the lower limit. For an arm64 build, this alone knocks 20% off the object code size of the entire alloc_iova() function! Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: simplified more of the arithmetic, rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:56 +02:00
Zhen Lei	2070f940a6	iommu/iova: Optimise rbtree searching Checking the IOVA bounds separately before deciding which direction to continue the search (if necessary) results in redundantly comparing both pfns twice each. GCC can already determine that the final comparison op is redundant and optimise it down to 3 in total, but we can go one further with a little tweak of the ordering (which makes the intent of the code that much cleaner as a bonus). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: rewrote commit message to clarify] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:56 +02:00
Arvind Yadav	3c6bae6213	iommu/amd: pr_err() strings should end with newlines pr_err() messages should end with a new-line to avoid other messages being concatenated. So replace '/n' with '\n'. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Fixes: `45a01c4293` ('iommu/amd: Add function copy_dev_tables()') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:01:35 +02:00
Yong Wu	1ff9b17ced	iommu/mediatek: Limit the physical address in 32bit for v7s The ARM short descriptor has already limited the physical address to 32bit after the commit <76557391433c> ("iommu/io-pgtable: Sanitise map/unmap addresses"). But in MediaTek 4GB mode, the physical address is from 0x1_0000_0000 to 0x1_ffff_ffff. this will cause: WARNING: CPU: 4 PID: 3900 at xxx/drivers/iommu/io-pgtable-arm-v7s.c:482 arm_v7s_map+0x40/0xf8 Modules linked in: CPU: 4 PID: 3900 Comm: weston Tainted: G S W 4.9.44 #1 Hardware name: MediaTek MT2712m1v1 board (DT) task: ffffffc0eaa5b280 task.stack: ffffffc0e9858000 PC is at arm_v7s_map+0x40/0xf8 LR is at mtk_iommu_map+0x64/0x90 pc : [<ffffff80085b09e8>] lr : [<ffffff80085b29fc>] pstate: 000001c5 sp : ffffffc0e985b920 x29: ffffffc0e985b920 x28: 0000000127d00000 x27: 0000000000100000 x26: ffffff8008f9e000 x25: 0000000000000003 x24: 0000000000100000 x23: 0000000127d00000 x22: 00000000ff800000 x21: ffffffc0f7ec8ce0 x20: 0000000000000003 x19: 0000000000000003 x18: 0000000000000002 x17: 0000007f7e5d72c0 x16: ffffff80082b0f08 x15: 0000000000000001 x14: 000000000000003f x13: 0000000000000000 x12: 0000000000000028 x11: 0088000000000000 x10: 0000000000000000 x9 : ffffff80092fa000 x8 : ffffffc0e9858000 x7 : ffffff80085b29d8 x6 : 0000000000000000 x5 : ffffff80085b09a8 x4 : 0000000000000003 x3 : 0000000000100000 x2 : 0000000127d00000 x1 : 00000000ff800000 x0 : 0000000000000001 ... Call trace: [<ffffff80085b09e8>] arm_v7s_map+0x40/0xf8 [<ffffff80085b29fc>] mtk_iommu_map+0x64/0x90 [<ffffff80085ab5f8>] iommu_map+0x100/0x3a0 [<ffffff80085ab99c>] default_iommu_map_sg+0x104/0x168 [<ffffff80085aead8>] iommu_dma_alloc+0x238/0x3f8 [<ffffff8008098b30>] __iommu_alloc_attrs+0xa8/0x260 [<ffffff80085f364c>] mtk_drm_gem_create+0xac/0x180 [<ffffff80085f3894>] mtk_drm_gem_dumb_create+0x54/0xc8 [<ffffff80085d576c>] drm_mode_create_dumb_ioctl+0xa4/0xd8 [<ffffff80085cb2a0>] drm_ioctl+0x1c0/0x490 In order to satify this, Limit the physical address to 32bit. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 16:58:28 +02:00
Yong Wu	5c62c1c679	iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA Fix the commit `81b3c25218` ("iommu/io-pgtable: Introduce explicit coherency"). If there is no IO_PGTABLE_QUIRK_NO_DMA, we should call dma_sync_single_for_device for cache synchronization. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Fixes: `81b3c25218` ('iommu/io-pgtable: Introduce explicit coherency') Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 16:56:17 +02:00
Thomas Gleixner	5ba204a181	iommu/amd: Reevaluate vector configuration on activate() With the upcoming reservation/management scheme, early activation will assign a special vector. The final activation at request_irq() assigns a real vector, which needs to be updated in the tables. Split out the reconfiguration code in set_affinity and use it for reactivation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213155.944883733@linutronix.de	2017-09-25 20:52:01 +02:00
Thomas Gleixner	d491bdff88	iommu/vt-d: Reevaluate vector configuration on activate() With the upcoming reservation/management scheme, early activation will assign a special vector. The final activation at request_irq() assigns a real vector, which needs to be updated in the tables. Split out the reconfiguration code in set_affinity and use it for reactivation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213155.853028808@linutronix.de	2017-09-25 20:52:00 +02:00
Thomas Gleixner	7249164346	genirq/irqdomain: Update irq_domain_ops.activate() signature The irq_domain_ops.activate() callback has no return value and no way to tell the function that the activation is early. The upcoming changes to support a reservation scheme which allows to assign interrupt vectors on x86 only when the interrupt is actually requested requires: - A return value, so activation can fail at request_irq() time - Information that the activate invocation is early, i.e. before request_irq(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de	2017-09-25 20:38:24 +02:00
Robin Murphy	c0d05cde2a	iommu/of: Remove PCI host bridge node check of_pci_iommu_init() tries to be clever and stop its alias walk at the device represented by master_np, in case of weird PCI topologies where the bridge to the IOMMU and the rest of the system is not at the root. It turns out this is a bit short-sighted, since there are plenty of other callers of pci_for_each_dma_alias() which would also need the same behaviour in that situation, and the only platform so far with such a topology (Cavium ThunderX2) already solves it more generally via a PCI quirk. As this check is effectively redundant, and returning a boolean value as an int is a bit broken anyway, let's just get rid of it. Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Fixes: `d87beb7492` ("iommu/of: Handle PCI aliases properly") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-22 12:05:43 +02:00
Geert Uytterhoeven	986a5f7017	iommu/qcom: Depend on HAS_DMA to fix compile error If NO_DMA=y: warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT && HAS_DMA && (ARM \|\| ARM64 \|\| COMPILE_TEST && !GENERIC_ATOMIC64)) and drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_sync_pte': io-pgtable-arm.c:(.text+0x206): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_free_pages': io-pgtable-arm.c:(.text+0x6a6): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `__arm_lpae_alloc_pages': io-pgtable-arm.c:(.text+0x812): undefined reference to `bad_dma_ops' io-pgtable-arm.c:(.text+0x81c): undefined reference to `bad_dma_ops' io-pgtable-arm.c:(.text+0x862): undefined reference to `bad_dma_ops' drivers/iommu/io-pgtable-arm.o: In function `arm_lpae_run_tests': io-pgtable-arm.c:(.init.text+0x86): undefined reference to `alloc_io_pgtable_ops' io-pgtable-arm.c:(.init.text+0x47c): undefined reference to `free_io_pgtable_ops' drivers/iommu/qcom_iommu.o: In function `qcom_iommu_init_domain': qcom_iommu.c:(.text+0x1ce): undefined reference to `alloc_io_pgtable_ops' drivers/iommu/qcom_iommu.o: In function `qcom_iommu_domain_free': qcom_iommu.c:(.text+0x754): undefined reference to `free_io_pgtable_ops' QCOM_IOMMU selects IOMMU_IO_PGTABLE_LPAE, which bypasses its dependency on HAS_DMA. Make QCOM_IOMMU depend on HAS_DMA to fix this. Fixes: `0ae349a0f3` ("iommu/qcom: Add qcom_iommu") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 15:30:41 +02:00
Marek Szyprowski	7a974b29fe	iommu/exynos: Rework runtime PM links management add_device is a bit more suitable for establishing runtime PM links than the xlate callback. This change also makes it possible to implement proper cleanup - in remove_device callback. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 14:48:01 +02:00
Arnd Bergmann	3bd71e18c5	iommu/vt-d: Fix harmless section mismatch warning Building with gcc-4.6 results in this warning due to dmar_table_print_dmar_entry being inlined as in newer compiler versions: WARNING: vmlinux.o(.text+0x5c8bee): Section mismatch in reference from the function dmar_walk_remapping_entries() to the function .init.text:dmar_table_print_dmar_entry() The function dmar_walk_remapping_entries() references the function __init dmar_table_print_dmar_entry(). This is often because dmar_walk_remapping_entries lacks a __init annotation or the annotation of dmar_table_print_dmar_entry is wrong. This removes the __init annotation to avoid the warning. On compilers that don't show the warning today, this should have no impact since the function gets inlined anyway. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:44:46 +02:00
Guenter Roeck	a4aaeccc7a	iommu: Add missing dependencies parisc:allmodconfig, xtensa:allmodconfig, and possibly others generate the following Kconfig warning. warning: (IPMMU_VMSA && ARM_SMMU && ARM_SMMU_V3 && QCOM_IOMMU) selects IOMMU_IO_PGTABLE_LPAE which has unmet direct dependencies (IOMMU_SUPPORT && HAS_DMA && (ARM \|\| ARM64 \|\| COMPILE_TEST && !GENERIC_ATOMIC64)) IOMMU_IO_PGTABLE_LPAE depends on (COMPILE_TEST && !GENERIC_ATOMIC64), so any configuration option selecting it needs to have the same dependencies. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:42:19 +02:00
Suman Anna	9d5018deec	iommu/omap: Add support to program multiple iommus A client user instantiates and attaches to an iommu_domain to program the OMAP IOMMU associated with the domain. The iommus programmed by a client user are bound with the iommu_domain through the user's device archdata. The OMAP IOMMU driver currently supports only one IOMMU per IOMMU domain per user. The OMAP IOMMU driver has been enhanced to support allowing multiple IOMMUs to be programmed by a single client user. This support is being added mainly to handle the DSP subsystems on the DRA7xx SoCs, which have two MMUs within the same subsystem. These MMUs provide translations for a processor core port and an internal EDMA port. This support allows both the MMUs to be programmed together, but with each one retaining it's own internal state objects. The internal EDMA block is managed by the software running on the DSPs, and this design provides on-par functionality with previous generation OMAP DSPs where the EDMA and the DSP core shared the same MMU. The multiple iommus are expected to be provided through a sentinel terminated array of omap_iommu_arch_data objects through the client user's device archdata. The OMAP driver core is enhanced to loop through the array of attached iommus and program them for all common operations. The sentinel-terminated logic is used so as to not change the omap_iommu_arch_data structure. NOTE: 1. The IOMMU group and IOMMU core registration is done only for the DSP processor core MMU even though both MMUs are represented by their own platform device and are probed individually. The IOMMU device linking uses this registered MMU device. The struct iommu_device for the second MMU is not used even though memory for it is allocated. 2. The OMAP IOMMU debugfs code still continues to operate on individual IOMMU objects. Signed-off-by: Suman Anna <s-anna@ti.com> [t-kristo@ti.com: ported support to 4.13 based kernel] Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:32:05 +02:00
Suman Anna	0d3642883b	iommu/omap: Change the attach detection logic The OMAP IOMMU driver allows only a single device (eg: a rproc device) to be attached per domain. The current attach detection logic relies on a check for an attached iommu for the respective client device. Change this logic to use the client device pointer instead in preparation for supporting multiple iommu devices to be bound to a single iommu domain, and thereby to a client device. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-19 11:32:05 +02:00
Linus Torvalds	4dfc278803	IOMMU Updates for Linux v4.14 Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - Common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - Finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - Finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - New functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - Support for mt2712 in the Mediatek IOMMU driver - New IOMMU driver for QCOM hardware - System PM support for ARM-SMMU - Shutdown method for ARM-SMMU-v3 - Some constification patches - Various other small improvements and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZtCFNAAoJECvwRC2XARrjZnQP/AxC/ezQpq82HbegF4sM/cVE Ep7TeTqodEl75FS/6txe2wU0pwodqk/LB9ajfQZUbE1w8vKsNEqi5qf4ZYHGoxYI 5bWyjJBzKIlwENH5lsBpQNt6XLevrYmRsFy7F0tRYy+qPQq8k+js2i7/XkCL3q7L 3xklF847RRoITaTOhhaROx1pF23dSMEsS2XGuWHcZfjORtep4wcFKzd/2SvlCWCo P2bRU7jBzfJuuGSA80gaiUbDmrULTUfYuZNp7njASzCgsDmagERtvDEpdoXPNNSp u6s4LjU1Dp3fgr6g6cFRO7B6JUbWd619nwo9so/c/wZN54yEngBF9EyeeF3mv2O5 ZbM2mOW3RlZcjxFT/AC8G4cZwwP6MpCEQOdqknoqc6ZQwcDqwN0o9I4+po0wsiAU 89ijZZe9Mx0p9lNpihaBEB1erAUWPo5Obh62zo80W3h6x9WzkGQWM+PyFK2DYoaC 8biEZzcc21sLEHvXQkcEGJSKrihHr9sluOqvxmCw5QAkYIFAeZRoeH7JtZWjVCnr T3XvaG1G1Aw6tS7ErxufdKawREAGki0Rm9i1baiH9sqNj5rllM01Y+PgU6E21Nbg iZp9gJLjfwM4vhYLlovvQK5PRoOBsCkyCpEI4GJqjLeam5p/WN06CFFf0ifQofYr qDPCVDkWHWV8nugFFKE7 =EVh9 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - new functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - support for mt2712 in the Mediatek IOMMU driver - new IOMMU driver for QCOM hardware - system PM support for ARM-SMMU - shutdown method for ARM-SMMU-v3 - some constification patches - various other small improvements and fixes" * tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits) iommu/vt-d: Don't be too aggressive when clearing one context entry iommu: Introduce Interface for IOMMU TLB Flushing iommu/s390: Constify iommu_ops iommu/vt-d: Avoid calling virt_to_phys() on null pointer iommu/vt-d: IOMMU Page Request needs to check if address is canonical. arm/tegra: Call bus_set_iommu() after iommu_device_register() iommu/exynos: Constify iommu_ops iommu/ipmmu-vmsa: Make ipmmu_gather_ops const iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable iommu/amd: Rename a few flush functions iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY iommu/mediatek: Fix a build warning of BIT(32) in ARM iommu/mediatek: Fix a build fail of m4u_type iommu: qcom: annotate PM functions as __maybe_unused iommu/pamu: Fix PAMU boot crash memory: mtk-smi: Degrade SMI init to module_init iommu/mediatek: Enlarge the validate PA range for 4GB mode iommu/mediatek: Disable iommu clock when system suspend iommu/mediatek: Move pgtable allocation into domain_alloc iommu/mediatek: Merge 2 M4U HWs into one iommu domain ...	2017-09-09 15:03:24 -07:00
Linus Torvalds	0d519f2d1e	pci-v4.14-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ EG76ridTolUFVV+wzJD9 =9cLP -----END PGP SIGNATURE----- Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add enhanced Downstream Port Containment support, which prints more details about Root Port Programmed I/O errors (Dongdong Liu) - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang) - add MediaTek MT2712 and MT7622 support (Ryder Lee) - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang) - add Qualcom IPQ8074 support (Varadarajan Narayanan) - add R-Car r8a7743/5 device tree support (Biju Das) - add Rockchip per-lane PHY support for better power management (Shawn Lin) - fix IRQ mapping for hot-added devices by replacing the pci_fixup_irqs() boot-time design with a host bridge hook called at probe-time (Lorenzo Pieralisi, Matthew Minter) - fix race when enabling two devices that results in upstream bridge not being enabled correctly (Srinath Mannam) - fix pciehp power fault infinite loop (Keith Busch) - fix SHPC bridge MSI hotplug events by enabling bus mastering (Aleksandr Bezzubikov) - fix a VFIO issue by correcting PCIe capability sizes (Alex Williamson) - fix an INTD issue on Xilinx and possibly other drivers by unifying INTx IRQ domain support (Paul Burton) - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg Roedel) - allow APM X-Gene device assignment to guests by adding an ACS quirk (Feng Kan) - fix driver crashes by disabling Extended Tags on Broadcom HT2100 (Extended Tags support is required for PCIe Receivers but not Requesters, and we now enable them by default when Requesters support them) (Sinan Kaya) - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs use the real Requester ID (not a phantom RID) (Robin Murphy) - prevent assignment of Intel VMD children to guests (which may be supported eventually, but isn't yet) by not associating an IOMMU with them (Jon Derrick) - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott Bauer) - fix a Function-Level Reset issue with Intel 750 NVMe by waiting longer (up to 60sec instead of 1sec) for device to become ready (Sinan Kaya) - fix a Function-Level Reset issue on iProc Stingray by working around hardware defects in the CRS implementation (Oza Pawandeep) - fix an issue with Intel NVMe P3700 after an iProc reset by adding a delay during shutdown (Oza Pawandeep) - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking in compose_msi_msg() (Stephen Hemminger) - fix a wireless LAN driver timeout by clearing DesignWare MSI interrupt status after it is handled, not before (Faiz Abbas) - fix DesignWare ATU enable checking (Jisheng Zhang) - reduce Layerscape dependencies on the bootloader by doing more initialization in the driver (Hou Zhiqiang) - improve Intel VMD performance allowing allocation of more IRQ vectors than present CPUs (Keith Busch) - improve endpoint framework support for initial DMA mask, different BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon Vijay Abraham I, Stan Drozd) - rework CRS support to add periodic messages while we poll during enumeration and after Function-Level Reset and prepare for possible other uses of CRS (Sinan Kaya) - clean up Root Port AER handling by removing unnecessary code and moving error handler methods to struct pcie_port_service_driver (Christoph Hellwig) - clean up error handling paths in various drivers (Bjorn Andersson, Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen, Lorenzo Pieralisi, Sergei Shtylyov) - clean up SR-IOV resource handling by disabling VF decoding before updating the corresponding resource structs (Gavin Shan) - clean up DesignWare-based drivers by unifying quirks to update Class Code and Interrupt Pin and related handling of write-protected registers (Hou Zhiqiang) - clean up by adding empty generic pcibios_align_resource() and pcibios_fixup_bus() and removing empty arch-specific implementations (Palmer Dabbelt) - request exclusive reset control for several drivers to allow cleanup elsewhere (Philipp Zabel) - constify various structures (Arvind Yadav, Bhumika Goyal) - convert from full_name() to %pOF (Rob Herring) - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn Lin) * tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits) PCI: xgene: Clean up whitespace PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset PCI: xgene: Fix platform_get_irq() error handling PCI: xilinx-nwl: Fix platform_get_irq() error handling PCI: rockchip: Fix platform_get_irq() error handling PCI: altera: Fix platform_get_irq() error handling PCI: spear13xx: Fix platform_get_irq() error handling PCI: artpec6: Fix platform_get_irq() error handling PCI: armada8k: Fix platform_get_irq() error handling PCI: dra7xx: Fix platform_get_irq() error handling PCI: exynos: Fix platform_get_irq() error handling PCI: iproc: Clean up whitespace PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP PCI: iproc: Add 500ms delay during device shutdown PCI: Fix typos and whitespace errors PCI: Remove unused "res" variable from pci_resource_io() PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag() PCI/AER: Reformat AER register definitions iommu/vt-d: Prevent VMD child devices from being remapping targets x86/PCI: Use is_vmd() rather than relying on the domain number ...	2017-09-08 15:47:43 -07:00
Linus Torvalds	b1b6f83ac9	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "PCID support, 5-level paging support, Secure Memory Encryption support The main changes in this cycle are support for three new, complex hardware features of x86 CPUs: - Add 5-level paging support, which is a new hardware feature on upcoming Intel CPUs allowing up to 128 PB of virtual address space and 4 PB of physical RAM space - a 512-fold increase over the old limits. (Supercomputers of the future forecasting hurricanes on an ever warming planet can certainly make good use of more RAM.) Many of the necessary changes went upstream in previous cycles, v4.14 is the first kernel that can enable 5-level paging. This feature is activated via CONFIG_X86_5LEVEL=y - disabled by default. (By Kirill A. Shutemov) - Add 'encrypted memory' support, which is a new hardware feature on upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system RAM to be encrypted and decrypted (mostly) transparently by the CPU, with a little help from the kernel to transition to/from encrypted RAM. Such RAM should be more secure against various attacks like RAM access via the memory bus and should make the radio signature of memory bus traffic harder to intercept (and decrypt) as well. This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled by default. (By Tom Lendacky) - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a hardware feature that attaches an address space tag to TLB entries and thus allows to skip TLB flushing in many cases, even if we switch mm's. (By Andy Lutomirski) All three of these features were in the works for a long time, and it's coincidence of the three independent development paths that they are all enabled in v4.14 at once" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits) x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y) x86/mm: Use pr_cont() in dump_pagetable() x86/mm: Fix SME encryption stack ptr handling kvm/x86: Avoid clearing the C-bit in rsvd_bits() x86/CPU: Align CR3 defines x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages acpi, x86/mm: Remove encryption mask from ACPI page protection type x86/mm, kexec: Fix memory corruption with SME on successive kexecs x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y x86/mm: Allow userspace have mappings above 47-bit x86/mm: Prepare to expose larger address space to userspace x86/mpx: Do not allow MPX if we have mappings above 47-bit x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit() x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD x86/mm/dump_pagetables: Fix printout of p4d level x86/mm/dump_pagetables: Generalize address normalization x86/boot: Fix memremap() related build failure ...	2017-09-04 12:21:28 -07:00
Joerg Roedel	47b59d8e40	Merge branches 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-09-01 11:31:42 +02:00
Filippo Sironi	5082219b6a	iommu/vt-d: Don't be too aggressive when clearing one context entry Previously, we were invalidating context cache and IOTLB globally when clearing one context entry. This is a tad too aggressive. Invalidate the context cache and IOTLB for the interested device only. Signed-off-by: Filippo Sironi <sironi@amazon.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-01 11:30:41 +02:00
Jérôme Glisse	30ef7d2c05	iommu/intel: update to new mmu_notifier semantic Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:12:59 -07:00
Jérôme Glisse	f0d1c713d6	iommu/amd: update to new mmu_notifier semantic Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-31 16:12:59 -07:00
Jon Derrick	5823e330b5	iommu/vt-d: Prevent VMD child devices from being remapping targets VMD child devices must use the VMD endpoint's ID as the requester. Because of this, there needs to be a way to link the parent VMD endpoint's IOMMU group and associated mappings to the VMD child devices such that attaching and detaching child devices modify the endpoint's mappings, while preventing early detaching on a singular device removal or unbinding. The reassignment of individual VMD child devices devices to VMs is outside the scope of VMD, but may be implemented in the future. For now it is best to prevent any such attempts. Prevent VMD child devices from returning an IOMMU, which prevents it from exposing an iommu_group sysfs directory and allowing subsequent binding by userspace-access drivers such as VFIO. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-30 16:43:58 -05:00
Joerg Roedel	add02cfdc9	iommu: Introduce Interface for IOMMU TLB Flushing With the current IOMMU-API the hardware TLBs have to be flushed in every iommu_ops->unmap() call-back. For unmapping large amounts of address space, like it happens when a KVM domain with assigned devices is destroyed, this causes thousands of unnecessary TLB flushes in the IOMMU hardware because the unmap call-back runs for every unmapped physical page. With the TLB Flush Interface and the new iommu_unmap_fast() function introduced here the need to clean the hardware TLBs is removed from the unmapping code-path. Users of iommu_unmap_fast() have to explicitly call the TLB-Flush functions to sync the page-table changes to the hardware. Three functions for TLB-Flushes are introduced: * iommu_flush_tlb_all() - Flushes all TLB entries associated with that domain. TLBs entries are flushed when this function returns. * iommu_tlb_range_add() - This will add a given range to the flush queue for this domain. * iommu_tlb_sync() - Flushes all queued ranges from the hardware TLBs. Returns when the flush is finished. The semantic of this interface is intentionally similar to the iommu_gather_ops from the io-pgtable code. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 18:07:13 +02:00
Arvind Yadav	cceb845195	iommu/s390: Constify iommu_ops iommu_ops are not supposed to change at runtime. Functions 'bus_set_iommu' working with const iommu_ops provided by <linux/iommu.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:53:29 +02:00
Ashok Raj	11b93ebfa0	iommu/vt-d: Avoid calling virt_to_phys() on null pointer New kernels with debug show panic() from __phys_addr() checks. Avoid calling virt_to_phys() when pasid_state_tbl pointer is null To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org> Cc: iommu@lists.linux-foundation.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:48:18 +02:00
Ashok Raj	9d8c3af316	iommu/vt-d: IOMMU Page Request needs to check if address is canonical. Page Request from devices that support device-tlb would request translation to pre-cache them in device to avoid overhead of IOMMU lookups. IOMMU needs to check for canonicallity of the address before performing page-fault processing. To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org> Cc: iommu@lists.linux-foundation.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reported-by: Sudeep Dutt <sudeep.dutt@intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:48:18 +02:00
Joerg Roedel	96302d89a0	arm/tegra: Call bus_set_iommu() after iommu_device_register() The bus_set_iommu() function will call the add_device() call-back which needs the iommu to be registered. Reported-by: Jon Hunter <jonathanh@nvidia.com> Fixes: `0b480e4470` ('iommu/tegra: Add support for struct iommu_device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:28:32 +02:00
Arvind Yadav	0b9a36947c	iommu/exynos: Constify iommu_ops iommu_ops are not supposed to change at runtime. Functions 'iommu_device_set_ops' and 'bus_set_iommu' working with const iommu_ops provided by <linux/iommu.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:13:54 +02:00
Bhumika Goyal	8da4af9586	iommu/ipmmu-vmsa: Make ipmmu_gather_ops const Make these const as they are not modified anywhere. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:10:53 +02:00
Oleksandr Tyshchenko	a175a67d30	iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable Reserving a free context is both quicker and more likely to fail (due to limited hardware resources) than setting up a pagetable. What is more the pagetable init/cleanup code could require the context to be set up. Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> CC: Robin Murphy <robin.murphy@arm.com> CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> CC: Joerg Roedel <joro@8bytes.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 15:10:46 +02:00
Joerg Roedel	0688a09990	iommu/amd: Rename a few flush functions Rename a few iommu cache-flush functions that start with iommu_ to start with amd_iommu now. This is to prevent name collisions with generic iommu code later on. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:31:31 +02:00
Baoquan He	ec62b1ab0f	iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY In get_domain(), 'domain' could be NULL before it's passed to dma_ops_domain() to dereference. And the current code calling get_domain() can't deal with the returned 'domain' well if its value is NULL. So before dma_ops_domain() calling, check if 'domain' is NULL, If yes just return ERR_PTR(-EBUSY) directly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `df3f7a6e8e` ('iommu/amd: Use is_attach_deferred call-back') Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:28:37 +02:00
Yong Wu	4193998043	iommu/mediatek: Fix a build warning of BIT(32) in ARM The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build warning while ARCH=arm: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_iova_to_phys': include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow] #define BIT(nr) (1UL << (nr)) ^ >> drivers/iommu/mtk_iommu.c:407:9: note: in expansion of macro 'BIT' pa \|= BIT(32); drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_probe': include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow] #define BIT(nr) (1UL << (nr)) ^ drivers/iommu/mtk_iommu.c:589:35: note: in expansion of macro 'BIT' data->enable_4GB = !!(max_pfn > (BIT(32) >> PAGE_SHIFT)); Use BIT_ULL instead of BIT. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:26:21 +02:00
Yong Wu	4f1c8ea16b	iommu/mediatek: Fix a build fail of m4u_type The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build error: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init': >> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data' has no member named 'm4u_type'; did you mean 'm4u_dom'? if (data->enable_4GB && data->m4u_type != M4U_MT8173) { This patch fix it, use "m4u_plat" instead of "m4u_type". Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:26:21 +02:00
Arnd Bergmann	6ce5b0f22d	iommu: qcom: annotate PM functions as __maybe_unused The qcom_iommu_disable_clocks() function is only called from PM code that is hidden in an #ifdef, causing a harmless warning without CONFIG_PM: drivers/iommu/qcom_iommu.c:601:13: error: 'qcom_iommu_disable_clocks' defined but not used [-Werror=unused-function] static void qcom_iommu_disable_clocks(struct qcom_iommu_dev qcom_iommu) drivers/iommu/qcom_iommu.c:581:12: error: 'qcom_iommu_enable_clocks' defined but not used [-Werror=unused-function] static int qcom_iommu_enable_clocks(struct qcom_iommu_dev qcom_iommu) Replacing that #ifdef with __maybe_unused annotations lets the compiler drop the functions silently instead. Fixes: `0ae349a0f3` ("iommu/qcom: Add qcom_iommu") Acked-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-28 11:24:52 +02:00
Ingo Molnar	413d63d71b	Merge branch 'linus' into x86/mm to pick up fixes and to fix conflicts Conflicts: arch/x86/kernel/head64.c arch/x86/mm/mmap.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-26 09:19:13 +02:00
Joerg Roedel	3ff2dcc058	iommu/pamu: Fix PAMU boot crash Commit `68a17f0be6` introduced an initialization order problem, where devices are linked against an iommu which is not yet initialized. Fix it by initializing the iommu-device before the iommu-ops are registered against the bus. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: `68a17f0be6` ('iommu/pamu: Add support for generic iommu-device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-23 16:28:09 +02:00
Yong Wu	30e2fccf95	iommu/mediatek: Enlarge the validate PA range for 4GB mode This patch is for 4GB mode, mainly for 4 issues: 1) Fix a 4GB bug: if the dram base is 0x4000_0000, the dram size is 0xc000_0000. then the code just meet a corner case because max_pfn is 0x10_0000. data->enable_4GB = !!(max_pfn > (0xffffffffUL >> PAGE_SHIFT)); It's true at the case above. That is unexpected. 2) In mt2712, there is a new register for the 4GB PA range(0x118) we should enlarge the max PA range, or the HW will report error. The dram range is from 0x1_0000_0000 to 0x1_ffff_ffff in the 4GB mode, we cut out the bit[32:30] of the SA(Start address) and EA(End address) into this REG_MMU_VLD_PA_RNG(0x118). 3) In mt2712, the register(0x13c) is extended for 4GB mode. bit[7:6] indicate the valid PA[32:33]. Thus, we don't mask the value and print it directly for debug. 4) if 4GB is enabled, the dram PA range is from 0x1_0000_0000 to 0x1_ffff_ffff. Thus, the PA from iova_to_pa should also '\|' BIT(32) Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:38:00 +02:00
Yong Wu	6254b64f57	iommu/mediatek: Disable iommu clock when system suspend When system suspend, infra power domain may be off, and the iommu's clock must be disabled when system off, or the iommu's bclk clock maybe disabled after system resume. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	4b00f5ac12	iommu/mediatek: Move pgtable allocation into domain_alloc After adding the global list for M4U HW, We get a chance to move the pagetable allocation into the mtk_iommu_domain_alloc. Let the domain_alloc do the right thing. This patch is for fixing this problem[1]. [1]: https://patchwork.codeaurora.org/patch/53987/ Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	7c3a2ec028	iommu/mediatek: Merge 2 M4U HWs into one iommu domain In theory, If there are 2 M4U HWs, there should be 2 IOMMU domains. But one IOMMU domain(4GB iova range) is enough for us currently, It's unnecessary to maintain 2 pagetables. Besides, This patch can simplify our consumer code largely. They don't need map a iova range from one domain into another, They can share the iova address easily. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:59 +02:00
Yong Wu	e6dec92308	iommu/mediatek: Add mt2712 IOMMU support The M4U IP blocks in mt2712 is MTK's generation2 M4U which use the ARM Short-descriptor like mt8173, and most of the HW registers are the same. The difference is that there are 2 M4U HWs in mt2712 while there's only one in mt8173. The purpose of 2 M4U HWs is for balance the bandwidth. Normally if there are 2 M4U HWs, there should be 2 iommu domains, each M4U has a iommu domain. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:58 +02:00
Yong Wu	a9467d9542	iommu/mediatek: Move MTK_M4U_TO_LARB/PORT into mtk_iommu.c The definition of MTK_M4U_TO_LARB and MTK_M4U_TO_PORT are shared by all the gen2 M4U HWs. Thus, Move them out from mt8173-larb-port.h, and put them into the c file. Suggested-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:37:58 +02:00
Magnus Damm	7af9a5fdb9	iommu/ipmmu-vmsa: Use iommu_device_sysfs_add()/remove() Extend the driver to make use of iommu_device_sysfs_add()/remove() functions to hook up initial sysfs support. Suggested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-22 16:19:00 +02:00
Joerg Roedel	2479c631d1	iommu/amd: Fix section mismatch warning The variable amd_iommu_pre_enabled is used in non-init code-paths, so remove the __initdata annotation. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `3ac3e5ee5e` ('iommu/amd: Copy old trans table from old kernel') Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-19 10:34:31 +02:00
Joerg Roedel	ae162efbf2	iommu/amd: Fix compiler warning in copy_device_table() This was reported by the kbuild bot. The condition in which entry would be used uninitialized can not happen, because when there is no iommu this function would never be called. But its no fast-path, so fix the warning anyway. Reported-by: kbuild test robot <fengguang.wu@intel.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-19 10:33:42 +02:00
Joerg Roedel	af6ee6c1c4	Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu	2017-08-19 00:42:01 +02:00
Robin Murphy	1464d0b1de	iommu: Avoid NULL group dereference The recently-removed FIXME in iommu_get_domain_for_dev() turns out to have been a little misleading, since that check is still worthwhile even when groups are universal. We have a few IOMMU-aware drivers which only care whether their device is already attached to an existing domain or not, for which the previous behaviour of iommu_get_domain_for_dev() was ideal, and who now crash if their device does not have an IOMMU. With IOMMU groups now serving as a reliable indicator of whether a device has an IOMMU or not (barring false-positives from VFIO no-IOMMU mode), drivers could arguably do this: group = iommu_group_get(dev); if (group) { domain = iommu_get_domain_for_dev(dev); iommu_group_put(group); } However, rather than duplicate that code across multiple callsites, particularly when it's still only the domain they care about, let's skip straight to the next step and factor out the check into the common place it applies - in iommu_get_domain_for_dev() itself. Sure, it ends up looking rather familiar, but now it's backed by the reasoning of having a robust API able to do the expected thing for all devices regardless. Fixes: `05f80300dc` ("iommu: Finish making iommu_group support mandatory") Reported-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-18 11:41:17 +02:00
Joerg Roedel	c184ae83c8	iommu/tegra-gart: Add support for struct iommu_device Add a struct iommu_device to each tegra-gart and register it with the iommu-core. Also link devices added to the driver to their respective hardware iommus. Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-17 16:31:34 +02:00
Joerg Roedel	0b480e4470	iommu/tegra: Add support for struct iommu_device Add a struct iommu_device to each tegra-smmu and register it with the iommu-core. Also link devices added to the driver to their respective hardware iommus. Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-17 16:31:34 +02:00
Joerg Roedel	fa94cf84af	Merge branch 'core' into arm/tegra	2017-08-17 16:31:27 +02:00
Robin Murphy	a2d866f7d6	iommu/arm-smmu: Add system PM support With all our hardware state tracked in such a way that we can naturally restore it as part of the necessary reset, resuming is trivial, and there's nothing to do on suspend at all. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:29 +01:00
Robin Murphy	90df373cc6	iommu/arm-smmu: Track context bank state Echoing what we do for Stream Map Entries, maintain a software shadow state for context bank configuration. With this in place, we are mere moments away from blissfully easy suspend/resume support. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: fix sparse warning by only clearing .cfg during domain destruction] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:28 +01:00
Nate Watterson	7aa8619a66	iommu/arm-smmu-v3: Implement shutdown method The shutdown method disables the SMMU to avoid corrupting a new kernel started with kexec. Signed-off-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-16 17:27:28 +01:00
Joerg Roedel	13cf017446	iommu/vt-d: Make use of iova deferred flushing Remove the deferred flushing implementation in the Intel VT-d driver and use the one from the common iova code instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:53 +02:00
Joerg Roedel	c8acb28b33	iommu/vt-d: Allow to flush more than 4GB of device TLBs The shift qi_flush_dev_iotlb() is done on an int, which limits the mask to 32 bits. Make the mask 64 bits wide so that more than 4GB of address range can be flushed at once. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:53 +02:00
Joerg Roedel	9003d61863	iommu/amd: Make use of iova queue flushing Rip out the implementation in the AMD IOMMU driver and use the one in the common iova code instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	9a005a800a	iommu/iova: Add flush timer Add a timer to flush entries from the Flush-Queues every 10ms. This makes sure that no stale TLB entries remain for too long after an IOVA has been unmapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	8109c2a2f8	iommu/iova: Add locking to Flush-Queues The lock is taken from the same CPU most of the time. But having it allows to flush the queue also from another CPU if necessary. This will be used by a timer to regularily flush any pending IOVAs from the Flush-Queues. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	fb418dab8a	iommu/iova: Add flush counters to Flush-Queue implementation There are two counters: * fq_flush_start_cnt - Increased when a TLB flush is started. * fq_flush_finish_cnt - Increased when a TLB flush is finished. The fq_flush_start_cnt is assigned to every Flush-Queue entry on its creation. When freeing entries from the Flush-Queue, the value in the entry is compared to the fq_flush_finish_cnt. The entry can only be freed when its value is less than the value of fq_flush_finish_cnt. The reason for these counters it to take advantage of IOMMU TLB flushes that happened on other CPUs. These already flushed the TLB for Flush-Queue entries on other CPUs so that they can already be freed without flushing the TLB again. This makes it less likely that the Flush-Queue is full and saves IOMMU TLB flushes. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	1928210107	iommu/iova: Implement Flush-Queue ring buffer Add a function to add entries to the Flush-Queue ring buffer. If the buffer is full, call the flush-callback and free the entries. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	42f87e71c3	iommu/iova: Add flush-queue data structures This patch adds the basic data-structures to implement flush-queues in the generic IOVA code. It also adds the initialization and destroy routines for these data structures. The initialization routine is designed so that the use of this feature is optional for the users of IOVA code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:50 +02:00
Robin Murphy	da4b02750a	iommu/of: Fix of_iommu_configure() for disabled IOMMUs Sudeep reports that the logic got slightly broken when a PCI iommu-map entry targets an IOMMU marked as disabled in DT, since of_pci_map_rid() succeeds in following a phandle, and of_iommu_xlate() doesn't return an error value, but we miss checking whether ops was actually non-NULL. Whilst this could be solved with a point fix in of_pci_iommu_init(), it suggests that all the juggling of ERR_PTR values through the ops pointer is proving rather too complicated for its own good, so let's instead simplify the whole flow (with a side-effect of eliminating the cause of the bug). The fact that we now rely on iommu_fwspec means that we no longer need to pass around an iommu_ops pointer at all - we can simply propagate a regular int return value until we know whether we have a viable IOMMU, then retrieve the ops from the fwspec if and when we actually need them. This makes everything a bit more uniform and certainly easier to follow. Fixes: `d87beb7492` ("iommu/of: Handle PCI aliases properly") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:50 +02:00
Joerg Roedel	f42c223514	iommu/s390: Add support for iommu_device handling Add support for the iommu_device_register interface to make the s390 hardware iommus visible to the iommu core and in sysfs. Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:22:45 +02:00
Baoquan He	20b46dff13	iommu/amd: Disable iommu only if amd_iommu=off is specified It's ok to disable iommu early in normal kernel or in kdump kernel when amd_iommu=off is specified. While we should not disable it in kdump kernel when on-flight dma is still on-going. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:41 +02:00
Baoquan He	daae2d25a4	iommu/amd: Don't copy GCR3 table root pointer When iommu is pre_enabled in kdump kernel, if a device is set up with guest translations (DTE.GV=1), then don't copy GCR3 table root pointer but move the device over to an empty guest-cr3 table and handle the faults in the PPR log (which answer them with INVALID). After all these PPR faults are recoverable for the device and we should not allow the device to change old-kernels data when we don't have to. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:41 +02:00
Baoquan He	b336781b82	iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled AMD pointed out it's unsafe to update the device-table while iommu is enabled. It turns out that device-table pointer update is split up into two 32bit writes in the IOMMU hardware. So updating it while the IOMMU is enabled could have some nasty side effects. The safe way to work around this is to always allocate the device-table below 4G, including the old device-table in normal kernel and the device-table used for copying the content of the old device-table in kdump kernel. Meanwhile we need check if the address of old device-table is above 4G because it might has been touched accidentally in corrupted 1st kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:40 +02:00
Baoquan He	df3f7a6e8e	iommu/amd: Use is_attach_deferred call-back Implement call-back is_attach_deferred and use it to defer the domain attach from iommu driver init to device driver init when iommu is pre-enabled in kdump kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	e01d1913b0	iommu: Add is_attach_deferred call-back to iommu-ops This new call-back will be used to check if the domain attach need be deferred for now. If yes, the domain attach/detach will return directly. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	53019a9e88	iommu/amd: Do sanity check for address translation and irq remap of old dev table entry Firstly split the dev table entry copy into address translation part and irq remapping part. Because these two parts could be enabled independently. Secondly do sanity check for address translation and irq remap of old dev table entry separately. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	3ac3e5ee5e	iommu/amd: Copy old trans table from old kernel Here several things need be done: - If iommu is pre-enabled in a normal kernel, just disable it and print warning. - If any one of IOMMUs is not pre-enabled in kdump kernel, just continue as it does in normal kernel. - If failed to copy dev table of old kernel, continue to proceed as it does in normal kernel. - Only if all IOMMUs are pre-enabled and copy dev table is done well, free the dev table allocated in early_amd_iommu_init() and make amd_iommu_dev_table point to the copied one. - Disable and Re-enable event/cmd buffer, install the copied DTE table to reg, and detect and enable guest vapic. - Flush all caches Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	45a01c4293	iommu/amd: Add function copy_dev_tables() Add function copy_dev_tables to copy the old DEV table entries of the panicked kernel to the new allocated device table. Since all iommus share the same device table the copy only need be done one time. Here add a new global old_dev_tbl_cpy to point to the newly allocated device table which the content of old device table will be copied to. Besides, we also need to: - Check whether all IOMMUs actually use the same device table with the same size - Verify that the size of the old device table is the expected size. - Reserve the old domain id occupied in 1st kernel to avoid touching the old io-page tables. Then on-flight DMA can continue looking it up. And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be reused in copy_dev_tables(). Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:39 +02:00
Baoquan He	07a80a6b59	iommu/amd: Define bit fields for DTE particularly In AMD-Vi spec several bits of IO PTE fields and DTE fields are similar so that both of them can share the same MACRO definition. However defining them respectively can make code more read-able. Do it now. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:38 +02:00
Baoquan He	9494ea90a5	Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel" This reverts commit `54bd635704`. We still need the IO_PAGE_FAULT message to warn error after the issue of on-flight dma in kdump kernel is fixed. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:38 +02:00
Baoquan He	78d313c611	iommu/amd: Add several helper functions Move single iommu enabling codes into a wrapper function early_enable_iommu(). This can make later kdump change easier. And also add iommu_disable_command_buffer and iommu_disable_event_buffer for later usage. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:38 +02:00
Baoquan He	4c232a708b	iommu/amd: Detect pre enabled translation Add functions to check whether translation is already enabled in IOMMU. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:14:38 +02:00
Stanimir Varbanov	d051f28c88	iommu/qcom: Initialize secure page table This basically gets the secure page table size, allocates memory for secure pagetables and passes the physical address to the trusted zone. Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 17:34:49 +02:00
Rob Clark	0ae349a0f3	iommu/qcom: Add qcom_iommu An iommu driver for Qualcomm "B" family devices which do implement the ARM SMMU spec, but not in a way that is compatible with how the arm-smmu driver is designed. It seems SMMU_SCR1.GASRAE=1 so the global register space is not accessible. This means it needs to get configuration from devicetree instead of setting it up dynamically. In the end, other than register definitions, there is not much code to share with arm-smmu (other than what has already been refactored out into the pgtable helpers). Signed-off-by: Rob Clark <robdclark@gmail.com> Tested-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 17:34:49 +02:00
Rob Clark	2b03774bae	iommu/arm-smmu: Split out register defines I want to re-use some of these for qcom_iommu, which has (roughly) the same context-bank registers. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 17:34:48 +02:00
Joerg Roedel	68a17f0be6	iommu/pamu: Add support for generic iommu-device This patch adds a global iommu-handle to the pamu driver and initializes it at probe time. Also link devices added to the iommu to this handle. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:59:34 +02:00
Joerg Roedel	07eb6fdf49	iommu/pamu: WARN when fsl_pamu_probe() is called more than once The function probes the PAMU hardware from device-tree specifications. It initializes global variables and can thus be only safely called once. Add a check that that prints a warning when its called more than once. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:59:34 +02:00
Joerg Roedel	af29d9fa41	iommu/pamu: Make driver depend on CONFIG_PHYS_64BIT Certain address calculations in the driver make the assumption that phys_addr_t and dma_addr_t are 64 bit wide. Force this by depending on CONFIG_PHYS_64BIT to be set. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:59:34 +02:00
Joerg Roedel	a4d98fb306	iommu/pamu: Let PAMU depend on PCI The driver does not compile when PCI is not selected, so make it depend on it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:59:34 +02:00
Joerg Roedel	2926a2aa5c	iommu: Fix wrong freeing of iommu_device->dev The struct iommu_device has a 'struct device' embedded into it, not as a pointer, but the whole struct. In the conversion of the iommu drivers to use struct iommu_device it was forgotten that the relase function for that struct device simply calls kfree() on the pointer. This frees memory that was never allocated and causes memory corruption. To fix this issue, use a pointer to struct device instead of embedding the whole struct. This needs some updates in the iommu sysfs code as well as the Intel VT-d and AMD IOMMU driver. Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `39ab9555c2` ('iommu: Add sysfs bindings for struct iommu_device') Cc: stable@vger.kernel.org # >= v4.11 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:58:48 +02:00
Artem Savkov	a7990c647b	iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device Commit `c54451a` "iommu/arm-smmu: Fix the error path in arm_smmu_add_device" removed fwspec assignment in legacy_binding path as redundant which is wrong. It needs to be updated after fwspec initialisation in arm_smmu_register_legacy_master() as it is dereferenced later. Without this there is a NULL-pointer dereference panic during boot on some hosts. Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-11 16:56:51 +02:00
Robin Murphy	05f80300dc	iommu: Finish making iommu_group support mandatory Now that all the drivers properly implementing the IOMMU API support groups (I'm ignoring the etnaviv GPU MMUs which seemingly only do just enough to convince the ARM DMA mapping ops), we can remove the FIXME workarounds from the core code. In the process, it also seems logical to make the .device_group callback non-optional for drivers calling iommu_group_get_for_dev() - the current callers all implement it anyway, and it doesn't make sense for any future callers not to either. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-10 00:03:51 +02:00
Robin Murphy	15f9a3104b	iommu/tegra-gart: Add iommu_group support As the last step to making groups mandatory, clean up the remaining drivers by adding basic support. Whilst it may not perfectly reflect the isolation capabilities of the hardware, using generic_device_group() should at least maintain existing behaviour with respect to the API. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-10 00:03:50 +02:00
Robin Murphy	d92e1f8498	iommu/tegra-smmu: Add iommu_group support As the last step to making groups mandatory, clean up the remaining drivers by adding basic support. Whilst it may not perfectly reflect the isolation capabilities of the hardware (tegra_smmu_swgroup sounds suspiciously like something that might warrant representing at the iommu_group level), using generic_device_group() should at least maintain existing behaviour with respect to the API. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-10 00:03:50 +02:00
Robin Murphy	ce2eb8f44e	iommu/msm: Add iommu_group support As the last step to making groups mandatory, clean up the remaining drivers by adding basic support. Whilst it may not perfectly reflect the isolation capabilities of the hardware, using generic_device_group() should at least maintain existing behaviour with respect to the API. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-10 00:03:50 +02:00
Marek Szyprowski	928055a01b	iommu/exynos: Remove custom platform device registration code Commit `09515ef5dd` ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") postponed the moment of attaching IOMMU controller to its device, so there is no need to register IOMMU controllers very early, before all other devices in the system. This change gives us an opportunity to use standard platform device registration method also for Exynos SYSMMU controllers. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-04 13:01:33 +02:00
Josue Albarran	bfee0cf0ee	iommu/omap: Use DMA-API for performing cache flushes The OMAP IOMMU driver was using ARM assembly code directly for flushing the MMU page table entries from the caches. This caused MMU faults on OMAP4 (Cortex-A9 based SoCs) as L2 caches were not handled due to the presence of a PL310 L2 Cache Controller. These faults were however not seen on OMAP5/DRA7 SoCs (Cortex-A15 based SoCs). The OMAP IOMMU driver is adapted to use the DMA Streaming API instead now to flush the page table/directory table entries from the CPU caches. This ensures that the devices always see the updated page table entries. The outer caches are now addressed automatically with the usage of the DMA API. Signed-off-by: Josue Albarran <j-albarran@ti.com> Acked-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-04 11:59:29 +02:00
Fernando Guzman Lugo	159d3e35da	iommu/omap: Fix disabling of MMU upon a fault The IOMMU framework lets its client users be notified on a MMU fault and allows them to either handle the interrupt by dynamic reloading of an appropriate TLB/PTE for the offending fault address or to completely restart/recovery the device and its IOMMU. The OMAP remoteproc driver performs the latter option, and does so after unwinding the previous mappings. The OMAP IOMMU fault handler however disables the MMU and cuts off the clock upon a MMU fault at present, resulting in an interconnect abort during any subsequent operation that touches the MMU registers. So, disable the IP-level fault interrupts instead of disabling the MMU, to allow continued MMU register operations as well as to avoid getting interrupted again. Signed-off-by: Fernando Guzman Lugo <fernando.lugo@ti.com> [s-anna@ti.com: add commit description] Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Josue Albarran <j-albarran@ti.com> Acked-by: Suman Anna <s-anna@ti.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-04 11:59:29 +02:00
Arnd Bergmann	db3a7fd7a9	iommu/exynos: prevent building on big-endian kernels Since we print the correct warning, an allmodconfig build is no longer clean but always prints it, which defeats compile-testing: drivers/iommu/exynos-iommu.c:58:2: error: #warning "revisit driver if we can enable big-endian ptes" [-Werror=cpp] This replaces the #warning with a dependency, moving warning text into a comment. Fixes: `1f59adb176` ("iommu/exynos: Replace non-existing big-endian Kconfig option") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-04 11:55:40 +02:00
Simon Xue	c3aa474249	iommu/rockchip: ignore isp mmu reset operation ISP mmu can't support reset operation, it won't get the expected result when reset, but rest functions work normally. Add this patch as a WA for this issue. Signed-off-by: Simon Xue <xxm@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-27 14:15:21 +02:00
Simon Xue	03f732f890	iommu/rockchip: add multi irqs support RK3368 vpu mmu have two irqs, this patch support multi irqs Signed-off-by: Simon Xue <xxm@rock-chips.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-27 14:15:20 +02:00
Joerg Roedel	74ddda71f4	iommu/amd: Fix schedule-while-atomic BUG in initialization code The register_syscore_ops() function takes a mutex and might sleep. In the IOMMU initialization code it is invoked during irq-remapping setup already, where irqs are disabled. This causes a schedule-while-atomic bug: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0 no locks held by swapper/0/1. irq event stamp: 304 hardirqs last enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60 hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196 softirqs last enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1 Hardware name: PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014 Call Trace: dump_stack+0x85/0xca ___might_sleep+0x22a/0x260 __might_sleep+0x4a/0x80 __mutex_lock+0x58/0x960 ? iommu_completion_wait.part.17+0xb5/0x160 ? register_syscore_ops+0x1d/0x70 ? iommu_flush_all_caches+0x120/0x150 mutex_lock_nested+0x1b/0x20 register_syscore_ops+0x1d/0x70 state_next+0x119/0x910 iommu_go_to_state+0x29/0x30 amd_iommu_enable+0x13/0x23 Fix it by moving the register_syscore_ops() call to the next initialization step, which runs with irqs enabled. Reported-by: Artem Savkov <asavkov@redhat.com> Tested-by: Artem Savkov <asavkov@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Fixes: `2c0ae1720c` ('iommu/amd: Convert iommu initialization to state machine') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 15:39:14 +02:00
Rob Herring	6bd4f1c754	iommu: Convert to using %pOF instead of full_name Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Heiko Stuebner <heiko@sntech.de> Cc: iommu@lists.linux-foundation.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-rockchip@lists.infradead.org Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 13:01:37 +02:00
Robin Murphy	02dd44caec	iommu/ipmmu-vmsa: Clean up device tracking Get rid of now unused device tracking code. Future code should instead be able to use driver_for_each_device() for this purpose. This is a simplified version of the following patch from Robin [PATCH] iommu/ipmmu-vmsa: Clean up group allocation Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 12:45:57 +02:00
Magnus Damm	7b2d59611f	iommu/ipmmu-vmsa: Replace local utlb code with fwspec ids Now when both 32-bit and 64-bit code inside the driver is using fwspec it is possible to replace the utlb handling with fwspec ids that get populated from ->of_xlate(). Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 12:45:56 +02:00
Robin Murphy	3c49ed322b	iommu/ipmmu-vmsa: Use fwspec on both 32 and 64-bit ARM Consolidate the 32-bit and 64-bit code to make use of fwspec instead of archdata for the 32-bit ARM case. This is a simplified version of the fwspec handling code from Robin posted as [PATCH] iommu/ipmmu-vmsa: Convert to iommu_fwspec Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 12:45:56 +02:00
Magnus Damm	49558da030	iommu/ipmmu-vmsa: Consistent ->of_xlate() handling The 32-bit ARM code gets updated to make use of ->of_xlate() and the code is shared between 64-bit and 32-bit ARM. The of_device_is_available() check gets dropped since it is included in of_iommu_xlate(). Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 12:45:55 +02:00
Magnus Damm	01da21e562	iommu/ipmmu-vmsa: Use iommu_device_register()/unregister() Extend the driver to make use of iommu_device_register()/unregister() functions together with iommu_device_set_ops() and iommu_set_fwnode(). These used to be part of the earlier posted 64-bit ARM (r8a7795) series but it turns out that these days they are required on 32-bit ARM as well. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 12:45:55 +02:00
Krzysztof Kozlowski	1f59adb176	iommu/exynos: Replace non-existing big-endian Kconfig option Wrong Kconfig option was used when adding warning for untested big-endian capabilities. There is no CONFIG_BIG_ENDIAN option. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 11:54:31 +02:00
David Dillow	bc24c57159	iommu/vt-d: Don't free parent pagetable of the PTE we're adding When adding a large scatterlist entry that covers more than the L3 superpage size (1GB) but has an alignment such that we must use L2 superpages (2MB) , we give dma_pte_free_level() a range that causes it to free the L3 pagetable we're about to populate. We fix this by telling dma_pte_free_pagetable() about the pagetable level we're about to populate to prevent freeing it. For example, mapping a scatterlist with entry lengths 854MB and 1194MB at IOVA 0xffff80000000 would, when processing the 2MB-aligned second entry, cause pfn_to_dma_pte() to create a L3 directory to hold L2 superpages for the mapping at IOVA 0xffffc0000000. We would previously call dma_pte_free_pagetable(domain, 0xffffc0000, 0xfffffffff), which would free the L3 directory pfn_to_dma_pte() just created for IO PFN 0xffffc0000. Telling dma_pte_free_pagetable() to retain the L3 directories while using L2 superpages avoids the erroneous free. Signed-off-by: David Dillow <dillow@google.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 11:12:58 +02:00
Robin Murphy	d87beb7492	iommu/of: Handle PCI aliases properly When a PCI device has DMA quirks, we need to ensure that an upstream IOMMU knows about all possible aliases, since the presence of a DMA quirk does not preclude the device still also emitting transactions (e.g. MSIs) on its 'real' RID. Similarly, the rules for bridge aliasing are relatively complex, and some bridges may only take ownership of transactions under particular transient circumstances, leading again to multiple RIDs potentially being seen at the IOMMU for the given device. Take all this into account in the OF code by translating every RID produced by the alias walk, not just whichever one comes out last. Happily, this also makes things tidy enough that we can reduce the number of both total lines of code, and confusing levels of indirection, by pulling the "iommus"/"iommu-map" parsing helpers back in-line again. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 11:11:57 +02:00
Suravee Suthikulpanit	efe6f24160	iommu/amd: Enable ga_log_intr when enabling guest_mode IRTE[GALogIntr] bit should set when enabling guest_mode, which enables IOMMU to generate entry in GALog when IRTE[IsRun] is not set, and send an interrupt to notify IOMMU driver. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: stable@vger.kernel.org # v4.9+ Fixes: `d98de49a53` ('iommu/amd: Enable vAPIC interrupt remapping mode by default') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-25 15:03:55 +02:00
Robin Murphy	7655739143	iommu/io-pgtable: Sanitise map/unmap addresses It may be an egregious error to attempt to use addresses outside the range of the pagetable format, but that still doesn't mean we should merrily wreak havoc by silently mapping/unmapping whatever truncated portions of them might happen to correspond to real addresses. Add some up-front checks to sanitise our inputs so that buggy callers don't invite potential memory corruption. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 10:30:28 +01:00
Vivek Gautam	c54451a5e2	iommu/arm-smmu: Fix the error path in arm_smmu_add_device fwspec->iommu_priv is available only after arm_smmu_master_cfg instance has been allocated. We shouldn't free it before that. Also it's logical to free the master cfg itself without checking for fwspec. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> [will: remove redundant assignment to fwspec] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 10:30:28 +01:00
Robin Murphy	2984f7f3bb	Revert "iommu/io-pgtable: Avoid redundant TLB syncs" The tlb_sync_pending flag was necessary for correctness in the Mediatek M4U driver, but since it offered a small theoretical optimisation for all io-pgtable users it was implemented as a high-level thing. However, now that some users may not be using a synchronising lock, there are several ways this flag can go wrong for them, and at worst it could result in incorrect behaviour. Since we've addressed the correctness issue within the Mediatek driver itself, and fixing the optimisation aspect to be concurrency-safe would be quite a headache (and impose extra overhead on every operation for the sake of slightly helping one case which will virtually never happen in typical usage), let's just retire it. This reverts commit `88492a4700`. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 10:30:27 +01:00
Robin Murphy	98a8f63e56	iommu/mtk: Avoid redundant TLB syncs locally Under certain circumstances, the io-pgtable code may end up issuing two TLB sync operations without any intervening invalidations. This goes badly for the M4U hardware, since it means the second sync ends up polling for a non-existent operation to finish, and as a result times out and warns. The io_pgtable_tlb_* helpers implement a high-level optimisation to avoid issuing the second sync at all in such cases, but in order to work correctly that requires all pagetable operations to be serialised under a lock, thus is no longer applicable to all io-pgtable users. Since we're the only user actually relying on this flag for correctness, let's reimplement it locally to avoid the headache of trying to make the high-level version concurrency-safe for other users. CC: Yong Wu <yong.wu@mediatek.com> CC: Matthias Brugger <matthias.bgg@gmail.com> Tested-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 10:30:27 +01:00
Will Deacon	8e517e762a	iommu/arm-smmu: Reintroduce locking around TLB sync operations Commit `523d7423e2` ("iommu/arm-smmu: Remove io-pgtable spinlock") removed the locking used to serialise map/unmap calls into the io-pgtable code from the ARM SMMU driver. This is good for performance, but opens us up to a nasty race with TLB syncs because the TLB sync register is shared within a context bank (or even globally for stage-2 on SMMUv1). There are two cases to consider: 1. A CPU can be spinning on the completion of a TLB sync, take an interrupt which issues a subsequent TLB sync, and then report a timeout on return from the interrupt. 2. A CPU can be spinning on the completion of a TLB sync, but other CPUs can continuously issue additional TLB syncs in such a way that the backoff logic reports a timeout. Rather than fix this by spinning for completion of prior TLB syncs before issuing a new one (which may suffer from fairness issues on large systems), instead reintroduce locking around TLB sync operations in the ARM SMMU driver. Fixes: `523d7423e2` ("iommu/arm-smmu: Remove io-pgtable spinlock") Cc: Robin Murphy <robin.murphy@arm.com> Reported-by: Ray Jui <ray.jui@broadcom.com> Tested-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 10:30:26 +01:00
Tom Lendacky	2543a786aa	iommu/amd: Allow the AMD IOMMU to work with memory encryption The IOMMU is programmed with physical addresses for the various tables and buffers that are used to communicate between the device and the driver. When the driver allocates this memory it is encrypted. In order for the IOMMU to access the memory as encrypted the encryption mask needs to be included in these physical addresses during configuration. The PTE entries created by the IOMMU should also include the encryption mask so that when the device behind the IOMMU performs a DMA, the DMA will be performed to encrypted memory. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Cc: <iommu@lists.linux-foundation.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/3053631ea25ba8b1601c351cb7c541c496f6d9bc.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-07-18 11:38:03 +02:00
Linus Torvalds	fb4e3beeff	IOMMU Updates for Linux v4.13 This update comes with: * Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM * Some Errata workarounds for ARM SMMU implemenations * Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before * Support for amd_iommu=off when booting with kexec. Problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel * The Rockchip IOMMU driver is now available on ARM64 * Align the return value of the iommu_ops->device_group call-backs to not miss error values * Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT * Various other small cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+ buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO 4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz 4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN Ukehbl4OEKzZpD3oLPPk =pmBf -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...	2017-07-12 10:00:04 -07:00
Linus Torvalds	f72e24a124	This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlldmw0LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOiKA/+Ln1mFLSf3nfTzIHa24Bbk8ZTGr0B8TD4Vmyyt8iG oO3AeaTLn3d6ugbH/uih/tPz8PuyXsdiTC1rI/ejDMiwMTSjW6phSiIHGcStSR9X VFNhmMFacp7QpUpvxceV0XZYKDViAoQgHeGdp3l+K5h/v4AYePV/v/5RjQPaEyOh YLbCzETO+24mRWdJxdAqtTW4ovYhzj6XsiJ+pAjlV0+SWU6m5L5E+VAPNi1vqv1H 1O2KeCFvVYEpcnfL3qnkw2timcjmfCfeFAd9mCUAc8mSRBfs3QgDTKw3XdHdtRml LU2WuA5cpMrOdBO4mVra2plo8E2szvpB1OZZXoKKdCpK3VGwVpVHcTvClK2Ks/3B GDLieroEQNu2ZIUIdWXf/g2x6le3BcC9MmpkAhnGPqCZ7skaIBO5Cjpxm0zTJAPl PPY3CMBBEktAvys6DcudOYGixNjKUuAm5lnfpcfTEklFdG0AjhdK/jZOplAFA6w4 LCiy0rGHM8ZbVAaFxbYoFCqgcjnv6EjSiqkJxVI4fu/Q7v9YXfdPnEmE0PJwCVo5 +i7aCLgrYshTdHr/F3e5EuofHN3TDHwXNJKGh/x97t+6tt326QMvDKX059Kxst7R rFukGbrYvG8Y7yXwrSDbusl443ta0Ht7T1oL4YUoJTZp0nScAyEluDTmrH1JVCsT R4o= =0Fso -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...	2017-07-06 19:20:54 -07:00
Linus Torvalds	03ffbcdd78	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The irq department delivers: - Expand the generic infrastructure handling the irq migration on CPU hotplug and convert X86 over to it. (Thomas Gleixner) Aside of consolidating code this is a preparatory change for: - Finalizing the affinity management for multi-queue devices. The main change here is to shut down interrupts which are affine to a outgoing CPU and reenabling them when the CPU comes online again. That avoids moving interrupts pointlessly around and breaking and reestablishing affinities for no value. (Christoph Hellwig) Note: This contains also the BLOCK-MQ and NVME changes which depend on the rework of the irq core infrastructure. Jens acked them and agreed that they should go with the irq changes. - Consolidation of irq domain code (Marc Zyngier) - State tracking consolidation in the core code (Jeffy Chen) - Add debug infrastructure for hierarchical irq domains (Thomas Gleixner) - Infrastructure enhancement for managing generic interrupt chips via devmem (Bartosz Golaszewski) - Constification work all over the place (Tobias Klauser) - Two new interrupt controller drivers for MVEBU (Thomas Petazzoni) - The usual set of fixes, updates and enhancements all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) irqchip/or1k-pic: Fix interrupt acknowledgement irqchip/irq-mvebu-gicp: Allocate enough memory for spi_bitmap irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity nvme: Allocate queues for all possible CPUs blk-mq: Create hctx for each present CPU blk-mq: Include all present CPUs in the default queue mapping genirq: Avoid unnecessary low level irq function calls genirq: Set irq masked state when initializing irq_desc genirq/timings: Add infrastructure for estimating the next interrupt arrival time genirq/timings: Add infrastructure to track the interrupt timings genirq/debugfs: Remove pointless NULL pointer check irqchip/gic-v3-its: Don't assume GICv3 hardware supports 16bit INTID irqchip/gic-v3-its: Add ACPI NUMA node mapping irqchip/gic-v3-its-platform-msi: Make of_device_ids const irqchip/gic-v3-its: Make of_device_ids const irqchip/irq-mvebu-icu: Add new driver for Marvell ICU irqchip/irq-mvebu-gicp: Add new driver for Marvell GICP dt-bindings/interrupt-controller: Add DT binding for the Marvell ICU genirq/irqdomain: Remove auto-recursive hierarchy support irqchip/MSI: Use irq_domain_update_bus_token instead of an open coded access ...	2017-07-03 16:50:31 -07:00
Linus Torvalds	9bd42183b9	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...	2017-07-03 13:08:04 -07:00
Linus Torvalds	81e3e04489	UUID/GUID updates: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAllZfmILHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMvyg/9EvWHOOsSdeDykCK3KdH2uIqnxwpl+m7ljccaGJIc MmaH0KnsP9p/Cuw5hESh2tYlmCYN7pmYziNXpf/LRS65/HpEYbs4oMqo8UQsN0UM 2IXHfXY0HnCoG5OixH8RNbFTkxuGphsTY8meaiDr6aAmqChDQI2yGgQLo3WM2/Qe R9N1KoBWH/bqY6dHv+urlFwtsREm2fBH+8ovVma3TO73uZCzJGLJBWy3anmZN+08 uYfdbLSyRN0T8rqemVdzsZ2SrpHYkIsYGUZV43F581vp8e/3OKMoMxpWRRd9fEsa MXmoaHcLJoBsyVSFR9lcx3axKrhAgBPZljASbbA0h49JneWXrzghnKBQZG2SnEdA ktHQ2sE4Yb5TZSvvWEKMQa3kXhEfIbTwgvbHpcDr5BUZX8WvEw2Zq8e7+Mi4+KJw QkvFC1S96tRYO2bxdJX638uSesGUhSidb+hJ/edaOCB/GK+sLhUdDTJgwDpUGmyA xVXTF51ramRS2vhlbzN79x9g33igIoNnG4/PV0FPvpCTSqxkHmPc5mK6Vals1lqt cW6XfUjSQECq5nmTBtYDTbA/T+8HhBgSQnrrvmferjJzZUFGr/7MXl+Evz2x4CjX OBQoAMu241w6Vp3zoXqxzv+muZ/NLar52M/zbi9TUjE0GvvRNkHvgCC4NmpIlWYJ Sxg= =J/4P -----END PGP SIGNATURE----- Merge tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid Pull uuid subsystem from Christoph Hellwig: "This is the new uuid subsystem, in which Amir, Andy and I have started consolidating our uuid/guid helpers and improving the types used for them. Note that various other subsystems have pulled in this tree, so I'd like it to go in early. UUID/GUID summary: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me)" * tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits) ACPI: hns_dsaf_acpi_dsm_guid can be static mmc: sdhci-pci: make guid intel_dsm_guid static uuid: Take const on input of uuid_is_null() and guid_is_null() thermal: int340x_thermal: fix compile after the UUID API switch thermal: int340x_thermal: Switch to use new generic UUID API acpi: always include uuid.h ACPI: Switch to use generic guid_t in acpi_evaluate_dsm() ACPI / extlog: Switch to use new generic UUID API ACPI / bus: Switch to use new generic UUID API ACPI / APEI: Switch to use new generic UUID API acpi, nfit: Switch to use new generic UUID API MAINTAINERS: add uuid entry tmpfs: generate random sb->s_uuid scsi_debug: switch to uuid_t nvme: switch to uuid_t sysctl: switch to use uuid_t partitions/ldm: switch to use uuid_t overlayfs: use uuid_t instead of uuid_be fs: switch ->s_uuid to uuid_t ima/policy: switch to use uuid_t ...	2017-07-03 09:55:26 -07:00
Christoph Hellwig	5860acc1a9	x86: remove arch specific dma_supported implementation And instead wire it up as method for all the dma_map_ops instances. Note that this also means the arch specific check will be fully instead of partially applied in the AMD iommu driver. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:46 -07:00
Christoph Hellwig	a869572c31	iommu/amd: implement ->mapping_error DMA_ERROR_CODE is going to go away, so don't rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:30 -07:00
Joerg Roedel	6a7086431f	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-06-28 14:45:02 +02:00
Suravee Suthikulpanit	84a21dbdef	iommu/amd: Fix interrupt remapping when disable guest_mode Pass-through devices to VM guest can get updated IRQ affinity information via irq_set_affinity() when not running in guest mode. Currently, AMD IOMMU driver in GA mode ignores the updated information if the pass-through device is setup to use vAPIC regardless of guest_mode. This could cause invalid interrupt remapping. Also, the guest_mode bit should be set and cleared only when SVM updates posted-interrupt interrupt remapping information. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Fixes: `d98de49a53` ('iommu/amd: Enable vAPIC interrupt remapping mode by default') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:44:56 +02:00
Arvind Yadav	01e1932a17	iommu/vt-d: Constify intel_dma_ops Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:17:12 +02:00
Joerg Roedel	72dcac6334	iommu: Warn once when device_group callback returns NULL This callback should never return NULL. Print a warning if that happens so that we notice and can fix it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:46 +02:00
Joerg Roedel	8faf5e5a12	iommu/omap: Return ERR_PTR in device_group call-back Make sure that the device_group callback returns an ERR_PTR instead of NULL. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:45 +02:00
Joerg Roedel	7f7a2304aa	iommu: Return ERR_PTR() values from device_group call-backs The generic device_group call-backs in iommu.c return NULL in case of error. Since they are getting ERR_PTR values from iommu_group_alloc(), just pass them up instead. Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:45 +02:00
Joerg Roedel	0929deca40	iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() The iommu_group_get_for_dev() function also attaches the device to its group, so this code doesn't need to be in the iommu driver. Further by using this function the driver can make use of default domains in the future. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:29:00 +02:00
Sebastian Andrzej Siewior	58c4a95f90	iommu/vt-d: Don't disable preemption while accessing deferred_flush() get_cpu() disables preemption and returns the current CPU number. The CPU number is only used once while retrieving the address of the local's CPU deferred_flush pointer. We can instead use raw_cpu_ptr() while we remain preemptible. The worst thing that can happen is that flush_unmaps_timeout() is invoked multiple times: once by taskA after seeing HIGH_WATER_MARK and then preempted to another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0 right after its read because another CPU invoked flush_unmaps_timeout() for this CPU. The access to flush_data is protected by a spinlock so even if we get migrated to another CPU or preempted - the data structure is protected. While at it, I marked deferred_flush static since I can't find a reference to it outside of this file. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:24:40 +02:00
Sebastian Andrzej Siewior	aaffaa8a3b	iommu/iova: Don't disable preempt around this_cpu_ptr() Commit `583248e662` ("iommu/iova: Disable preemption around use of this_cpu_ptr()") disables preemption while accessing a per-CPU variable. This does keep lockdep quiet. However I don't see the point why it is bad if we get migrated after its access to another CPU. __iova_rcache_insert() and __iova_rcache_get() immediately locks the variable after obtaining it - before accessing its members. _If_ we get migrated away after retrieving the address of cpu_rcache before taking the lock then the other task on the same CPU will retrieve the same address of cpu_rcache and will spin on the lock. alloc_iova_fast() disables preemption while invoking free_cpu_cached_iovas() on each CPU. The function itself uses per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr() does). It _could_ make sense to use get_online_cpus() instead but the we have a hotplug notifier for CPU down (and none for up) so we are good. Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:23:01 +02:00
Joerg Roedel	0b25635bd4	Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu	2017-06-28 10:42:54 +02:00
Geetha Sowjanya	f935448acf	iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 Cavium ThunderX2 SMMU doesn't support MSI and also doesn't have unique irq lines for gerror, eventq and cmdq-sync. New named irq "combined" is set as a errata workaround, which allows to share the irq line by register single irq handler for all the interrupts. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Geetha sowjanya <gakula@caviumnetworks.com> [will: reworked irq equality checking and added SPI check] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:04 +01:00
shameer	99caf177f6	iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) HiSilicon SMMUv3 on Hip06/Hip07 platforms doesn't support CMD_PREFETCH command. The dt based support for this quirk is already present in the driver(hisilicon,broken-prefetch-cmd). This adds ACPI support for the quirk using the IORT smmu model number. Signed-off-by: shameer <shameerali.kolothum.thodi@huawei.com> Signed-off-by: hanjun <guohanjun@huawei.com> [will: rewrote patch] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:04 +01:00
Linu Cherian	e5b829de05	iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 Cavium ThunderX2 SMMU implementation doesn't support page 1 register space and PAGE0_REGS_ONLY option is enabled as an errata workaround. This option when turned on, replaces all page 1 offsets used for EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets. SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY, since resource size can be either 64k/128k. For this, arm_smmu_device_dt_probe/acpi_probe has been moved before platform_get_resource call, so that SMMU options are set beforehand. Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Geetha Sowjanya <geethasowjanya.akula@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:03 +01:00
Robert Richter	12275bf0a4	iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions The model number is already defined in acpica and we are actually waiting for the acpi maintainers to include it: https://github.com/acpica/acpica/commit/d00a4eb86e64 Adding those temporary definitions until the change makes it into include/acpi/actbl2.h. Once that is done this patch can be reverted. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:03 +01:00
Will Deacon	77f3445866	iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table When writing a new table entry, we must ensure that the contents of the table is made visible to the SMMU page table walker before the updated table entry itself. This is currently achieved using wmb(), which expands to an expensive and unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when writing the table entry, but this doesn't have memory ordering semantics on !SMP systems. Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this does more than we require (since it targets the outer-shareable domain), but it's likely to be significantly faster than the DSB approach. Reported-by: Linu Cherian <linu.cherian@cavium.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:02 +01:00
Will Deacon	c1004803b4	iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE The LPAE/ARMv8 page table format relies on the ability to read and write 64-bit page table entries in an atomic fashion. With the move to a lockless implementation, we also need support for cmpxchg64 to resolve races when installing table entries concurrently. Unfortunately, not all architectures support cmpxchg64, so the code can fail to compiler when building for these architectures using COMPILE_TEST. Rather than disable COMPILE_TEST altogether, instead check that GENERIC_ATOMIC64 is not selected, which is a reasonable indication that the architecture has support for 64-bit cmpxchg. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:02 +01:00
Robin Murphy	58188afeb7	iommu/arm-smmu-v3: Remove io-pgtable spinlock As for SMMUv2, take advantage of io-pgtable's newfound tolerance for concurrency. Unfortunately in this case the command queue lock remains a point of serialisation for the unmap path, but there may be a little more we can do to ameliorate that in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:01 +01:00
Robin Murphy	523d7423e2	iommu/arm-smmu: Remove io-pgtable spinlock With the io-pgtable code now robust against (valid) races, we no longer need to serialise all operations with a lock. This might make broken callers who issue concurrent operations on overlapping addresses go even more wrong than before, but hey, they already had little hope of useful or deterministic results. We do however still have to keep a lock around to serialise the ATS1* translation ops, as parallel iova_to_phys() calls could lead to unpredictable hardware behaviour otherwise. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:01 +01:00
Robin Murphy	119ff305b0	iommu/io-pgtable-arm-v7s: Support lockless operation Mirroring the LPAE implementation, rework the v7s code to be robust against concurrent operations. The same two potential races exist, and are solved in the same manner, with the fixed 2-level structure making life ever so slightly simpler. What complicates matters compared to LPAE, however, is large page entries, since we can't update a block of 16 PTEs atomically, nor assume available software bits to do clever things with. As most users are never likely to do partial unmaps anyway (due to DMA API rules), it doesn't seem unreasonable for this case to remain behind a serialising lock; we just pull said lock down into the bowels of the implementation so it's well out of the way of the normal call paths. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	2c3d273eab	iommu/io-pgtable-arm: Support lockless operation For parallel I/O with multiple concurrent threads servicing the same device (or devices, if several share a domain), serialising page table updates becomes a massive bottleneck. On reflection, though, we don't strictly need to do that - for valid IOMMU API usage, there are in fact only two races that we need to guard against: multiple map requests for different blocks within the same region, when the intermediate-level table for that region does not yet exist; and multiple unmaps of different parts of the same block entry. Both of those are fairly easily solved by using a cmpxchg to install the new table, such that if we then find that someone else's table got there first, we can simply free ours and continue. Make the requisite changes such that we can withstand being called without the caller maintaining a lock. In theory, this opens up a few corners in which wildly misbehaving callers making nonsensical overlapping requests might lead to crashes instead of just unpredictable results, but correct code really does not deserve to pay a significant performance cost for the sake of masking bugs in theoretical broken code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	81b3c25218	iommu/io-pgtable: Introduce explicit coherency Once we remove the serialising spinlock, a potential race opens up for non-coherent IOMMUs whereby a caller of .map() can be sure that cache maintenance has been performed on their new PTE, but will have no guarantee that such maintenance for table entries above it has actually completed (e.g. if another CPU took an interrupt immediately after writing the table entry, but before initiating the DMA sync). Handling this race safely will add some potentially non-trivial overhead to installing a table entry, which we would much rather avoid on coherent systems where it will be unnecessary, and where we are stirivng to minimise latency by removing the locking in the first place. To that end, let's introduce an explicit notion of cache-coherency to io-pgtable, such that we will be able to avoid penalising IOMMUs which know enough to know when they are coherent. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	b9f1ef30ac	iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap Whilst the short-descriptor format's split_blk_unmap implementation has no need to be recursive, it followed the pattern of the LPAE version anyway for the sake of consistency. With the latter now reworked for both efficiency and future scalability improvements, tweak the former similarly, not least to make it less obtuse. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:59 +01:00
Robin Murphy	fb3a95795d	iommu/io-pgtable-arm: Improve split_blk_unmap The current split_blk_unmap implementation suffers from some inscrutable pointer trickery for creating the tables to replace the block entry, but more than that it also suffers from hideous inefficiency. For example, the most pathological case of unmapping a level 3 page from a level 1 block will allocate 513 lower-level tables to remap the entire block at page granularity, when only 2 are actually needed (the rest can be covered by level 2 block entries). Also, we would like to be able to relax the spinlock requirement in future, for which the roll-back-and-try-again logic for race resolution would be pretty hideous under the current paradigm. Both issues can be resolved most neatly by turning things sideways: instead of repeatedly recursing into __arm_lpae_map() map to build up an entire new sub-table depth-first, we can directly replace the block entry with a next-level table of block/page entries, then repeat by unmapping at the next level if necessary. With a little refactoring of some helper functions, the code ends up not much bigger than before, but considerably easier to follow and to adapt in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:59 +01:00
Robin Murphy	9db829d281	iommu/io-pgtable-arm-v7s: Check table PTEs more precisely Whilst we don't support the PXN bit at all, so should never encounter a level 1 section or supersection PTE with it set, it would still be wise to check both table type bits to resolve any theoretical ambiguity. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:58 +01:00
Arvind Yadav	5c2d021829	iommu: arm-smmu: Handle return of iommu_device_register. iommu_device_register returns an error code and, although it currently never fails, we should check its return value anyway. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> [will: adjusted to follow arm-smmu.c] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:58 +01:00
Arvind Yadav	ebdd13c93f	iommu: arm-smmu-v3: make of_device_ids const of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Robin Murphy	84c24379a7	iommu/arm-smmu: Plumb in new ACPI identifiers Revision C of IORT now allows us to identify ARM MMU-401 and the Cavium ThunderX implementation. Wire them up so that we can probe these models once firmware starts using the new codes in place of generic ones, and so that the appropriate features and quirks get enabled when we do. For the sake of backports and mitigating sychronisation problems with the ACPICA headers, we'll carry a backup copy of the new definitions locally for the short term to make life simpler. CC: stable@vger.kernel.org # 4.10 Acked-by: Robert Richter <rrichter@cavium.com> Tested-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Arvind Yadav	60ab7a75c8	iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops. File size before: text data bss dec hex filename 6146 56 9 6211 1843 drivers/iommu/io-pgtable-arm-v7s.o File size After adding 'const': text data bss dec hex filename 6170 24 9 6203 183b drivers/iommu/io-pgtable-arm-v7s.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Sunil Goutham	b847de4e50	iommu/arm-smmu-v3: Increase CMDQ drain timeout value Waiting for a CMD_SYNC to be processed involves waiting for the command queue to drain, which can take an awful lot longer than waiting for a single entry to become available. Consequently, the common timeout value of 100us has been observed to be too short on some platforms when a CMD_SYNC is issued into a queued full of TLBI commands. This patch resolves the issue by using a different (1s) timeout when waiting for the CMDQ to drain and using a simple back-off mechanism when polling the cons pointer in the absence of WFE support. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> [will: rewrote commit message and cosmetic changes] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:56 +01:00
Thomas Gleixner	3e49a81822	iommu/amd: Use named irq domain interface Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.142270582@linutronix.de	2017-06-22 18:21:11 +02:00
Thomas Gleixner	cea29b656a	iommu/vt-d: Use named irq domain interface Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.063083997@linutronix.de	2017-06-22 18:21:10 +02:00
Thomas Gleixner	1bb3a5a763	iommu/vt-d: Add name to irq chip Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.431939968@linutronix.de	2017-06-22 18:21:07 +02:00
Thomas Gleixner	290be194ba	iommu/amd: Add name to irq chip Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.343236995@linutronix.de	2017-06-22 18:21:07 +02:00
Joerg Roedel	9ce3a72cd7	iommu/amd: Free already flushed ring-buffer entries before full-check To benefit from IOTLB flushes on other CPUs we have to free the already flushed IOVAs from the ring-buffer before we do the queue_ring_full() check. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:22 +02:00
Joerg Roedel	ffa080ebb5	iommu/amd: Remove amd_iommu_disabled check from amd_iommu_detect() This check needs to happens later now, when all previously enabled IOMMUs have been disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:21 +02:00
Joerg Roedel	7ad820e433	iommu/amd: Free IOMMU resources when disabled on command line After we made sure that all IOMMUs have been disabled we need to make sure that all resources we allocated are released again. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:21 +02:00
Joerg Roedel	f601927136	iommu/amd: Set global pointers to NULL after freeing them Avoid any tries to double-free these pointers. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	151b09031a	iommu/amd: Check for error states first in iommu_go_to_state() Check if we are in an error state already before calling into state_next(). Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	1b1e942e34	iommu/amd: Add new init-state IOMMU_CMDLINE_DISABLED This will be used when during initialization we detect that the iommu should be disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	90b3eb03e1	iommu/amd: Rename free_on_init_error() The function will also be used to free iommu resources when amd_iommu=off was specified on the kernel command line. So rename the function to reflect that. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	1112374153	iommu/amd: Disable IOMMUs at boot if they are enabled When booting, make sure the IOMMUs are disabled. They could be previously enabled if we boot into a kexec or kdump kernel. So make sure they are off. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:19 +02:00
Ingo Molnar	902b319413	Merge branch 'WIP.sched/core' into sched/core Conflicts: kernel/sched/Makefile Pick up the waitqueue related renames - it didn't get much feedback, so it appears to be uncontroversial. Famous last words? ;-) Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-20 12:28:21 +02:00
Christoph Hellwig	81a5a31675	iommu/dma: don't rely on DMA_ERROR_CODE DMA_ERROR_CODE is not a public API and will go away soon. dma dma-iommu driver already implements a proper ->mapping_error method, so it's only using the value internally. Add a new local define using the value that arm64 which is the only current user of dma-iommu. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-20 11:12:59 +02:00
Joerg Roedel	54bd635704	iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel When booting into a kdump kernel, suppress IO_PAGE_FAULTs by default for all devices. But allow the faults again when a domain is assigned to a device. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-16 10:21:05 +02:00

... 2 3 4 5 6 ...

2153 Commits