[ Upstream commit 571f316074a203e979ea90211d9acf423dfe5f46 ]
Fix IOVA reserve failure in the case when address of first memory region
listed in dma-ranges is equal to 0x0.
Fixes: aadad097cd ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA address")
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Sven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/20200914072319.6091-1-srinath.mannam@broadcom.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b21a503adf597773e4b37db05db0e9b16a81d53 ]
print_iommu_info prints the EFR register and then the decoded list of
features on a separate line:
pci 0000:00:00.2: AMD-Vi: Extended features (0x206d73ef22254ade):
PPR X2APIC NX GT IA GA PC GA_vAPIC
The second line is emitted via 'pr_cont', which causes it to have a
different ('warn') loglevel compared to the previous line ('info').
Commit 9a295ff0ff attempted to rectify this by removing the newline
from the pci_info format string, but this doesn't work, as pci_info
calls implicitly append a newline anyway.
Printing the decoded features on the same line would make it quite long.
Instead, change pci_info() to pr_info() to omit PCI bus location info,
which is also shown in the preceding message. This results in:
pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
AMD-Vi: Extended features (0x206d73ef22254ade): PPR X2APIC NX GT IA GA PC GA_vAPIC
AMD-Vi: Interrupt remapping enabled
Fixes: 9a295ff0ff ("iommu/amd: Print extended features in one line to fix divergent log levels")
Link: https://lore.kernel.org/lkml/alpine.LNX.2.20.13.2104112326460.11104@monopod.intra.ispras.ru
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/20210504102220.1793-1-amonakov@ispras.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54c80d907400189b09548039be8f3b6e297e8ae3 ]
When first-level page tables are used for IOVA translation, we use user
privilege by setting U/S bit in the page table entry. This is to make it
consistent with the second level translation, where the U/S enforcement
is not available. Clear the SRE (Supervisor Request Enable) field in the
pasid table entry of RID2PASID so that requests requesting the supervisor
privilege are blocked and treated as DMA remapping faults.
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210512064426.3440915-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210519015027.108468-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 382d91fc0f4f1b13f8a0dcbf7145f4f175b71a18 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Fixes: fa4afd78ea ("iommu/virtio: Build virtio-iommu as module")
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210508031451.53493-1-cuibixuan@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit eea53c5816889ee8b64544fa2e9311a81184ff9c upstream.
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8ce9ebbecdfda3322bbcece6b3b25888217f8e3 upstream.
The Access/Dirty bits in the first level page table entry will be set
whenever a page table entry was used for address translation or write
permission was successfully translated. This is always true when using
the first-level page table for kernel IOVA. Instead of wasting hardware
cycles to update the certain bits, it's better to set them up at the
beginning.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 416fa531c8 which is
commit a8ce9ebbecdfda3322bbcece6b3b25888217f8e3 upstream as it was
backported incorrectly and is causing problems for some systems.
Reported-by: Wolfgang Müller <wolf@oriole.systems>
Reported-by: Charles Wright <charles@charleswright.co>
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit c848416cc0 which is
eea53c5816889ee8b64544fa2e9311a81184ff9c upstream.
Another iommu patch was backported incorrectly, causing problems, so
drop this as well for the moment.
Reported-by: Wolfgang Müller <wolf@oriole.systems>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 994d6608efe4a4c8834bdc5014c86f4bc6aceea6 ]
In early AMD desktop/mobile platforms (during 2013), when the IOMMU
Performance Counter (PMC) support was first introduced in
commit 30861ddc9c ("perf/x86/amd: Add IOMMU Performance Counter
resource management"), there was a HW bug where the counters could not
be accessed. The result was reading of the counter always return zero.
At the time, the suggested workaround was to add a test logic prior
to initializing the PMC feature to check if the counters can be programmed
and read back the same value. This has been working fine until the more
recent desktop/mobile platforms start enabling power gating for the PMC,
which prevents access to the counters. This results in the PMC support
being disabled unnecesarily.
Unfortunatly, there is no documentation of since which generation
of hardware the original PMC HW bug was fixed. Although, it was fixed
soon after the first introduction of the PMC. Base on this, we assume
that the buggy platforms are less likely to be in used, and it should
be relatively safe to remove this legacy logic.
Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210409085848.3908-3-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 715601e4e36903a653cd4294dfd3ed0019101991 ]
This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.
The original commit tries to address an issue, where PMC power-gating
causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
platforms where the power-gating is normally enabled.
There have been several reports that the workaround still does not
guarantee to work, and can add up to 100 ms (on the worst case)
to the boot process on certain platforms such as the MSI B350M MORTAR
with AMD Ryzen 3 2200G.
Therefore, revert this commit as a prelude to removing the pre-init
test.
Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210409085848.3908-2-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0474a606ecb9326227b4d68059942f9db88a897 ]
When the Intel IOMMU is operating in the scalable mode, some information
from the root and context table may be used to tag entries in the PASID
cache. Software should invalidate the PASID-cache when changing root or
context table entries.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eea53c5816889ee8b64544fa2e9311a81184ff9c ]
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8ce9ebbecdfda3322bbcece6b3b25888217f8e3 ]
The Access/Dirty bits in the first level page table entry will be set
whenever a page table entry was used for address translation or write
permission was successfully translated. This is always true when using
the first-level page table for kernel IOVA. Instead of wasting hardware
cycles to update the certain bits, it's better to set them up at the
beginning.
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03d205094af45bca4f8e0498c461a893aa3ec6d9 ]
The Address field of the Page Request Descriptor only keeps bit [63:12]
of the offending address. Convert it to a full address before reporting
it to device drivers.
Fixes: eb8d93ea3c ("iommu/vt-d: Report page request faults for guest SVA")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c00612d0cba10f7d0917cf1f73c945003ed4cd7 ]
The Intel VT-d driver checks wrong register to report snoop capablility
when using first level page table for GPA to HPA translation. This might
lead the IOMMU driver to say that it supports snooping control, but in
reality, it does not. Fix this by always setting PASID-table-entry.PGSNP
whenever a pasid entry is setting up for GPA to HPA translation so that
the IOMMU driver could report snoop capability as long as it runs in the
scalable mode.
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Rajesh Sankaran <rajesh.sankaran@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210330021145.13824-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d421058c815d54113d9afdf6db3f995c788cf0d ]
The VT-d specification (section 7.6) requires that the value in the
Private Data field of a Page Group Response Descriptor must match
the value in the Private Data field of the respective Page Request
Descriptor.
The private data field of a page group response descriptor is set then
immediately cleared in prq_event_thread(). This breaks the rule defined
by the VT-d specification. Fix it by moving clearing code up.
Fixes: 5b438f4ba3 ("iommu/vt-d: Support page request in scalable mode")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 655c447c97d7fe462e6cd9e15809037be028bc70 ]
In arm_smmu_gerror_handler(), the value of the SMMU_GERROR register is
filtered by GERROR_ERR_MASK. However, the GERROR_ERR_MASK does not contain
the SFM bit. As a result, the subsequent error processing is not performed
when only the SFM error occurs.
Fixes: 48ec83bcbc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
Reported-by: Rui Zhu <zhurui3@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210324081603.1074-1-thunder.leizhen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78a523fe73b81b4447beb2d6c78c9fafae24eebb ]
When supervisor/privilige mode SVM is used, we bind init_mm.pgd with
a supervisor PASID. There should not be any page fault for init_mm.
Execution request with DMA read is also not supported.
This patch checks PRQ descriptor for both unsupported configurations,
reject them both with invalid responses.
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b9abb19fa5fd2d8a4be61c6cd4b2a48aa1a17f9c ]
The device iommu probe/attach might have failed leaving dev->iommu
to NULL and device drivers may still invoke these functions resulting
in a crash in iommu vendor driver code.
Hence make sure we check that.
Fixes: a3a195929d ("iommu: Add APIs for multiple domains per device")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20210303173611.520-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b ]
Certain AMD platforms enable power gating feature for IOMMU PMC,
which prevents the IOMMU driver from updating the counter while
trying to validate the PMC functionality in the init_iommu_perf_ctr().
This results in disabling PMC support and the following error message:
"AMD-Vi: Unable to read/write to IOMMU perf counter"
To workaround this issue, disable power gating temporarily by programming
the counter source to non-zero value while validating the counter,
and restore the prior state afterward.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Tj (Elloe Linux) <ml.linux@elloe.vision>
Link: https://lore.kernel.org/r/20210208122712.5048-1-suravee.suthikulpanit@amd.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28a77185f1cd0650b664f54614143aaaa3a7a615 ]
It is incorrect to always clear PRO when it's set w/o first checking
whether the overflow condition has been cleared. Current code assumes
that if an overflow condition occurs it must have been cleared by earlier
loop. However since the code runs in a threaded context, the overflow
condition could occur even after setting the head to the tail under some
extreme condition. To be sane, we should read both head/tail again when
seeing a pending PRO and only clear PRO after all pending PRs have been
handled.
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/
Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 444d66a23c1f1e4c4d12aed4812681d0ad835d60 ]
As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response
Register"), the status code of "No PASID available" error in response to
the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error
in response to the Free PASID command.
We will otherwise see confusing kernel log under the command failure from
guest side. Fix it.
Fixes: 24f27d32ab ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210227073909.432-1-yuzenghui@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit dead723e6f049e9fb6b05e5b93456982798ea961 upstream.
When extracting the mask for a SMR that was programmed by the
bootloader, the SMR's valid bit is also extracted and is treated
as part of the mask, which is not correct. Consider the scenario
where an SMMU master whose context is determined by a bootloader
programmed SMR is removed (omitting parts of device/driver core):
->iommu_release_device()
-> arm_smmu_release_device()
-> arm_smmu_master_free_smes()
-> arm_smmu_free_sme() /* Assume that the SME is now free */
-> arm_smmu_write_sme()
-> arm_smmu_write_smr() /* Construct SMR value using mask and SID */
Since the valid bit was considered as part of the mask, the SMR will
be programmed as valid.
Fix the SMR mask extraction step for bootloader programmed SMRs
by masking out the valid bit when we know that we're already
working with a valid SMR.
Fixes: 07a7f2caaa5a ("iommu/arm-smmu-qcom: Read back stream mappings")
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Cc: stable@vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1611611545-19055-1-git-send-email-isaacm@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b8437a3ef8c485903d05d1f261328aaf0c0a6cc2 ]
Sleeping while atomic = bad. Let's fix an obvious typo to try to avoid it.
The warning that was seen (on a downstream kernel with the problematic
patch backported):
BUG: sleeping function called from invalid context at mm/page_alloc.c:4726
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9, name: ksoftirqd/0
CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.93-12508-gc10c93e28e39 #1
Call trace:
dump_backtrace+0x0/0x154
show_stack+0x20/0x2c
dump_stack+0xa0/0xfc
___might_sleep+0x11c/0x12c
__might_sleep+0x50/0x84
__alloc_pages_nodemask+0xf8/0x2bc
__arm_lpae_alloc_pages+0x48/0x1b4
__arm_lpae_map+0x124/0x274
__arm_lpae_map+0x1cc/0x274
arm_lpae_map+0x140/0x170
arm_smmu_map+0x78/0xbc
__iommu_map+0xd4/0x210
_iommu_map+0x4c/0x84
iommu_map_atomic+0x44/0x58
__iommu_dma_map+0x8c/0xc4
iommu_dma_map_page+0xac/0xf0
Fixes: d8c1df02ac7f ("iommu: Move iotlb_sync_map out from __iommu_map")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210201170611.1.I64a7b62579287d668d7c89e105dcedf45d641063@changeid
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8c1df02ac7f2c802a9b2afc0f5c888c4217f1d5 ]
In the end of __iommu_map, It alway call iotlb_sync_map.
This patch moves iotlb_sync_map out from __iommu_map since it is
unnecessary to call this for each sg segment especially iotlb_sync_map
is flush tlb all currently. Add a little helper _iommu_map for this.
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210107122909.16317-2-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 862c3715de8f3e5350489240c951d697f04bd8c9 ]
Currently gather->end is "unsigned long" which may be overflow in
arch32 in the corner case: 0xfff00000 + 0x100000(iova + size).
Although it doesn't affect the size(end - start), it affects the checking
"gather->end < end"
This patch changes this "end" to the real end address
(end = start + size - 1). Correspondingly, update the length to
"end - start + 1".
Fixes: a7d20dc19d ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210107122909.16317-5-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 29b32839725f8c89a41cb6ee054c85f3116ea8b5 upstream.
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500 ("intel-iommu: Avoid global flushes with caching
mode.")
This behavior changed after commit 13cf017446 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.
Fixes: 13cf017446 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 728da60da7c1ec1e21ae64648e376666de3c279c upstream.
Midgard GPUs have ACE-Lite master interfaces which allows systems to
integrate them in an I/O-coherent manner. It seems that from the GPU's
viewpoint, the rest of the system is its outer shareable domain, and so
even when snoop signals are wired up, they are only emitted for outer
shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
indeed get coherent pagetable walks working nicely for the coherent
T620 in the Arm Juno SoC.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8df778355378127ea7eccc9521d6427e3e48d4f2.1600780574.git.robin.murphy@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 494b3688bb11a21af12e92a344a1313486693d47 ]
An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid()
to check the address alignment. This leads to a lot of spurious kernel
warnings:
[ 485.837093] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[ 485.837098] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[ 492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed
[ 492.494147] DMAR: Invalidate non-aligned address 7f7728800000, order 11
[ 492.508965] DMAR: Invalidate non-aligned address 7f7728800000, order 11
Fix it by checking the alignment in right way.
Fixes: 288d08e780 ("iommu/vt-d: Handle non-page aligned address")
Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a44092e326d403c7878018ba532369f84d31dbfa ]
IOMMU Extended Feature Register (EFR) is used to communicate
the supported features for each IOMMU to the IOMMU driver.
This is normally read from the PCI MMIO register offset 0x30,
and used by the iommu_feature() helper function.
However, there are certain scenarios where the information is needed
prior to PCI initialization, and the iommu_feature() function is used
prematurely w/o warning. This has caused incorrect initialization of IOMMU.
This is the case for the commit 6d39bdee23 ("iommu/amd: Enforce 4k
mapping for certain IOMMU data structures")
Since, the EFR is also available in the IVHD header, and is available to
the driver prior to PCI initialization. Therefore, default to using
the IVHD EFR instead.
Fixes: 6d39bdee23 ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/20210120135002.2682-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2d6ffc63f12417b979955a5b22ad9a76d2af5de9 upstream.
The VT-d hardware will ignore those Addr bits which have been masked by
the AM field in the PASID-based-IOTLB invalidation descriptor. As the
result, if the starting address in the descriptor is not aligned with
the address mask, some IOTLB caches might not invalidate. Hence people
will see below errors.
[ 1093.704661] dmar_fault: 29 callbacks suppressed
[ 1093.704664] DMAR: DRHD: handling fault status reg 3
[ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2
fault addr 7f81c968d000 [fault reason 113]
SM: Present bit in first-level paging entry is clear
Fix this by using aligned address for PASID-based-IOTLB invalidation.
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c062db039f40e868c371c36afe8d0fac64305b5d ]
The iommu-dma constrains IOVA allocation based on the domain geometry
that the driver reports. Update domain geometry everytime a domain is
attached to or detached from a device.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 420d42f6f9db27d88bc4f83e3e668fcdacbf7e29 ]
Lock(&iommu->lock) without disabling irq causes lockdep warnings.
========================================================
WARNING: possible irq lock inversion dependency detected
5.11.0-rc1+ #828 Not tainted
--------------------------------------------------------
kworker/0:1H/120 just changed the state of lock:
ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at:
iommu_flush_dev_iotlb.part.0+0x32/0x120
but this lock took another, SOFTIRQ-unsafe lock in the past:
(&iommu->lock){+.+.}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&iommu->lock);
local_irq_disable();
lock(device_domain_lock);
lock(&iommu->lock);
<Interrupt>
lock(device_domain_lock);
*** DEADLOCK ***
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit ff2b46d7cff80d27d82f7f3252711f4ca1666129 upstream.
When irq_domain_get_irq_data() or irqd_cfg() fails
at i == 0, data allocated by kzalloc() has not been
freed before returning, which leads to memleak.
Fixes: b106ee63ab ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1efd17e7acb6692bffc6c58718f41f27fdfd62f5 upstream.
Use IS_ALIGNED() instead. Otherwise, an unaligned address will be ignored.
Fixes: 33cd6e642d ("iommu/vt-d: Flush PASID-based iotlb for iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aded8c7c2b72f846a07a2c736b8e75bb8cf50a87 ]
On SM8150 it's occasionally observed that the boot hangs in between the
writing of SMEs and context banks in arm_smmu_device_reset().
The problem seems to coincide with a display refresh happening after
updating the stream mapping, but before clearing - and there by
disabling translation - the context bank picked to emulate translation
bypass.
Resolve this by explicitly disabling the bypass context already in
cfg_probe.
Fixes: f9081b8ff593 ("iommu/arm-smmu-qcom: Implement S2CR quirk")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210106005038.4152731-1-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 9ad9f45b3b91162b33abfe175ae75ab65718dbf5 upstream.
'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.
In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: Guo Kaijie <Kaijie.Guo@intel.com>
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9081b8ff5934b8d69c748d0200e844cadd2c667 upstream.
The firmware found in some Qualcomm platforms intercepts writes to S2CR
in order to replace bypass type streams with fault; and ignore S2CR
updates of type fault.
Detect this behavior and implement a custom write_s2cr function in order
to trick the firmware into supporting bypass streams by the means of
configuring the stream for translation using a reserved and disabled
context bank.
Also circumvent the problem of configuring faulting streams by
configuring the stream as bypass.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201019182323.3162386-4-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07a7f2caaa5a2619934491bab3c47b261c554fb0 upstream.
The Qualcomm boot loader configures stream mapping for the peripherals
that it accesses and in particular it sets up the stream mapping for the
display controller to be allowed to scan out a splash screen or EFI
framebuffer.
Read back the stream mappings during initialization and make the
arm-smmu driver maintain the streams in bypass mode.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201019182323.3162386-3-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56b75b51ed6d5e7bffda59440404409bca2dff00 upstream.
The firmware found in some Qualcomm platforms intercepts writes to the
S2CR register in order to replace the BYPASS type with FAULT. Further
more it treats faults at this level as catastrophic and restarts the
device.
Add support for providing implementation specific versions of the S2CR
write function, to allow the Qualcomm driver to work around this
behavior.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201019182323.3162386-2-bjorn.andersson@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 68dd9d89eaf56dfab8d46bf25610aa4650247617 ]
Commit 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM.
This function uses the dedicated static variable inv_type_granu_table
and functions to_vtd_granularity() and to_vtd_size().
These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence,
make CC=clang W=1 warns with an -Wunused-function warning.
Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM.
Fixes: 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
- Fix interrupt table length definition for AMD IOMMU
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/QzsMQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNHjZB/9PTSzOquLp/Tinu733rLJl96lpTpOtfR9Z
HIBljKmZt86sOrnfXCMEY+u2Yc1PiZiXZSt+oMF6UqilAfT6oisK9cK5xa3t7tFC
VGPsKSNzzYc8aruBLTfBEfwGxW8xd91U1iYHyWgq2IooBFHleEwGJXWohh0esJEd
cTWPAdWdxcrFk87VyqCeaUbo5516FBoM4449P2zlfWlW71lcCO7lu9LY34LBMTs0
KtQv/0nhV3Z1Ts5MBNWgm+Euof/TzhUiRUbmhakf5DCUmGVnFmV3mAuCHSRkgEf7
tAxz2khf236bz7y++pdilCxgRW3rqH3zCxjUu9d0cLQQz7HA+pFz
=EIqB
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull iommu fix from Will Deacon:
"Fix interrupt table length definition for AMD IOMMU.
It's actually a fix for a fix, where the size of the interrupt
remapping table was increased but a related constant for the
size of the interrupt table was forgotten"
* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
According to the AMD IOMMU spec, the commit 73db2fc595
("iommu/amd: Increase interrupt remapping table limit to 512 entries")
also requires the interrupt table length (IntTabLen) to be set to 9
(power of 2) in the device table mapping entry (DTE).
Fixes: 73db2fc595 ("iommu/amd: Increase interrupt remapping table limit to 512 entries")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20201207091920.3052-1-suravee.suthikulpanit@amd.com
Signed-off-by: Will Deacon <will@kernel.org>
- Fix intel iommu driver when running on devices without VCCAP_REG
- Fix swiotlb and "iommu=pt" interaction under TXT (tboot)
- Fix missing return value check during device probe()
- Fix probe ordering for Qualcomm SMMU implementation
- Ensure page-sized mappings are used for AMD IOMMU buffers with SNP RMP
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/A3msQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNAI6B/9/MLjurPmQrSusq9Y7dhnGR7aahtICLAwR
UDZHebGTeVxNqtTklQVl/1qgcY7DxU40LAO1777MM7eHOW1FnlCAWrkTo6BBQsGx
U1FKegOJ/0eVHFPtFDfM7IA2skeLwlZW+hywNLAksme5mtd6iZG9yQLlDFqAjxL7
v8uXgfHFn6Z2MvMc2O+IeKTtflIwPek/6rYuaEf7UknA6ZYPAD3hnu9i1RTEuUAA
h2PoVrrJ/KefEsCMUIq2jwMTsSvxohDH8ClGK9b6h74J2CLKKuhALSgABRAmsEL9
7w5TVdtMQ85n3ccnXyT4RBQ+O/eVtmsKSdfeAbFI4nG9e9j7YE1L
=BI1u
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull iommu fixes from Will Deacon:
"Here's another round of IOMMU fixes for -rc6 consisting mainly of a
bunch of independent driver fixes. Thomas agreed for me to take the
x86 'tboot' fix here, as it fixes a regression introduced by a vt-d
change.
- Fix intel iommu driver when running on devices without VCCAP_REG
- Fix swiotlb and "iommu=pt" interaction under TXT (tboot)
- Fix missing return value check during device probe()
- Fix probe ordering for Qualcomm SMMU implementation
- Ensure page-sized mappings are used for AMD IOMMU buffers with SNP
RMP"
* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
iommu/vt-d: Don't read VCCAP register unless it exists
x86/tboot: Don't disable swiotlb when iommu is forced on
iommu: Check return of __iommu_attach_device()
arm-smmu-qcom: Ensure the qcom_scm driver has finished probing
iommu/amd: Enforce 4k mapping for certain IOMMU data structures
My virtual IOMMU implementation is whining that the guest is reading a
register that doesn't exist. Only read the VCCAP_REG if the corresponding
capability is set in ECAP_REG to indicate that it actually exists.
Fixes: 3375303e82 ("iommu/vt-d: Add custom allocator for IOASID")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Cc: stable@vger.kernel.org # v5.7+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.org
Signed-off-by: Will Deacon <will@kernel.org>
Currently iommu_create_device_direct_mappings() is called
without checking the return of __iommu_attach_device(). This
may result in failures in iommu driver if dev attach returns
error.
Fixes: ce574c27ae ("iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device()")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20201119165846.34180-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Robin Murphy pointed out that if the arm-smmu driver probes before
the qcom_scm driver, we may call qcom_scm_qsmmu500_wait_safe_toggle()
before the __scm is initialized.
Now, getting this to happen is a bit contrived, as in my efforts it
required enabling asynchronous probing for both drivers, moving the
firmware dts node to the end of the dtsi file, as well as forcing a
long delay in the qcom_scm_probe function.
With those tweaks we ran into the following crash:
[ 2.631040] arm-smmu 15000000.iommu: Stage-1: 48-bit VA -> 48-bit IPA
[ 2.633372] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[ 2.633402] [0000000000000000] user address but active_mm is swapper
[ 2.633409] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 2.633415] Modules linked in:
[ 2.633427] CPU: 5 PID: 117 Comm: kworker/u16:2 Tainted: G W 5.10.0-rc1-mainline-00025-g272a618fc36-dirty #3971
[ 2.633430] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 2.633448] Workqueue: events_unbound async_run_entry_fn
[ 2.633456] pstate: 80c00005 (Nzcv daif +PAN +UAO -TCO BTYPE=--)
[ 2.633465] pc : qcom_scm_qsmmu500_wait_safe_toggle+0x78/0xb0
[ 2.633473] lr : qcom_smmu500_reset+0x58/0x78
[ 2.633476] sp : ffffffc0105a3b60
...
[ 2.633567] Call trace:
[ 2.633572] qcom_scm_qsmmu500_wait_safe_toggle+0x78/0xb0
[ 2.633576] qcom_smmu500_reset+0x58/0x78
[ 2.633581] arm_smmu_device_reset+0x194/0x270
[ 2.633585] arm_smmu_device_probe+0xc94/0xeb8
[ 2.633592] platform_drv_probe+0x58/0xa8
[ 2.633597] really_probe+0xec/0x398
[ 2.633601] driver_probe_device+0x5c/0xb8
[ 2.633606] __driver_attach_async_helper+0x64/0x88
[ 2.633610] async_run_entry_fn+0x4c/0x118
[ 2.633617] process_one_work+0x20c/0x4b0
[ 2.633621] worker_thread+0x48/0x460
[ 2.633628] kthread+0x14c/0x158
[ 2.633634] ret_from_fork+0x10/0x18
[ 2.633642] Code: a9034fa0 d0007f73 29107fa0 91342273 (f9400020)
To avoid this, this patch adds a check on qcom_scm_is_available() in
the qcom_smmu_impl_init() function, returning -EPROBE_DEFER if its
not ready.
This allows the driver to try to probe again later after qcom_scm has
finished probing.
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andy Gross <agross@kernel.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-arm-msm <linux-arm-msm@vger.kernel.org>
Link: https://lore.kernel.org/r/20201112220520.48159-1-john.stultz@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>