Including (in no particular order):
- Page table code for AMD IOMMU now supports large pages where
smaller page-sizes were mapped before. VFIO had to work around
that in the past and I included a patch to remove it (acked by
Alex Williamson)
- Patches to unmodularize a couple of IOMMU drivers that would
never work as modules anyway.
- Work to unify the the iommu-related pointers in
'struct device' into one pointer. This work is not finished
yet, but will probably be in the next cycle.
- NUMA aware allocation in iommu-dma code
- Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver
- Scalable mode support for the Intel VT-d driver
- PM runtime improvements for the ARM-SMMU driver
- Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom
- Various smaller fixes and improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJcKkEoAAoJECvwRC2XARrjCCoQAJxsgaAF5Z0s7z8j2A9SkaGp
SIMnUAI5mDOdyhTOAI+eehpRzg5UVYt/JjFYnHz8HWqbSc8YOvDvHafmhMFIwYvO
hq5knbs6ns2jJNFO+M4dioDq+3THdqkGIF5xoHdGTP7cn9+XyQ8lAoHo0RuL122U
PJGqX7Cp4XnFP4HMb3uQYhVeBV7mU+XqAdB+4aDnQkzI5LkQCRr74GcqOm+Rlnyc
cmQWc2arUMjgc1TJIrex8dx9dT6lq8kOmhyEg/IjHeGaZyJ3HqA+30XDDLEExN0G
MeVawuxJz40HgXlkXr+iZTQtIFYkXdKvJH6rptMbOfbDeDz+YZ01TbtAMMH9o4jX
yxjjMjdcWTsWYQ/MHHdsoMP34cajCi/EYPMNksbycw+E3Y+X/bSReCoWC0HUK8/+
Z4TpZ9mZVygtJR+QNZ+pE9oiJpb4sroM10zTnbMoVHNnvfsO01FYk7FMPkolSKLw
zB4MDswQYgchoFR9Z4ZB4PycYTzeafLKYgDPDoD1vIJgDavuidwvDWDRTDc+aMWM
siIIewq19To9jDJkVjX4dsT/p99KVKgAR/Ps6jjWkAroha7g6GcmlYZHIJnyop04
jiaSXUsk8aRucP/CRz5xdMmaGoN7BsNmpUjcrquc6Povk/6gvXvpY04oCs1+gNMX
ipL9E3GTFCVBubRFrksv
=DT9A
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- Page table code for AMD IOMMU now supports large pages where smaller
page-sizes were mapped before. VFIO had to work around that in the
past and I included a patch to remove it (acked by Alex Williamson)
- Patches to unmodularize a couple of IOMMU drivers that would never
work as modules anyway.
- Work to unify the the iommu-related pointers in 'struct device' into
one pointer. This work is not finished yet, but will probably be in
the next cycle.
- NUMA aware allocation in iommu-dma code
- Support for r8a774a1 and r8a774c0 in the Renesas IOMMU driver
- Scalable mode support for the Intel VT-d driver
- PM runtime improvements for the ARM-SMMU driver
- Support for the QCOM-SMMUv2 IOMMU hardware from Qualcom
- Various smaller fixes and improvements
* tag 'iommu-updates-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (78 commits)
iommu: Check for iommu_ops == NULL in iommu_probe_device()
ACPI/IORT: Don't call iommu_ops->add_device directly
iommu/of: Don't call iommu_ops->add_device directly
iommu: Consolitate ->add/remove_device() calls
iommu/sysfs: Rename iommu_release_device()
dmaengine: sh: rcar-dmac: Use device_iommu_mapped()
xhci: Use device_iommu_mapped()
powerpc/iommu: Use device_iommu_mapped()
ACPI/IORT: Use device_iommu_mapped()
iommu/of: Use device_iommu_mapped()
driver core: Introduce device_iommu_mapped() function
iommu/tegra: Use helper functions to access dev->iommu_fwspec
iommu/qcom: Use helper functions to access dev->iommu_fwspec
iommu/of: Use helper functions to access dev->iommu_fwspec
iommu/mediatek: Use helper functions to access dev->iommu_fwspec
iommu/ipmmu-vmsa: Use helper functions to access dev->iommu_fwspec
iommu/dma: Use helper functions to access dev->iommu_fwspec
iommu/arm-smmu: Use helper functions to access dev->iommu_fwspec
ACPI/IORT: Use helper functions to access dev->iommu_fwspec
iommu: Introduce wrappers around dev->iommu_fwspec
...
A huge update this time, but a lot of that is just consolidating or
removing code:
- provide a common DMA_MAPPING_ERROR definition and avoid indirect
calls for dma_map_* error checking
- use direct calls for the DMA direct mapping case, avoiding huge
retpoline overhead for high performance workloads
- merge the swiotlb dma_map_ops into dma-direct
- provide a generic remapping DMA consistent allocator for architectures
that have devices that perform DMA that is not cache coherent. Based
on the existing arm64 implementation and also used for csky now.
- improve the dma-debug infrastructure, including dynamic allocation
of entries (Robin Murphy)
- default to providing chaining scatterlist everywhere, with opt-outs
for the few architectures (alpha, parisc, most arm32 variants) that
can't cope with it
- misc sparc32 dma-related cleanups
- remove the dma_mark_clean arch hook used by swiotlb on ia64 and
replace it with the generic noncoherent infrastructure
- fix the return type of dma_set_max_seg_size (Niklas Söderlund)
- move the dummy dma ops for not DMA capable devices from arm64 to
common code (Robin Murphy)
- ensure dma_alloc_coherent returns zeroed memory to avoid kernel data
leaks through userspace. We already did this for most common
architectures, but this ensures we do it everywhere.
dma_zalloc_coherent has been deprecated and can hopefully be
removed after -rc1 with a coccinelle script.
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov
ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb
j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer
TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj
JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG
8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl
xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs
HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF
JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq
apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR
XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ
WgA=
=in72
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping
Pull DMA mapping updates from Christoph Hellwig:
"A huge update this time, but a lot of that is just consolidating or
removing code:
- provide a common DMA_MAPPING_ERROR definition and avoid indirect
calls for dma_map_* error checking
- use direct calls for the DMA direct mapping case, avoiding huge
retpoline overhead for high performance workloads
- merge the swiotlb dma_map_ops into dma-direct
- provide a generic remapping DMA consistent allocator for
architectures that have devices that perform DMA that is not cache
coherent. Based on the existing arm64 implementation and also used
for csky now.
- improve the dma-debug infrastructure, including dynamic allocation
of entries (Robin Murphy)
- default to providing chaining scatterlist everywhere, with opt-outs
for the few architectures (alpha, parisc, most arm32 variants) that
can't cope with it
- misc sparc32 dma-related cleanups
- remove the dma_mark_clean arch hook used by swiotlb on ia64 and
replace it with the generic noncoherent infrastructure
- fix the return type of dma_set_max_seg_size (Niklas Söderlund)
- move the dummy dma ops for not DMA capable devices from arm64 to
common code (Robin Murphy)
- ensure dma_alloc_coherent returns zeroed memory to avoid kernel
data leaks through userspace. We already did this for most common
architectures, but this ensures we do it everywhere.
dma_zalloc_coherent has been deprecated and can hopefully be
removed after -rc1 with a coccinelle script"
* tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits)
dma-mapping: fix inverted logic in dma_supported
dma-mapping: deprecate dma_zalloc_coherent
dma-mapping: zero memory returned from dma_alloc_*
sparc/iommu: fix ->map_sg return value
sparc/io-unit: fix ->map_sg return value
arm64: default to the direct mapping in get_arch_dma_ops
PCI: Remove unused attr variable in pci_dma_configure
ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled
dma-mapping: bypass indirect calls for dma-direct
vmd: use the proper dma_* APIs instead of direct methods calls
dma-direct: merge swiotlb_dma_ops into the dma_direct code
dma-direct: use dma_direct_map_page to implement dma_direct_map_sg
dma-direct: improve addressability error reporting
swiotlb: remove dma_mark_clean
swiotlb: remove SWIOTLB_MAP_ERROR
ACPI / scan: Refactor _CCA enforcement
dma-mapping: factor out dummy DMA ops
dma-mapping: always build the direct mapping code
dma-mapping: move dma_cache_sync out of line
dma-mapping: move various slow path functions out of line
...
Avoid expensive indirect calls in the fast path DMA mapping
operations by directly calling the dma_direct_* ops if we are using
the directly mapped DMA operations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
With the new validation code, a malicious user-space app could
potentially submit command streams with enough buffer-object and resource
references in them to have the resulting allocated validion nodes and
relocations make the kernel run out of GFP_KERNEL memory.
Protect from this by having the validation code reserve TTM graphics
memory when allocating.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
---
v2: Removed leftover debug printouts
[airlied: make etnaviv build again]
amdgpu:
- DC trace support
- More DC documentation
- XGMI hive reset support
- Rework IH interaction with KFD
- Misc fixes and cleanups
- Powerplay updates for newer polaris variants
- Add cursor plane update fast path
- Enable gpu reset by default on CI parts
- Fix config with KFD/HSA not enabled
amdkfd:
- Limit vram overcommit
- dmabuf support
- Support for doorbell BOs
ttm:
- Support for simultaneous submissions to multiple engines
scheduler:
- Add helpers for hw with preemption support
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207233119.16861-1-alexander.deucher@amd.com
The return statement is redundant as there is a return statement
immediately before it so we have dead code that can be removed.
Also remove the unused declaration of ret.
Detected by CoverityScan, CID#1473793 ("Structurally dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Let's support simultaneous submissions to multiple engines.
v2: rename the field to num_shared and fix up all users
v3: rebased
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Our wrappers don't do anything useful anymore except calling the
atomic helpers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c: In function 'vmw_event_fence_action_seq_passed':
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c:909:19: warning:
variable 'file_priv' set but not used [-Wunused-but-set-variable]
struct drm_file *file_priv;
It not used any more since
commit fb740cf249 ("drm: Create drm_send_event helpers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
The return statement is redundant as there is a return statement
immediately before it so we have dead code that can be removed.
Also remove the unused declaration of ret.
Detected by CoverityScan, CID#1473793 ("Structurally dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
This fixes a layout update race condition. We make sure
the crtc mutex is locked before we dereference crtc->state. Otherwise the
state might change under us.
Since now we're already holding the crtc mutexes when reading the gui
coordinates, protect them with the crtc mutexes rather than with the
requested_layout mutex.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Make the connector is_implicit property immutable.
As far as we know, no user-space application is writing to it.
Also move the verification that all implicit display units scan out
from the same framebuffer to atomic_check().
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
With kernel commit "drm/modes: Kill off the oddball DRM_MODE_TYPE_CRTC_C
vs. DRM_MODE_TYPE_BUILTIN handling", no need to clear mode::type for
user-space bug.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
USe new atomic helper for dirty fb IOCTL which make use of damage
interface. Note that this is only done for STDU and SOU, for legacy
display unit still using old interface.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
SOU primary plane now support damage clips, enable it for user-space.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Update comments to sync with code.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
With new interface to do plane update on SOU available, use that instead
of old kms_dirty.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Using the new interface implement SOU plane update for BO backed fb.
v2: Rebase to new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Using the new interface implement SOU plane update for surface backed
fb.
v2: Rebase to new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
STDU primary plane now support damage clips, enable it for user-space.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Update the comments to sync with code.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
With new interface to do plane update on STDU available, use that
instead of old kms_dirty.
v2: Use fence from new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Using the new interface implement STDU plane update for BO backed fb.
v2: Rebase to new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Using the new interface implement STDU plane update for surface backed
fb.
v2: Rebase to new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Add a new struct vmw_du_update_plane similar to vmw_kms_dirty which
represent the flow of operations needed to update a display unit from
surface or bo (blit a new framebuffer).
v2:
- Kernel doc correction.
- Rebase.
v3: Rebase to new resource validation.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
New features for 4.21:
amdgpu:
- Support for SDMA paging queue on vega
- Put compute EOP buffers into vram for better performance
- Share more code with amdkfd
- Support for scanout with DCC on gfx9
- Initial kerneldoc for DC
- Updated SMU firmware support for gfx8 chips
- Rework CSA handling for eventual support for preemption
- XGMI PSP support
- Clean up RLC handling
- Enable GPU reset by default on VI, SOC15 dGPUs
- Ring and IB test cleanups
amdkfd:
- Share more code with amdgpu
ttm:
- Move global init out of the drivers
scheduler:
- Track if schedulers are ready for work
- Timeout/fault handling changes to facilitate GPU recovery
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114165113.3751-1-alexander.deucher@amd.com
Commit e61d98d8da ("x64, x2apic/intr-remap: Intel vt-d, IOMMU
code reorganization") moved dma_remapping.h from drivers/pci/ to
current place. It is entirely VT-d specific, but uses a generic
name. This merges dma_remapping.h with include/linux/intel-iommu.h
and removes dma_remapping.h as the result.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Liu, Yi L <yi.l.liu@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make sure that the global BO state is always correctly initialized.
This allows removing all the device code to initialize it.
v2: fix up vbox (Alex)
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the name says we only need one global instance of ttm_mem_global.
Drop all the driver initialization and just use a single exported
instance which is initialized during BO global initialization.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The functions ttm_bo_global_init() and ttm_bo_global_release() do not
receive an argument of type struct ttm_bo_global. Both take a struct
drm_global_reference that contains points to a struct ttm_bo_global_ref.
Renaming them reflects this.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4.19 is out, Lyude asked for a backmerge, and it's been a while. All
very good reasons on their own :-)
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Use the correct helper and also return early on helper
success rather than on helper failure.
Also explicitly return 0 in the case of no fb.
v2: Check for !fb after updating state->visible (Ville).
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> (v1)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: VMware Graphics <linux-graphics-maintainer@vmware.com>
Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181004202446.22905-17-daniel.vetter@ffwll.ch
It's the default. The exported version was kinda a transition state,
before we made this the default.
To stop new atomic drivers from using it (instead of just relying on
the default) let's unexport it.
v2: rename the default implementation to a more fitting name and add a
comment (Laurent)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Cc: VMware Graphics <linux-graphics-maintainer@vmware.com>
Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Archit Taneja <architt@codeaurora.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Jernej Skrabec <jernej.skrabec@siol.net>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Pierre-Hugues Husson <phh@phh.me>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181004202446.22905-3-daniel.vetter@ffwll.ch
Mostly code reorganizations and optimizations for vmwgfx.
- Move TTM code that's only used by vmwgfx to vmwgfx
- Break out the vmwgfx buffer- and resource validation code to a separate source file
- Get rid of a number of atomic operations during command buffer validation.
From: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180928131157.2810-1-thellstrom@vmware.com
Make the process of looking up a user resource and adding it to the
validation list reference-free unless when it's actually added to the
validation list where a single reference is taken.
This saves two locked atomic operations per command stream buffer object
handle lookup, unless there is a lookup cache hit.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
The typical pattern of these lookups are
-Lookup
-Put on validate list if not already there.
-Unreference
And since we are the exclusive user of the context during lookup time,
we can be sure that the resource will stay alive during the sequence.
So avoid taking a reference during lookup, and also avoid unreferencing
when done. There are two users outside of command buffer validation and
those are refcounted explicitly.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
The typical pattern of these lookups are
-Lookup
-Put on validate list if not already there.
-Unreference
And since we are the exclusive user of the context during lookup time,
we can be sure that the resource will stay alive during the sequence.
So avoid taking a reference during lookup, and also avoid unreferencing
when done.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Make the process of looking up a buffer object and adding it to the
validation list reference-free unless when it's actually added to the
validation list where a single reference is taken.
This saves two locked atomic operations per command stream buffer object
handle lookup.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Identically to how we look up ttm base objects witout reference, provide
the same functionality to vmw user buffer objects which derive from them.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Adapt the validation code so that vmw_validation_add[res|bo] can be called
under an rcu read lock (non-sleeping) and with rcu-only protected resource-
or buffer object pointers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Typically when we look up objects under the rcu lock, we take a reference
to make sure the returned object pointer is valid.
Now provide a function to look up an object and instead of taking a
reference to it, keep the rcu lock held when returning the object pointer.
This means that the object pointer is valid as long as the rcu lock is
held, but the object may be doomed (its refcount may be zero). Any
persistent usage of the object pointer outside of the rcu lock requires
a reference to be taken using kref_get_unless_zero().
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Instead of generating user-space object handles based on a, possibly
processed, hash of the kernel address of the object, use idr to generate
and lookup those handles. This might improve somewhat on security since
we loose all connections to the object's kernel address. Also idr is
designed to do just this.
As a todo-item, since user-space handles are now generated in sequence,
we can probably use a much simpler hash function to hash them.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
We were checking that the resource destructor matched that of the
intended object type, to make sure the looked up resource was of the
right type.
But we already have an object type check in place which makes sure the
resource is of the right type.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
This field was previously used to prevent a lookup of a resource before its
constructor had run to its end. This was mainly intended for an interface
that is now removed that allowed looking up a resource by its device id.
Currently all affected resources are added to the lookup mechanism (its
TTM prime object is initialized) late in the constructor where it's OK to
look up the resource.
This means we can change the device resource_lock to an ordinary spinlock
instead of an rwlock and remove a locking sequence during lookup.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Replace instances of WARN_ON[_ONCE](!mutex_is_held()) with
lockdep_assert_held(). This makes sure the checking process actually
holds the mutex and also removes the checks from release builds
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
With the new allocator this leads to less consumed memory for each
user-space command submission
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
A common trait of these objects are that they are allocated during the
command validation phase and freed after command submission. Furthermore
they are accessed by a single thread only. So provide a simple unprotected
stack-like allocator from which these objects can be allocated. Their
memory is freed with the validation context when the command submission
is done.
Note that the mm subsystem maintains a per-cpu cache of single pages to
make single page allocation and freeing efficient.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Strip the old KMS helpers and use the new validation interface also in
the modesetting code.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com> #v1
Reviewed-by: Sinclair Yeh <syeh@vmware.com>