At one point, the GPU command verifier and user-space handle manager
couldn't properly protect GPU clients from accessing each other's data.
Instead there was an elaborate mechanism to make sure only the active
master's primary clients could render. The other clients were either
put to sleep or even killed (if the master had exited). VRAM was
evicted on master switch. With the advent of render-node functionality,
we relaxed the VRAM eviction, but the other mechanisms stayed in place.
Now that the GPU command verifier and ttm object manager properly
isolate primary clients from different master realms we can remove the
master switch related code and drop those legacy features.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Add the callbacks necessary to implement emulated coherent memory for
surfaces. Add a flag to the gb_surface_create ioctl to indicate that
surface memory should be coherent.
Also bump the drm minor version to signal the availability of coherent
surfaces.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Similar to write-coherent resources, make sure that from the user-space
point of view, GPU rendered contents is automatically available for
reading by the CPU.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
With emulated coherent memory we need to be able to quickly look up
a resource from the MOB offset. Instead of traversing a linked list with
O(n) worst case, use an RBtree with O(log n) worst case complexity.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
This infrastructure will, for coherent resources, make sure that
from the user-space point of view, data written by the CPU is immediately
automatically available to the GPU at resource validation time.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
TTM provides a means to assign eviction priorities to buffer object. This
means that all buffer objects with a lower priority will be evicted first
on memory pressure.
Use this to make sure surfaces and in particular non-dirty surfaces are
evicted first. Evicting in particular shaders, cotables and contexts imply
a significant performance hit on vmwgfx, so make sure these resources are
evicted last.
Some buffer objects are sub-allocated in user-space which means we can have
many resources attached to a single buffer object or resource. In that case
the buffer object is given the highest priority of the attached resources.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Use struct sg_dma_page_iter in favour struct of sg_page_iter, which fairly
recently was declared useless for obtaining dma addresses.
With a struct sg_dma_page_iter we can't call sg_page_iter_page() so
when the page is needed, use the same page lookup mechanism as for the
non-sg dma modes instead of calling sg_dma_page_iter.
Note, the fixes tag doesn't really point to a commit introducing a
failure / regression, but rather to a commit that implemented a simple
workaround for this problem.
Cc: Jason Gunthorpe <jgg@mellanox.com>
Fixes: d901b2760d ("lib/scatterlist: Provide a DMA page iterator")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Whenever FIFO allocation fails an error message is printed to dmesg.
Since this is common operation a lot of similar messages are scattered
everywhere. Use preprocessor macro to remove this cluttering.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Error messages or debugging message reported during user-space command
submission should not be printed to dmesg by default. So add a new
preprocessor define called VMW_DEBUG_USER which translates to
DRM_DEBUG_DRIVER.
v2: Use VMW_DEBUG_USER instead of using DRM_DEBUG_DRIVER directly.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Currently we flag resources as dirty (GPU contents not yet read back to
the backing MOB) whenever they have been part of a command stream.
Obviously many resources can't be dirty and others can only be dirty when
written to by the GPU. That is when they are either bound to the context as
render-targets, depth-stencil, copy / clear destinations and
stream-output targets, or similarly when there are corresponding views into
them.
So mark resources dirty only in these special cases. Context- and cotable
resources are always marked dirty when referenced.
This is important for upcoming emulated coherent memory, since we can avoid
issuing automatic readbacks to non-dirty resources when the CPU tries to
access part of the backing MOB.
Testing: Unigine Heaven with max GPU memory set to 256MB resulting in
heavy resource thrashing.
---
v2: Addressed review comments by Deepak Rawat.
v3: Added some documentation
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Most TTM drivers define the constant DRM_FILE_PAGE_OFFSET of the same
value. The only exception is vboxvideo, which is being converted to the
new offset by this patch. Unifying the constants in a single place
simplifies the driver code.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function ttm_bo_put releases a reference to a TTM buffer object. The
function's name is more aligned to the Linux kernel convention of naming
ref-counting function _get and _put.
A call to ttm_bo_unref takes the address of the TTM BO object's pointer and
clears the pointer's value to NULL. This is not necessary in most cases and
sometimes even worked around by the calling code. A call to ttm_bo_put only
releases the reference without clearing the pointer.
In places where is might be necessary, the current behaviour of cleaning the
pointer is kept.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function ttm_bo_get acquires a reference on a TTM buffer object. The
function's name is more aligned to the Linux kernel convention of naming
ref-counting function _get and _put.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With the new validation code, a malicious user-space app could
potentially submit command streams with enough buffer-object and resource
references in them to have the resulting allocated validion nodes and
relocations make the kernel run out of GFP_KERNEL memory.
Protect from this by having the validation code reserve TTM graphics
memory when allocating.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
---
v2: Removed leftover debug printouts
This fixes a layout update race condition. We make sure
the crtc mutex is locked before we dereference crtc->state. Otherwise the
state might change under us.
Since now we're already holding the crtc mutexes when reading the gui
coordinates, protect them with the crtc mutexes rather than with the
requested_layout mutex.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Make the connector is_implicit property immutable.
As far as we know, no user-space application is writing to it.
Also move the verification that all implicit display units scan out
from the same framebuffer to atomic_check().
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Make sure that the global BO state is always correctly initialized.
This allows removing all the device code to initialize it.
v2: fix up vbox (Alex)
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the name says we only need one global instance of ttm_mem_global.
Drop all the driver initialization and just use a single exported
instance which is initialized during BO global initialization.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the process of looking up a user resource and adding it to the
validation list reference-free unless when it's actually added to the
validation list where a single reference is taken.
This saves two locked atomic operations per command stream buffer object
handle lookup, unless there is a lookup cache hit.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Identically to how we look up ttm base objects witout reference, provide
the same functionality to vmw user buffer objects which derive from them.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
This field was previously used to prevent a lookup of a resource before its
constructor had run to its end. This was mainly intended for an interface
that is now removed that allowed looking up a resource by its device id.
Currently all affected resources are added to the lookup mechanism (its
TTM prime object is initialized) late in the constructor where it's OK to
look up the resource.
This means we can change the device resource_lock to an ordinary spinlock
instead of an rwlock and remove a locking sequence during lookup.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
A common trait of these objects are that they are allocated during the
command validation phase and freed after command submission. Furthermore
they are accessed by a single thread only. So provide a simple unprotected
stack-like allocator from which these objects can be allocated. Their
memory is freed with the validation context when the command submission
is done.
Note that the mm subsystem maintains a per-cpu cache of single pages to
make single page allocation and freeing efficient.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Strip the old execbuf validation functionality and use the new API instead.
Also use the new API for a now removed execbuf function that was called
from the kms code.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Allow selecting interruptible or uninterruptible waits to match
expectations of callers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
No other driver is using this functionality so move it out of TTM and
into the vmwgfx driver. Update includes and remove exports.
Also annotate to remove false static analyzer lock balance warnings.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
A new param DRM_VMW_PARAM_SM4_1, is added for user space to determine
availability of SM4.1.
Minor version bump for SM4.1.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Support for SVGA3D_SURFACE_MULTISAMPLE and surface mob size according
to sample count.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
New ioctls DRM_VMW_GB_SURFACE_CREATE_EXT and DRM_VMW_GB_SURFACE_REF_EXT
are added which support 64-bit wide svga device surface flags, quality
level and multisample pattern.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Since svga device introduced new 64bit SVGA3dSurfaceAllFlags, vmwgfx
now stores the surface flags internally as SVGA3dSurfaceAllFlags.
For legacy surface define commands, only lower 32-bit is used.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
SVGA device added new command SVGA3dCmdDefineGBSurface_v3 which allows
64-bit SVGA3dSurfaceAllFlags. This commit adds support for
SVGA3dCmdDefineGBSurface_v3 command in vmwgfx.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
A boolean flag in device private structure to specify if the device
support SM4_1.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
The device exposes a new capability register. Add support for it.
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Make the host message module function declarations similar to the other
declarations in vmwgfx_drv.h and include the header in vmwgfx_msg.c
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
To avoid race condition between update_layout ioctl and modeset ioctl
for access to gui_x/y positioning added a new mutex
requested_layout_mutex.
Also used drm_for_each_connector_iter to iterate over connector list.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
It makes more sense to have all the buffer object related code in
a single file rather than splitting it up between the resource code
and buffer object pinning utilities.
Place all buffer object related code in vmwgfx_bo.c. Fix up headers
and export resource functionality when needed in the buffer object
code.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Initially vmware buffer objects were only used as DMA buffers, so the name
DMA buffer was a natural one. However, currently they are used also as
dumb buffers and MOBs backing guest backed objects so renaming them to
buffer objects is logical. Particularly since there is a dmabuf subsystem
in the kernel where a dma buffer means something completely different.
This also renames user-space api structures and IOCTL names
correspondingly, but the old names remain defined for now and the ABI
hasn't changed.
There are a couple of minor style changes to make checkpatch happy.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
This is dual licensed under GPL-2.0 or MIT.
vmwgfx_msg.h is the odd one out that is GPL-2.0+ or MIT.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dirk Hohndel (VMware) <dirk@hohndel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180506231626.115996-9-dirk@hohndel.org
We have had problems displaying fbdev after a resume and as a
workaround we have had to call vmw_fb_refresh(). This has had
a number of unwanted side-effects. The root of the problem was,
however that the coalesced fbdev dirty region was not empty on
the first dirty_mark() after a resume, so a flush was never
scheduled.
Fix this by force scheduling an fbdev flush after resume, and
remove the workaround.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJauCZfAAoJEHm+PkMAQRiGWGUH/2rhdQDkoJpYWnjaQkolECG8
MxpGE7nmIIHxQcbSDdHTGJ8IhVm6Z5wZ7ym/PwCDTT043Y1y341sJrIwL2/nTG6d
HVidk8hFvgN6QzlzVAHT3ZZMII/V9Zt+VV5SUYLGnPAVuJNHo/6uzWlTU5g+NTFo
IquFDdQUaGBlkKqby+NoAFnkV1UAIkW0g22cfvPnlO5GMer0gusGyVNvVp7TNj3C
sqj4Hvt3RMDLMNe9RZ2pFTiOD096n8FWpYftZneUTxFImhRV3Jg5MaaYZm9SI3HW
tXrv/LChT/F1mi5Pkx6tkT5Hr8WvcrwDMJ4It1kom10RqWAgjxIR3CMm448ileY=
=YKUG
-----END PGP SIGNATURE-----
Backmerge tag 'v4.16-rc7' into drm-next
Linux 4.16-rc7
This was requested by Daniel, and things were getting
a bit hard to reconcile, most of the conflicts were
trivial though.
It was used to early block fbdev dirty processing. Replace it with an
unprotected check of the par->dirty.active field. While this might
race with the vmw_fb_off() function, we do a protected check later so
the race will at worst lead to grabbing and releasing a couple of locks.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Make it possible to hibernate also with masters that don't switch VT at
hibernation time. We save and restore modesetting state unless fbdev is
active and enabled at hibernation time.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
fbdev framebuffers were previously pinned to be able to keep them mapped
across updates.
This commit introduces a mechanism that instead revalidates the map on
each update, keeping the map cached across updates. The cached map is torn
down if the underlying pages change. Typically on buffer object moves and
swapouts.
This should be nicer to the system when we have resource contention.
Testing done: Basic fbdev functionality under Fedora 27.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
This blit was previously performed using two large vmaps, one of which
was teared down and remapped on each blit. Use the more resource-
conserving TTM cpu blit instead.
The blit is used in boundary-box computing mode which makes it possible
to minimize the bounding box used in host operations.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
The utility uses kmap_atomic() instead of vmapping the whole buffer
object. As a result there will be more book-keeping but on some
architectures this will help avoid exhausting vmalloc space and also
avoid expensive TLB flushes.
The blit utility also adds a provision to compute a bounding box of
changed content, which is very useful to optimize presentation speed
of ill-behaved applications that don't supply proper damage regions, and
for page-flips. The cost of computing the bounding box is not that
expensive when done in a cpu-blit utility like this.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
When we are running without fbdev, transitioning from the login screen to
X or gnome-shell/wayland will cause a vt switch and the driver will disable
svga mode, losing all modesetting resources. However, the kms atomic state
does not reflect that and may think that a crtc is still turned on, which
will cause device errors when we try to bind an fb to the crtc, and the
screen will remain black.
Fix this by turning off all kms resources before disabling svga mode.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>