Defer the transfer from the client's timeline onto the execution
timeline from the point of readiness to the point of actual submission.
For example, in execlists, a request is finally submitted to hardware
when the hardware is ready, and only put onto the hardware queue when
the request is ready. By deferring the transfer, we ensure that the
timeline is maintained in retirement order if we decide to queue the
requests onto the hardware in a different order than fifo.
v2: Rebased onto distinct global/user timeline lock classes.
v3: Play with the position of the spin_lock().
v4: Nesting finally resolved with distinct sw_fence lock classes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-4-chris@chris-wilson.co.uk
In order to support deferred scheduling, we need to differentiate
between when the request is ready to run (i.e. the submit fence is
signaled) and when the request is actually run (a new execute fence).
This is typically split between the request itself wanting to wait upon
others (for which we use the submit fence) and the CPU wanting to wait
upon the request, for which we use the execute fence to be sure the
hardware is ready to signal completion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-3-chris@chris-wilson.co.uk
In order to simplify the lockdep annotation, as they become more complex
in the future with deferred execution and multiple paths through the
same functions, create a separate lockclass for the user timeline and
the hardware execution timeline.
We should only ever be locking the user timeline and the execution
timeline in parallel so we only need to create two lock classes, rather
than a separate class for every timeline.
v2: Rename the lock classes to be more consistent with other lockdep.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-2-chris@chris-wilson.co.uk
Localise the static struct lock_class_key to the caller of
i915_sw_fence_init() so that we create a lock_class instance for each
unique sw_fence rather than all sw_fences sharing the same
lock_class. This eliminate some lockdep false positive when using fences
from within fence callbacks.
For the relatively small number of fences currently in use [2], this adds
160 bytes of unused text/code when lockdep is disabled. This seems
quite high, but fully reducing it via ifdeffery is also quite ugly.
Removing the #fence strings saves 72 bytes with just a single #ifdef.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-1-chris@chris-wilson.co.uk
On pre-gen4 we connect plane A to pipe B and vice versa to get an FBC
capable plane feeding the LVDS port by default. We have the logic for
the plane swapping duplicated in many places. Let's remove a bit of the
duplication by having the crtc look up the thing from the primary plane.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478616439-10150-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
The primary and sprite planes on CHV pipe B support horizontal
mirroring. Expose it to the world.
Sadly the hardware ignores the mirror bit when the rotate bit is
set, so we'll have to reject the 180+X case.
v2: Drop the BIT()
v3: Pass dev_priv instead of dev to IS_CHERRYVIEW()
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-4-git-send-email-ville.syrjala@linux.intel.com
Move the plane control register rotation setup away from the
coordinate munging code. This will result in neater looking
code once we add reflection support for CHV.
v2: Drop the BIT(), drop some usless parens,
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-3-git-send-email-ville.syrjala@linux.intel.com
Using == to check for 180 degree rotation only works as long as the
reflection bits aren't set. That will change soon enough for CHV, so
let's stop doing things the wrong way.
v2: Drop the BIT()
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479142440-25283-2-git-send-email-ville.syrjala@linux.intel.com
It only has two checks now, so it makes sense to just move the code to
the caller.
Also take this opportunity to make no_fbc_reason make more sense: now
we'll only list "no suitable CRTC for FBC" instead of maybe giving a
reason why the last CRTC we checked was not selected, and we'll more
consistently set the reason (e.g., if no primary planes are visible).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478883461-20201-5-git-send-email-paulo.r.zanoni@intel.com
When supplying a view to vma_compare() it is required that the supplied
i915_address_space is the global GTT. I tested the VMA instead (which is
the current position in the rbtree and maybe from any address space).
(This reapplies commit a44342acde ("drm/i915: Fix test on inputs for
vma_compare()") as it was lost in the vma split)
Reported-by: Matthew Auld <matthew.auld@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579
Fixes: db6c2b4151 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: b42fe9ca0a ("drm/i915: Split out i915_vma.c")
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The previous spec version said "double Ytile planes minimum lines",
and I interpreted this as referring to what the spec calls "Y tile
minimum", but in fact it was referring to what the spec calls "Minimum
Scanlines for Y tile". I noticed that Mahesh Kumar had a different
interpretation, so I sent and email to the spec authors and got
clarification on the correct meaning. Also, BSpec was updated and
should be clear now.
Fixes: ee3d532fcb ("drm/i915/gen9: unconditionally apply the memory bandwidth WA")
Cc: stable@vger.kernel.org
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478636531-6081-1-git-send-email-paulo.r.zanoni@intel.com
My heuristic for detecting type 1 DVI DP++ adaptors based on the VBT
port information apparently didn't survive the reality of buggy VBTs.
In this particular case we have a machine with a natice HDMI port, but
the VBT tells us it's a DP++ port based on its capabilities.
The dvo_port information in VBT does claim that we're dealing with a
HDMI port though, but we have other machines which do the same even
when they actually have DP++ ports. So that piece of information alone
isn't sufficient to tell the two apart.
After staring at a bunch of VBTs from various machines, I have to
conclude that the only other semi-reliable clue we can use is the
presence of the AUX channel in the VBT. On this particular machine
AUX channel is specified as zero, whereas on some of the other machines
which listed the DP++ port as HDMI have a non-zero AUX channel.
I've also seen VBTs which have dvo_port a DP but have a zero AUX
channel. I believe those we need to treat as DP ports, so we'll limit
the AUX channel check to just the cases where dvo_port is HDMI.
If we encounter any more serious failures with this heuristic I think
we'll have to have to throw it out entirely. But that could mean that
there is a risk of type 1 DVI dongle users getting greeted by a
black screen, so I'd rather not go there unless absolutely necessary.
v2: Remove the duplicate PORT_A check (Daniel)
Fix some typos in the commit message
Cc: Daniel Otero <daniel.otero@outlook.com>
Cc: stable@vger.kernel.org
Tested-by: Daniel Otero <daniel.otero@outlook.com>
Fixes: d61992565b ("drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97994
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478884464-14251-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The term "preliminary hardware support" has always caused confusion both
among users and developers. It has always been about preliminary driver
support for new hardware, and not so much about preliminary hardware. Of
course, initially both the software and hardware are in early stages,
but the distinction becomes more clear when the user picks up production
hardware and an older kernel to go with it, with just the early support
we had for the hardware at the time the kernel was released. The user
has to specifically enable the alpha quality *driver* support for the
hardware in that specific kernel version.
Rename preliminary_hw_support to alpha_support to emphasize that the
module parameter, config option, and flag are about software, not about
hardware. Improve the language in help texts and debug logging as well.
This appears to be a good time to do the change, as there are currently
no platforms with preliminary^W alpha support.
Cc: Rob Clark <robdclark@gmail.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477909108-18696-1-git-send-email-jani.nikula@intel.com
When we release the shmem backing storage, we make sure that the pages
are coherent with the cpu cache. However, our clflush routine was
skipping the flush as the object had no pages at release time. Fix this by
explicitly flushing the sg_table we are decoupling.
Fixes: 03ac84f183 ("drm/i915: Pass around sg_table to get_pages/put_pages backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-2-chris@chris-wilson.co.uk
After this patch only conversion of INTEL_INFO(p)->gen to
INTEL_GEN(dev_priv) remains before the __I915__ macro can
be removed.
v2: Tidy vlv_compute_wm. (David Weinehall)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.
v2: Keep original order. (Ville Syrjala)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)
I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.
v2:
- Use i915_gem_fence_reg.{c,h}
v3:
- Rebased
v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
Zhenyu Wang writes:
gvt-next-kvmgt-framework
This adds initial KVMGT framework based on GVT-g MPT(Mediated Passthrough) interface.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
At the moment we allocate enough sg table entries assuming we
will not be able to do any coalescing. But since in practice
we most often can, and more so very effectively, this ends up
wasting a lot of memory.
A simple and effective way of trimming the over-allocated
entries is to copy the table over to a new one allocated to the
exact size.
Experiments on my freshly logged and idle desktop (KDE) showed
that by doing this we can save approximately 1 MiB of RAM, or
when running a typical benchmark like gl_manhattan I have
even seen a 6 MiB saving.
More complicated techniques such as only copying the last used
page and freeing the rest are left to the reader.
v2:
* Update commit message.
* Use temporary sg_table on stack. (Chris Wilson)
v3:
* Commit message update.
* Comment added.
* Replace memcpy with copy assignment.
(Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com
KVMGT is the MPT implementation based on VFIO/KVM. It provides
a kvmgt_mpt ops to gvt for vGPU access mediation, e.g. to
mediate and emulate the MMIO accesses, to inject interrupts
to vGPU user, to intercept the GTT writing and replace it with
DMA-able address, to write-protect guest PPGTT table for
shadowing synchronization, etc. This patch provides the MPT
implementation for GVT, not yet functional due to theabsence
of mdev.
It's built as kvmgt.ko, depends on vfio.ko, kvm.ko and mdev.ko,
and being required by i915.ko. To not introduce hard dependency
in i915.ko, we used indirect symbol reference. But that means
users have to include kvmgt.ko into init ramdisk if their
i915.ko is included.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There are currently 4 methods in intel_gvt_io_emulation_ops
to emulate CFG/MMIO reading/writing for intel vGPU. A possibly
better scope is: add 3 more methods for vgpu create/destroy/reset
respectively, and rename the ops to 'intel_gvt_ops', then pass
it to the MPT module (say the future kvmgt) to use: they are
all methods for external usage.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Hypervisors are different, the MPT ops is a only superset of
all possibly supported hypervisors. There might be other way
out of the MPT to achieve same target. e.g. vfio-based kvmgt
won't provide map_gfn_to_mfn method to establish guest EPT
mapping for aperture, since it will be done in QEMU/KVM, MMIO
is also trapped elsewhere, etc.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
GVT host needs init/exit hooks to do some initialization/cleanup
work, e.g.: vfio mdev host device register/unregister.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Current GVT contains some obsolete logic originally cooked to
support the old, non-vfio kvmgt, which is actually workarounds.
We don't support that anymore, so it's safe to remove it and
make a better framework.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
By providing predefined vGPU types, users can choose which type a vgpu
to create and use, without specifying detailed parameters.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
kmap_atomic doesn't allow sleep until unmapped. However,
it's necessary to allow sleep during reading/writing guest
memory, so use kmap instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
On guest writing a PPGTT entry, if it contains value and the old
entry is valid, gvt will read it and find & free the corresponding
old data for it. However, with the KVM write protection provided
by page_track, the guest entry will be written with new value
before gvt handling. To avoid that, we should use the shadow
entry instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJYIy78AAoJEL/70l94x66DzbQH/34Mgm52OdLzTe+Yf8Pcae5X
A/z2Aicto+6M+dtYoWPn2DfoevPaSpVmymryRrRzAqk5QDPHiVHZ5iCW0RaIEU7M
iiPudqSGVGa0oFYBkxsJxCBysAQsHX5sEXEszs4egbO1TtQ8LCAxYUVuLBGlx+Fa
qj26Fi/p2ByHVp/RX55A2kF5T8J671KT4LWUvFjzTgGFWo8Kr1bk0q4hmRYB9OBc
/pqczV3Mc6KcmzfIg3Rd6xt8UDlEGJ4YQhpNgY6nxrQ1py3AP7vNqBPAH4RXbHJB
/OqdAjpqa8+rrwSpQ1f58U+7v/ZO1ZTg0IW9bf60qKjG/aV4fGN6y0/iXKGgKyQ=
=cpog
-----END PGP SIGNATURE-----
Merge tag 'for-kvmgt' of git://git.kernel.org/pub/scm/virt/kvm/kvm into drm-intel-next-queued
Paulo Bonzini writes:
The three KVM patches that KVMGT needs.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
When we need to reset the global seqno on wraparound, we have to wait
until the current rbtrees are drained (or otherwise the next waiter will
be out of sequence). The current mechanism to kick and spin until
complete, may exit too early as it would break if the target thread was
currently running. Instead, we must wake up the threads, but keep
spinning until the trees have been deleted.
In order to appease Tvrtko, busy spin rather than yield().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161108143719.32215-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
All of this state should be updated as soon as possible. It shouldn't be
done later because then future updates may not depend on it.
Changes since v1:
- Move the modeset update to before drm_atomic_state_get. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-10-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
drm_select_eld requires mode_config.mutex and connection_mutex
because it looks at the connector list and at the legacy encoders.
This is not required, because when we call audio_codec_enable we know
which connector it was called for, so pass the state.
This also removes having to look at crtc->config.
Changes since v1:
- Use intel_crtc->pipe instead of drm_crtc_index. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478609742-13603-8-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The newly added assert_kernel_context_is_current introduces a warning
when built with W=1:
drivers/gpu/drm/i915/i915_gem.c: In function ‘assert_kernel_context_is_current’:
drivers/gpu/drm/i915/i915_gem.c:4417:63: error: suggest braces around empty body in an ‘else’ statement [-Werror=empty-body]
Changing the GEM_BUG_ON() macro from an empty definition to "do { } while (0)"
makes the macro more robust to use and avoids the warning.
Fixes: 3033acab07 ("drm/i915: Queue the idling context switch after all other timelines")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161108135834.2166677-1-arnd@arndb.de
gvt-next-2016-11-07
- Fix regression from e95433c73a
- Some MMIO handler fixes
- Add better handling for guest reset control
- stratch page table tree for shadow ppgtt
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>