Gvt gdrst handler handle_device_reset() invoke function
setup_vgpu_mmio() to reset mmio status. In this case,
the virtual mmio memory has been allocated already. The
new allocation just cause old mmio memory leakage.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We initiate vgpu->workload_q_head via for_each_engine
macro which may skip unavailable engines. So we should
follow this rule anywhere. The function
intel_vgpu_reset_execlist is not aware of this. Kernel
crash when touch a uninitiated vgpu->workload_q_head[x].
Let's fix it by using for_each_engine_masked and skip
unavailable engine ID. Meanwhile rename ring_bitmap to
general name engine_mask.
v2: remove unnecessary engine activation check (zhenyu)
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Emulate right behavior for tlb_control, set to ZERO upon write.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It's a classical abba type deadlock when using 2 mutex objects, which
are gvt.lock(a) and drm.struct_mutex(b). Deadlock happens in threads:
1. intel_gvt_create/destroy_vgpu: P(a)->P(b)
2. workload_thread: P(b)->P(a)
Fix solution is align the lock acquire sequence in both threads. This
patch choose to adjust the sequence in workload_thread function.
This fixed lockup symptom for guest-reboot stress test.
v2: adjust sequence in workload_thread based on zhenyu's suggestion.
adjust sequence in create/destroy_vgpu function.
v3: fix to still require struct_mutex for dispatch_workload()
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
[zhenyuw: fix unused variables warnings.]
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
KVMGT is the MPT implementation based on VFIO/KVM. It provides
a kvmgt_mpt ops to gvt for vGPU access mediation, e.g. to
mediate and emulate the MMIO accesses, to inject interrupts
to vGPU user, to intercept the GTT writing and replace it with
DMA-able address, to write-protect guest PPGTT table for
shadowing synchronization, etc. This patch provides the MPT
implementation for GVT, not yet functional due to theabsence
of mdev.
It's built as kvmgt.ko, depends on vfio.ko, kvm.ko and mdev.ko,
and being required by i915.ko. To not introduce hard dependency
in i915.ko, we used indirect symbol reference. But that means
users have to include kvmgt.ko into init ramdisk if their
i915.ko is included.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There are currently 4 methods in intel_gvt_io_emulation_ops
to emulate CFG/MMIO reading/writing for intel vGPU. A possibly
better scope is: add 3 more methods for vgpu create/destroy/reset
respectively, and rename the ops to 'intel_gvt_ops', then pass
it to the MPT module (say the future kvmgt) to use: they are
all methods for external usage.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Hypervisors are different, the MPT ops is a only superset of
all possibly supported hypervisors. There might be other way
out of the MPT to achieve same target. e.g. vfio-based kvmgt
won't provide map_gfn_to_mfn method to establish guest EPT
mapping for aperture, since it will be done in QEMU/KVM, MMIO
is also trapped elsewhere, etc.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
GVT host needs init/exit hooks to do some initialization/cleanup
work, e.g.: vfio mdev host device register/unregister.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Current GVT contains some obsolete logic originally cooked to
support the old, non-vfio kvmgt, which is actually workarounds.
We don't support that anymore, so it's safe to remove it and
make a better framework.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
By providing predefined vGPU types, users can choose which type a vgpu
to create and use, without specifying detailed parameters.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
kmap_atomic doesn't allow sleep until unmapped. However,
it's necessary to allow sleep during reading/writing guest
memory, so use kmap instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
On guest writing a PPGTT entry, if it contains value and the old
entry is valid, gvt will read it and find & free the corresponding
old data for it. However, with the KVM write protection provided
by page_track, the guest entry will be written with new value
before gvt handling. To avoid that, we should use the shadow
entry instead.
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
All the unused entries in the page table tree(PML4E->PDPE->PDE->PTE)
should point to scratch page table/scratch page to avoid page walk error
due to the page prefetching.
When removing an entry in shadow PPGTT, it need map to scratch page
also, the older implementation use single scratch page to assign to all
level entries, it doesn't align the page walk behavior when removed
entry is in PML, PDP, PD. To avoid potential page walk error this patch
implement a scratch page tree to replace the single scratch page.
v2: more details in commit message address Kevin's comments.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
When SW wishes to reset the render engine, it will program
engine's reset control register and wait response from HW.
We need emulate the behavior of this register so guest i915
driver could walk through the engine reset flow. The registers
are not emulated in gvt yet, this patch add the emulation
logic.
v2: add more desc info in commit message.
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
From commit e95433c73a, workload status setting
was changed to only capture on error path, but we need to set it properly in
normal path too, otherwise we'll fail to complete workload which could lead
guest VM vGPU reset.
v2: uses braces and add Fixes tag.
Fixes: e95433c73a ("drm/i915: Rearrange i915_wait_request() accounting with callers")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Misc ctl related registers are for WA purpose, should detect the
stepping info first before updating HW value.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Need a explicit write_vreg in TLB MMIO write handler, beside that
TLB vreg should update correspondingly following HW status to do
correct emulation.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Missing write_vreg in DMA_CTRL write handler would make obsolete
value return when read vreg.
v2: get data from vreg after updating it.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Remove the variable 'execlist' as it's unused in function
vgpu_has_pending_workload.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This is to fix smatch warning on
drivers/gpu/drm/i915/gvt/cmd_parser.c:1421 cmd_handler_mi_op_2f()
warn: shift has higher precedence than mask
We need bits 20-19 mask for data size.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Our low-level wait routine has evolved from our generic wait interface
that handled unlocked, RPS boosting, waits with time tracking. If we
push our GEM fence tracking to use reservation_objects (required for
handling multiple timelines), we lose the ability to pass the required
information down to i915_wait_request(). However, if we push the extra
functionality from i915_wait_request() to the individual callsites
(i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use
of those extras, we can both simplify our low level wait and prepare for
extending the GEM interface for use of reservation_objects.
v2: Rewrite i915_wait_request() kerneldocs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
We cannot use blocking method mutex_lock inside a wait loop.
Here we invoke pick_next_workload() which needs acquire a
mutex in our "condition" experssion. Then we go into a another
of the going-to-sleep sequence and changing the task state.
This is a dangerous. Let's rewrite the wait sequence to avoid
nested sleeping.
v2: fix do...while loop exit condition (zhenyu)
v3: rebase to gvt-staging branch
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
throw error message in elsp emulation handler basing on execlist
submit result. guest will trigger tdr process for recovering, gvt
just follow guest's desire.
v2: populate error to top of mmio emulation logic, comments from
zhenyu
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Full vGPU reset need to release all the shadow PPGGT pages to avoid
unnecessary write-protect and also should re-initialize pvinfo after
resetting vregs to keep pvinfo correct.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The current_vgpu will set to NULL after stopping the scheduler when
the reset is triggered by current vgpu, so here need change the
judgement condition for current vgpu detection.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The emulation handler for MMIO GDRST miss vreg write in it, as result
the vreg cannot update correspondingly.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Like other routines, intel_gvt_hypervisor_detect_host returns 0
for success.
Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Linux PCI driver saves the msi and msix capability offset
in pci_dev->msi_cap and pci_dev->msix_cap. We can use msi_cap
in pci_dev directly, no need hardcode.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Macro set_mask_bits() is ready for us, just invoke it and remove
our write_bits().
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
It is better to use %p format for void pointers instead of casting them
because a void* is not necessary a 64 bits value.
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Since ioread32 returns a 32-bit value, it is impossible to left-shift
this value by 32 bits (it produces a compilation error). Casting the
return value of ioread32 fix this issue.
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
When invalidating RCS TLB the device can enter RC6 state interrupting
the process, therefore the need for render forcewake for the whole
procedure.
This WA is needed for all production SKL SKUs.
v2: reworked putting and getting forcewake with help of Mika Kuoppala
v3: use I915_READ_FW and I915_WRITE_FW as we are handling forcewake on
in the code path
References: HSD#2136899, HSD#1404391274
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Function create_scratch_page() may fail in some cases.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The function return values should has type int if it return
a integer value.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Mark all local functions & variables as static.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Add proper __iomem annotation for pointers obtained via ioremap().
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Switch to use new for_each_engine() helper to properly access
enabled intel_engine_cs as i915 core has changed that to be
dynamic managed. At GVT-g init time would still depend on ring
mask to determine engine list as it's earlier.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This code was removed from i915_cmd_parser.c but still an obsolete
version wound up being duplicated into gvt/cmd_parser.c. Good riddance.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We have the ability to map an object, so use it rather than opencode it
badly. Note that the object remains permanently pinned, this is poor
practise.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We have the ability to map an object, so use it rather than opencode it
badly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
For whatever reason, the gvt scheduler runs synchronously. At the very
least, lets run synchronously without holding the struct_mutex.
v2: cut'n'paste mutex_lock instead of unlock.
Replace long hold of struct_mutex with a mutex to serialise the worker
threads.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The kthread will not be interrupted, don't even bother checking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The workload took a pointer to the request, and even waited upon,
without holding a reference on the request. Take that reference
explicitly and fix up the error path following request allocation that
missed flushing the request.
v2: [zhenyuw]
- drop request put in error path for dispatch, as main thread
caller will handle it identically to a real request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Unpinning the pages prior to the object being release from the GPU may
allow the GPU to read and write into system pages (i.e. use after free
by the hw).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The purpose of returning the just-pinned VMA is so that we can use the
information within, like its address. Also it should be tracked and used
as the cookie to unpin...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
On failure from i915_gem_object_create(), we need to check for an error
pointer not NULL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Manipulating the fence_list requires the runtime wakelock, as does
writing to the fence registers. Acquire a wakelock for the former, and
assert that the device is awake for the latter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>