Commit Graph

313 Commits

Author SHA1 Message Date
Dave Airlie
735f463af7 Merge tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Final pile of features for 4.14

- New ioctl to change NOA configurations, plus prep (Lionel)
- CCS (color compression) scanout support, based on the fancy new
  modifier additions (Ville&Ben)
- Document i915 register macro style (Jani)
- Many more gen10/cnl patches (Rodrigo, Pualo, ...)
- More gpu reset vs. modeset duct-tape to restore the old way.
- prep work for cnl: hpd_pin reorg (Rodrigo), support for more power
  wells (Imre), i2c pin reorg (Anusha)
- drm_syncobj support (Jason Ekstrand)
- forcewake vs gpu reset fix (Chris)
- execbuf speedup for the no-relocs fastpath, anv/vk low-overhead ftw (Chris)
- switch to idr/radixtree instead of the resizing ht for execbuf id->vma
  lookups (Chris)

gvt:
- MMIO save/restore optimization (Changbin)
- Split workload scan vs. dispatch for more parallel exec (Ping)
- vGPU full 48bit ppgtt support (Joonas, Tina)
- vGPU hw id expose for perf (Zhenyu)

Bunch of work all over to make the igt CI runs more complete/stable.
Watch https://intel-gfx-ci.01.org/tree/drm-tip/shards-all.html for
progress in getting this ready. Next week we're going into production
mode (i.e. will send results to intel-gfx) on hsw, more platforms to
come.

Also, a new maintainer tram, I'm stepping out. Huge thanks to Jani for
being an awesome co-maintainer the past few years, and all the best
for Jani, Joonas&Rodrigo as the new maintainers!

* tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel: (179 commits)
  drm/i915: Update DRIVER_DATE to 20170818
  drm/i915/bxt: use NULL for GPIO connection ID
  drm/i915: Mark the GT as busy before idling the previous request
  drm/i915: Trivial grammar fix s/opt of/opt out of/ in comment
  drm/i915: Replace execbuf vma ht with an idr
  drm/i915: Simplify eb_lookup_vmas()
  drm/i915: Convert execbuf to use struct-of-array packing for critical fields
  drm/i915: Check context status before looking up our obj/vma
  drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs
  drm/i915: Stop touching forcewake following a gen6+ engine reset
  MAINTAINERS: drm/i915 has a new maintainer team
  drm/i915: Split pin mapping into per platform functions
  drm/i915/opregion: let user specify override VBT via firmware load
  drm/i915/cnl: Reuse skl_wm_get_hw_state on Cannonlake.
  drm/i915/gen10: implement gen 10 watermarks calculations
  drm/i915/cnl: Fix LSPCON support.
  drm/i915/vbt: ignore extraneous child devices for a port
  drm/i915/cnl: Setup PAT Index.
  drm/i915/edp: Allow alternate fixed mode for eDP if available.
  drm/i915: Add support for drm syncobjs
  ...
2017-08-22 10:03:07 +10:00
Imre Deak
9c3a16c887 drm/i915/hsw+: Add support for multiple power well regs
Future platforms increase the number of power wells which require
additional control registers. A convenient way to select the correct
register is to use the high bits of the power well ID as index. This
patch only prepares for this, while upcoming platform enabling patches
will add the actual new power well IDs and corresponding power well
control registers.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Rakshmi Bhatia <rakshmi.bhatia@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Rakshmi Bhatia <rakshmi.bhatia@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170814151530.24154-2-imre.deak@intel.com
2017-08-15 15:28:10 +03:00
Dave Airlie
0c697fafc6 Linux 4.13-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZkNpUAAoJEHm+PkMAQRiGr68H/2nr8kxpoUhZ7eA5C71waCjh
 gnJSevkzJAp+fCb0KfQFAp1qvpmLLle4e6tAxYgTQZg4Z3W5cJJNfxu9TzY5sGuL
 o9QUr43XzABepW4e4jhRtZv6dj3K6XruNeDQKXDZTDcc/S8zoiS/Pltq7VgPcAuM
 kX+3qsNdUyknngD6b0z9NtJkb0mHKY6J8MpraWRO34egDwsaN/tuhRj0DRQpCoyQ
 x/k+hMbc9MB9Dn8cfACo6Omb+r5Rfd7dTBUAju/TnIIgs//9voHba307N7XvLJZg
 kWc8MqMQQZXfRZHB0atpDMHyZS/XQRlNPXj76j0+Ud/byODKTFkkazmgTpALvj8=
 =CxeU
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.13-rc5' into drm-next

Linux 4.13-rc5

There's a really nasty nouveau collision, hopefully someone can take a look
once I pushed this out.
2017-08-15 16:16:58 +10:00
Tina Zhang
6b3816d696 drm/i915/gvt: Fix guest i915 full ppgtt blocking issue
Guest i915 full ppgtt functionality was blocking by an issue, which would
lead to gpu hardware hang. Guest i915 driver may update the ppgtt table
just before this workload is going to be submitted to the hardware by
device model. This case wasn't handled well by device model before, due
to the small time window between removing old ppgtt entry and adding the
new one. Errors occur when the workload is executed by hardware during
that small time window. This patch is to remove this time window by adding
the new ppgtt entry first and then remove the old one.

Changes in v2:
- Move VGT_CAPS_FULL_PPGTT introduction to patch 2/4. (Joonas)

Changes since v2:
- Divide the whole patch set into two separate patch series, with one
  patch in i915 side to check guest i915 full ppgtt capability and enable
  it when this capability is supported by the device model, and the other
  one in gvt side which fixs the blocking issue and enables the device
  model to provide the capability to guest. And this patch focuses on gvt
  side. (Joonas)
- Change the title from "reorder the shadow ppgtt update process by adding
  entry first" to "Fix guest i915 full ppgtt blocking issue". (Tina)

Changes since v3:
- Rebase to the latest branch.

Changes since v4:
- Tested by Tina Zhang.

Changes since v5:
- Rebase to the latest branch.

v6:
- Update full 48bit ppgtt definition

Cc: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-15 10:13:09 +08:00
Kechen Lu
9dfb8e5b91 drm/i915/gvt: Add shadow context descriptor updating
The current context logic only updates the descriptor of context when
it's being pinned to graphics memory space. But this cannot satisfy the
requirement of shadow context. The addressing mode of the pinned shadow
context descriptor may be changed according to the guest addressing mode.
And this won't be updated, as the already pinned shadow context has no
chance to update its descriptor. And this will lead to GPU hang issue,
as shadow context is used with wrong descriptor. This patch fixes this
issue by letting the pinned shadow context descriptor update its
addressing mode on demand.

This patch fixes GPU HANG issue which happends after changing the
grub parameter i915.enable_ppgtt form 0x01 to 0x03 or vice versa and
then rebooting the guest.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Kechen Lu <kechen.lu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:15 +08:00
Zhenyu Wang
a45050d718 drm/i915/gvt: expose vGPU context hw id
This exposes vGPU context hw id in mdev sysfs which is used to
do vGPU based profiling. Retrieved vGPU context hw id can be set
through i915 perf ioctl to set profiling for target vGPU.

Cc: Jiao Pengyuan <pengyuan.jiao@intel.com>
Cc: Niu Bing <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:11 +08:00
Chuanxiao Dong
4d3e67bb6f drm/i915/gvt: Refine the intel_vgpu_reset_gtt reset function
When doing the VGPU reset, we don't need to do the gtt/ppgtt reset.
This will make the GVT to do the ppgtt shadow every time for
a workload and caused really bad performance after a VGPU reset.
This patch will make sure ppgtt clean only happen at device module
level reset to fix this.

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:10 +08:00
Changbin Du
4b2dbbc225 drm/i915/gvt: Add carefully checking in GTT walker paths
When debugging the gtt code, found the intel_vgpu_gma_to_gpa() can
translate any given GMA though the GMA is not valid. This because
the GTT ops suppress the possible errors, which may result in an
invalid PT entry is retrieved by upper caller.

This patch changed the prototype of pte ops to propagate status to
callers. Then we make sure the GTT walker stop as early as when
a error is detected to prevent undefined behavior.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:09 +08:00
Jian Jun Chen
36ed7e97e2 drm/i915/gvt: Remove duplicated MMIO entries
Remove duplicated MMIO entries in the tracked MMIO list. -EEXIST
is returned if duplicated MMIO entries are found when new MMIO
entry is added.

v2:
- Use WARN(1, ...) for more verbose message. (Zhenyu)

Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Yulei Zhang <yulei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:09 +08:00
Zhenyu Wang
73821a53d0 drm/i915/gvt: take runtime pm when do early scan and shadow
Need to take runtime pm when do early scan/shadow of workload
for request operations.

Fixes: 7fa56bd159bc ("drm/i915/gvt: Audit and shadow workload during ELSP writing")
Cc: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:08 +08:00
Ping Gao
64d8bb83b6 drm/i915/gvt: Replace duplicated code with exist function
Use the exist function intel_gvt_ggtt_validate_range to replace
these duplicated code that do the same thing.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:08 +08:00
Ping Gao
87e919d741 drm/i915/gvt: To check whether workload scan and shadow has mutex hold
The function workload scan and shadow have to hold the drm.struct_mutex
before called. To avoid misusing of this function, add a lockdep assert
in it.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:07 +08:00
Ping Gao
d0302e7400 drm/i915/gvt: Audit and shadow workload during ELSP writing
Let the workload audit and shadow ahead of vGPU scheduling, that
will eliminate GPU idle time and improve performance for multi-VM.

The performance of Heaven running simultaneously in 3VMs has
improved 20% after this patch.

v2:Remove condition current->vgpu==vgpu when shadow during ELSP
writing.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:07 +08:00
Ping Gao
89ea20b930 drm/i915/gvt: Factor out scan and shadow from workload dispatch
To perform the workload scan and shadow in ELSP writing stage for
performance consideration, the workload scan and shadow stuffs
should be factored out from dispatch_workload().

v2:Put context pin before i915_add_request;
   Refine the comments;
   Rename some APIs;

v3:workload->status should set only when error happens.
v4:i915_add_request is must to have after i915_gem_request_alloc.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:06 +08:00
Changbin Du
4671ea2041 drm/i915/gvt: Optimize ring siwtch 2x faster again by light weight mmio access wrapper
The I915_READ/WRITE is not only a mmio read/write, it also contains
debug checking and Forcewake domain lookup. This is too heavy for
GVT ring switch case which access batch of mmio registers on ring
switch. We can handle Forcewake manually and use the raw
i915_read/write instead. The benefit from this is 2x faster mmio
switch performance.
         Before       After
cycles  ~550000      ~250000

v2: Use existing I915_READ_FW/I915_WRITE_FW macro. (zhenyu)

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:06 +08:00
Changbin Du
f846c8de64 drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ
There are lots of POSTING_READ alongside each mmio write Op. While
actually this is not necessary. It just bring too much latency since
PCIe read Op is very slow which is of non-posted transaction.

For PCIe device, the mem transaction for strong ordering rules are:
  o PCIe mmio write sequence is FIFO. Posted request cannot
    pass previous posted request.
  o PCIe mmio read will not go ahead of previous write.

Intel graphics doesn't support RO, so we can apply above rules. In
our case, we only need one POSTING_READ at last. This can remove
half of mmio read Op and then the average ring switch performance
is nearly doubled.
         Before       After
cycles  ~970000      ~550000

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:05 +08:00
Chuanxiao Dong
4cf196eb1e drm/i915/gvt: Use gvt_err to print the resource not enough error
It is better to use gvt_err when the gvt resource is not enough so
the user can be notified from the kernel dmesg. And this kind of
error message is gvt related.

Suggested-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Bing Niu <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-10 10:26:05 +08:00
Xiong Zhang
d6086598d3 drm/i915/gvt: Change the max length of mmio_reg_rw from 4 to 8
When linux guest access mmio with __raw_i915_read64 or __raw_i915_write64,
its length is 8 bytes.

This fix the linux guest in xengt couldn't boot up as it fail in
reading pv_info->magic.

Fixes: 65f9f6febf ("drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-07 15:50:39 +08:00
Tina Zhang
02b6ed4430 drm/i915/gvt: Initialize MMIO Block with HW state
MMIO block with tracked mmio, is introduced for the sake of performance
of searching tracked mmio. All the tracked mmio needs to get the initial
value from the HW state during vGPU being created. This patch is to
initialize the tracked registers in MMIO block with the HW state.

v2: Add "Fixes:" line for this patch (Zhenyu)

Fixes: 65f9f6febf ("drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks")
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-04 17:39:41 +08:00
Chuanxiao Dong
f2e2c00adc drm/i915/gvt: clean workload queue if error happened
If a workload caused a HW GPU hang or it is in the middle of
vGPU reset, the workload queue should be cleaned up to emulate
the hang state of the GPU.

v2:
- use ENGINE_MASK(ring_id) instead of (1 << ring_id). (Zhenyu)

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-02 10:07:46 +08:00
Chuanxiao Dong
6184cc8ddb drm/i915/gvt: change resetting to resetting_eng
Use resetting_eng to identify which engine is resetting
so the rest ones' workload won't be impacted

v2:
- use ENGINE_MASK(ring_id) instead of (1 << ring_id). (Zhenyu)

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-02 10:07:40 +08:00
Imre Deak
b2891eb253 drm/i915/hsw+: Add has_fuses power well attribute
The pattern of a power well backing a set of fuses whose initialization
we need to wait for during power well enabling is common to all GEN9+
platforms. Adding support for this to the HSW power well enable helper
allows us to use the HSW/BDW power well code for GEN9+ as well in a
follow-up patch.

v2:
- Use an enum for power gates instead of raw numbers. (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-6-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:53 +02:00
Imre Deak
1af474fef2 drm/i915/hsw+: Unify the hsw/bdw and gen9+ power well req/state macros
Although on HSW/BDW there is only a single display global power well,
it's programmed the same way as other GEN9+ power wells. This also
means we can get at the HSW/BDW request and status flags the same way
it's done on GEN9+ by assigning the corresponding HSW/BDW power well ID.
This ID was assigned in a recent patch, so we can now switch to using
the same macros everywhere on HSW+.

Updating the HSW power well control register with RMW is not strictly
necessary, but this will allow us to use the same code for GEN9+.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1499352040-8819-13-git-send-email-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:52 +02:00
Dave Airlie
2d62c799f8 Merge tag 'drm-intel-next-2017-07-17' of git://anongit.freedesktop.org/git/drm-intel into drm-next
2nd round of 4.14 features:

- prep for deferred fbdev setup
- refactor fixed 16.16 computations and skl+ wm code (Mahesh Kumar)
- more cnl paches (Rodrigo, Imre et al)
- tighten context cleanup and handling (Chris Wilson)
- fix interlaced handling on skl+ (Mahesh Kumar)
- small bits as usual

* tag 'drm-intel-next-2017-07-17' of git://anongit.freedesktop.org/git/drm-intel: (84 commits)
  drm/i915: Update DRIVER_DATE to 20170717
  drm/i915: Protect against deferred fbdev setup
  drm/i915/fbdev: Always forward hotplug events
  drm/i915/skl+: unify cpp value in WM calculation
  drm/i915/skl+: WM calculation don't require height
  drm/i915: Addition wrapper for fixed16.16 operation
  drm/i915: cleanup fixed-point wrappers naming
  drm/i915: Always perform internal fixed16 division in 64 bits
  drm/i915: take-out common clamping code of fixed16 wrappers
  drm/i915/cnl: Add missing type case.
  drm/i915/cnl: Add max allowed Cannonlake DC.
  drm/i915: Make DP-MST connector info work
  drm/i915/cnl: Get DDI clock based on PLLs.
  drm/i915/cnl: Inherit RPS stuff from previous platforms.
  drm/i915/cnl: Gen10 render context size.
  drm/i915/cnl: Don't trust VBT's alternate pin for port D for now.
  drm/i915: Fix the kernel panic when using aliasing ppgtt
  drm/i915/cnl: Cannonlake color init.
  drm/i915/cnl: Add force wake for gen10+.
  x86/gpu: CNL uses the same GMS values as SKL
  ...
2017-07-20 11:31:43 +10:00
fred gao
f43aa31fe7 drm/i915/gvt: Fix the vblank timer close issue after shutdown VMs in reverse
Once the Windows guest is shutdown, the display pipe will be disabled
and intel_gvt_check_vblank_emulation will be called to check if the
vblank timer is turned off. Given the scenario of creating VM1 ,VM2,
destoying VM2 in current code, VM1 has pipe enabled and continues to
check VM2, the flag have_enabled_pipe is always false since all the VM2
pipes are disabled, so the vblank timer will be canceled and TDR happens
in Windows VM1 guest due to the vsync timeout.

In this patch the vblank timer will be never canceled once one pipe is
enabled.

v2:
- remove have_enabled_pipe flag and check pipe enabled directly. (Zhenyu)

Cc: Wang Hongbo <hongbo.wang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-17 17:09:04 +08:00
Chuanxiao Dong
0cf5ec4183 drm/i915/gvt: Use fence error from GVT request for workload status
The req->fence.error will be set if this request caused GPU hang so
we can use this value to workload->status to indicate whether this
GVT request caused any problem. If it caused GPU hang, we shouldn't
trigger any context switch back to the guest.

v2:
- only take -EIO from fence->error. (Zhenyu)

Fixes: 8f1117abb4 (drm/i915/gvt: handle workload lifecycle properly)
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11 13:47:09 +08:00
Weinan Li
4cc74389a5 drm/i915/gvt: remove scheduler_mutex in per-engine workload_thread
For the vGPU workloads, now GVT-g use per vGPU scheduler, the per-ring
work_thread only pick workload belongs to the current vGPU. And with time
slice based scheduler, it waits all the engines become idle before do vGPU
switch. So we can run free dispatch in per-ring work_thread, different ring
running in different 'vGPU' won't happen.

For the workloads between vGPU and Host, this scheduler_mutex can't block
host to dispatch workload into other ring engines.

Here remove this mutex since it impacts the performance when applications
use more than 1 ring engines in 1 vgpu.

ring0 running in vGPU1, ring1 running in Host. Will happen.
ring0 running in vGPU1, ring1 running in vGPU2. Won't happen.

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11 13:46:58 +08:00
Chuanxiao Dong
08673c3e27 drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue"
This reverts commit 62d02fd1f8.

The rwsem recursive trace should not be fixed from kvmgt side by using
a workqueue and it is an issue should be fixed in VFIO. So this one
should be reverted.

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11 13:46:58 +08:00
Ping Gao
3364bf5fd0 drm/i915/gvt: Audit the command buffer address
The command buffer address in context like ring buffer base address
and wa_ctx address need to be audit to make sure they are in the
valid GGTT range.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11 13:46:58 +08:00
Zhou, Wenjia
0de9870989 drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt()
It will causes memory leak, if the function setup_spt_oos() fail,
in the function intel_gvt_init_gtt(),
which allocated by get_zeroed_page() and mapped by dma_map_page().

Unmap and free the page,  after STP oos initialize fail,
it will fix this issue.

Signed-off-by: Zhou, Wenjia <zhiyuan_zhu@htc.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11 13:46:58 +08:00
Daniel Vetter
953152253e main drm pull for v4.13
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZYseIAAoJEAx081l5xIa+85kP/0zKzKKVzZXSXG2TAGb5jNfk
 Ex+TELG8tWk9KBxA7lEE5c0WEsnP79cNoXZLQu8wlUzO8+kwQK5Bz0zgNUkpSuo1
 RthwdsxBQX1++UxB+HoSG+dOa7hkKVqlgQR3z9qyhsBXzetkJV0DoYcpMV0A1EWd
 6Jzt+AvCShVkcW+21LqHPlc5EIVewrDMoA3oU6aYCLhyAOUTVvvQB2ML8YApH7TM
 JrSrzCFHTrQEBbGUrZQhzR0sZzZzk9byntb/I/mdVbHeCyIHiL8sC4PfWSOyyazm
 GkPnA8G3aFAY9haBRz9jG/VBr1yVb0mCBjkWQ1lGfIAOCDDSc+d7PDXdG+i4AewK
 jZheXlrDIdGgmJLy4W3rdEqJvdf7UQHZOs8594OL19l4+FxCTrol1JSHSMeavCvr
 8bUNil9Jb/ONU/wmp+q55U0k4TCTyerUA7gKnuaJAwBvd4n78/PKmQnbrWinDyJc
 GQXp6zESk9bKt5DXSnVZuVf4POTzpuAsQkkfX1V2y145EHTQYfS3jLENWqEjyZUy
 QtKCHZvRkJfGaFU4Pr+vBo9Iu1GlA5OiOv08QadldTT4OxUI0T6yaLDobHCQfKPE
 sc3wCuCM+/dAnqoKDcGC4hAmF8zDdO0kw65P2m7uC6T9Jm1G35CioKbzo+fzUhuL
 fg5TBpbp2Wwe2oPA5iBm
 =2S5N
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.13' into drm-intel-next-queued

Resync with the main drm-next pull request for 4.13. What we really
need is to fully resync with pending drm-misc, but that's not yet
possible due to the still ongoing merge window.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-07-10 21:56:39 +02:00
Jani Nikula
507ad75736 Merge tag 'gvt-fixes-2017-06-29' of https://github.com/01org/gvt-linux into drm-intel-next-fixes
gvt-fixes-2017-06-29

- two race fixes for VFIO locks from Chuanxiao
- virtual display fix for BDW from Xiong

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170629065424.kxopjbvntuakbyz2@zhen-hp.sh.intel.com
2017-06-30 12:49:45 +03:00
Changbin Du
5cd82b7577 drm/i915/gvt: Make function dpy_reg_mmio_readx safe
The dpy_reg_mmio_read_x functions directly copy 4 bytes data to the
target address with considering the length. If may cause the target
memory corrupted if the requested length less than 4 bytes. Fix it
for safety even we already have some checking to avoid this happen.
And for convince, the 3 functions are merged.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-29 11:15:11 +08:00
Xiong Zhang
75e64ff2c2 drm/i915/gvt: Don't read ADPA_CRT_HOTPLUG_MONITOR from host
When host connects a crt screen, linux guest will detect two
screens: crt and dp. This is wrong as linux guest has only
one dp.

In order to avoid guest get host crt screen, we should set
ADPA_CRT_HOTPLUG_MONITOR to none. But MMIO_RO(PCH_ADPA) prevent
from that. So MMIO_DH should be used instead of MMIO_RO.

v2: Clear its staus to none at initialize, so guest don't
    get host crt.(Zhangyu)
v3: SKL doesn't have this register, limit it to pre_skl.(xiong)

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-27 17:29:25 +08:00
Xiong Zhang
295a0d0b55 drm/i915/gvt: Set initial PORT_CLK_SEL vreg for BDW
On BDW, when host physical screen and guest virtual screen aren't on
the same DDI port, guest i915 driver prints the following error and
stop running.
[    6.775873] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000068
[    6.775928] IP: intel_ddi_clock_get+0x81/0x430 [i915]
[    6.776206] Call Trace:
[    6.776233]  ? vgpu_read32+0x4f/0x100 [i915]
[    6.776264]  intel_ddi_get_config+0x11c/0x230 [i915]
[    6.776298]  intel_modeset_setup_hw_state+0x313/0xd40 [i915]
[    6.776334]  intel_modeset_init+0xe49/0x18d0 [i915]
[    6.776368]  ? vgpu_write32+0x53/0x100 [i915]
[    6.776731]  ? intel_i2c_reset+0x42/0x50 [i915]
[    6.777085]  ? intel_setup_gmbus+0x32a/0x350 [i915]
[    6.777427]  i915_driver_load+0xabc/0x14d0 [i915]
[    6.777768]  i915_pci_probe+0x4f/0x70 [i915]

The null pointer is guest intel_crtc_state->shared_dpll which is
setted in haswell_get_ddi_pll(). When guest and host screen are
on different DDI port, host driver won't set PORT_CLK_SET(guest_port),
so haswell_get_ddi_pll() will return null and don't set
pipe_config->shared_dpll, once the following program refernce this
structure, it will print the above error.

This patch set the initial val of guest PORT_CLK_SEL(guest_port) to
LCPLL_810. And guest i915 driver will reset this value according to
guest screen mode.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-27 17:29:19 +08:00
Chuanxiao Dong
f16bd3dda2 drm/i915/gvt: Fix inconsistent locks holding sequence
There are two kinds of locking sequence.

One is in the thread which is started by vfio ioctl to do
the iommu unmapping. The locking sequence is:
	down_read(&group_lock) ----> mutex_lock(&cached_lock)

The other is in the vfio release thread which will unpin all
the cached pages. The lock sequence is:
	mutex_lock(&cached_lock) ---> down_read(&group_lock)

And, the cache_lock is used to protect the rb tree of the cache
node and doing vfio unpin doesn't require this lock. Move the
vfio unpin out of the cache_lock protected region.

v2:
- use for style instead of do{}while(1). (Zhenyu)

Fixes: f30437c5e7 ("drm/i915/gvt: add KVMGT support")
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-26 16:32:20 +08:00
Chuanxiao Dong
62d02fd1f8 drm/i915/gvt: Fix possible recursive locking issue
vfio_unpin_pages will hold a read semaphore however it is already hold
in the same thread by vfio ioctl. It will cause below warning:

[ 5102.127454] ============================================
[ 5102.133379] WARNING: possible recursive locking detected
[ 5102.139304] 4.12.0-rc4+ #3 Not tainted
[ 5102.143483] --------------------------------------------
[ 5102.149407] qemu-system-x86/1620 is trying to acquire lock:
[ 5102.155624]  (&container->group_lock){++++++}, at: [<ffffffff817768c6>] vfio_unpin_pages+0x96/0xf0
[ 5102.165626]
but task is already holding lock:
[ 5102.172134]  (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280
[ 5102.182522]
other info that might help us debug this:
[ 5102.189806]  Possible unsafe locking scenario:

[ 5102.196411]        CPU0
[ 5102.199136]        ----
[ 5102.201861]   lock(&container->group_lock);
[ 5102.206527]   lock(&container->group_lock);
[ 5102.211191]
*** DEADLOCK ***

[ 5102.217796]  May be due to missing lock nesting notation

[ 5102.225370] 3 locks held by qemu-system-x86/1620:
[ 5102.230618]  #0:  (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280
[ 5102.241482]  #1:  (&(&iommu->notifier)->rwsem){++++..}, at: [<ffffffff810de775>] __blocking_notifier_call_chain+0x35/0x70
[ 5102.253713]  #2:  (&vgpu->vdev.cache_lock){+.+...}, at: [<ffffffff8157b007>] intel_vgpu_iommu_notifier+0x77/0x120
[ 5102.265163]
stack backtrace:
[ 5102.270022] CPU: 5 PID: 1620 Comm: qemu-system-x86 Not tainted 4.12.0-rc4+ #3
[ 5102.277991] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.01.APER.061220151418 06/12/2015
[ 5102.289445] Call Trace:
[ 5102.292175]  dump_stack+0x85/0xc7
[ 5102.295871]  validate_chain.isra.21+0x9da/0xaf0
[ 5102.300925]  __lock_acquire+0x405/0x820
[ 5102.305202]  lock_acquire+0xc7/0x220
[ 5102.309191]  ? vfio_unpin_pages+0x96/0xf0
[ 5102.313666]  down_read+0x2b/0x50
[ 5102.317259]  ? vfio_unpin_pages+0x96/0xf0
[ 5102.321732]  vfio_unpin_pages+0x96/0xf0
[ 5102.326024]  intel_vgpu_iommu_notifier+0xe5/0x120
[ 5102.331283]  notifier_call_chain+0x4a/0x70
[ 5102.335851]  __blocking_notifier_call_chain+0x4d/0x70
[ 5102.341490]  blocking_notifier_call_chain+0x16/0x20
[ 5102.346935]  vfio_iommu_type1_ioctl+0x87b/0x920
[ 5102.351994]  vfio_fops_unl_ioctl+0x81/0x280
[ 5102.356660]  ? __fget+0xf0/0x210
[ 5102.360261]  do_vfs_ioctl+0x93/0x6a0
[ 5102.364247]  ? __fget+0x111/0x210
[ 5102.367942]  SyS_ioctl+0x41/0x70
[ 5102.371542]  entry_SYSCALL_64_fastpath+0x1f/0xbe

put the vfio_unpin_pages in a workqueue can fix this.

v2:
- use for style instead of do{}while(1). (Zhenyu)
v3:
- rename gvt_cache_mark to gvt_cache_mark_remove. (Zhenyu)

Fixes: 659643f7d8 ("drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT")
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-26 16:31:49 +08:00
Dave Airlie
305b9eddee Merge tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Final pile of features for 4.13

New uabi:
- batch bo in first slot, for faster execbuf assembly in userspace
  (Chris Wilson)
- (sub)slice getparam, needed for mesa perf support (Robert Bragg)

First pile of patches for cnl/cfl support, maintained by Rodrigo but
with lots of contributions from others. Still incomplete since public
review still ongoing.

Features/refactoring:
- Make execbuf faster (Chris Wilson), a pile of series to make execbuf
  buffer handling have fewer passes, use less list walking, postpone
  more work to async workers and shuffle buffers less, all to make the
  common case much faster (in some cases at least).
- cold boot support for glk dsi (Madhav Chauhan)
- Clean up pipe A quirk and related old platform hacks (Ville)
- perf sampling support for kbl/glk (Lionel)
- perf cleanups (Robert Bragg)
- wire atomic state to backlight code, to avoid pipe lookup hacks
  (Maarten)
- reduce request waiting latency/overhead to remove the spinning and
  associated cpu cycle wasting (Chris)
- fix 90/270 rotation wm computation (Ville)
- new ddb allocation algo for skl (Kumar Mahesh)
- fix regression due to system suspend optimiazatino (Imre)
- the usual pile of small cleanups and refactors all over

GVT updates contained in this tag:
- optimization for per-VM mmio save/restore (Changbin)
- optimization for mmio hash table (Changbin)
- scheduler optimization with event (Ping)
- vGPU reset refinement (Fred)
- other misc refactor and cleanups, etc.

* tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel: (170 commits)
  drm/i915: Update DRIVER_DATE to 20170619
  drm/i915/cfl: Introduce Coffee Lake workarounds.
  drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH
  drm/i915: Stash a pointer to the obj's resv in the vma
  drm/i915: Async GPU relocation processing
  drm/i915: Allow execbuffer to use the first object as the batch
  drm/i915: Wait upon userptr get-user-pages within execbuffer
  drm/i915: First try the previous execbuffer location
  drm/i915: Store a persistent reference for an object in the execbuffer cache
  drm/i915: Eliminate lots of iterations over the execobjects array
  drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
  drm/i915: Pass vma to relocate entry
  drm/i915: Store a direct lookup from object handle to vma
  drm/i915: Fix retrieval of hangcheck stats
  drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
  drm/i915: Mark CPU cache as dirty on every transition for CPU writes
  drm/i915: Make i915_vma_destroy() static
  drm/i915: Actually attach the tv_format property to the SDVO connector
  Revert "drm/i915/skl: New ddb allocation algorithm"
  drm/i915/glk: Add cold boot sequence for GLK DSI
  ...
2017-06-21 08:55:22 +10:00
Chris Wilson
5f09a9c8ab drm/i915: Allow contexts to be unreferenced locklessly
If we move the actual cleanup of the context to a worker, we can allow
the final free to be called from any context and avoid undue latency in
the caller.

v2: Negotiate handling the delayed contexts free by flushing the
workqueue before calling i915_gem_context_fini() and performing the final
free of the kernel context directly
v3: Flush deferred frees before new context allocations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk
2017-06-20 17:13:47 +01:00
Jani Nikula
7dfb9ba33f Merge tag 'gvt-next-2017-06-08' of https://github.com/01org/gvt-linux into drm-intel-next-queued
gvt-next-2017-06-08

First gvt-next pull for 4.13:
- optimization for per-VM mmio save/restore (Changbin)
- optimization for mmio hash table (Changbin)
- scheduler optimization with event (Ping)
- vGPU reset refinement (Fred)
- other misc refactor and cleanups, etc.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608093547.bjgs436e3iokrzdm@zhen-hp.sh.intel.com
2017-06-16 10:03:01 +03:00
Dave Airlie
925344ccc9 Linux 4.12-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh
 biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G
 kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU
 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+
 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E
 W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4=
 =m2Gx
 -----END PGP SIGNATURE-----

BackMerge tag 'v4.12-rc5' into drm-next

Linux 4.12-rc5 for nouveau fixes
2017-06-16 13:58:27 +10:00
fred gao
615c16a9d8 drm/i915/gvt: Refine virtual reset function
during the emulation of virtual reset:
1. only reset the engine related mmio ending with MMIO
   offset Master_IRQ, not include display stuff.

2. fences are not required to set default
   value as well to prevent screen flicking.

this will fix the issue of Guest screen hang while running
Force tdr in Linux guest.

v2:
- only reset the engine related mmio. (Zhenyu & Zhiyuan)
v3:
- IMR/Ring mode registers are not save/restored. (Changbin)
v4:
- redefine the MMIO reset offset for easy understanding. (Zhenyu)
- pvinfo can be reset. (Zhenyu)
v5:
- add more comments for mmio reset. (Zhenyu)

Cc: Changbin Du <changbin.du@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Lv zhiyuan <zhiyuan.lv@intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:21 +08:00
fred gao
0811fa6630 drm/i915/gvt: Fix GDRST vreg state after reset
Emulating the GDRST read behavior correctly to ack the
guest reset request.

v2:
- split the original patch into two:
  GDRST read handler and virtual gpu reset. (Zhenyu)
v3:
- emulate the GDRST read right after write. (Zhenyu)

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhang Yulei <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:21 +08:00
Changbin Du
178cd160c6 drm/i915/gvt: Tuning the size of MMIO hash lookup table to 2048
On Skylake platform, The traced virtual mmio registers are up to 2039.
So tuning the hash table size to improve lookup performance.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:21 +08:00
Changbin Du
fbfd76c374 drm/i915/gvt: Add helper for tuning MMIO hash table
We count all the tracked virtual MMIO registers, which can help us to
tune the MMIO hash table.

v2: Move num_tracked_mmio into gvt structure.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:20 +08:00
Changbin Du
5c6d4c676d drm/i915/gvt: Make the MMIO attribute wrappers be inline
Function calls are expensive. I have see obvious overhead call to
these wrappers in perf data, especially from the cmd parser side.
So make these simple wrappers be inline to kill them all.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:20 +08:00
Changbin Du
56a78de549 drm/i915/gvt: Make mmio_attribute as type u8 to save 1.5MB memory
Type u8 is big enough to contain all MMIO attribute flags. As the
total MMIO size is 2MB so we saved 1.5MB memory.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:20 +08:00
Changbin Du
d8d94ba3fc drm/i915/gvt: Cleanup struct intel_gvt_mmio_info
The size, length, addr_mask fields actually are not necessary. Every
tracked mmio has DWORD size, and addr_mask is a legacy field.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:19 +08:00
Changbin Du
65f9f6febf drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocks
Some of traced MMIO registers are a large continuous section. These
stuffed the MMIO lookup hash table and so waste lots of memory and
get much lower lookup performance.

Here we picked out these sections by special handling. These sections
include:
  o Display pipe registers, total 768.
  o The PVINFO page, total 1024.
  o MCHBAR_MIRROR, total 65536.
  o CSR_MMIO, total 3072.

So we removed 70,400 items from the hash table, and speed up guest
boot time by ~500ms.

v2:
  o add a local function find_mmio_block().
  o fix comments.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:19 +08:00
Chuanxiao Dong
af2c6399aa drm/i915/gvt: add gtt_invalidate API to flush the GTT TLB
add gtt_invalidate API to handle the GTT TLB flush instead of
hiding in write_pte64 function. This can avoid overkill when using
write_pte64

Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08 13:59:18 +08:00