Commit Graph

807 Commits

Author SHA1 Message Date
Daniele Ceraolo Spurio
c447ff7db3 drm/i915: update with_intel_runtime_pm to use the rpm structure
Matching the underlying get/put functions.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio
d858d5695f drm/i915: update rpm_get/put to use the rpm structure
The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson
0cf289bd5d drm/i915: Move fence register tracking from i915->mm to ggtt
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
2019-06-13 09:37:39 +01:00
Tvrtko Ursulin
09a32cb7b4 drm/i915: Make GuC GGTT reservation work on ggtt
These functions operate on ggtt so make them take that directly as
parameter.

At the same time move the USES_GUC conditional down to
intel_guc_reserve_ggtt_top for symmetry with
intel_guc_reserved_gtt_size.

v2:
 * Rename and move functions to be static in i915_gem_gtt.c (Michal)

v3:
 * Add comment explaining reason for reservation, add assert and fix
   error message. (Michal)

v4:
 * Fix checkpatch error.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611122350.15060-1-tvrtko.ursulin@linux.intel.com
2019-06-11 14:40:15 +01:00
Tvrtko Ursulin
9937e16b28 drm/i915/guc: Move intel_guc_reserved_gtt_size to intel_wopcm_guc_size
Reduces pointer chasing and gets more to the point.

v2:
 * Tidy whitespace.
 * Tidy comment. (Michal)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611110044.7742-1-tvrtko.ursulin@linux.intel.com
2019-06-11 14:40:14 +01:00
Chris Wilson
ab53497b57 drm/i915: Rename i915_hw_ppgtt to i915_ppgtt
Keeping the _hw_ in there does not help to distinguish it from its
only brethren i915_ggtt, so drop it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-2-chris@chris-wilson.co.uk
2019-06-11 11:44:32 +01:00
Chris Wilson
e568ac3874 drm/i915: Pull kref into i915_address_space
Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk
2019-06-11 11:44:24 +01:00
Tvrtko Ursulin
6a8cc66ffe drm/i915: Move i915_check_and_clear_faults to intel_reset.c
The code is logically about reset so it makes sense.

It also enables making i915_clear_error_registers static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607115932.20271-1-tvrtko.ursulin@linux.intel.com
2019-06-10 09:09:26 +01:00
Tvrtko Ursulin
dbc6518363 drm/i915: Convert some more bits to use engine mmio accessors
Remove a couple dev_priv locals as a consequence.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607084521.16845-1-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:49 +01:00
Tvrtko Ursulin
bcc726bea2 drm/i915: Unexport i915_gem_init/fini_aliasing_ppgtt
These two are only used from within i915_gem_gtt.c and can trivially be
made static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-5-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:41 +01:00
Tvrtko Ursulin
77a302e043 drm/i915: Make Gen6/7 RING_FAULT_REG access engine centric
Similar to earlier conversions, eliminate the implicit dev_priv by
introducing some helpers which take the engine parameter (since the
register itself is per engine).

v2:
 * Always use parentheses in macro arguments.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:39 +01:00
Tvrtko Ursulin
b61ea001b2 drm/i915: Reset only affected engines when handling error capture
Pass down the engine mask to i915_clear_error_registers so only affected
engines can be reset on the Gen6/7 path.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-1-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:37 +01:00
Chris Wilson
155ab8836c drm/i915: Move object close under its own lock
Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-06-06 12:51:13 +01:00
Chris Wilson
1d1b5490b9 drm/i915/gtt: Replace struct_mutex serialisation for allocation
Instead of relying on the caller holding struct_mutex across the
allocation, push the allocation under a tree of spinlocks stored inside
the page tables. Not only should this allow us to avoid struct_mutex
here, but it will allow multiple users to lock independent ranges for
concurrent allocations, and operate independently. This is vital for
pushing the GTT manipulation into a background thread where dependency
on struct_mutex is verboten, and for allowing other callers to avoid
struct_mutex altogether.

v2: Restore lost GEM_BUG_ON for removing too many PTE from
gen6_ppgtt_clear_range.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604153830.19096-1-chris@chris-wilson.co.uk
2019-06-04 20:51:46 +01:00
Chris Wilson
59ec84eca5 drm/i915: Use unchecked uncore writes to flush the GTT
As the GTT is outside of the powerwell, we can simplify flushing the
GGTT writes by using an unchecked mmio write and post.

v2: s/unc/uncore/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190604120022.20472-3-chris@chris-wilson.co.uk
2019-06-04 14:51:25 +01:00
Matthew Auld
0a4a6e74e7 drm/i915/gtt: grab wakeref in gen6_alloc_va_range
Some steps in gen6_alloc_va_range require the HW to be awake, so ideally
we should be grabbing the wakeref ourselves and not relying on the
caller already holding it for us.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529123108.24422-1-matthew.auld@intel.com
2019-05-30 12:00:41 +01:00
Chris Wilson
7f5f228008 drm/i915/gtt: Avoid overflowing the WC stash
An interesting issue cropped with making the pagetables be allocated and
freed concurrently (i.e. removing their grandeous struct_mutex guard)
was that we would overflow the page stash. This happens when we have
multiple allocators grabbing WC pages such that we fill the vm's local
page stash and then when we free another page, the page stash is already
full and we overflow.

The fix is quite simple: to check for a full page stash before adding
another. This results in us keeping a vm local page stash around for
much longer, which is both a blessing and a curse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529093407.31697-1-chris@chris-wilson.co.uk
2019-05-29 16:42:38 +01:00
Chris Wilson
6951e5893b drm/i915: Move GEM object domain management from struct_mutex to local
Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
37d63f8fdb drm/i915: Pull scatterlist utils out of i915_gem.h
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Colin Ian King
c2df2201b6 drm/i915/gtt: set err to -ENOMEM on memory allocation failure
Currently when the allocation of ppgtt->work fails the error return
path via err_free returns an uninitialized value in err. Fix this
by setting err to the appropriate error return of -ENOMEM.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: d3622099c7 ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524212627.24256-1-colin.king@canonical.com
2019-05-27 10:27:34 +01:00
Chris Wilson
63e8dcdb4f drm/i915/gtt: Neuter the deferred unbind callback from gen6_ppgtt_cleanup
Having deferred the vma destruction to a worker where we can acquire the
struct_mutex, we have to avoid chasing back into the now destroyed
ppgtt. The pd_vma is special in having a custom unbind function to scan
for unused pages despite the VMA itself being notionally part of the
GGTT. As such, we need to disable that callback to avoid a
use-after-free.

This unfortunately blew up so early during boot that CI declared the
machine unreachable as opposed to being the major failure it was. Oops.

Fixes: d3622099c7 ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524064529.20514-1-chris@chris-wilson.co.uk
2019-05-24 10:02:01 +01:00
Chris Wilson
d3622099c7 drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup
We rearranged the vm_destroy_ioctl to avoid taking struct_mutex, little
realising that buried underneath the gen6 ppgtt release path was a
struct_mutex requirement (to remove its GGTT vma). Until that
struct_mutex is vanquished, take a detour in gen6_ppgtt_cleanup to do
the i915_vma_destroy from inside a worker under the struct_mutex.

<4> [257.740160] WARN_ON(debug_locks && !lock_is_held(&(&vma->vm->i915->drm.struct_mutex)->dep_map))
<4> [257.740213] WARNING: CPU: 3 PID: 1507 at drivers/gpu/drm/i915/i915_vma.c:841 i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740214] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal mei_hdcp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 realtek snd_pcm mei_me mei prime_numbers lpc_ich
<4> [257.740224] CPU: 3 PID: 1507 Comm: gem_vm_create Tainted: G     U            5.2.0-rc1-CI-CI_DRM_6118+ #1
<4> [257.740225] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4> [257.740249] RIP: 0010:i915_vma_destroy+0x1ae/0x3a0 [i915]
<4> [257.740250] Code: 00 00 00 48 81 c7 c8 00 00 00 e8 ed 08 f0 e0 85 c0 0f 85 78 fe ff ff 48 c7 c6 e8 ec 30 a0 48 c7 c7 da 55 33 a0 e8 42 8c e9 e0 <0f> 0b 8b 83 40 01 00 00 85 c0 0f 84 63 fe ff ff 48 c7 c1 c1 58 33
<4> [257.740251] RSP: 0018:ffffc90000aafc68 EFLAGS: 00010282
<4> [257.740252] RAX: 0000000000000000 RBX: ffff8883f7957840 RCX: 0000000000000003
<4> [257.740253] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff8212d1b9
<4> [257.740254] RBP: ffffc90000aafcc8 R08: 0000000000000000 R09: 0000000000000000
<4> [257.740255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883f4d5c2a8
<4> [257.740256] R13: ffff8883f4d5d680 R14: ffff8883f4d5c668 R15: ffff8883f4d5c2f0
<4> [257.740257] FS:  00007f777fa8fe40(0000) GS:ffff88840f780000(0000) knlGS:0000000000000000
<4> [257.740258] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [257.740259] CR2: 00007f777f6522b0 CR3: 00000003c612a006 CR4: 00000000001606e0
<4> [257.740260] Call Trace:
<4> [257.740283]  gen6_ppgtt_cleanup+0x25/0x60 [i915]
<4> [257.740306]  i915_ppgtt_release+0x102/0x290 [i915]
<4> [257.740330]  i915_gem_vm_destroy_ioctl+0x7c/0xa0 [i915]
<4> [257.740376]  ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740379]  drm_ioctl_kernel+0x83/0xf0
<4> [257.740382]  drm_ioctl+0x2f3/0x3b0
<4> [257.740422]  ? i915_gem_vm_create_ioctl+0x160/0x160 [i915]
<4> [257.740426]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [257.740430]  do_vfs_ioctl+0xa0/0x6e0
<4> [257.740433]  ? lock_acquire+0xa6/0x1c0
<4> [257.740436]  ? __task_pid_nr_ns+0xb9/0x1f0
<4> [257.740439]  ksys_ioctl+0x35/0x60
<4> [257.740441]  __x64_sys_ioctl+0x11/0x20
<4> [257.740443]  do_syscall_64+0x55/0x1c0
<4> [257.740445]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

References: e0695db729 ("drm/i915: Create/destroy VM (ppGTT) for use with contexts")
Fixes: 7f3f317a66 ("drm/i915: Restore control over ppgtt for context creation ABI")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190523064933.23604-1-chris@chris-wilson.co.uk
2019-05-23 21:12:12 +01:00
Ville Syrjälä
1a74fc0b3f drm/i915: Add a new "remapped" gtt_view
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
    s/unused/unused_mbz/ (Chris)
    Separate BUILD_BUG_ON()s (Chris)
    Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
    Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
    to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Chris Wilson
112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00
Fernando Pacheco
911800765e drm/i915/uc: Reserve upper range of GGTT
GuC and HuC depend on struct_mutex for device reinitialization. Moving
away from this dependency requires perma-pinning the firmware images in
GGTT.  The upper portion of the GuC address space has a sizeable hole
(several MB) that is inaccessible by GuC. Reserve this range within GGTT
as it can comfortably hold GuC/HuC firmware images.

v2: Reserve node rather than insert (Chris)
    Simpler determination of node start/size (Daniele)
    Move reserve/release out to intel_guc.* files

v3: Reserve starting at GUC_GGTT_TOP only and bail if this
    fails (Chris)

Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-3-fernando.pacheco@intel.com
2019-04-20 08:19:12 +01:00
Chris Wilson
267e80ee6a drm/i915/gtt: Skip clearing the GGTT under gen6+ full-ppgtt
If we know that the user cannot access the GGTT, by virtue of having a
segregated memory area, we can skip clearing the unused entries as they
cannot be accessed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419201207.5477-1-chris@chris-wilson.co.uk
2019-04-19 21:29:29 +01:00
Mika Kuoppala
3936867dbc drm/i915: Disable read only ppgtt support for gen11
On gen11 writing to read only ppgtt page causes a gpu hang.
This behaviour is different than with previous gen where
read only ppgtt access is supported. On those, the write
is just dropped without visible side effects.

Disable ro ppgtt support on gen11 until a solution can
be found to bring it into line with its predecessors.

References: HSDES#1807136187
References: https://bugzilla.freedesktop.org/show_bug.cgi?id=108569
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411083034.28311-1-mika.kuoppala@linux.intel.com
2019-04-11 20:48:51 +01:00
Chris Wilson
e0695db729 drm/i915: Create/destroy VM (ppGTT) for use with contexts
In preparation to making the ppGTT binding for a context explicit (to
facilitate reusing the same ppGTT between different contexts), allow the
user to create and destroy named ppGTT.

v2: Replace global barrier for swapping over the ppgtt and tlbs with a
local context barrier (Tvrtko)
v3: serialise with struct_mutex; it's lazy but required dammit
v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
similarly, turned out different!)

v5: Fix up test unwind for aliasing-ppgtt (snb)
v6: Tighten language for uapi struct drm_i915_gem_vm_control.
v7: Patch the context image for runtime ppgtt switching!

Testcase: igt/gem_vm_create
Testcase: igt/gem_ctx_param/vm
Testcase: igt/gem_ctx_clone/vm
Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
2019-03-22 13:12:32 +00:00
Chris Wilson
3aa9945a52 drm/i915: Separate GEM context construction and registration to userspace
In later patches, it became apparent that userspace can see a partially
constructed GEM context and begin using it before it was ready, to much
hilarity. Close this window of opportunity by lifting the registration of
the context with userspace (the insertion of the context into the filp's
idr) to the very end of the CONTEXT_CREATE ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-1-chris@chris-wilson.co.uk
2019-03-21 15:59:25 +00:00
Chris Wilson
2ebd000abc drm/i915/gtt: Refactor common ppgtt initialisation
The basic setup of the i915_hw_ppgtt is the same between gen6 and gen8,
so refactor that into a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-5-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
a9fe9ca44c drm/i915/gtt: Rename i915_vm_is_48b to i915_vm_is_4lvl
Large ppGTT are differentiated by the requirement to go to four levels
to address more than 32b. Given the introduction of more 4 level ppGTT
with different sizes of addressable bits, rename i915_vm_is_48b() to
better reflect the commonality of using 4 levels.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
cbecbccaa1 drm/i915: Record platform specific ppGTT size in intel_device_info
As the maximum addressable bits is determined by platform, record that
information in our static chipset tables. This has the advantage of
being clearly recorded in our capability dumps for dmesg, debugfs and
error states.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-2-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
fb251a72d6 drm/i915/gtt: Mark ALL_ENGINES as dirty on ppGTT modification
Small simplification to set all bits in the dirty mask rather than
lookup the exact mask of populated engines. The bits for the engines
that do not exist are unused and so can safely set and then ignored.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-2-chris@chris-wilson.co.uk
2019-03-05 18:20:05 +00:00
Chris Wilson
8a68d46436 drm/i915: Store the BIT(engine->id) as the engine's mask
In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
2019-03-05 18:19:50 +00:00
Chris Wilson
a2ac437bc0 drm/i915/gtt: Store scratch page size alongside not in the common struct
As the scratch page is the only one to be allocated with variable size,
rather than keep an unused slot in all i915_page_table structs, store it
alongside the vm->scratch_page.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305135430.4948-1-chris@chris-wilson.co.uk
2019-03-05 14:43:16 +00:00
Chris Wilson
4f1836453e drm/i915/gtt: Use optimised memset32/64 for clearing PTE
Replace the open-coded memset loops with the memset32/64 routines that
reduce to a single instruction or two:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83)
Function                                     old     new   delta
gen6_ppgtt_clear_range                       371     344     -27
gen8_ppgtt_clear_pd                          575     519     -56

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304230646.23714-1-chris@chris-wilson.co.uk
2019-03-05 08:51:30 +00:00
Chris Wilson
13f1bfd3b3 drm/i915: Make object/vma allocation caches global
As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-28 11:08:02 +00:00
Chengguang Xu
772b5408e3 drm/i915: remove redundant likely/unlikely annotation
unlikely has already included in IS_ERR(), so just
remove redundant likely/unlikely annotation.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221020819.21832-1-cgxu519@gmx.com
2019-02-21 16:01:00 +00:00
Chris Wilson
21950ee7cc drm/i915: Pull i915_gem_active into the i915_active family
Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05 17:20:11 +00:00
Chris Wilson
64d6c500a3 drm/i915: Generalise GPU activity tracking
We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk
2019-02-05 17:12:00 +00:00
Chris Wilson
09d7e46b97 drm/i915: Pull VM lists under the VM mutex.
A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
2019-01-28 16:24:13 +00:00
Chris Wilson
499197dc16 drm/i915: Stop tracking MRU activity on VMA
Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
2019-01-28 16:24:09 +00:00
Chris Wilson
9f58892ea9 drm/i915: Pull all the reset functionality together into i915_reset.c
Currently the code to reset the GPU and our state is spread widely
across a few files. Pull the logic together into a common file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk
2019-01-16 22:45:31 +00:00
Chris Wilson
8cd999181f drm/i915: Prevent concurrent GGTT update and use on Braswell (again)
On Braswell, under heavy stress, if we update the GGTT while
simultaneously accessing another region inside the GTT, we are returned
the wrong values. To prevent this we stop the machine to update the GGTT
entries so that no memory traffic can occur at the same time.

This was first spotted in

commit 5bab6f60cb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 23 18:43:32 2015 +0100

    drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

but removed again in forlorn hope with

commit 4509276ee8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 20 12:47:18 2017 +0000

    drm/i915: Remove Braswell GGTT update w/a

However, gem_concurrent_blit is once again only stable with the patch
applied and CI is detecting the odd failure in forked gem_mmap_gtt tests
(which smell like the same issue). Fwiw, a wide variety of CPU memory
barriers (around GGTT flushing, fence updates, PTE updates) and GPU
flushes/invalidates (between requests, after PTE updates) were tried as
part of the investigation to find an alternate cause, nothing comes
close to serialised GGTT updates.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105591
Testcase: igt/gem_concurrent_blit
Testcase: igt/gem_mmap_gtt/*forked*
References: 5bab6f60cb ("drm/i915: Serialise updates to GGTT with access through GGTT on Braswell")
References: 4509276ee8 ("drm/i915: Remove Braswell GGTT update w/a")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114211729.30352-1-chris@chris-wilson.co.uk
2019-01-15 09:21:11 +00:00
Chris Wilson
305dc3f983 drm/i915: Differentiate between ggtt->mutex and ppgtt->mutex
We have two classes of VM, global GTT and per-process GTT. In order to
allow ourselves the freedom to mix both along call chains, distinguish
the two classes with regards to their mutex and lockdep maps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114215956.32266-1-chris@chris-wilson.co.uk
2019-01-14 22:57:28 +00:00
Chris Wilson
d4225a535b drm/i915: Syntatic sugar for using intel_runtime_pm
Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
2019-01-14 16:18:25 +00:00
Chris Wilson
538ef96b9d drm/i915/gem: Track the rpm wakerefs
Keep track of the temporary rpm wakerefs used for user access to the
device, so that we can cancel them upon release and clearly identify any
leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-10-chris@chris-wilson.co.uk
2019-01-14 16:18:13 +00:00
Chris Wilson
16e4dd0342 drm/i915: Markup paired operations on wakerefs
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
2019-01-14 16:17:53 +00:00
Jani Nikula
2f80d7bd8d drm/i915: drop all drmP.h includes
Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com
2019-01-09 10:26:36 +02:00
Chris Wilson
d25f71a162 drm/i915: Return immediately if trylock fails for direct-reclaim
Ignore trying to shrink from i915 if we fail to acquire the struct_mutex
in the shrinker while performing direct-reclaim. The trade-off being
(much) lower latency for non-i915 clients at an increased risk of being
unable to obtain a page from direct-reclaim without hitting the
oom-notifier. The proviso being that we still keep trying to hard
obtain the lock for kswapd so that we can reap under heavy memory
pressure.

v2: Taint all mutexes taken within the shrinker with the struct_mutex
subclass as an early warning system, and drop I915_SHRINK_ACTIVE from
vmap to reduce the number of dangerous paths. We also have to drop
I915_SHRINK_ACTIVE from oom-notifier to be able to make the same claim
that ACTIVE is only used from outside context, which fits in with a
longer strategy of avoiding stalls due to scanning active during
shrinking.

The danger in using the subclass struct_mutex is that we declare
ourselves more knowledgable than lockdep and deprive ourselves of
automatic coverage. Instead, we require ourselves to mark up any mutex
taken inside the shrinker in order to detect lock-inversion, and if we
miss any we are doomed to a deadlock at the worst possible moment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107115509.12523-1-chris@chris-wilson.co.uk
2019-01-08 09:28:18 +00:00