Commit Graph

829 Commits

Author SHA1 Message Date
Chris Wilson
334636c67e drm/i915: Trim i915_guc_info() stack usage
i915_guc_info() (part of debugfs output) tries to avoid holding
struct_mutex for a long period by copying onto the stack. This causes a
warning that the stack frame is massive, so stop doing that. We can even
forgo holding the struct_mutex here as that doesn't serialise the values
being read (and the lists used exist for the device lifetime).

v2: Skip printing anything if guc->execbuf_client is disabled (avoids
potential NULL dereference).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129121024.22650-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-11-29 15:52:29 +00:00
Libin Yang
9a148a96fc drm/i915/debugfs: add dp mst info
Add the DP MST info dump in debugfs.

Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Libin Yang <libin.yang@intel.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1480334827-112273-1-git-send-email-libin.yang@intel.com
2016-11-29 17:33:35 +02:00
Chris Wilson
312c3c47ef drm/i915/debugfs: Update pageflip information
Show the last submitted seqno to the engine, not the overall next seqno,
as this is more pertinent information when inspecting the pageflip and
whether the CS or display engine stalled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124144750.2610-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-25 07:01:26 +00:00
Joonas Lahtinen
4c266edb4c drm/i915: Rename i915_gem_timeline.next_seqno to .seqno
Rename i915_gem_timeline member 'next_seqno' into 'seqno' as
the variable is pre-increment. We've already had two bugs due
to the confusing name, second is fixed as follow-up patch.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124144750.2610-1-chris@chris-wilson.co.uk
2016-11-25 07:01:11 +00:00
Chris Wilson
9607ae7971 drm/i915/debugfs: Increment return value of gt.next_seqno
The i915_next_seqno read value is to be the next seqno used by the
kernel. However, in the conversion to atomics ops for gt.next_seqno, in
commit 28176ef4cf ("drm/i915: Reserve space in the global seqno during
request allocation"), this was changed from a post-increment to a
pre-increment. This increment was missed from the value reported by
debugfs, so in effect it was reporting the current seqno (last
assigned), not the next seqno.

Fixes: 28176ef4cf ("drm/i915: Reserve space in the global seqno during request allocation")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81209
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124094752.19129-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-24 12:27:29 +00:00
Chris Wilson
30576a2c46 drm/i915/debugfs: Drop i915_hws_info
i915_hws_info() has not been kept upto date (missing new engines) and so
I consider it to be unused. HWS is included in the error state, which
would be an avenue to retrieving it if required in future (possibly via
i915_engine_info). As it is currently oopsing with an rpm testcase, just
remove it.

Fixes: 3b3f1650b1 ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98838
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124093401.18852-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-11-24 11:00:44 +00:00
Mika Kuoppala
3fe3b030bd drm/i915: Decouple hang detection from hangcheck period
Hangcheck state accumulation has gained more steps
along the years, like head movement and more recently the
subunit inactivity check. As the subunit sampling is only
done if the previous state check showed inactivity, we
have added more stages (and time) to reach a hang verdict.

Asymmetric engine states led to different actual weight of
'one hangcheck unit' and it was demonstrated in some
hangs that due to difference in stages, simpler engines
were accused falsely of a hang as their scoring was much
more quicker to accumulate above the hang treshold.

To completely decouple the hangcheck guilty score
from the hangcheck period, convert hangcheck score to a
rough period of inactivity measurement. As these are
tracked as jiffies, they are meaningful also across
reset boundaries. This makes finding a guilty engine
more accurate across multi engine activity scenarios,
especially across asymmetric engines.

We lose the ability to detect cross batch malicious attempts
to hinder the progress. Plan is to move this functionality
to be part of context banning which is more natural fit,
later in the series.

v2: use time_before macros (Chris)
    reinstate the pardoning of moving engine after hc (Chris)
v3: avoid global state for per engine stall detection (Chris)
v4: take timeline last retirement into account (Chris)
v5: do debug print on pardoning, split out retirement timestamp (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-11-21 14:36:40 +02:00
A.Sunil Kamath
58415918d4 drm/i915: Use num_scalers instead of SKL_NUM_SCALERS in debugfs
Better to use num_scaler itself while printing scaler_info.
This fixes a bug of printing information for the missing
second scaler on pipe C for SKL platform.

Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479664226-22307-1-git-send-email-sunil.kamath@intel.com
2016-11-21 13:49:53 +02:00
Tvrtko Ursulin
08c4d7fcb3 drm/i915: Introduce enableddisabled helper
Similar to existing yesno and onoff and use it throughout the code.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479385814-2358-2-git-send-email-tvrtko.ursulin@linux.intel.com
2016-11-17 15:00:36 +00:00
Daniel Vetter
3975797f3e Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Tvrtko needs

commit b3c11ac267
Author: Eric Engestrom <eric@engestrom.ch>
Date:   Sat Nov 12 01:12:56 2016 +0000

    drm: move allocation out of drm_get_format_name()

to be able to apply his patches without conflicts.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-11-17 14:32:57 +01:00
Chris Wilson
20311bd350 drm/i915/scheduler: Execute requests in order of priorities
Track the priority of each request and use it to determine the order in
which we submit requests to the hardware via execlists.

The priority of the request is determined by the user (eventually via
the context) but may be overridden at any time by the driver. When we set
the priority of the request, we bump the priority of all of its
dependencies to match - so that a high priority drawing operation is not
stuck behind a background task.

When the request is ready to execute (i.e. we have signaled the submit
fence following completion of all its dependencies, including third
party fences), we put the request into a priority sorted rbtree to be
submitted to the hardware. If the request is higher priority than all
pending requests, it will be submitted on the next context-switch
interrupt as soon as the hardware has completed the current request. We
do not currently preempt any current execution to immediately run a very
high priority request, at least not yet.

One more limitation, is that this is first implementation is for
execlists only so currently limited to gen8/gen9.

v2: Replace recursive priority inheritance bumping with an iterative
depth-first search list.
v3: list_next_entry() for walking lists
v4: Explain how the dfs solves the recursion problem with PI.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk
2016-11-14 21:01:21 +00:00
Chris Wilson
663f71e73f drm/i915: Remove engine->execlist_lock
The execlist_lock is now completely subsumed by the engine->timeline->lock,
and so we can remove the redundant layer of locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk
2016-11-14 21:00:25 +00:00
Eric Engestrom
b3c11ac267 drm: move allocation out of drm_get_format_name()
The function's behaviour was changed in 90844f0004, without changing
its signature, causing people to keep using it the old way without
realising they were now leaking memory.
Rob Clark also noticed it was also allocating GFP_KERNEL memory in
atomic contexts, breaking them.

Instead of having to allocate GFP_ATOMIC memory and fixing the callers
to make them cleanup the memory afterwards, let's change the function's
signature by having the caller take care of the memory and passing it to
the function.
The new parameter is a single-field struct in order to enforce the size
of its buffer and help callers to correctly manage their memory.

Fixes: 90844f0004 ("drm: make drm_get_format_name thread-safe")
Cc: Rob Clark <robdclark@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Sinclair Yeh <syeh@vmware.com> (vmwgfx)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161112011309.9799-1-eric@engestrom.ch
2016-11-12 14:19:38 +01:00
Joonas Lahtinen
24327f837f drm/i915: Remove two sloppy inline functions from .h
Get rid of sloppy inline functions now that we don't have more users:

i915_gem_request_get_seqno
i915_gem_request_get_engine

v2:
- request->engine is always non-NULL (Chris)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478589108-3702-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-08 12:36:35 +02:00
Chris Wilson
c8247c0671 drm/i915: Show the execlist queue in debugfs/i915_engine_info
When looking at freezes whilst working on execlists, knowing the order
of the pending requests in the driver is useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161027000348.4641-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-02 15:02:00 +00:00
Joonas Lahtinen
56cea32382 drm/i915: Unify global_list into global_link
$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-02 15:17:13 +02:00
Ville Syrjälä
98187836fc drm/i915: Always use intel_get_crtc_for_pipe()
Replace the open coded dev_priv->pipe_to_crtc_mapping[] usage with
intel_get_crtc_for_pipe().

Mostly done with coccinelle, with a few manual tweaks

@@
expression E1, E2;
@@
(
- E1->pipe_to_crtc_mapping[E2]
+ intel_get_crtc_for_pipe(E1, E2)
|
- E1->plane_to_crtc_mapping[E2]
+ intel_get_crtc_for_plane(E1, E2)
)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-12-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Ville Syrjälä
b91eb5cce6 drm/i915: Pass dev_priv to intel_get_crtc_for_pipe()
Unify our approach to things by passing around dev_priv instead of dev.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-11-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Ville Syrjälä
0f0f74bc83 drm/i915: Pass dev_priv to intel_wait_for_vblank()
Unify our approach to things by passing around dev_priv instead of dev.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-8-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Ville Syrjälä
e2af48c66b drm/i915: Store struct intel_crtc * in {pipe,plane}_to_crtc_mapping[]
A lot of users of the {pipe,plane}_to_crtc_mapping[] will end up
casting the result to intel_crtc, so let's just store the intel_crtc
pointer in the first place.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Chris Wilson
cb399eabc4 drm/i915: Avoid accessing request->timeline outside of its lifetime
Whilst waiting on a request, we may do so without holding any locks or
any guards beyond a reference to the request. In order to avoid taking
locks within request deallocation, we drop references to its timeline
(via the context and ppgtt) upon retirement. We should avoid chasing
such pointers outside of their control, in particular we inspect the
request->timeline to see if we may restore the RPS waitboost for a
client. If we instead look at the engine->timeline, we will have similar
behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
amount of reward we give towards stalling clients (i.e. only if the
client stalls and the GPU is uncontended does it reclaim its boost).
This restores behaviour back to pre-timelines, whilst fixing:

[  645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0
[  645.078577] Read of size 4 by task gem_exec_schedu/28408
[  645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
[  645.078724] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[  645.078816]  ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
[  645.078998]  ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
[  645.079172]  ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
[  645.079345] Call Trace:
[  645.079404]  [<ffffffff8143d059>] dump_stack+0x68/0x9f
[  645.079467]  [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70
[  645.079534]  [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
[  645.079601]  [<ffffffff8122a244>] kasan_report+0x34/0x40
[  645.079676]  [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
[  645.079741]  [<ffffffff81229951>] __asan_load4+0x61/0x80
[  645.079807]  [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
[  645.079876]  [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
[  645.079944]  [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
[  645.080016]  [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
[  645.080084]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[  645.080157]  [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
[  645.080226]  [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
[  645.080296]  [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
[  645.080366]  [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0
[  645.080433]  [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
[  645.080508]  [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
[  645.080603]  [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
[  645.080670]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[  645.080738]  [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[  645.080804]  [<ffffffff8120268c>] ? do_mmap+0x47c/0x580
[  645.080871]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[  645.080938]  [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
[  645.081011]  [<ffffffff81108c53>] ? up_write+0x23/0x50
[  645.081078]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[  645.081145]  [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
[  645.081214]  [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
[  645.082030]  [<ffffffff81288408>] ? __fget+0x168/0x250
[  645.082106]  [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
[  645.082176]  [<ffffffff81288592>] ? __fget_light+0xa2/0xc0
[  645.082242]  [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[  645.082309]  [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192
[  645.082431] Allocated:
[  645.082480] PID = 28408
[  645.082535]  [  645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
[  645.082623]  [  645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0
[  645.082716]  [  645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
[  645.082817]  [  645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220
[  645.082908]  [  645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560
[  645.083027]  [  645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
[  645.083152]  [  645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
[  645.083243]  [  645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[  645.083334]  [  645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[  645.083432]  [  645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  645.083551] Freed:
[  645.083599] PID = 27629
[  645.083648]  [  645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
[  645.083738]  [  645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0
[  645.083830]  [  645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0
[  645.083922]  [  645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170
[  645.084021]  [  645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
[  645.084139]  [  645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
[  645.084257]  [  645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
[  645.084380]  [  645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630
[  645.084500]  [  645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
[  645.085313]  [  645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
[  645.085440]  [  645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920
[  645.085532]  [  645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840
[  645.085622]  [  645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0
[  645.085718]  [  645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40
[  645.085811] Memory state around the buggy address:
[  645.085869]  ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.085956]  ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086138]                                ^
[  645.086193]  ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086283]  ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

v2: Add a comment to document the hint like nature of
 intel_engine_last_submit()

Fixes: 73cb97010d ("drm/i915: Combine seqno + tracking into a global timeline struct")
Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
2016-11-01 10:48:40 +00:00
Chris Wilson
28176ef4cf drm/i915: Reserve space in the global seqno during request allocation
A restriction on our global seqno is that they cannot wrap, and that we
cannot use the value 0. This allows us to detect when a request has not
yet been submitted, its global seqno is still 0, and ensures that
hardware semaphores are monotonic as required by older hardware. To
meet these restrictions when we defer the assignment of the global
seqno, we must check that we have an available slot in the global seqno
space during request construction. If that test fails, we wait for all
requests to be completed and reset the hardware back to 0.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-33-chris@chris-wilson.co.uk
2016-10-28 20:53:56 +01:00
Chris Wilson
f6168e3304 drm/i915: Convert breadcrumbs spinlock to be irqsafe
The breadcrumbs are about to be used from within IRQ context sections
(e.g. nouveau signals a fence from an interrupt handler causing us to
submit a new request) and/or from bottom-half tasklets (i.e.
intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock
variants.

For example, deferring the request submission to the
intel_lrc_irq_handler generates this trace:

[   66.388639] =================================
[   66.388650] [ INFO: inconsistent lock state ]
[   66.388663] 4.9.0-rc2+ #56 Not tainted
[   66.388672] ---------------------------------
[   66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
[   66.388706]  (&(&b->lock)->rlock){+.?...} , at: [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
[   66.388761] {SOFTIRQ-ON-W} state was registered at:
[   66.388772]   [   66.388783] [<ffffffff810bd842>] __lock_acquire+0x682/0x1870
[   66.388795]   [   66.388803] [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
[   66.388814]   [   66.388824] [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
[   66.388835]   [   66.388845] [<ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0
[   66.388857]   [   66.388866] [<ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100
[   66.388878]   [   66.388887] [<ffffffff81403b92>] gen8_init_render_ring+0x12/0x60
[   66.388903]   [   66.388912] [<ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0
[   66.388927]   [   66.388936] [<ffffffff813f899b>] i915_gem_init+0xbb/0xf0
[   66.388950]   [   66.388959] [<ffffffff813b4980>] i915_driver_load+0x7e0/0x1330
[   66.388978]   [   66.388988] [<ffffffff813c09d8>] i915_pci_probe+0x28/0x40
[   66.389003]   [   66.389013] [<ffffffff812fa0db>] pci_device_probe+0x8b/0xf0
[   66.389028]   [   66.389037] [<ffffffff8147737e>] driver_probe_device+0x21e/0x430
[   66.389056]   [   66.389065] [<ffffffff8147766e>] __driver_attach+0xde/0xe0
[   66.389080]   [   66.389090] [<ffffffff814751ad>] bus_for_each_dev+0x5d/0x90
[   66.389105]   [   66.389113] [<ffffffff81477799>] driver_attach+0x19/0x20
[   66.389134]   [   66.389144] [<ffffffff81475ced>] bus_add_driver+0x15d/0x260
[   66.389159]   [   66.389168] [<ffffffff81477e3b>] driver_register+0x5b/0xd0
[   66.389183]   [   66.389281] [<ffffffff812fa19b>] __pci_register_driver+0x5b/0x60
[   66.389301]   [   66.389312] [<ffffffff81aed333>] i915_init+0x3e/0x45
[   66.389326]   [   66.389336] [<ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118
[   66.389350]   [   66.389359] [<ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b
[   66.389378]   [   66.389387] [<ffffffff8160fc39>] kernel_init+0x9/0x100
[   66.389402]   [   66.389411] [<ffffffff816180e7>] ret_from_fork+0x27/0x40
[   66.389426] irq event stamp: 315865
[   66.389438] hardirqs last  enabled at (315864): [<ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50
[   66.389469] hardirqs last disabled at (315865): [<ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50
[   66.389499] softirqs last  enabled at (315818): [<ffffffff8107a04c>] _local_bh_enable+0x1c/0x50
[   66.389530] softirqs last disabled at (315819): [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
[   66.389559]
[   66.389559] other info that might help us debug this:
[   66.389580]  Possible unsafe locking scenario:
[   66.389580]
[   66.389598]        CPU0
[   66.389609]        ----
[   66.389620]   lock(&(&b->lock)->rlock);
[   66.389650]   <Interrupt>
[   66.389661]     lock(&(&b->lock)->rlock);
[   66.389690]
[   66.389690]  *** DEADLOCK ***
[   66.389690]
[   66.389715] 2 locks held by swapper/1/0:
[   66.389728]  #0: (&(&tl->lock)->rlock){..-...}, at: [<ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0
[   66.389785]  #1: (&(&req->lock)->rlock/1){..-...}, at: [<ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170
[   66.389854]
[   66.389854] stack backtrace:
[   66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56
[   66.389976] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[   66.389999]  ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20
[   66.390036]  ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000
[   66.390070]  0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10
[   66.390104] Call Trace:
[   66.390117]  <IRQ>
[   66.390128]  [<ffffffff812beae5>] dump_stack+0x68/0x93
[   66.390147]  [<ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0
[   66.390164]  [<ffffffff810bb8a0>] mark_lock+0x470/0x4f0
[   66.390181]  [<ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
[   66.390203]  [<ffffffff810bd75d>] __lock_acquire+0x59d/0x1870
[   66.390221]  [<ffffffff810bedbc>] lock_acquire+0x6c/0xb0
[   66.390237]  [<ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0
[   66.390255]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
[   66.390273]  [<ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
[   66.390291]  [<ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
[   66.390309]  [<ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
[   66.390327]  [<ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170
[   66.390345]  [<ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0
[   66.390363]  [<ffffffff81079d97>] tasklet_action+0x57/0xc0
[   66.390380]  [<ffffffff8107a249>] __do_softirq+0x119/0x240
[   66.390396]  [<ffffffff8107a50e>] irq_exit+0xbe/0xd0
[   66.390414]  [<ffffffff8101afd5>] do_IRQ+0x65/0x110
[   66.390431]  [<ffffffff81618806>] common_interrupt+0x86/0x86
[   66.390446]  <EOI>
[   66.390457]  [<ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200
[   66.390480]  [<ffffffff814ec7a2>] cpuidle_enter+0x12/0x20
[   66.390498]  [<ffffffff810b639e>] call_cpuidle+0x1e/0x40
[   66.390516]  [<ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0
[   66.390534]  [<ffffffff81036133>] start_secondary+0x103/0x130

(This is split out of the defer global seqno allocation patch due to
realisation that we need a more complete conversion if we want to defer
request submission even further.)

v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ
contexts so we only need to use spin_lock_bh and not disable interrupts.

v3: We need full irq protection as we may be called from a third party
interrupt handler (via fences).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
2016-10-28 20:53:55 +01:00
Chris Wilson
562f5d4550 drm/i915: Create a unique name for the context
This will be used for communicating issues with this context to
userspace, so we want to identify the parent process and the individual
context. Note that the name isn't quite unique, it makes the presumption
of there only being a single device fd per process.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-31-chris@chris-wilson.co.uk
2016-10-28 20:53:55 +01:00
Chris Wilson
85e17f5974 drm/i915: Move the global sync optimisation to the timeline
Currently we try to reduce the number of synchronisations (now the
number of requests we need to wait upon) by noting that if we have
earlier waited upon a request, all subsequent requests in the timeline
will be after the wait. This only applies to requests in this timeline,
as other timelines will not be ordered by that waiter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
2016-10-28 20:53:54 +01:00
Chris Wilson
65e4760e39 drm/i915: Introduce a global_seqno for each request
Though we will have multiple timelines, we still have a single timeline
of execution. This we can use to provide an execution and retirement order
of requests. This keeps tracking execution of requests simple, and vital
for preserving a single waiter (i.e. so that we can order the waiters so
that only the earliest to wakeup need be woken). To accomplish this we
distinguish the seqno used to order requests per-context (external) and
that used internally for execution.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
2016-10-28 20:53:53 +01:00
Chris Wilson
73cb97010d drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.

Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
2016-10-28 20:53:51 +01:00
Chris Wilson
d07f0e59b2 drm/i915: Move GEM activity tracking into a common struct reservation_object
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

 * busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

 * non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

 * dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

 * loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

 * minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
2016-10-28 20:53:50 +01:00
Chris Wilson
fbbd37b36f drm/i915: Move object release to a freelist + worker
We want to hide the latency of releasing objects and their backing
storage from the submission, so we move the actual free to a worker.
This allows us to switch to struct_mutex freeing of the object in the
next patch.

Furthermore, if we know that the object we are dereferencing remains valid
for the duration of our access, we can forgo the usual synchronisation
barriers and atomic reference counting. To ensure this we defer freeing
an object til after an RCU grace period, such that any lookup of the
object within an RCU read critical section will remain valid until
after we exit that critical section. We also employ this delay for
rate-limiting the serialisation on reallocation - we have to slow down
object creation in order to prevent resource starvation (in particular,
files).

v2: Return early in i915_gem_tiling() ioctl to skip over superfluous
work on error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-19-chris@chris-wilson.co.uk
2016-10-28 20:53:49 +01:00
Chris Wilson
a4f5ea64f0 drm/i915: Refactor object page API
The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
2016-10-28 20:53:46 +01:00
Matt Roper
8b364b41ce drm/i915: Rename for_each_plane -> for_each_universal_plane
This macro's name is a bit misleading; it doesn't actually iterate over
all planes since it omits the cursor plane.  Its only uses are in gen9
code which is using it to iterate over the universal planes (which we
treat as primary+sprites); in these cases the legacy cursor registers
are programmed independently if necessary.  The macro's iterator value
(0 for primary plane, spritenum+1 for each secondary plane) also isn't
meaningful outside the gen9 context where the hardware considers them to
all be "universal" planes that follow this numbering.

This is just a renaming/clarification patch with no functional change.
However it will make the subsequent patches more clear.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-2-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2016-10-28 11:26:36 -07:00
Sagar Arun Kamble
685534ef4c drm/i915: Debugfs support for GuC logging control
This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
    Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
    set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

v5:
- Lock struct_mutex after the NULL check for guc log buffer vma. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
5aa1ee4b12 drm/i915: Add stats for GuC log buffer flush interrupts
GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
d6b40b4b17 drm/i915: New structure to contain GuC logging related fields
So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Daniel Vetter
f9bf1d97e8 Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge because Chris Wilson needs the very latest&greates of
Gustavo Padovan's sync_file work, specifically the refcounting changes
from:

commit 30cd85dd6e
Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Date:   Wed Oct 19 15:48:32 2016 -0200

    dma-buf/sync_file: hold reference to fence when creating sync_file

Also good to sync in general since git tends to get confused with the
cherry-picking going on.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-10-25 08:57:53 +02:00
Dave Airlie
5481e27f6f Merge tag 'drm-intel-next-2016-10-24' of git://anongit.freedesktop.org/drm-intel into drm-next
- first slice of the gvt device model (Zhenyu et al)
- compression support for gpu error states (Chris)
- sunset clause on gpu errors resulting in dmesg noise telling users
  how to report them
- .rodata diet from Tvrtko
- switch over lots of macros to only take dev_priv (Tvrtko)
- underrun suppression for dp link training (Ville)
- lspcon (hmdi 2.0 on skl/bxt) support from Shashank Sharma, polish
  from Jani
- gen9 wm fixes from Paulo&Lyude
- updated ddi programming for kbl (Rodrigo)
- respect alternate aux/ddc pins (from vbt) for all ddi ports (Ville)

* tag 'drm-intel-next-2016-10-24' of git://anongit.freedesktop.org/drm-intel: (227 commits)
  drm/i915: Update DRIVER_DATE to 20161024
  drm/i915: Stop setting SNB min-freq-table 0 on powersave setup
  drm/i915/dp: add lane_count check in intel_dp_check_link_status
  drm/i915: Fix whitespace issues
  drm/i915: Clean up DDI DDC/AUX CH sanitation
  drm/i915: Respect alternate_ddc_pin for all DDI ports
  drm/i915: Respect alternate_aux_channel for all DDI ports
  drm/i915/gen9: Remove WaEnableYV12BugFixInHalfSliceChicken7
  drm/i915: KBL - Recommended buffer translation programming for DisplayPort
  drm/i915: Move down skl/kbl ddi iboost and n_edp_entires fixup
  drm/i915: Add a sunset clause to GPU hang logging
  drm/i915: Stop reporting error details in dmesg as well as the error-state
  drm/i915/gvt: do not ignore return value of create_scratch_page
  drm/i915/gvt: fix spare warnings on odd constant _Bool cast
  drm/i915/gvt: mark symbols static where possible
  drm/i915/gvt: fix sparse warnings on different address spaces
  drm/i915/gvt: properly access enabled intel_engine_cs
  drm/i915/gvt: Remove defunct vmap_batch()
  drm/i915/gvt: Use common mapping routines for shadow_bb object
  drm/i915/gvt: Use common mapping routines for indirect_ctx object
  ...
2016-10-25 16:39:43 +10:00
Paulo Zanoni
0fc6a9dc94 drm/i915/fbc: fix FBC_COMPRESSION_MASK on BDW+
Its size is 11:0 instead of 10:0. Found by inspecting the spec. I'm
not aware of any real-world IGT failures caused by this.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477065346-13736-2-git-send-email-paulo.r.zanoni@intel.com
2016-10-24 17:39:03 -02:00
Chris Wilson
9c870d0367 drm/i915: Use RPM as the barrier for controlling user mmap access
We can remove the false coupling between RPM and struct mutex by the
observation that we can use the RPM wakeref as the barrier around user
mmap access. That is as we tear down the user's PTE atomically from
within rpm suspend and then to fault in new PTE requires the rpm
wakeref, means that no user access is possible through those PTE without
RPM being awake. Having made that observation, we can then remove the
presumption of having to take rpm outside of struct_mutex and so allow
fine grained acquisition of a wakeref around hw access rather than
having to remember to acquire the wakeref early on.

v2: Rejig placement of the new intel_runtime_pm_get() to be as tight
as possible around the GTT pread/pwrite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk
2016-10-24 13:45:35 +01:00
Chris Wilson
275f039db5 drm/i915: Move user fault tracking to a separate list
We want to decouple RPM and struct_mutex, but currently RPM has to walk
the list of bound objects and remove userspace mmapping before we
suspend (otherwise userspace may continue to access the GTT whilst it is
powered down). This currently requires the struct_mutex to walk the
bound_list, but if we move that to a separate list and lock we can take
the first step towards removing the struct_mutex.

v2: Split runtime suspend unmapping vs regular unmapping, to make the
locking (and barriers) clearer. Add the object to the userfault_list
prior to inserting the first PTE, the race between add/revoke depends
upon struct_mutex for regular unmappings and rpm for runtime-suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v1
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-1-chris@chris-wilson.co.uk
2016-10-24 13:45:35 +01:00
Chris Wilson
3ef7f22893 drm/i915: Bump object bookkeeping to u64 from size_t
Internally we allow for using more objects than a single process can
allocate, i.e. we allow for a 64bit GPU address space even on a 32bit
system. Using size_t may oveerflow.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161018120251.25043-1-chris@chris-wilson.co.uk
2016-10-18 14:22:26 +01:00
Chris Wilson
0853695c3b drm: Add reference counting to drm_atomic_state
drm_atomic_state has a complicated single owner model that tracks the
single reference from allocation through to destruction on another
thread - or perhaps on a local error path. We can simplify this tracking
by using reference counting (at a cost of a few more atomics). This is
even more beneficial when the lifetime of the state becomes more
convoluted than being passed to a single worker thread for the commit.

v2: Double check !intel atomic_commit functions for missing gets
v3: Update kerneldocs

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161014121833.439-27-chris@chris-wilson.co.uk
2016-10-17 08:19:57 +02:00
Tvrtko Ursulin
5db9401983 drm/i915: Make IS_GEN macros only take dev_priv
Saves 1416 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-14 12:23:22 +01:00
Akash Goel
3b3f1650b1 drm/i915: Allocate intel_engine_cs structure only for the enabled engines
With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
	struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
	struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
2016-10-14 09:58:43 +01:00
Chris Wilson
8baa1f04b9 drm/i915: Update debugfs describe_obj() to show fault-mappable
The current meaning of whether an object has a GGTT vma is very
ill-defined (and note we don't check for any partials either), it just
means that at some point it was in the GGTT but it may not be now. The
information we really care about here is whether it is taking up
precious mappable aperture space. This is the obj->fault_mappable flag.
We have a redundant long form reprinting of this information, so remove
that in favour of the compact flag.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012114827.17031-2-chris@chris-wilson.co.uk
2016-10-12 16:27:14 +01:00
Chris Wilson
98a2f41167 drm/i915: Allow disabling error capture
We currently capture the GPU state after we detect a hang. This is vital
for us to both triage and debug hangs in the wild (post-mortem
debugging). However, it comes at the cost of running some potentially
dangerous code (since it has to make very few assumption about the state
of the driver) that is quite resource intensive.

This patch introduces both a method to disable error capture at runtime
(for users who hit bugs at runtime and need a workaround) and to disable
error capture at compiletime (for realtime users who want to minimise
any possible latency, and never require error capture, saving ~30k of
code). The cost is that we now have to be wary of (and test!) a kconfig
flag and a module parameter. The effect of the module parameter is easy
to verify through code inspection and runtime testing, but a kconfig flag
needs regular compile checking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-2-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00
Chris Wilson
0e70447605 drm/i915: Move common code out of i915_gpu_error.c
In the next patch, I want to conditionally compile i915_gpu_error.c and
that requires moving the functions used by debug out of
i915_gpu_error.c!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-1-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00
Daniel Vetter
c0c8b9ed1b Merge tag 'drm-for-v4.9' into drm-intel-next-queued
It's been over two months, git definitely lost it's marbles. Conflicts
resolved by picking our version, plus manually checking the diff with
the parent in drm-intel-next-queued to make sure git didn't do
anything stupid. It did, so I removed 2 occasions where it
double-inserted a bit of code. The diff is now just
- kernel-doc changes
- drm format/name changes
- display-info changes
so looks all reasonable.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-10-12 08:22:25 +02:00
Joonas Lahtinen
604db6509d drm/i915: Reduce trickery in DEV_INFO_FOR_EACH_FLAG
Get rid of SEP_SEMICOLON and SEP_BLANK in DEV_INFO_FOR_EACH_FLAG.
Consolidate the debug output so that instead of one huge line with
"cap1,cap2,capN" each capability is split to own line and displayed
as "capN: [yes|no]" to make the dumps more historically informative.

v2:
- Do not break auto-indent by keeping semicolon after macro (Jani)
- Consolidate and use yesno() in all locations (Chris)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-10-05 15:24:54 +03:00
Chris Wilson
33f53719d1 drm/i915: Show waiters in i915_hangcheck_info
It is convenient to know what processes are waiting when looking at
hangcheck status in debugfs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161004201132.21801-8-chris@chris-wilson.co.uk
2016-10-05 08:40:06 +01:00
Chris Wilson
1b36595ffb drm/i915: Show RING registers through debugfs
Knowing where the RINGs are pointing is extremely useful in diagnosing
if the engines are executing the ringbuffers you expect - and igt may be
suppressing the usual method of looking in the GPU error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161004201132.21801-7-chris@chris-wilson.co.uk
2016-10-05 08:40:06 +01:00