Commit Graph

1205 Commits

Author SHA1 Message Date
Chris Wilson
16e8745967 drm/i915/gt: Move the batch buffer pool from the engine to the gt
Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
2020-04-30 19:12:02 +01:00
Chris Wilson
de3b4d9361 drm/i915/gt: Restore aggressive post-boost downclocking
We reduced the clocks slowly after a boost event based on the
observation that the smoothness of animations suffered. However, since
reducing the evalution intervals, we should be able to respond to the
rapidly fluctuating workload of a simple desktop animation and so
restore the more aggressive downclocking.

References: 2a8862d2f3 ("drm/i915: Reduce the RPS shock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk
2020-04-30 00:57:38 +01:00
Chris Wilson
3f88dde6ee drm/i915/gt: Apply the aggressive downclocking to parking
We treat parking as a manual RPS timeout event, and downclock the GPU
for the next unpark and batch execution. However, having restored the
aggressive downclocking and observed that we have very light workloads
whose only interaction is through the manual parking events, carry over
the aggressive downclocking to the fake RPS events.

References: 21abf0bf16 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk
2020-04-30 00:57:37 +01:00
Chris Wilson
36d516be86 drm/i915/gt: Switch to manual evaluation of RPS
As with the realisation for soft-rc6, we respond to idling the engines
within microseconds, far faster than the response times for HW RC6 and
RPS. Furthermore, our fast parking upon idle, prevents HW RPS from
running for many desktop workloads, as the RPS evaluation intervals are
on the order of tens of milliseconds, but the typical workload is just a
couple of milliseconds, but yet we still need to determine the best
frequency for user latency versus power.

Recognising that the HW evaluation intervals are a poor fit, and that
they were deprecated [in bspec at least] from gen10, start to wean
ourselves off them and replace the EI with a timer and our accurate
busy-stats. The principle benefit of manually evaluating RPS intervals
is that we can be more responsive for better performance and powersaving
for both spiky workloads and steady-state.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698
Fixes: 98479ada42 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk
2020-04-30 00:57:37 +01:00
Chris Wilson
8e99299a04 drm/i915/gt: Track use of RPS interrupts in flags
Use the new intel_rps.flags field to store whether or not interrupts are
being used with RPS.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi@etezian.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-3-chris@chris-wilson.co.uk
2020-04-30 00:57:36 +01:00
Chris Wilson
9bad2adbdd drm/i915/gt: Move rps.enabled/active to flags
Pull the boolean intel_rps.enabled and intel_rps.active into a single
flags field, in preparation for more.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-2-chris@chris-wilson.co.uk
2020-04-30 00:57:35 +01:00
Chris Wilson
426d0073fb drm/i915/gt: Always enable busy-stats for execlists
In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
2020-04-30 00:57:34 +01:00
Chris Wilson
be1cb55a07 drm/i915/gt: Keep a no-frills swappable copy of the default context state
We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.

This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.

Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
2020-04-29 19:02:37 +01:00
Dan Carpenter
8c35a19576 drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb()
If intel_context_create() fails then it leads to an error pointer
dereference.  I shuffled things around to make error handling easier.

Fixes: 1dd47b54ba ("drm/i915: Add live selftests for indirect ctx batchbuffers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda
2020-04-29 15:16:35 +01:00
Nathan Chancellor
2ea4a7ba9b drm/i915/gt: Avoid uninitialized use of rpcurupei in frequency_show
When building with clang + -Wuninitialized:

drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable
'rpcurupei' is uninitialized when used here [-Wuninitialized]
                           rpcurupei,
                           ^~~~~~~~~
drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the
variable 'rpcurupei' to silence this warning
                u32 rpcurupei, rpcurup, rpprevup;
                             ^
                              = 0
1 warning generated.

rpupei is assigned twice; based on the second argument to
intel_uncore_read, it seems this one should have been assigned to
rpcurupei.

Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Link: https://github.com/ClangBuiltLinux/linux/issues/1016
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com
2020-04-29 07:46:21 +01:00
Chris Wilson
f6a7c21c99 drm/i915/execlists: Verify we don't submit two identical CCIDs
Check that we do not submit two contexts into ELSP with the same CCID
[upper portion of the descriptor].

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1793
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-3-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
5c4a53e3b1 drm/i915/execlists: Track inflight CCID
The presumption is that by using a circular counter that is twice as
large as the maximum ELSP submission, we would never reuse the same CCID
for two inflight contexts.

However, if we continually preempt an active context such that it always
remains inflight, it can be resubmitted with an arbitrary number of
paired contexts. As each of its paired contexts will use a new CCID,
eventually it will wrap and submit two ELSP with the same CCID.

Rather than use a simple circular counter, switch over to a small bitmap
of inflight ids so we can avoid reusing one that is still potentially
active.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
2632f174a2 drm/i915/execlists: Avoid reusing the same logical CCID
The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.

To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
96a4faf524 drm/i915/selftests: Tweak the tolerance for clock ticks to 12.5%
Give a small bump for our tolerance on comparing the expected vs
measured clock ticks/time from 10% to 12.5% to accommodate a bad result
on Sandybridge that was off by 10.3%. Hopefully, that is the worst we
will see.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1802
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428114307.5153-1-chris@chris-wilson.co.uk
2020-04-28 14:25:21 +01:00
Colin Ian King
d631461d5c drm/i915/gt: fix spelling mistake "evalution" -> "evaluation"
There is a spelling mistaking in a pr_notice message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428084920.1035125-1-colin.king@canonical.com
2020-04-28 09:53:59 +01:00
Chris Wilson
6dc0d028f5 drm/i915/gt: Fix up clock frequency
The bspec lists both the clock frequency and the effective interval. The
interval corresponds to observed behaviour, so adjust the frequency to
match.

v2: Mika rightfully asked if we could measure the clock frequency from a
selftest.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk
2020-04-27 17:34:33 +01:00
Chris Wilson
4243cd5388 drm/i915/gt: Sanitize GT first
We see that if the HW doesn't actually sleep, the HW may eat the poison
we set in its write-only HWSP during sanitize:

  intel_gt_resume.part.8: 0000:00:02.0
  __gt_unpark: 0000:00:02.0
  gt_sanitize: 0000:00:02.0 force:yes
  process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90
  process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a
  assert_pending_valid: Nothing pending for promotion!

The CS TAIL pointer should have been reset by reset_csb_pointers(), so
in this case it is likely that we have read back from the CPU cache and
so we must clflush our control over that page. In doing so, push the
sanitisation to the start of the GT sequence so that our poisoning is
assuredly before we start talking to the HW.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk
2020-04-27 11:39:23 +01:00
Chris Wilson
2759e39535 drm/i915/gt: Check cacheline is valid before acquiring
The hwsp_cacheline pointer from i915_request is very, very flimsy. The
i915_request.timeline (and the hwsp_cacheline) are lost upon retiring
(after an RCU grace). Therefore we need to confirm that once we have the
right pointer for the cacheline, it is not in the process of being
retired and disposed of before we attempt to acquire a reference to the
cacheline.

<3>[  547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915]
<3>[  547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536

<4>[  547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G     U            5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1
<4>[  547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016
<4>[  547.208564] Call Trace:
<4>[  547.208579]  dump_stack+0x96/0xdb
<4>[  547.208707]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208719]  print_address_description.constprop.6+0x16/0x310
<4>[  547.208841]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208963]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.208975]  __kasan_report+0x137/0x190
<4>[  547.209106]  ? active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209127]  kasan_report+0x32/0x50
<4>[  547.209257]  ? i915_gemfs_fini+0x40/0x40 [i915]
<4>[  547.209376]  active_debug_hint+0x6a/0x70 [i915]
<4>[  547.209389]  debug_print_object+0xa7/0x220
<4>[  547.209405]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.209426]  debug_object_assert_init+0x297/0x430
<4>[  547.209449]  ? debug_object_free+0x360/0x360
<4>[  547.209472]  ? lock_acquire+0x1ac/0x8a0
<4>[  547.209592]  ? intel_timeline_read_hwsp+0x4f/0x840 [i915]
<4>[  547.209737]  ? i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209861]  i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[  547.209990]  ? __live_alloc.isra.15+0xc0/0xc0 [i915]
<4>[  547.210005]  ? rcu_read_lock_sched_held+0xd0/0xd0
<4>[  547.210017]  ? print_usage_bug+0x580/0x580
<4>[  547.210153]  intel_timeline_read_hwsp+0xbc/0x840 [i915]
<4>[  547.210284]  __emit_semaphore_wait+0xd5/0x480 [i915]
<4>[  547.210415]  ? i915_fence_get_timeline_name+0x110/0x110 [i915]
<4>[  547.210428]  ? lockdep_hardirqs_on+0x348/0x5f0
<4>[  547.210442]  ? _raw_spin_unlock_irq+0x2a/0x40
<4>[  547.210567]  ? __await_execution.constprop.51+0x2e0/0x570 [i915]
<4>[  547.210706]  i915_request_await_dma_fence+0x8f7/0xc70 [i915]

Fixes: 85bedbf191 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk
2020-04-27 11:39:23 +01:00
Chris Wilson
68ace460c5 drm/i915/execlists: Check preempt-timeout target before submit_ports
We evaluate *active, which is a pointer into execlists->inflight[]
during dequeue to decide how long a preempt-timeout we need to apply.
However, as soon as we do the submit_ports, the HW may send its ACK
interrupt causing us to promote execlists->pending[] tp
execlists->inflight[], overwriting the value of *active. We know *active
is only stable until we submit (as we only submit when there is no
pending promotion).

[   16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915]
[   16.102356]
[   16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1:
[   16.102780]  execlists_dequeue+0x1449/0x1600 [i915]
[   16.103160]  __execlists_submission_tasklet+0x48/0x60 [i915]
[   16.103540]  execlists_submit_request+0x38e/0x3c0 [i915]
[   16.103940]  submit_notify+0x8f/0xc0 [i915]
[   16.104308]  __i915_sw_fence_complete+0x61/0x420 [i915]
[   16.104683]  i915_sw_fence_complete+0x58/0x80 [i915]
[   16.105054]  i915_sw_fence_commit+0x16/0x20 [i915]
[   16.105457]  __i915_request_queue+0x60/0x70 [i915]
[   16.105843]  i915_gem_do_execbuffer+0x2d6b/0x4230 [i915]
[   16.106227]  i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915]
[   16.106257]  drm_ioctl_kernel+0xe9/0x130
[   16.106279]  drm_ioctl+0x27d/0x45e
[   16.106311]  ksys_ioctl+0x89/0xb0
[   16.106336]  __x64_sys_ioctl+0x42/0x60
[   16.106370]  do_syscall_64+0x6e/0x2c0
[   16.106397]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk
2020-04-27 11:36:59 +01:00
Mika Kuoppala
b8a1181122 drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL
Use indirect ctx bb to load cmd buffer control value
from context image to avoid corruption.

v2: add to lrc layout (Chris)
v3: end to a cacheline (Chris)
v4: add to lrc fixed (Chris)
v5: value in offset+1

Testcase: igt/i915_selftest/gt_lrc
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com
2020-04-25 19:08:56 +01:00
Mika Kuoppala
1dd47b54ba drm/i915: Add live selftests for indirect ctx batchbuffers
Indirect ctx batchbuffers are a hw feature of which
batch can be run, by hardware, during context restoration stage.
Driver can setup a batchbuffer and also an offset into the
context image. When context image is marshalled from
memory to registers, and when the offset from the start of
context register state is equal of what driver pre-determined,
batch will run. So one can manipulate context restoration
process at cacheline granularity, given some limitations,
as you need to have rudimentaries in place before you can
run a batch.

Add selftest which will write the ring start register
to a canary spot. This will test that hardware will run a
batchbuffer for the context in question.

v2: request wait fix, naming (Chris)
v3: test order (Chris)
v4: rebase

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com
2020-04-25 19:08:18 +01:00
Mika Kuoppala
685d21096f drm/i915: Add per ctx batchbuffer wa for timestamp
Restoration of a previous timestamp can collide
with updating the timestamp, causing a value corruption.

Combat this issue by using indirect ctx bb to
modify the context image during restoring process.

We can preload value into scratch register. From which
we then do the actual write with LRR. LRR is faster and
thus less error prone as probability of race drops.

v2: tidying (Chris)
v3: lrr for all engines
v4: grp
v5: reg bit
v6: wa_bb_offset, virtual engines (Chris)

References: HSDES#16010904313
Testcase: igt/i915_selftest/gt_lrc
Suggested-by: Joseph Koston <joseph.koston@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com
2020-04-25 18:39:32 +01:00
Mika Kuoppala
168c6d231b drm/i915: Add engine scratch register to live_lrc_fixed
General purpose registers are per engine and
in a fixed location. Add to live_lrc_fixed.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com
2020-04-25 17:58:33 +01:00
Chris Wilson
9c878557b1 drm/i915/gt: Use the RPM config register to determine clk frequencies
For many configuration details within RC6 and RPS we are programming
intervals for the internal clocks. From gen11, these clocks are
configuration via the RPM_CONFIG and so for convenience, we would like
to convert to/from more natural units (ns).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
2020-04-24 19:10:17 +01:00
Chris Wilson
555a322429 drm/i915/gt: Trace RPS events
Add tracek to the RPS events (interrupts, worker, enabling, threshold
selection, frequency setting), so that if we have to debug reticent HW
we have some traces to start from.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
2020-04-24 18:38:46 +01:00
Chris Wilson
1ebf7aaf3a drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and
upon receipt of that interrupt we reprogram the GPU clocks down to the
next idle notch [to help convserve power during rc6]. However, on
execlists, we benefit from soft-rc6 immediately parking the GPU and
setting idle frequencies upon idling [within a jiffie], and here the
interrupt prevents us from restarting from our last frequency.

In the process, we can simply opt for a static pm_events mask and rely
on the enable/disable interrupts to flush the worker on parking.

This will reduce the amount of oscillation observed during steady
workloads with microsleeps, as each time the rc6 timeout occurs we
immediately follow with a waitboost for a dropped frame.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
2020-04-24 17:20:58 +01:00
Chris Wilson
50689771c8 drm/i915: Only close vma we open
The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.

Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.

It's a poor man's replacement for reference counting.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
2020-04-24 11:24:45 +01:00
Mika Kuoppala
b4892e4404 drm/i915: Make define for lrc state offset
More often than not, we need a byte offset into lrc
register state from the start of the hw state. Make it so.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com
2020-04-24 00:52:14 +01:00
Mika Kuoppala
f1cc6acf22 drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixed
Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed
for verification.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk
2020-04-24 00:36:13 +01:00
Chris Wilson
b97f77baa8 drm/i915/gt: Check carefully for an idle engine in wait-for-idle
intel_gt_wait_for_idle() tries to wait until all the outstanding requests
are retired and the GPU is idle. As a side effect of retiring requests,
we may submit more work to flush any pm barriers, and so the
wait-for-idle tries to flush the background pm work and catch the new
requests. However, if the work completed in the background before we
were able to flush, it would queue the extra barrier request without us
noticing -- and so we would return from wait-for-idle with one request
remaining. (This breaks e.g. record_default_state where we need to wait
until that barrier is retired, and it may slow suspend down by causing
us to wait on the background retirement worker as opposed to immediately
retiring the barrier.)

However, since we track if there has been a submission since the engine
pm barrier, we can very quickly detect if the idle barrier is still
outstanding.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423085940.28168-1-chris@chris-wilson.co.uk
2020-04-23 16:16:32 +01:00
Chris Wilson
36fe164d8d drm/i915/gt: Carefully order virtual_submission_tasklet
During the virtual engine's submission tasklet, we take the request and
insert into the submission queue on each of our siblings. This seems
quite simply, and so no problems with ordering. However, the sibling
execlists' submission tasklets may run concurrently with the virtual
engine's tasklet, submitting the request to HW before the virtual
finishes its task of telling all the siblings. If this happens, the
sibling tasklet may *reorder* the ve->sibling[] array that the virtual
engine tasklet is processing. This can *only* reorder within the
elements already processed by the virtual engine, nevertheless the
race is detected by KCSAN:

[  185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915]
[  185.580054]
[  185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2:
[  185.580553]  execlists_dequeue+0x6ad/0x1600 [i915]
[  185.581044]  __execlists_submission_tasklet+0x48/0x60 [i915]
[  185.581517]  execlists_submission_tasklet+0xd3/0x170 [i915]
[  185.581554]  tasklet_action_common.isra.0+0x42/0x90
[  185.581585]  __do_softirq+0xc8/0x206
[  185.581613]  run_ksoftirqd+0x15/0x20
[  185.581641]  smpboot_thread_fn+0x15a/0x270
[  185.581669]  kthread+0x19a/0x1e0
[  185.581695]  ret_from_fork+0x1f/0x30
[  185.581717]
[  185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0:
[  185.582231]  virtual_submission_tasklet+0x10e/0x5c0 [i915]
[  185.582265]  tasklet_action_common.isra.0+0x42/0x90
[  185.582291]  __do_softirq+0xc8/0x206
[  185.582315]  run_ksoftirqd+0x15/0x20
[  185.582340]  smpboot_thread_fn+0x15a/0x270
[  185.582368]  kthread+0x19a/0x1e0
[  185.582395]  ret_from_fork+0x1f/0x30
[  185.582417]

We can prevent this race by checking for the ve->request after looking
up the sibling array.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423115315.26825-1-chris@chris-wilson.co.uk
2020-04-23 16:14:27 +01:00
Chris Wilson
15501287b1 drm/i915/execlists: Drop request-before-CS assertion
When we migrated to execlists, one of the conditions we wanted to test
for was whether the breadcrumb seqno was being written before the
breadcumb interrupt was delivered. This was following on from issues
observed on previous generations which were not so strongly ordered. With
the removal of the missed interrupt detection, we have not reliable
means of detecting the out-of-order seqno/interrupt but instead tried to
assert that the relationship between the CS event interrupt and the
breadwrite should be strongly ordered. However, Icelake proves it is
possible for the HW implementation to forget about minor little details
such as write ordering and so the order between *processing* the CS
event and the breadcrumb is unreliable.

Remove the unreliable assertion, but leave a debug telltale in case we
have reason to suspect.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk
2020-04-22 17:17:50 +01:00
Chris Wilson
c92724de6d drm/i915/selftests: Try to detect rollback during batchbuffer preemption
Since batch buffers dominant execution time, most preemption requests
should naturally occur during execution of a batch buffer. We wish to
verify that should a preemption occur within a batch buffer, when we
come to restart that batch buffer, it occurs at the interrupted
instruction and most importantly does not rollback to an earlier point.

v2: Do not clear the GPR at the start of the batch, but rely on them
being clear for new contexts.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk
2020-04-22 15:42:52 +01:00
Chris Wilson
cbb6f8805a drm/i915/selftests: Disable heartbeat around RPS interrupt testing
For verifying reciving the EI interrupts, we need to hold the GPU in
very precise conditions (in terms of C0 cycles during the EI). If we
preempt the busy load to handle the heartbeat, this may perturb the busy
load causing us to miss the interrupt.

The other tests, while not as time sensitive, may also be slightly
perturbed, so apply the heartbeat protection across all the
measurements.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422083855.26842-1-chris@chris-wilson.co.uk
2020-04-22 10:59:46 +01:00
Chris Wilson
33883310cd drm/i915/selftests: Unroll the CS frequency loop
Having noticed that MI_BB_START is incurring a memory stall (see the
correlation with uncore frequency), we have to unroll the loop in order
to diminish the impact of the MI_BB_START on the instruction throughput.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421171351.19575-1-chris@chris-wilson.co.uk
2020-04-21 20:48:45 +01:00
Chris Wilson
bd3ec9e758 drm/i915/gt: Poison residual state [HWSP] across resume.
Since we may lose the content of any buffer when we relinquish control
of the system (e.g. suspend/resume), we have to be careful not to rely
on regaining control. A good method to detect when we might be using
garbage is by always injecting that garbage prior to first use on
load/resume/etc.

v2: Drop sanitize callback on cleanup
v3: Move seqno reset to timeline enter, so we reset all timelines.
However, this is done on every activation during runtime and not reset.
The similar level of paranoia we apply to correcting context state after
a period of inactivity.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421092504.7416-1-chris@chris-wilson.co.uk
2020-04-21 16:27:39 +01:00
Chris Wilson
cf9ba27840 drm/i915/selftests: Disable C-states when measuring RPS frequency response
Let's isolate the impact of cpu frequency selection on determing the GPU
throughput in response to selection of RPS frequencies.

For real systems, we do have to be concerned with the impact of
integrating c-states, p-states and rp-states, but for the sake of
proving whether or not RPS works, one baby step at a time.

For the record, as one would hope, it does not seem to impact on the
measured performance, but we do it anyway to reduce the number of
variables. Later, we can extend the testing to encourage the the
cpu/pkg to try and sleep while the GPU is busy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk
2020-04-21 16:24:34 +01:00
Chris Wilson
4ea6b1c456 drm/i915/selftests: Show the full scaling curve on failure
If we detect that the RPS end points do not scale perfectly, take the
time to measure all the in between values as well. We are aborting the
test, so we might as well spend the available time gathering critical
debug information instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421124636.22554-1-chris@chris-wilson.co.uk
2020-04-21 16:24:34 +01:00
Chris Wilson
74f103928d drm/i915/selftests: Show the pstate limits on any failure to reset min
We want to see the pstate limits whenever we fail to set the minimum
frequency as that may help for debugging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420203040.8984-1-chris@chris-wilson.co.uk
2020-04-21 09:39:35 +01:00
Chris Wilson
e42a969e72 drm/i915/selftests: Exercise dynamic reclocking with RPS
After having testing all the RPS controls individually, we need to take
a step back and check how our RPS worker integrates them to perform
dynamic GPU reclocking. So do that by submitting a spinner and wait and
see what happens.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-6-chris@chris-wilson.co.uk
2020-04-20 20:08:07 +01:00
Chris Wilson
6b36fc9442 drm/i915/selftests: Show the pcode frequency table on error
If we encounter an error while scaling, read back the frequency tables
from the pcu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-5-chris@chris-wilson.co.uk
2020-04-20 20:08:07 +01:00
Chris Wilson
0eaccc4b18 drm/i915/selftests: Split RPS frequency measurement
Split the frequency measurement into two modes, so that we can judge the
impact of the llc setup on top of the pure CS frequency scaling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-4-chris@chris-wilson.co.uk
2020-04-20 20:08:07 +01:00
Chris Wilson
9938ee2e63 drm/i915/selftests: Check RPS controls
Check that the GPU does respond to our RPS frequency requests by setting
our desired frequency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-3-chris@chris-wilson.co.uk
2020-04-20 20:08:06 +01:00
Chris Wilson
a740f5c5f6 drm/i915/selftests: Skip energy consumption tests if not controlling freq
If we can not manipulate the frequency with RPS, then comparing min/max
power consumption is pointless / misleading. We will leave the warning
about not being able to control the frequency selection via RPS to other
tests so as not to confuse this more specialised check.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-2-chris@chris-wilson.co.uk
2020-04-20 20:08:06 +01:00
Chris Wilson
4ba74e53ad drm/i915/selftests: Verify frequency scaling with RPS
One of the core tenents of reclocking the GPU is that its throughput
scales with the clock frequency. We can observe this by incrementing a
loop counter on the GPU, and compare the different execution rates at
the notional RPS frequencies.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420172739.11620-1-chris@chris-wilson.co.uk
2020-04-20 20:08:06 +01:00
Chris Wilson
f153f6395a drm/i915/gt: Move the late flush_submission in retire to the end
Avoid flushing the submission queue (of others) under the client's
timeline lock, but instead move it to the end so that we may catch more.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1066
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420125356.26614-2-chris@chris-wilson.co.uk
2020-04-20 16:56:23 +01:00
Michael J. Ruhl
31a02eb70b drm/i915: Refactor setting dma info to a common helper
DMA_MASK bit values are different for different generations.

This will become more difficult to manage over time with the open
coded usage of different versions of the device.

Fix by:
  disallow setting of dma mask in AGP path (< GEN(5) for i915,
  add dma_mask_size to the device info configuration,
  updating open code call sequence to the latest interface,
  refactoring into a common function for setting the dma segment
  and mask info

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
cc: Brian Welty <brian.welty@intel.com>
cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417195107.68732-1-michael.j.ruhl@intel.com
2020-04-18 07:49:11 +01:00
Chris Wilson
c43dd6b414 drm/i915/selftests: Check power consumption at min/max frequencies
A basic premise of RPS is that at lower frequencies, not only do we run
slower, but we save power compared to higher frequencies. For example,
when idle, we set the minimum frequency just in case there is some
residual current. Since the power curve should be a physical
relationship, if we find no power saving it's likely that we've broken
our frequency handling, so test!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417152018.13079-2-chris@chris-wilson.co.uk
2020-04-17 18:48:51 +01:00
Chris Wilson
d4e3d455a1 drm/i915/selftests: Move gpu energy measurement into its own little lib
Move the handy utility to measure the GPU energy consumption using RAPL
msr into a common lib so that it can be reused easily.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417152018.13079-1-chris@chris-wilson.co.uk
2020-04-17 18:48:51 +01:00
Chris Wilson
a50717dbf4 drm/i915/selftests: Take the engine wakeref around __rps_up_interrupt
Since we are touching the device to read the registers, we are required
to ensure the device is awake at the time. Currently, we believe
ourselves to be inside the active request [thus an active engine
wakeref], but since that may be retired in the background, we can
spontaneously lose the wakeref and the ability to probe the HW.

<4> [379.686703] RPM wakelock ref not held during HW access
<4> [379.686805] WARNING: CPU: 7 PID: 4869 at ./drivers/gpu/drm/i915/intel_runtime_pm.h:115 gen12_fwtable_read32+0x233/0x300 [i915]
<4> [379.686808] Modules linked in: i915(+) vgem snd_hda_codec_hdmi mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ax88179_178a usbnet mii ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e mei_me ptp mei pps_core intel_lpss_pci prime_numbers [last unloaded: i915]
<4> [379.686827] CPU: 7 PID: 4869 Comm: i915_selftest Tainted: G     U            5.7.0-rc1-CI-CI_DRM_8313+ #1
<4> [379.686830] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2457.A13.1912190237 12/19/2019
<4> [379.686883] RIP: 0010:gen12_fwtable_read32+0x233/0x300 [i915]
<4> [379.686887] Code: d8 ea e0 0f 0b e9 19 fe ff ff 80 3d ad 12 2d 00 00 0f 85 17 fe ff ff 48 c7 c7 b0 32 3e a0 c6 05 99 12 2d 00 01 e8 2d d8 ea e0 <0f> 0b e9 fd fd ff ff 8b 05 c4 75 56 e2 85 c0 0f 85 84 00 00 00 48
<4> [379.686889] RSP: 0018:ffffc90000727970 EFLAGS: 00010286
<4> [379.686892] RAX: 0000000000000000 RBX: ffff88848cc20ee8 RCX: 0000000000000001
<4> [379.686894] RDX: 0000000080000001 RSI: ffff88843b1f0900 RDI: 00000000ffffffff
<4> [379.686896] RBP: 0000000000000000 R08: ffff88843b1f0900 R09: 0000000000000000
<4> [379.686898] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000a058
<4> [379.686900] R13: 0000000000000001 R14: ffff88848cc2bf30 R15: 00000000ffffffea
<4> [379.686902] FS:  00007f7d63f5e300(0000) GS:ffff8884a0180000(0000) knlGS:0000000000000000
<4> [379.686904] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [379.686907] CR2: 000055e5c30f4988 CR3: 000000042e190002 CR4: 0000000000760ee0
<4> [379.686910] PKRU: 55555554
<4> [379.686911] Call Trace:
<4> [379.686986]  live_rps_interrupt+0xb14/0xc10 [i915]
<4> [379.687051]  ? intel_rps_unpark+0xb0/0xb0 [i915]
<4> [379.687057]  ? __trace_bprintk+0x57/0x80
<4> [379.687143]  __i915_subtests+0xb8/0x210 [i915]
<4> [379.687222]  ? __i915_live_teardown+0x50/0x50 [i915]
<4> [379.687291]  ? __intel_gt_live_setup+0x30/0x30 [i915]
<4> [379.687361]  __run_selftests+0x112/0x170 [i915]
<4> [379.687431]  i915_live_selftests+0x2c/0x60 [i915]
<4> [379.687491]  i915_pci_probe+0x93/0x1b0 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417093928.17822-2-chris@chris-wilson.co.uk
2020-04-17 14:47:30 +01:00
Chris Wilson
9d7e560f43 drm/i915/selftests: Delay spinner before waiting for an interrupt
It seems that although (perhaps because of the memory stall?) the
spinner has signaled that it has started, it still takes some time to
spin up to 100% utilisation of the HW. Since the test depends on the
full utilisation of the HW to trigger the RPS interrupt, wait a little
bit and flush the interrupt status to be sure that the event we see if
from the spinner.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417093928.17822-1-chris@chris-wilson.co.uk
2020-04-17 14:47:15 +01:00
Chris Wilson
23122a4d99 drm/i915/gt: Scrub execlists state on resume
Before we resume, we reset the HW so we restart from a known good state.
However, as a part of the reset process, we drain our pending CS event
queue -- and if we are resuming that does not correspond to internal
state. On setup, we are scrubbing the CS pointers, but alas only on
setup.

Apply the sanitization not just to setup, but to all resumes.

Reported-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200416114117.3460-1-chris@chris-wilson.co.uk
2020-04-17 13:56:00 +01:00
Joonas Lahtinen
2b703bbda2 Merge drm/drm-next into drm-intel-next-queued
Backmerging in order to pull "topic/phy-compliance".

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-04-16 14:35:16 +03:00
Chris Wilson
a080bd994c drm/i915/gt: Update PMINTRMSK holding fw
If we use a non-forcewaked write to PMINTRMSK, it does not take effect
until much later, if at all, causing a loss of RPS interrupts and no GPU
reclocking, leaving the GPU running at the wrong frequency for long
periods of time.

Reported-by: Francisco Jerez <currojerez@riseup.net>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Fixes: 35cc7f32c2 ("drm/i915/gt: Use non-forcewake writes for RPS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: <stable@vger.kernel.org> # v5.6+
Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-2-chris@chris-wilson.co.uk
2020-04-16 00:21:31 +01:00
Chris Wilson
46495adc6c drm/i915/selftests: Exercise basic RPS interrupt generation
Since we depend upon RPS generating interrupts after evaluation
intervals to determine when to up/down clock the GPU, it is imperative
that we successfully enable interrupt generation! Verify that we do see
an interrupt if we keep the GPU busy for an entire EI.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-1-chris@chris-wilson.co.uk
2020-04-16 00:21:18 +01:00
Matt Roper
2a040f0d08 drm/i915/tgl: Initialize multicast register steering for workarounds
Even though the bspec is missing gen12 register details for the MCR
selector register (0xFDC), this is confirmed by hardware folks to be a
mistake; the register does exist and we do indeed need to steer
multicast register reads to an appropriate instance the same as we did
on gen11.

Note that despite the lack of documentation we were still using the MCR
selector to read INSTDONE and such in read_subslice_reg() too.

Cc: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200414211118.2787489-4-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-04-15 15:29:20 -07:00
Chris Wilson
c1b5ea926d drm/i915/selftests: Check for an already completed timeslice
With timeslice yielding on a semaphore, we may complete timeslices much
faster than we were expecting and already have yielded the stuck
request. Before complaining that timeslicing is not enabled, check that
we haven't already applied the switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200410081638.19893-1-chris@chris-wilson.co.uk
2020-04-10 14:15:27 +01:00
Chris Wilson
fbaa1229d3 drm/i915/selftests: Take an explicit ref for rq->batch
Since we are peeking into the batch object of the request, it is
beholden on us to hold a reference to it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1634
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200408091723.28937-1-chris@chris-wilson.co.uk
2020-04-08 13:40:07 +01:00
Chris Wilson
dd345efe8a drm/i915/gt: Mark up racy check of breadcrumb irq enabled
We control b->irq_enabled inside the b->irq_lock, but we check before
entering the spinlock whether or not the interrupt is currently
unmasked.

[ 1511.735208] BUG: KCSAN: data-race in __intel_breadcrumbs_disarm_irq [i915] / intel_engine_disarm_breadcrumbs [i915]
[ 1511.735231]
[ 1511.735242] write to 0xffff8881f75fc214 of 1 bytes by interrupt on cpu 2:
[ 1511.735440]  __intel_breadcrumbs_disarm_irq+0x4b/0x160 [i915]
[ 1511.735635]  signal_irq_work+0x337/0x710 [i915]
[ 1511.735652]  irq_work_run_list+0xd7/0x110
[ 1511.735666]  irq_work_run+0x1d/0x50
[ 1511.735681]  smp_irq_work_interrupt+0x21/0x30
[ 1511.735701]  irq_work_interrupt+0xf/0x20
[ 1511.735722]  __do_softirq+0x6f/0x206
[ 1511.735736]  irq_exit+0xcd/0xe0
[ 1511.735756]  do_IRQ+0x44/0xc0
[ 1511.735773]  ret_from_intr+0x0/0x1c
[ 1511.735787]  schedule+0x0/0xb0
[ 1511.735803]  worker_thread+0x194/0x670
[ 1511.735823]  kthread+0x19a/0x1e0
[ 1511.735837]  ret_from_fork+0x1f/0x30
[ 1511.735848]
[ 1511.735867] read to 0xffff8881f75fc214 of 1 bytes by task 432 on cpu 1:
[ 1511.736068]  intel_engine_disarm_breadcrumbs+0x22/0x80 [i915]
[ 1511.736263]  __engine_park+0x107/0x5d0 [i915]
[ 1511.736453]  ____intel_wakeref_put_last+0x44/0x90 [i915]
[ 1511.736648]  __intel_wakeref_put_last+0x5a/0x70 [i915]
[ 1511.736842]  intel_context_exit_engine+0xf2/0x100 [i915]
[ 1511.737044]  i915_request_retire+0x6b2/0x770 [i915]
[ 1511.737244]  retire_requests+0x7a/0xd0 [i915]
[ 1511.737438]  intel_gt_retire_requests_timeout+0x3a7/0x6f0 [i915]
[ 1511.737633]  i915_drop_caches_set+0x1e7/0x260 [i915]
[ 1511.737650]  simple_attr_write+0xfa/0x110
[ 1511.737665]  full_proxy_write+0x94/0xc0
[ 1511.737679]  __vfs_write+0x4b/0x90
[ 1511.737697]  vfs_write+0xfc/0x280
[ 1511.737718]  ksys_write+0x78/0x100
[ 1511.737732]  __x64_sys_write+0x44/0x60
[ 1511.737751]  do_syscall_64+0x6e/0x2c0
[ 1511.737769]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200408092916.5355-1-chris@chris-wilson.co.uk
2020-04-08 13:40:07 +01:00
Chris Wilson
32a55a109f drm/i915/gt: Mark up racy read of intel_ring.head
The intel_ring.head is updated as the requests are retired, but is
sampled at any time as we submit requests. Furthermore, it tracks
RING_HEAD which is inherently asynchronous.

[  148.630314] BUG: KCSAN: data-race in execlists_dequeue [i915] / i915_request_retire [i915]
[  148.630349]
[  148.630374] write to 0xffff8881f4e28ddc of 4 bytes by task 90 on cpu 2:
[  148.630752]  i915_request_retire+0xed/0x770 [i915]
[  148.631123]  retire_requests+0x7a/0xd0 [i915]
[  148.631491]  engine_retire+0xa6/0xe0 [i915]
[  148.631523]  process_one_work+0x3af/0x640
[  148.631552]  worker_thread+0x80/0x670
[  148.631581]  kthread+0x19a/0x1e0
[  148.631609]  ret_from_fork+0x1f/0x30
[  148.631629]
[  148.631652] read to 0xffff8881f4e28ddc of 4 bytes by task 14288 on cpu 3:
[  148.632019]  execlists_dequeue+0x1300/0x1680 [i915]
[  148.632384]  __execlists_submission_tasklet+0x48/0x60 [i915]
[  148.632770]  execlists_submit_request+0x38e/0x3c0 [i915]
[  148.633146]  submit_notify+0x8f/0xc0 [i915]
[  148.633512]  __i915_sw_fence_complete+0x5d/0x3e0 [i915]
[  148.633875]  i915_sw_fence_complete+0x58/0x80 [i915]
[  148.634238]  i915_sw_fence_commit+0x16/0x20 [i915]
[  148.634613]  __i915_request_queue+0x60/0x70 [i915]
[  148.634985]  i915_gem_do_execbuffer+0x2de0/0x42b0 [i915]
[  148.635366]  i915_gem_execbuffer2_ioctl+0x2ab/0x580 [i915]
[  148.635400]  drm_ioctl_kernel+0xe9/0x130
[  148.635429]  drm_ioctl+0x27d/0x45e
[  148.635456]  ksys_ioctl+0x89/0xb0
[  148.635482]  __x64_sys_ioctl+0x42/0x60
[  148.635510]  do_syscall_64+0x6e/0x2c0
[  148.635542]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  645.071436] BUG: KCSAN: data-race in gen8_emit_fini_breadcrumb [i915] / i915_request_retire [i915]
[  645.071456]
[  645.071467] write to 0xffff8881efe403dc of 4 bytes by task 14668 on cpu 3:
[  645.071647]  i915_request_retire+0xed/0x770 [i915]
[  645.071824]  i915_request_create+0x6c/0x160 [i915]
[  645.072000]  i915_gem_do_execbuffer+0x206d/0x42b0 [i915]
[  645.072177]  i915_gem_execbuffer2_ioctl+0x2ab/0x580 [i915]
[  645.072194]  drm_ioctl_kernel+0xe9/0x130
[  645.072208]  drm_ioctl+0x27d/0x45e
[  645.072222]  ksys_ioctl+0x89/0xb0
[  645.072235]  __x64_sys_ioctl+0x42/0x60
[  645.072248]  do_syscall_64+0x6e/0x2c0
[  645.072263]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  645.072275]
[  645.072285] read to 0xffff8881efe403dc of 4 bytes by interrupt on cpu 2:
[  645.072458]  gen8_emit_fini_breadcrumb+0x158/0x300 [i915]
[  645.072636]  __i915_request_submit+0x204/0x430 [i915]
[  645.072809]  execlists_dequeue+0x8e1/0x1680 [i915]
[  645.072982]  __execlists_submission_tasklet+0x48/0x60 [i915]
[  645.073154]  execlists_submit_request+0x38e/0x3c0 [i915]
[  645.073330]  submit_notify+0x8f/0xc0 [i915]
[  645.073499]  __i915_sw_fence_complete+0x5d/0x3e0 [i915]
[  645.073668]  i915_sw_fence_wake+0xc2/0x130 [i915]
[  645.073836]  __i915_sw_fence_complete+0x2cf/0x3e0 [i915]
[  645.074006]  i915_sw_fence_complete+0x58/0x80 [i915]
[  645.074175]  dma_i915_sw_fence_wake+0x3e/0x80 [i915]
[  645.074344]  signal_irq_work+0x62f/0x710 [i915]
[  645.074360]  irq_work_run_list+0xd7/0x110
[  645.074373]  irq_work_run+0x1d/0x50
[  645.074386]  smp_irq_work_interrupt+0x21/0x30
[  645.074400]  irq_work_interrupt+0xf/0x20
[  645.074414]  _raw_spin_unlock_irqrestore+0x34/0x40
[  645.074585]  execlists_submission_tasklet+0xde/0x170 [i915]
[  645.074602]  tasklet_action_common.isra.0+0x42/0x90
[  645.074617]  __do_softirq+0xc8/0x206
[  645.074629]  irq_exit+0xcd/0xe0
[  645.074642]  do_IRQ+0x44/0xc0
[  645.074654]  ret_from_intr+0x0/0x1c
[  645.074667]  finish_task_switch+0x73/0x230
[  645.074679]  __schedule+0x1c5/0x4c0
[  645.074691]  schedule+0x45/0xb0
[  645.074704]  worker_thread+0x194/0x670
[  645.074716]  kthread+0x19a/0x1e0
[  645.074729]  ret_from_fork+0x1f/0x30

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407221832.15465-1-chris@chris-wilson.co.uk
2020-04-08 13:40:07 +01:00
Jani Nikula
4381bbd856 drm/i915/uc: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-17-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Jani Nikula
dc483ba501 drm/i915/gt: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-16-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Jani Nikula
61d5c507e9 drm/i915/uc: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-10-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Chris Wilson
cf4c826d96 drm/i915/selftests: Drop vestigal timeslicing assert
Since the semaphore interrupt may cause us to yield the timeslice
immediately, we may cancel the timer before we notice the submission is
complete. The assertion is no longer valid due to the race with the
interrupt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407222625.15542-1-chris@chris-wilson.co.uk
2020-04-08 09:42:17 +01:00
Chris Wilson
c4e8ba7390 drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore
If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the
user batch or in our own preamble, the engine raises a
GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so
respond to a semaphore wait by yielding the timeslice, if we have
another context to yield to!

The only real complication is that the interrupt is only generated for
the start of the semaphore wait, and is asynchronous to our
process_csb() -- that is, we may not have registered the timeslice before
we see the interrupt. To ensure we don't miss a potential semaphore
blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark
the interrupt and apply it to the next timeslice regardless of whether it
was active at the time.

v2: We use semaphores in preempt-to-busy, within the timeslicing
implementation itself! Ergo, when we do insert a preemption due to an
expired timeslice, the new context may start with the missed semaphore
flagged by the retired context and be yielded, ad infinitum. To avoid
this, read the context id at the time of the semaphore interrupt and
only yield if that context is still active.

Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk
2020-04-07 14:43:58 +01:00
Chris Wilson
0b72a251bf drm/i915/gt: Fill all the unused space in the GGTT
When we allocate space in the GGTT we may have to allocate a larger
region than will be populated by the object to accommodate fencing. Make
sure that this space beyond the end of the buffer points safely into
scratch space, in case the HW tries to access it anyway (e.g. fenced
access to the last tile row).

v2: Preemptively / conservatively guard gen6 ggtt as well.

Reported-by: Imre Deak <imre.deak@intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1554
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331152348.26946-1-chris@chris-wilson.co.uk
(cherry picked from commit 4d6c185908)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-06 10:31:19 -07:00
Chris Wilson
848862e672 drm/i915/gt: Free request pool from virtual engines
While extremely unlikely to be populated, we could capture a request on
the virtual engine which we should free along with the virtual engine.

Fixes: 43acd6516c ("drm/i915: Keep a per-engine request pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403203303.10903-1-chris@chris-wilson.co.uk
2020-04-03 21:50:24 +01:00
Chris Wilson
53f5da74c7 drm/i915/selftests: Wait until we start timeslicing after a submit
If we submit, we do not start timeslicing until we process the CS event
that marks the start of the context running on HW. So in the selftest,
be sure to wait until we have processed the pending events before
asserting that timeslicing has begun.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403190209.21818-1-chris@chris-wilson.co.uk
2020-04-03 21:38:27 +01:00
Chris Wilson
43acd6516c drm/i915: Keep a per-engine request pool
Add a tiny per-engine request mempool so that we should always have a
request available for powermanagement allocations from tricky
contexts. This reserve is expected to be only used for kernel
contexts when barriers must be emitted [almost] without fail.

The main consumer for this reserved request is expected to be engine-pm,
for which we know that there will always be at least the previous pm
request that we can reuse under mempressure (so there should always be
a spare request for engine_park()).

This is an alternative to using a comparatively bulky mempool, which
requires custom handling for both our reserved allocation requirement
and to protect our TYPESAFE_BY_RCU slab cache. The advantage of mempool
would be that it would allow us to keep a larger per-engine request
pool. However, converting over to mempool is straightforward should the
need arise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-and-tested-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402184037.21630-1-chris@chris-wilson.co.uk
2020-04-03 15:20:03 +01:00
Swathi Dhanavanthri
63d0f3ea8e drm/i915/tgl: Make Wa_14010229206 permanent
This workaround now applies to all steppings, not just A0.
Wa_1409085225 is a temporary A0-only W/A however it is
identical to Wa_14010229206 and hence the combined workaround
is made permanent.
Bspec: 52890

Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mattrope: added missing blank line]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326234955.16155-1-swathi.dhanavanthri@intel.com
2020-04-02 15:48:21 -07:00
Chris Wilson
98d513167f drm/i915/selftests: Check for has-reset before testing hostile contexts
In order to kill off a hostile context, we need to be able to reset the
GPU. So check that is supported prior to beginning the test.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402205839.25065-1-chris@chris-wilson.co.uk
2020-04-02 22:00:54 +01:00
Chris Wilson
4c977837ba drm/i915/execlists: Peek at the next submission for error interrupts
If we receive the error interrupt before the CS interrupt, we may find
ourselves without an active request to reset, skipping the GPU reset.
All because the attempt to reset was too early.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401110435.30389-1-chris@chris-wilson.co.uk
2020-04-02 21:30:30 +01:00
Chris Wilson
7bcb773daf drm/i915/uc: Cleanup kerneldoc warnings
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c:205: warning: Excess function parameter 'supported' description in 'intel_uc_fw_init_early'
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c:205: warning: Excess function parameter 'platform' description in 'intel_uc_fw_init_early'
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c:205: warning: Excess function parameter 'rev' description in 'intel_uc_fw_init_early'

drivers/gpu/drm/i915/gt/uc/intel_guc_log.c:696: warning: Function parameter or member 'log' not described in 'intel_guc_log_info'
drivers/gpu/drm/i915/gt/uc/intel_guc_log.c:696: warning: Excess function parameter 'guc' description in 'intel_guc_log_info'

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330212254.18236-1-chris@chris-wilson.co.uk
2020-04-02 20:00:39 +01:00
Chris Wilson
0d86ee3509 drm/i915/gt: Make fence revocation unequivocal
If we must revoke the fence because the VMA is no longer present, or
because the fence no longer applies, ensure that we do and convert it
into an error if we try but cannot.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-3-chris@chris-wilson.co.uk
2020-04-01 23:34:17 +01:00
Chris Wilson
725c9ee7fc drm/i915/gt: Store the fence details on the fence
Make a copy of the object tiling parameters at the point of grabbing the
fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-2-chris@chris-wilson.co.uk
2020-04-01 23:34:16 +01:00
Chris Wilson
63baf4f3d5 drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence
Only GPU activity via the GGTT fence is asynchronous, we know that we
control the CPU access directly, so we only need to wait for the GPU to
stop using the fence before we relinquish it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-1-chris@chris-wilson.co.uk
2020-04-01 23:34:16 +01:00
Chen Zhou
0d961c4610 drm/i915/gt: fix spelling mistake "undeflow" -> "underflow"
There is a spelling mistake in comment, fix it.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401022506.52965-1-chenzhou10@huawei.com
2020-04-01 14:38:18 +01:00
Chris Wilson
a5572d1f0d drm/i915/gt: Align engine dump active/pending
Insert a space so that the same fields between active/pending execlists
state are aligned.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401111554.6279-1-chris@chris-wilson.co.uk
2020-04-01 14:30:36 +01:00
Chris Wilson
4d6c185908 drm/i915/gt: Fill all the unused space in the GGTT
When we allocate space in the GGTT we may have to allocate a larger
region than will be populated by the object to accommodate fencing. Make
sure that this space beyond the end of the buffer points safely into
scratch space, in case the HW tries to access it anyway (e.g. fenced
access to the last tile row).

v2: Preemptively / conservatively guard gen6 ggtt as well.

Reported-by: Imre Deak <imre.deak@intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1554
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331152348.26946-1-chris@chris-wilson.co.uk
2020-03-31 21:51:08 +01:00
Mika Kuoppala
708c82d59b drm/i915: Report all failed registers for ctx isolation
For CI it is enough to point out a single failure
in isolation. However it is beneficial to gather
info in logs for transients further down
the line.

Do not stop into first comparison failure but
continue probing forward.

v2: for all engines and poisons (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331135403.16906-1-mika.kuoppala@linux.intel.com
2020-03-31 21:42:12 +01:00
Chris Wilson
606727842d drm/i915/gt: Include the execlists CCID of each port in the engine dump
Since we print out EXECLISTS_STATUS in the dump, also print out the CCID
of each context so we can cross check between the two.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331094239.23145-1-chris@chris-wilson.co.uk
2020-03-31 21:42:12 +01:00
Chris Wilson
9171555572 drm/i915/execlists: Pause CS flow before reset
Since we may be attempting to reset an active engine, we try to freeze
it in place before resetting -- to be on the safe side. We can go one
step further if we are using the CS flow semaphore to prevent the
context switching into the next.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331091459.29179-2-chris@chris-wilson.co.uk
2020-03-31 21:42:12 +01:00
Chris Wilson
71a6688e81 drm/i915/selftests: Tidy up an error message for live_error_interrupt
Since we don't wait for the error interrupt to reset, restart and then
complete the guilty request, clean up the error messages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331091459.29179-1-chris@chris-wilson.co.uk
2020-03-31 21:42:12 +01:00
Chris Wilson
f53ae29c0e drm/i915/gt: Include a few tracek for timeslicing
Add a few telltales to see when timeslicing is being enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331120502.14713-1-chris@chris-wilson.co.uk
2020-03-31 21:42:12 +01:00
Chris Wilson
e2ccf0d009 drm/i915/execlists: Double check breadcrumb before crying foul
process_csb: 0000:00:02.0 bcs0: cs-irq head=4, tail=5
  process_csb: 0000:00:02.0 bcs0: csb[5]: status=0x00008002:0x60000020
  trace_ports: 0000:00:02.0 bcs0: preempted { ff84:45154! prio 2 }
  trace_ports: 0000:00:02.0 bcs0: promote { ff84:45155* prio 2 }
  trace_ports: 0000:00:02.0 bcs0: submit { ff84:45156 prio 2 }

  process_csb: 0000:00:02.0 bcs0: cs-irq head=5, tail=6
  process_csb: 0000:00:02.0 bcs0: csb[6]: status=0x00000018:0x60000020
  trace_ports: 0000:00:02.0 bcs0: completed { ff84:45155* prio 2 }
  process_csb: 0000:00:02.0 bcs0: ring:{start:0x00178000, head:0928, tail:0928, ctl:00000000, mode:00000200}
  process_csb: 0000:00:02.0 bcs0: rq:{start:00178000, head:08b0, tail:08f0, seqno:ff84:45155, hwsp:45156},
  process_csb: 0000:00:02.0 bcs0: ctx:{start:00178000, head:e000928, tail:0928},
  process_csb: GEM_BUG_ON("context completed before request")

In this sequence, we can see that although we have submitted the next
request [ff84:45156] to HW (via ELSP[]) it has not yet reported the
lite-restore. Instead, we see the completion event of the currently
active request [ff84:45155] but at the time of processing that event,
the breadcrumb has not yet been written. Though by the time we do print
out the debug info, the seqno write of ff84:45156 has landed!

Therefore there is a serialisation problem between the seqno writes and
CS events, not just between the CS buffer and its head/tail pointers as
previously observed on Icelake.

This is not a huge problem, as we don't strictly rely on the breadcrumb
to determine HW activity, but it may indicate that interrupt delivery is
before the seqno write, aka bringing back the plague of missed
interrupts from yesteryear. However, there is no indication of this
wider problem, so let's just flush the seqno read before reporting an
error. If it persists after the fresh read we can worry again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330234318.30638-1-chris@chris-wilson.co.uk
2020-03-31 10:02:04 +01:00
Chris Wilson
b28b34ac85 drm/i915/execlists: Explicitly reset both reg and context runtime
Upon a GPU reset, we copy the default context image over top of the
guilty image. This will rollback the CTX_TIMESTAMP register to before
our value of ce->runtime.last. Reset both back to 0 so that we do not
encounter an underflow on the next schedule out after resume.

This should not be a huge issue in practice, as hangs should be rare in
correct code.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330125827.5804-1-chris@chris-wilson.co.uk
2020-03-30 21:13:50 +01:00
Chris Wilson
4b379a48de drm/i915/selftests: Check timeout before flush and cond checks
Allow a bit of leniency for the CPU scheduler to be distracted while we
flush the tasklet and so ensure that we always check the status of the
request once more before timing out.

v2: Wait until the HW acked the submit, and we do any secondary actions
for the submit (e.g. timeslices)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330121644.25277-1-chris@chris-wilson.co.uk
2020-03-30 17:56:00 +01:00
Chris Wilson
8b6d457f95 drm/i915/execlists: Include priority info in trace_ports
Add some extra information into trace_ports to help with reviewing
correctness.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330113137.24425-1-chris@chris-wilson.co.uk
2020-03-30 17:56:00 +01:00
Michal Wajdeczko
d472634ef9 drm/i915/huc: Fix HuC register used in debugfs
We report HuC status in debugfs using register read, but
we missed that on Gen11+ HuC uses different register.
Use correct one.

While here, correct placement of the colon.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330113338.1713-1-michal.wajdeczko@intel.com
2020-03-30 17:56:00 +01:00
Michal Wajdeczko
2da48b1f88 drm/i915/huc: Add more errors for I915_PARAM_HUC_STATUS
There might be many reasons why we failed to successfully
load and authenticate HuC firmware, but today we only use
single error in case of no HuC hardware. Add some more
error codes for most common cases (disabled, not installed,
corrupted or mismatched firmware).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Robert M. Fosha <robert.m.fosha@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330113302.1670-1-michal.wajdeczko@intel.com
2020-03-30 15:21:33 +01:00
Chris Wilson
35f3fd8182 drm/i915/execlists: Workaround switching back to a completed context
In what seems remarkably similar to the w/a required to not reload an
idle context with HEAD==TAIL, it appears we must prevent the HW from
switching to an idle context in ELSP[1], while simultaneously trying to
preempt the HW to run another context and a continuation of the idle
context (which is no longer idle).

We can achieve this by preventing the context from completing while we
reload a new ELSP (by applying ring_set_paused(1) across the whole of
dequeue), except this eventually fails due to a lite-restore into a
waiting semaphore does not generate an ACK. Instead, we try to avoid
making the GPU do anything too challenging and not submit a new ELSP
while the interrupts + CSB events appear to have fallen behind the
completed contexts. We expect it to catch up shortly so we queue another
tasklet execution and hope for the best.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1501
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200327201433.21864-1-chris@chris-wilson.co.uk
2020-03-27 20:53:26 +00:00
Daniele Ceraolo Spurio
a9410a6250 drm/i915/uc: do not free err log on uc_fini
We need to keep the GuC error logs around to debug the load failure,
so we can't clean them in the error unwind, which includes uc_fini().
Moving the cleanup to driver remove ensures that the logs stick around
long enough for us to dump them.

v2: reword commit msg (John)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326181121.16869-7-daniele.ceraolospurio@intel.com
2020-03-26 21:23:22 +00:00
Daniele Ceraolo Spurio
293a554801 drm/i915/uc: Move uC debugfs to its own folder under GT
uC is a component of the GT, so it makes sense for the uC debugfs files
to be in the GT folder. A subfolder has been used to keep the same
structure we have for the code.

v2: use intel_* prefix (Jani), rebase on new gt_debugfs_register_files,
    fix permissions for writable debugfs files.

v3: Rename files (Michal), remove blank line (Jani), fix sparse warns.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com> #v2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326181121.16869-6-daniele.ceraolospurio@intel.com
2020-03-26 21:23:03 +00:00
Daniele Ceraolo Spurio
34904bd64a drm/i915/debugfs: move uC printers and update debugfs file names
Move the printers to the respective files for clarity. The
guc_load_status debugfs has been squashed in the guc_info one, has
having separate ones wasn't very useful. The HuC debugfs has been
renamed huc_info to match.

v2: keep printing HUC_STATUS2 (Tony), avoid const->non-const
    container_of (Jani)

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326181121.16869-5-daniele.ceraolospurio@intel.com
2020-03-26 21:22:43 +00:00
Daniele Ceraolo Spurio
801a0caa62 drm/i915/huc: make "support huc" reflect HW capabilities
We currently initialize HuC support based on GuC being enabled in
modparam; this means that huc_is_supported() can return false on HW that
does have a HuC when enable_guc=0. The rationale for this behavior is
that HuC requires GuC for authentication and therefore is not supported
by itself. However, we do not allow defining HuC fw wthout GuC fw and
selecting HuC in modparam implicitly selects GuC as well, so we can't
actually hit a scenario where HuC is selected alone. Therefore, we can
flip the support check to reflect the HW capabilities and fw
availability, which is more intuitive and will make it cleaner to log
HuC the difference between not supported in HW and not selected.

Removing the difference between GuC and HuC also allows us to simplify
the init_early, since we don't need to differentiate the support based
on the type of uC.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326181121.16869-4-daniele.ceraolospurio@intel.com
2020-03-26 21:22:01 +00:00
Andi Shyti
12df6c59b6 drm/i915/gt: allow setting generic data pointer
When registering debugfs files the intel gt debugfs library
forces a 'struct *gt' private data on the caller.

To be open to different usages make the new
"intel_gt_debugfs_register_files()"[*] function more generic by
converting the 'struct *gt' pointer to a 'void *' type.

I take the chance to rename the functions by using "intel_gt_" as
prefix instead of "debugfs_", so that "debugfs_gt_register_files()"
becomes "intel_gt_debugfs_register_files()".

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200326181121.16869-2-daniele.ceraolospurio@intel.com
2020-03-26 21:20:51 +00:00
Chris Wilson
a97b786bfa drm/i915/gt: Stage the transfer of the virtual breadcrumb
We move the virtual breadcrumb from one physical engine to the next, if
the next virtual request is scheduled on a new physical engine. Since
the virtual context can only be in one signal queue, we need it to track
the current physical engine for the new breadcrumbs. However, to move
the list we need both breadcrumb locks -- and since we cannot take both
at the same time (unless we are careful and always ensure consistent
ordering) stage the movement of the signaler via the current virtual
request.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1510
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325130059.30600-1-chris@chris-wilson.co.uk
(cherry picked from commit 6c81e21a47)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:30 -07:00
Chris Wilson
c1ed2fb9d9 drm/i915/gt: Select the deepest available parking mode for rc6
On Ivybridge, we can go lower than rc6 to rc6p. And this is required for
Ivybridge to hit the same minimum power consumption as rc6 on other
platforms, so make it so.

v2: Update selftest to include all rc6 residency counters

Note that Andi did mention that we should be converting the magic
numbers into opaque magic macros, so if they ever get reused (unlikely
given only Ivybridge used the extra modes) we'll need to pay back the
technical debt.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1518
Fixes: 730eaeb524 ("drm/i915/gt: Manual rc6 entry upon parking")
Testcase: igt/i915_pm_rc6_residency/rc6-idle
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324134232.8773-1-chris@chris-wilson.co.uk
(cherry picked from commit 13c5a577b3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:30 -07:00
Chris Wilson
98479ada42 drm/i915/gt: Treat idling as a RPS downclock event
If we park/unpark faster than we can respond to RPS events, we never
will process a downclock event after expiring a waitboost, and thus we
will forever restart the GPU at max clocks even if the workload switches
and doesn't justify full power.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1500
Fixes: 3e7abf8141 ("drm/i915: Extract GT render power state management")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-1-chris@chris-wilson.co.uk
Cc: <stable@vger.kernel.org> # v5.5+
(cherry picked from commit 21abf0bf16)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:30 -07:00
Chris Wilson
a24c57d0b3 drm/i915/gt: Cancel a hung context if already closed
Use the restored ability to check if a context is closed to decide
whether or not to immediately ban the context from further execution
after a hang.

Fixes: be90e34483 ("drm/i915/gt: Cancel banned contexts after GT reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk
(cherry picked from commit 8e37d69913)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:30 -07:00
Chris Wilson
2e46a2a0b0 drm/i915: Use explicit flag to mark unreachable intel_context
I need to keep the GEM context around a bit longer so adding an explicit
flag for syncing execbuf with closed/abandonded contexts.

v2:
 * Use already available context flags. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
(cherry picked from commit 207e4a71fb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:04 -07:00
Chris Wilson
73c8bfb7fe drm/i915: Drop final few uses of drm_i915_private.engine
We've migrated all the heavy users over to the intel_gt, and can finally
drop the last few users and with that the mirror in dev_priv->engine[].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325234803.6175-1-chris@chris-wilson.co.uk
2020-03-26 10:50:17 +00:00
Chris Wilson
6c81e21a47 drm/i915/gt: Stage the transfer of the virtual breadcrumb
We move the virtual breadcrumb from one physical engine to the next, if
the next virtual request is scheduled on a new physical engine. Since
the virtual context can only be in one signal queue, we need it to track
the current physical engine for the new breadcrumbs. However, to move
the list we need both breadcrumb locks -- and since we cannot take both
at the same time (unless we are careful and always ensure consistent
ordering) stage the movement of the signaler via the current virtual
request.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1510
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325130059.30600-1-chris@chris-wilson.co.uk
2020-03-25 13:59:41 +00:00
Chris Wilson
6670b413f8 drm/i915/execlists: Pull tasklet interrupt-bh local to direct submission
We dropped calling process_csb prior to handling direct submission in
order to avoid the nesting of spinlocks and lift process_csb() and the
majority of the tasklet out of irq-off. However, we do want to avoid
ksoftirqd latency in the fast path, so try and pull the interrupt-bh
local to direct submission if we can acquire the tasklet's lock.

v2: Document the read of pending[0] from outside the tasklet with
READ_ONCE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-1-chris@chris-wilson.co.uk
2020-03-25 13:05:04 +00:00
Chris Wilson
032d992dcb drm/i915/selftests: Measure the energy consumed while in RC6
Measure and compare the energy consumed, as reported by the rapl MSR,
by the GPU while in RC0 and RC6 states. Throw an error if RC6 does not
at least halve the energy consumption of RC0, as this more than likely
means we failed to enter RC0 correctly.

If we can't measure the energy draw with the MSR, then it will report 0
for both measurements. Since the measurement works on all gen6+, this seems
worth flagging as an error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325101502.12591-1-chris@chris-wilson.co.uk
2020-03-25 11:33:05 +00:00
Chris Wilson
9bf7c31386 drm/i915/execlists: Drop setting sibling priority hint on virtual engines
We set the priority hint on execlists to avoid executing the tasklet for
when we know that there will be no change in execution order. However,
as we set it from the virtual engine for all siblings, but only one
physical engine may respond, we leave the hint set on the others
stopping direct submission that could take place.

If we do not set the hint, we may attempt direct submission even if we
don't expect to submit. If we set the hint, we may not do any submission
until the tasklet is run (and sometimes we may park the engine before
that has had a chance). Ergo there's only a minor ill-effect on mixed
virtual/physical engine workloads where we may try and fail to do direct
submission more often than required. (Pure virtual / engine workloads
will have redundant tasklet execution suppressed as normal.)

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1522
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325101358.12231-1-chris@chris-wilson.co.uk
2020-03-25 10:53:46 +00:00
Chris Wilson
13c5a577b3 drm/i915/gt: Select the deepest available parking mode for rc6
On Ivybridge, we can go lower than rc6 to rc6p. And this is required for
Ivybridge to hit the same minimum power consumption as rc6 on other
platforms, so make it so.

v2: Update selftest to include all rc6 residency counters

Note that Andi did mention that we should be converting the magic
numbers into opaque magic macros, so if they ever get reused (unlikely
given only Ivybridge used the extra modes) we'll need to pay back the
technical debt.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1518
Fixes: 730eaeb524 ("drm/i915/gt: Manual rc6 entry upon parking")
Testcase: igt/i915_pm_rc6_residency/rc6-idle
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200324134232.8773-1-chris@chris-wilson.co.uk
2020-03-24 15:53:51 +00:00
Chris Wilson
af7a272ef6 drm/i915/gt: Only delay the context barrier pm
It is strictly sufficient to only delay the intel_engine_pm_put from the
context barrier (and not from the context exit) in order to prevent the
gem_exec_nop contention. Adding the delay to the context exit incurs
noticably extra penalty for soft-rc6.

Fixes: edee52c927 ("drm/i915/gt: Delay release of engine-pm after last retirement")
Testcase: igt/i915_pm_rc6_residency/rc6-idle
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323192029.20723-1-chris@chris-wilson.co.uk
2020-03-23 20:38:56 +00:00
Chris Wilson
edee52c927 drm/i915/gt: Delay release of engine-pm after last retirement
Keep the engine-pm awake until the next jiffie, to avoid immediate
ping-pong under moderate load. (Forcing the idle barrier excerbates the
moderate load, dramatically increasing the driver overhead.) On the
other hand, delaying the idle-barrier slightly incurs longer rc6-off
and so more power consumption.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/848
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-4-chris@chris-wilson.co.uk
2020-03-23 12:51:19 +00:00
Chris Wilson
e9037e7f9a drm/i915: Extend intel_wakeref to support delayed puts
In some cases we want to hold onto the wakeref for a little after the
last user so that we can avoid having to drop and then immediately
reacquire it. Allow the last user to specify if they would like to keep
the wakeref alive for a short hysteresis.

v2: Embrace bitfield.h for adjustable flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323103221.14444-1-chris@chris-wilson.co.uk
2020-03-23 12:51:05 +00:00
Chris Wilson
41e4065a6b drm/i915: Rely on direct submission to the queue
Drop the pretense of kicking the tasklet (used only for the defunct guc
submission backend, it should just take ownership of the submit!) and so
remove the bh-kicking from around submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-5-chris@chris-wilson.co.uk
2020-03-23 11:51:39 +00:00
Chris Wilson
8e87e0139a drm/i915/gt: Mark timeline->cacheline as destroyed after rcu grace period
Since we take advantage of RCU for some i915_active objects, like the
intel_timeline_cacheline, we need to delay the i915_active_fini until
after the RCU grace period and we perform the kfree -- that is until
after all RCU protected readers.

<3> [108.204873] ODEBUG: assert_init not available (active state 0) object type: i915_active hint: __cacheline_active+0x0/0x80 [i915]
<4> [108.207377] WARNING: CPU: 3 PID: 2342 at lib/debugobjects.c:488 debug_print_object+0x67/0x90
<4> [108.207400] Modules linked in: vgem snd_hda_codec_hdmi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec ax88179_178a snd_hwdep usbnet btusb snd_hda_core btrtl mii btbcm btintel snd_pcm bluetooth ecdh_generic ecc i915 i2c_hid pinctrl_sunrisepoint pinctrl_intel intel_lpss_pci prime_numbers
<4> [108.207587] CPU: 3 PID: 2342 Comm: gem_exec_parall Tainted: G     U            5.6.0-rc6-CI-Patchwork_17047+ #1
<4> [108.207609] Hardware name: Google Soraka/Soraka, BIOS MrChromebox-4.10 08/25/2019
<4> [108.207639] RIP: 0010:debug_print_object+0x67/0x90
<4> [108.207668] Code: 83 c2 01 8b 4b 14 4c 8b 45 00 89 15 87 d2 8a 02 8b 53 10 4c 89 e6 48 c7 c7 38 2b 32 82 48 8b 14 d5 80 2f 07 82 e8 49 d5 b7 ff <0f> 0b 5b 83 05 c3 f6 22 01 01 5d 41 5c c3 83 05 b8 f6 22 01 01 c3
<4> [108.207692] RSP: 0018:ffffc90000e7f890 EFLAGS: 00010282
<4> [108.207723] RAX: 0000000000000000 RBX: ffffc90000e7f8b0 RCX: 0000000000000001
<4> [108.207747] RDX: 0000000080000001 RSI: ffff88817ada8cb8 RDI: 00000000ffffffff
<4> [108.207770] RBP: ffffffffa0341cc0 R08: ffff88816b5a8948 R09: 0000000000000000
<4> [108.207792] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82322d54
<4> [108.207814] R13: ffffffffa0341cc0 R14: ffffffff83df9568 R15: ffff88816064f400
<4> [108.207839] FS:  00007f437d753700(0000) GS:ffff88817ad80000(0000) knlGS:0000000000000000
<4> [108.207863] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [108.207887] CR2: 00007f2ad1fb5000 CR3: 00000001725d8004 CR4: 00000000003606e0
<4> [108.207907] Call Trace:
<4> [108.207959]  debug_object_assert_init+0x15c/0x180
<4> [108.208475]  ? i915_active_acquire_if_busy+0x10/0x50 [i915]
<4> [108.208513]  ? rcu_read_lock_held+0x4d/0x60
<4> [108.208970]  i915_active_acquire_if_busy+0x10/0x50 [i915]
<4> [108.209380]  intel_timeline_read_hwsp+0x81/0x540 [i915]
<4> [108.210262]  __emit_semaphore_wait+0x45/0x1b0 [i915]
<4> [108.210726]  ? i915_request_await_dma_fence+0x143/0x560 [i915]
<4> [108.211156]  i915_request_await_dma_fence+0x28a/0x560 [i915]
<4> [108.211633]  i915_request_await_object+0x24a/0x3f0 [i915]
<4> [108.212102]  eb_submit.isra.47+0x58f/0x920 [i915]
<4> [108.212622]  i915_gem_do_execbuffer+0x1706/0x2c70 [i915]
<4> [108.213071]  ? i915_gem_execbuffer2_ioctl+0xc0/0x470 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-1-chris@chris-wilson.co.uk
2020-03-23 11:16:03 +00:00
Chris Wilson
043cd2d14e drm/i915/gt: Leave rps->cur_freq on unpark
Don't override our previous frequency we used after parking, and avoid
continually spiking back to the efficient frequency for mostly idle
workloads. Trust our ability to autotune across a workload switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-2-chris@chris-wilson.co.uk
2020-03-22 23:03:08 +00:00
Chris Wilson
21abf0bf16 drm/i915/gt: Treat idling as a RPS downclock event
If we park/unpark faster than we can respond to RPS events, we never
will process a downclock event after expiring a waitboost, and thus we
will forever restart the GPU at max clocks even if the workload switches
and doesn't justify full power.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1500
Fixes: 3e7abf8141 ("drm/i915: Extract GT render power state management")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-1-chris@chris-wilson.co.uk
Cc: <stable@vger.kernel.org> # v5.5+
2020-03-22 23:03:00 +00:00
Chris Wilson
bb6892b7ce drm/i915/gt: Use the correct err_unlock unwind path for a closed context
A silly cut'n'paste copied the unlocked error path and used it inside
the pin_mutex lock, we need to drop that lock before returning.

Fixes: b412c63f1c ("drm/i915/gt: Report context-is-closed prior to pinning")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200322123241.17694-1-chris@chris-wilson.co.uk
2020-03-22 13:44:24 +00:00
Chris Wilson
e50c951ea6 drm/i915/gt: Restrict gen7 w/a batch to Haswell
The residual w/a batch is causing system instablity on Ivybridge and
Baytrail under some workloads, so disable until resolved.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1405
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311103640.26572-1-chris@chris-wilson.co.uk
(cherry picked from commit a62774782b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-20 07:04:38 -07:00
Chris Wilson
b412c63f1c drm/i915/gt: Report context-is-closed prior to pinning
Our assertion caught that we do try to pin a closed context if userspace
is viciously racing context-closure with execbuf, so make it fail
gracefully.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1492
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200320130159.3922-1-chris@chris-wilson.co.uk
2020-03-20 13:27:28 +00:00
Chris Wilson
8e37d69913 drm/i915/gt: Cancel a hung context if already closed
Use the restored ability to check if a context is closed to decide
whether or not to immediately ban the context from further execution
after a hang.

Fixes: be90e34483 ("drm/i915/gt: Cancel banned contexts after GT reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk
2020-03-19 21:28:24 +00:00
Chris Wilson
207e4a71fb drm/i915: Use explicit flag to mark unreachable intel_context
I need to keep the GEM context around a bit longer so adding an explicit
flag for syncing execbuf with closed/abandonded contexts.

v2:
 * Use already available context flags. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
2020-03-19 21:28:20 +00:00
Wambui Karuga
394ad36c51 drm/i915/workarounds: convert to drm_device based logging macros.
Replace the use of printk based drm logging macros with the struct
drm_device based logging macros.

Note that this converts DRM_DEBUG_DRIVER() to drm_dbg().

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-8-wambui.karugax@gmail.com
2020-03-19 11:34:36 +02:00
Wambui Karuga
a8fa7c079f drm/i915/rps: use struct drm_device based logging macros.
Replace the use of the printk based drm logging macros with the struct
drm_device based logging macros in i915/gt/intel_rps.c. This also
involves extracting the drm_i915_private device pointer from various
intel types.

This converts the instances of DRM_DEBUG_DRIVER to drm_dbg() while not
converting DRM_DEBUG() instances due to the lack of an analogous
drm_device based macro.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-7-wambui.karugax@gmail.com
2020-03-19 11:34:32 +02:00
Wambui Karuga
606856f09e drm/i915/ring_submission: use drm_device based logging macros.
Replace the use of printk based drm logging macros to the struct
drm_device based logging macros in i915/gt/intel_ring_submission.c.
This was done using the following semantic patch that transforms based
on the existence of a drm_i915_private device:
@@
identifier fn, T;
@@

fn(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

@@
identifier fn, T;
@@

fn(...,struct drm_i915_private *T,...) {
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

New checkpatch warnings were fixed manually.

Note that this converts DRM_DEBUG_DRIVER to drm_dbg().

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-6-wambui.karugax@gmail.com
2020-03-19 11:34:28 +02:00
Wambui Karuga
edf040f4ee drm/i915/renderstate: use struct drm_device based logging macros.
Replace the use of the printk based drm logging macros with the struct
drm_device based logging macros.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-5-wambui.karugax@gmail.com
2020-03-19 11:34:22 +02:00
Wambui Karuga
1ca6ce9332 drm/i915/rc6: convert to struct drm_device based logging macros.
Converts various instances of the printk based drm logging macros to use
the struct drm_device logging macros. This also involves extracting the
drm_i915_private device from intel types in some cases.

Note that this converts DRM_DEBUG_DRIVER() to drm_dbg().

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-4-wambui.karugax@gmail.com
2020-03-19 11:34:16 +02:00
Wambui Karuga
91682e45ba drm/i915/lrc: convert to struct drm_device based logging macros.
Convert various instances of the printk based drm logging macros to the
struct drm_device based logging macros.

Note that this converts DRM_DEBUG_DRIVER() to drm_dbg() but does not
convert DRM_DEBUG() due to the lack of an analogous drm_device based
macro.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-3-wambui.karugax@gmail.com
2020-03-19 11:34:11 +02:00
Wambui Karuga
36034c95d3 drm/i915/ggtt: convert to drm_device based logging macros.
Converts various instances of the printk based drm logging macros to use
the struct drm_device based logging macros in i915/gt/intel_ggtt.c.
This change was done using the following coccinelle script that matches
based on the existence of a drm_i915_private device:
@@
identifier fn, T;
@@

fn(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

@@
identifier fn, T;
@@

fn(...,struct drm_i915_private *T,...) {
<+...
(
-DRM_INFO(
+drm_info(&T->drm,
...)
|
-DRM_ERROR(
+drm_err(&T->drm,
...)
|
-DRM_WARN(
+drm_warn(&T->drm,
...)
|
-DRM_DEBUG_DRIVER(
+drm_dbg(&T->drm,
...)
|
-DRM_DEBUG_KMS(
+drm_dbg_kms(&T->drm,
...)
|
-DRM_DEBUG_ATOMIC(
+drm_dbg_atomic(&T->drm,
...)
)
...+>
}

New checkpatch warnings were fixed manually.

Note that this converts DRM_DEBUG_DRIVER to drm_dbg()

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-2-wambui.karugax@gmail.com
2020-03-19 11:34:04 +02:00
Chris Wilson
500f9ac302 drm/i915/gt: Always reschedule the new heartbeat
In order to better respond to new heartbeat intervals given via sysfs,
always reprogramme an active heartbeat upon change (i.e. use
mod_delayed_work to reschedule rather than queue_delayed_work which
ignores an already active work.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317163208.30010-1-chris@chris-wilson.co.uk
2020-03-17 18:26:15 +00:00
Lionel Landwerlin
11ecbdddf2 drm/i915/perf: introduce global sseu pinning
On Gen11 powergating half the execution units is a functional
requirement when using the VME samplers. Not fullfilling this
requirement can lead to hangs.

This unfortunately plays fairly poorly with the NOA requirements. NOA
requires a stable power configuration to maintain its configuration.

As a result using OA (and NOA feeding into it) so far has required us
to use a power configuration that can work for all contexts. The only
power configuration fullfilling this is powergating half the execution
units.

This makes performance analysis for 3D workloads somewhat pointless.

Failing to find a solution that would work for everybody, this change
introduces a new i915-perf stream open parameter that punts the
decision off to userspace. If this parameter is omitted, the existing
Gen11 behavior remains (half EU array powergating).

This change takes the initiative to move all perf related sseu
configuration into i915_perf.c

v2: Make parameter priviliged if different from default

v3: Fix context modifying its sseu config while i915-perf is enabled

v4: Always consider global sseu a privileged operation (Tvrtko)
    Override req_sseu point in intel_sseu_make_rpcs() (Tvrtko)
    Remove unrelated changes (Tvrtko)

v5: Some typos (Tvrtko)
    Process sseu param in read_properties_unlocked() (Tvrtko)

v6: Actually commit the bits from v5...
    Fixup some checkpath warnings

v7: Only compare engine uabi field (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-3-lionel.g.landwerlin@intel.com
2020-03-17 15:27:55 +02:00
Chris Wilson
220a6704ff drm/i915/gt: Restore check for invalid vma for fencing
Apparently we do try and attach a fence to an invalid vma (during
execbuf) so we cannot simply assert it never happens and report EINVAL
instead.

Fixes: dec9cf9ee8 ("drm/i915/gt: Pull restoration of GGTT fences underneath the GT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316205450.15843-1-chris@chris-wilson.co.uk
2020-03-17 00:22:34 +00:00
Chris Wilson
0b6bc81dbd drm/i915/gt: Allocate i915_fence_reg array
Since the number of fence regs can vary dramactically between platforms,
allocate the array on demand so we don't waste as much space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-4-chris@chris-wilson.co.uk
2020-03-16 20:28:29 +00:00
Chris Wilson
dec9cf9ee8 drm/i915/gt: Pull restoration of GGTT fences underneath the GT
Make the GT responsible for restoring its fence when it wakes up from
suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-2-chris@chris-wilson.co.uk
2020-03-16 20:28:28 +00:00
Chris Wilson
f899f786d1 drm/i915: Move GGTT fence registers under gt/
Since the fence registers control HW detiling through the GGTT
aperture, make them a part of the intel_ggtt under gt/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-1-chris@chris-wilson.co.uk
2020-03-16 20:28:26 +00:00
Chris Wilson
a62774782b drm/i915/gt: Restrict gen7 w/a batch to Haswell
The residual w/a batch is causing system instablity on Ivybridge and
Baytrail under some workloads, so disable until resolved.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1405
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311103640.26572-1-chris@chris-wilson.co.uk
2020-03-15 10:34:01 +00:00
Matt Roper
34a77b0b7b drm/i915: Add Wa_1605460711 / Wa_1408767742 to ICL and EHL
This workaround appears under two different numbers (and with somewhat
confused stepping applicability on ICL).  Ultimately it appears we
should just implement this for all stepping of ICL and EHL.

Note that this is identical to Wa_1407928979:tgl that already exists in
our driver too...yet another number referencing the same actual
workaround.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-7-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-03-13 09:03:17 -07:00
Matt Roper
fb899dd8ea drm/i915: Apply Wa_1406680159:icl,ehl as an engine workaround
The register this workaround updates is a render engine register in the
MCR range, so we should initialize this in rcs_engine_wa_init() rather
than gt_wa_init().

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1222
Fixes: 36204d80ba ("drm/i915/icl: Wa_1406680159")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-6-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-03-13 09:02:54 -07:00
Matt Roper
14f49be483 drm/i915: Add Wa_1406306137:icl,ehl
v2:
 - Move to context workarounds.  ROW_CHICKEN4 is part of the context
   image on gen11 (although it isn't on gen12).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-5-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-03-13 09:02:21 -07:00
Matt Roper
d0ed510a8e drm/i915: Add Wa_1604278689:icl,ehl
The bspec description for this workaround tells us to program
0xFFFF_FFFF into both FBC_RT_BASE_ADDR_REGISTER_* registers, but we've
previously found that this leads to failures in CI.  Our suspicion is
that the failures are caused by this valid turning on the "address valid
bit" even though we're intentionally supplying an invalid address.
Experimentation has shown that setting all bits _except_ for the
RT_VALID bit seems to avoid these failures.

v2:
 - Mask off the RT_VALID bit.  Experimentation with CI trybot indicates
   that this is necessary to avoid reset failures on BCS.

v3:
 - Program RT_BASE before RT_BASE_UPPER so that the valid bit is turned
   off by the first write.  (Chris)

Bspec: 11388
Bspec: 33451
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-4-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2020-03-13 09:01:44 -07:00
Matt Roper
415d126997 drm/i915: Handle all MCR ranges
The bspec documents multiple MCR ranges; make sure they're all captured
by the driver.

Bspec: 13991, 52079
Fixes: 592a7c5e08 ("drm/i915: Extend non readable mcr range")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-2-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-03-13 08:58:11 -07:00
Chris Wilson
bb4328f6b9 drm/i915/selftest: Add more poison patterns
Throw in the inverse patterns to create more examples of poison to use
against the LRC state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200313102812.30173-1-chris@chris-wilson.co.uk
2020-03-13 11:36:34 +00:00
Caz Yokoyama
175c4d9b3b Revert "drm/i915/tgl: Add extra hdc flush workaround"
This reverts commit 36a6b5d964.

The commit takes care Wa_1604544889 which was fixed on a0 stepping based on
a0 replan. So no SW workaround is required on any stepping now.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Caz Yokoyama <caz.yokoyama@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Fixes: 36a6b5d964 ("drm/i915/tgl: Add extra hdc flush workaround")
Link: https://patchwork.freedesktop.org/patch/msgid/1c751032ce79c80c5485cae315f1a9904ce07cac.1583359940.git.caz.yokoyama@intel.com
2020-03-12 15:19:00 -07:00
Chris Wilson
22ca8a452e drm/i915/gt: Wait for RCUs frees before asserting idle on unload
During driver unload, we have many asserts that we have released our
bookkeeping structs and are idle. In some cases, these struct are
protected by RCU and we do not release them until after an RCU grace
period.

Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 130a95e909 ("drm/i915/gem: Consolidate ctx->engines[] release")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200312115307.16460-1-chris@chris-wilson.co.uk
2020-03-12 20:47:24 +00:00
Tvrtko Ursulin
07bcfd1291 drm/i915/gen12: Disable preemption timeout
Allow super long OpenCL workloads which cannot be preempted within
the default timeout to run out of the box.

v2:
 * Make it stick out more and apply only to RCS. (Chris)

v3:
 * Mention platform override in kconfig. (Joonas)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michal Mrozek <Michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200312115748.29970-1-tvrtko.ursulin@linux.intel.com
2020-03-12 13:46:01 +00:00
Chris Wilson
60ef5b7ac6 drm/i915/execlists: Track active elements during dequeue
Record the initial active element we use when building the next ELSP
submission, so that we can compare against it latter to see if there's
no change.

Fixes: 44d0a9c05b ("drm/i915/execlists: Skip redundant resubmission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311092624.10012-2-chris@chris-wilson.co.uk
2020-03-11 11:59:59 +00:00
Chris Wilson
408464b4cb drm/i915/gt: Pull checking rps->pm_events under the irq_lock
Avoid angering kcsan by serialising the read of the pm_events with the
write in rps_disable_interrupts.

[ 6268.713419] BUG: KCSAN: data-race in intel_rps_park [i915] / rps_work [i915]
[ 6268.713437]
[ 6268.713449] write to 0xffff8881eda8efac of 4 bytes by task 1127 on cpu 3:
[ 6268.713680]  intel_rps_park+0x136/0x260 [i915]
[ 6268.713905]  __gt_park+0x61/0xa0 [i915]
[ 6268.714128]  ____intel_wakeref_put_last+0x42/0x90 [i915]
[ 6268.714352]  __intel_wakeref_put_work+0xd3/0xf0 [i915]
[ 6268.714369]  process_one_work+0x3b1/0x690
[ 6268.714384]  worker_thread+0x80/0x670
[ 6268.714398]  kthread+0x19a/0x1e0
[ 6268.714412]  ret_from_fork+0x1f/0x30
[ 6268.714423]
[ 6268.714435] read to 0xffff8881eda8efac of 4 bytes by task 950 on cpu 2:
[ 6268.714664]  rps_work+0xc2/0x680 [i915]
[ 6268.714679]  process_one_work+0x3b1/0x690
[ 6268.714693]  worker_thread+0x80/0x670
[ 6268.714707]  kthread+0x19a/0x1e0
[ 6268.714720]  ret_from_fork+0x1f/0x30

v2: Mark all reads and writes of rpm->pm_events.

The flow of enabling/disabling rps is stronly ordered, so the writes and
interrupt generation are also strongly ordered -- just this may not be
visible to the compiler, so provide annotations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311092624.10012-1-chris@chris-wilson.co.uk
2020-03-11 11:59:49 +00:00
Takashi Iwai
61f874d6e0 drm/i915/gt: Use scnprintf() for avoiding potential buffer overflow
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit.  Fix it by replacing with scnprintf().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311073256.6535-1-tiwai@suse.de
2020-03-11 10:54:59 +00:00
Chris Wilson
3a55dc895e drm/i915/execlists: Mark up data-races in virtual engines
The virtual engine passes tokens back and forth to its backing physical
engines.

[   57.372993] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915]
[   57.373012]
[   57.373023] write to 0xffff8881f47324c0 of 4 bytes by interrupt on cpu 2:
[   57.373241]  execlists_dequeue+0x6fa/0x2150 [i915]
[   57.373458]  __execlists_submission_tasklet+0x48/0x60 [i915]
[   57.373677]  execlists_submission_tasklet+0xd3/0x170 [i915]
[   57.373694]  tasklet_action_common.isra.0+0x42/0xa0
[   57.373709]  __do_softirq+0xd7/0x2cd
[   57.373723]  irq_exit+0xbe/0xe0
[   57.373735]  do_IRQ+0x51/0x100
[   57.373748]  ret_from_intr+0x0/0x1c
[   57.373963]  engine_retire+0x89/0xe0 [i915]
[   57.373977]  process_one_work+0x3b1/0x690
[   57.373990]  worker_thread+0x80/0x670
[   57.374004]  kthread+0x19a/0x1e0
[   57.374017]  ret_from_fork+0x1f/0x30
[   57.374027]
[   57.374038] read to 0xffff8881f47324c0 of 4 bytes by interrupt on cpu 3:
[   57.374256]  virtual_submission_tasklet+0x27/0x5a0 [i915]
[   57.374273]  tasklet_action_common.isra.0+0x42/0xa0
[   57.374288]  __do_softirq+0xd7/0x2cd
[   57.374302]  run_ksoftirqd+0x15/0x20
[   57.374315]  smpboot_thread_fn+0x1ab/0x300
[   57.374329]  kthread+0x19a/0x1e0
[   57.374342]  ret_from_fork+0x1f/0x30

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310141320.24149-3-chris@chris-wilson.co.uk
2020-03-10 23:12:39 +00:00
Chris Wilson
326611ddff drm/i915: Mark up racy read of active rq->engine
As a virtual engine may change the rq->engine to point to the active
request in flight, we need to warn the compiler that an active request's
engine is volatile.

[   95.017686] write (marked) to 0xffff8881e8386b10 of 8 bytes by interrupt on cpu 2:
[   95.018123]  execlists_dequeue+0x762/0x2150 [i915]
[   95.018539]  __execlists_submission_tasklet+0x48/0x60 [i915]
[   95.018955]  execlists_submission_tasklet+0xd3/0x170 [i915]
[   95.018986]  tasklet_action_common.isra.0+0x42/0xa0
[   95.019016]  __do_softirq+0xd7/0x2cd
[   95.019043]  irq_exit+0xbe/0xe0
[   95.019068]  irq_work_interrupt+0xf/0x20
[   95.019491]  i915_request_retire+0x2c5/0x670 [i915]
[   95.019937]  retire_requests+0xa1/0xf0 [i915]
[   95.020348]  engine_retire+0xa1/0xe0 [i915]
[   95.020376]  process_one_work+0x3b1/0x690
[   95.020403]  worker_thread+0x80/0x670
[   95.020429]  kthread+0x19a/0x1e0
[   95.020454]  ret_from_fork+0x1f/0x30
[   95.020476]
[   95.020498] read to 0xffff8881e8386b10 of 8 bytes by task 8909 on cpu 3:
[   95.020918]  __i915_request_commit+0x177/0x220 [i915]
[   95.021329]  i915_gem_do_execbuffer+0x38c4/0x4e50 [i915]
[   95.021750]  i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
[   95.021784]  drm_ioctl_kernel+0xe4/0x120
[   95.021809]  drm_ioctl+0x297/0x4c7
[   95.021832]  ksys_ioctl+0x89/0xb0
[   95.021865]  __x64_sys_ioctl+0x42/0x60
[   95.021901]  do_syscall_64+0x6e/0x2c0
[   95.021927]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310142403.5953-1-chris@chris-wilson.co.uk
2020-03-10 23:12:38 +00:00
Chris Wilson
0690e504b6 drm/i915/gt: Mark up racy reads for intel_context.inflight
When being used across multiple real engines inside a virtual engine,
the intel_context.inflight is updated atomically, and so we must
annotate the racy read from outside the owning context.

[11142.482846] BUG: KCSAN: data-race in __execlists_submission_tasklet [i915] / __execlists_submission_tasklet [i915]
[11142.482867]
[11142.482878] write (marked) to 0xffff8881f257b5e0 of 8 bytes by interrupt on cpu 2:
[11142.483107]  __execlists_submission_tasklet+0x1d33/0x2120 [i915]
[11142.483336]  execlists_submission_tasklet+0xd3/0x170 [i915]
[11142.483355]  tasklet_action_common.isra.0+0x42/0xa0
[11142.483371]  __do_softirq+0xd7/0x2cd
[11142.483384]  irq_exit+0xbe/0xe0
[11142.483401]  do_IRQ+0x51/0x100
[11142.483424]  ret_from_intr+0x0/0x1c
[11142.483446]  do_idle+0x133/0x1f0
[11142.483465]  cpu_startup_entry+0x14/0x16
[11142.483483]  start_secondary+0x120/0x180
[11142.483498]  secondary_startup_64+0xa4/0xb0
[11142.483512]
[11142.483528] read to 0xffff8881f257b5e0 of 8 bytes by interrupt on cpu 1:
[11142.483755]  __execlists_submission_tasklet+0x14e/0x2120 [i915]
[11142.483981]  execlists_submission_tasklet+0xd3/0x170 [i915]
[11142.483999]  tasklet_action_common.isra.0+0x42/0xa0
[11142.484014]  __do_softirq+0xd7/0x2cd
[11142.484028]  do_softirq_own_stack+0x2a/0x40
[11142.484046]  do_softirq.part.0+0x26/0x30
[11142.484071]  __local_bh_enable_ip+0x46/0x50
[11142.484299]  i915_gem_do_execbuffer+0x39c1/0x4e50 [i915]
[11142.484528]  i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
[11142.484546]  drm_ioctl_kernel+0xe4/0x120
[11142.484559]  drm_ioctl+0x297/0x4c7
[11142.484572]  ksys_ioctl+0x89/0xb0
[11142.484586]  __x64_sys_ioctl+0x42/0x60
[11142.484610]  do_syscall_64+0x6e/0x2c0
[11142.484627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310141320.24149-1-chris@chris-wilson.co.uk
2020-03-10 23:12:38 +00:00
Chris Wilson
f494960d5e drm/i915/gt: Defend against concurrent updates to execlists->active
[  206.875637] BUG: KCSAN: data-race in __i915_schedule+0x7fc/0x930 [i915]
[  206.875654]
[  206.875666] race at unknown origin, with read to 0xffff8881f7644480 of 8 bytes by task 703 on cpu 3:
[  206.875901]  __i915_schedule+0x7fc/0x930 [i915]
[  206.876130]  __bump_priority+0x63/0x80 [i915]
[  206.876361]  __i915_sched_node_add_dependency+0x258/0x300 [i915]
[  206.876593]  i915_sched_node_add_dependency+0x50/0xa0 [i915]
[  206.876824]  i915_request_await_dma_fence+0x1da/0x530 [i915]
[  206.877057]  i915_request_await_object+0x2fe/0x470 [i915]
[  206.877287]  i915_gem_do_execbuffer+0x45dc/0x4c20 [i915]
[  206.877517]  i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
[  206.877535]  drm_ioctl_kernel+0xe4/0x120
[  206.877549]  drm_ioctl+0x297/0x4c7
[  206.877563]  ksys_ioctl+0x89/0xb0
[  206.877577]  __x64_sys_ioctl+0x42/0x60
[  206.877591]  do_syscall_64+0x6e/0x2c0
[  206.877606]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

v2: Be safe and include mb

References: https://gitlab.freedesktop.org/drm/intel/issues/1318
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200309170540.10332-1-chris@chris-wilson.co.uk
2020-03-09 20:38:57 +00:00
Chris Wilson
ff34527103 drm/i915/gt: Mark up intel_rps.active for racy reads
We read the current state of intel_rps.active outside of the lock, so
mark up the racy access.

[  525.037073] BUG: KCSAN: data-race in intel_rps_boost [i915] / intel_rps_park [i915]
[  525.037091]
[  525.037103] write to 0xffff8881f145efa1 of 1 bytes by task 192 on cpu 2:
[  525.037331]  intel_rps_park+0x72/0x230 [i915]
[  525.037552]  __gt_park+0x61/0xa0 [i915]
[  525.037771]  ____intel_wakeref_put_last+0x42/0x90 [i915]
[  525.037991]  __intel_wakeref_put_work+0xd3/0xf0 [i915]
[  525.038008]  process_one_work+0x3b1/0x690
[  525.038022]  worker_thread+0x80/0x670
[  525.038037]  kthread+0x19a/0x1e0
[  525.038051]  ret_from_fork+0x1f/0x30
[  525.038062]
[  525.038074] read to 0xffff8881f145efa1 of 1 bytes by task 733 on cpu 3:
[  525.038304]  intel_rps_boost+0x67/0x1f0 [i915]
[  525.038535]  i915_request_wait+0x562/0x5d0 [i915]
[  525.038764]  i915_gem_object_wait_fence+0x81/0xa0 [i915]
[  525.038994]  i915_gem_object_wait_reservation+0x489/0x520 [i915]
[  525.039224]  i915_gem_wait_ioctl+0x167/0x2b0 [i915]
[  525.039241]  drm_ioctl_kernel+0xe4/0x120
[  525.039255]  drm_ioctl+0x297/0x4c7
[  525.039269]  ksys_ioctl+0x89/0xb0
[  525.039282]  __x64_sys_ioctl+0x42/0x60
[  525.039296]  do_syscall_64+0x6e/0x2c0
[  525.039311]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200309113623.24208-1-chris@chris-wilson.co.uk
2020-03-09 18:30:20 +00:00