Extend disabling SAMPLER_STATE prefetch workaround to gen12.
v2: Limit the WA to TGL A0 and update the WA no(Chris)
BSpec: 52890
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231953.24853-1-radhakrishna.sripada@intel.com
Still the saga of the hsw live_blt incoherency continues. While it did
seem that the invalidate before the breadcrumb had improved the mtbf,
nevertheless live_blt still failed. Mika's next idea was to pull the
invalidate-stall into the breadcrumb write itself.
References: 860afa0868 ("drm/i915/gt: Flush gen7 even harder")
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113151956.32242-1-chris@chris-wilson.co.uk
The bspec was just updated with a minor correction to entry 61 (it
shouldn't have had the SCF bit set).
v2:
- Add a MOCS_ENTRY_UNUSED() and use it to declare the
explicitly-reserved MOCS entries. (Lucas)
- Move the warning suppression from the Makefile to a #pragma that only
affects the TGL table. (Lucas)
v3:
- Entries 16 and 17 are identical to ICL now, so no need to explicitly
adjust them (or mess with compiler warning overrides).
Bspec: 45101
Fixes: 2ddf992179 ("drm/i915/tgl: Define MOCS entries for Tigerlake")
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-2-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
This reverts commit f4071997f1.
These extra EHL entries won't behave as expected without a bit more work
on the kernel side so let's drop them until that kernel work has had a
chance to land. Userspace trying to use these new entries won't get the
advantage of the new functionality these entries are meant to provide,
but at least it won't misbehave.
When we do add these back in the future, we'll probably want to
explicitly use separate tables for ICL and EHL so that userspace
software that mistakenly uses these entries (which are undefined on ICL)
sees the same behavior it sees with all the other undefined entries.
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Fixes: f4071997f1 ("drm/i915/ehl: Update MOCS table for EHL")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-1-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
live_blt is still failing on hsw, showing the hallmark of incoherency.
Since we are fairly certain that the interrupt is after the seqno is
visible, the other possibility is that the seqno is before the writes to
memory are flushed. Throw in an TLB invalidate before the breadcrumb as
we are reasonably confident that forces a CS stall.
References: f9228f7658 ("drm/i915/gt: Try an extra flush on the Haswell blitter")
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112160941.23969-1-chris@chris-wilson.co.uk
On gen7, including Haswell, the MI_FLUSH_DW command is not synchronous
with the command streamer nor is there an option to make it so. To hide
this we add a large delay on the CS so that the breadcrumb should always
be visible before the interrupt. However, that does not seem to be
enough to ensure the memory is actually coherent with the read of the
breadcrumb. The breadcrumb update is a post-sync op... Throw in a
preliminary MI_FLUSH_DW before the breadcrumb update in the hope that
helps.
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111120957.17732-1-chris@chris-wilson.co.uk
Some basic information that is useful to know, such as how many cycles
is a MI_NOOP.
v2: Keep volatile pages pinned at all times! (Matthew)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anna Karas <anna.karas@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111172716.23733-1-chris@chris-wilson.co.uk
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
Having been forced to reduce Braswell back to using the aliasing ppgtt,
the coherency issue we previously observed cannot impact us. Reduce the
performance penalty imposed on all platforms from using the mfence to a
mere sfence.
References: cf66b8a0ba ("drm/i915/execlists: Apply a full mb before execution for Braswell")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-10-chris@chris-wilson.co.uk
If we detect a hang in a closed context, just flush all of its requests
and cancel any remaining execution along the context. Note that after
closing the context, the last reference to the context may be dropped,
leaving it only valid under RCU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-5-chris@chris-wilson.co.uk
We mention that we are resetting the GPU, and dump the device state for
post mortem debugging. However, while that dump contains the active
processes and the one flagged as causing the error, we do not always
include that information in dmesg. Include the name of the guilty
process in dmesg for reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-4-chris@chris-wilson.co.uk
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
After doing some measuring, Icelake behaves on a par with Broadwell, and
without having to compromise for low power cores with long latencies, we
can reduce the powergating hysteresis so that the powersaving is enabled
faster. No impact observed on client side throughput measures (so
negligible increase in extra switching), and inspection from high
frequency polling using igt/gem_exec_nop/sequential, provided an estimate
for the upper bound before we can measure a substantial impact on
latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-9-chris@chris-wilson.co.uk
Since drm provided us with a real struct file we can use for our
anonymous internal clients (mock_file), complete our transition to using
that as the primary interface (and not the mocked up struct drm_file we
previous were using).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107213929.23286-1-chris@chris-wilson.co.uk
The headers in the gem/selftests/, gt/selftests, gvt/, selftests/
directories have never been compile-tested, but it would be possible
to make them self-contained.
This commit only addresses missing <linux/types.h> and forward
struct declarations.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108094142.25942-1-yamada.masahiro@socionext.com
As drm now exports a method to create an anonymous struct file around a
drm_device for internal use, make use of it to avoid our horrible hacks.
Danial suggested that the mock_file_put() wrapper was suitable for
drm-core, along with the mock_drm_getfile() [and that the vestigal
mock_drm_file() in this patch should perhaps be the drm interface
itself]. However, the eventual goal is to remove the mock_drm_file() and
use the struct file and fput() directly, in this patch we take a simple
transition in that direction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-3-chris@chris-wilson.co.uk
Only add the engine to the available set of uabi engines once it has
been fully initialised and we know we want it in the public set.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107081252.10542-17-chris@chris-wilson.co.uk
Before we grab the engine wakeref, tidy up the previous heartbeat
request. If we then abort because the engine powerwell is off, we ensure
the request is freed as we know we will not have freed it when
cancelling the work (as the work is running!).
Fixes: 841e867286 ("drm/i915/gt: Only drop heartbeat.systole if the sole owner")
References: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106223410.30334-1-chris@chris-wilson.co.uk
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
The counter is removed from the pm wakeref count, but it remains intact
so that we can restore it upon resume. Ergo inside suspend, it may have
a value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-1-chris@chris-wilson.co.uk
Currently we shutdown rc6 during i915_gem_resume() but this is called
during the preparation phase (i915_drm_prepare) for all suspend paths,
but we only want to shutdown rc6 for S3+. Move the actual shutdown to
i915_gem_suspend_late().
We then need to differentiate between suspend targets, to distinguish S0
(s2idle) where the device is kept awake but needs to be in a low power
mode (the same as runtime suspend) from the device suspend levels where
we lose control of HW and so must disable any HW access to dangling
memory.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111909
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/gem_exec_suspend/power-S0
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-4-chris@chris-wilson.co.uk
Our timelines are currently contained within an intel_gt, and we only
need to perform list/spinlock initialisation, so we can pull the
intel_timelines_init() into our intel_gt_init_early().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101130406.4142-1-chris@chris-wilson.co.uk
An interesting observation made with our parallel selftests was that on
our small/single cpu systems we would call kthread_stop() before the
kthreads were spawned. If this happens, the kthread is never run at all;
completely bypassing the test.
A simple yield() from the parent will ensure that all children have the
opportunity to start before we reap them.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101084940.31838-1-chris@chris-wilson.co.uk
Recent GuC doesn't require the shared area. We still have one user in
i915 (engine reset via guc) because we haven't updated the command to
match the current guc submission flow [1]. Since the flow in guc is
about to change again, just disable the command for now and add a note
that we'll implement it as part of the new flow.
[1] https://patchwork.freedesktop.org/patch/295038/
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Fernando Pacheco <fernando.pacheco@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031013040.25803-2-daniele.ceraolospurio@intel.com
Recent GuC binaries (including all the ones we're currently using)
don't require this shared area anymore, having moved the relevant
entries into the stage pool instead. i915 itself doesn't write
anything into it either, so we can safely drop it.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031013040.25803-1-daniele.ceraolospurio@intel.com
When checking the heartbeat pulse, we expect it to have been sent by the
time we have slept. We can verify this by checking the engine serial
number to see if that matches the predicted pulse serial. It will always
be true if, and only if, the pulse was sent by itself (as designed by
the test).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031094259.23028-1-chris@chris-wilson.co.uk
GuC 35.2.0 and HuC 7.0.3 are the first production releases for TGL.
GuC 35.2 for Gen12 is interface-compatible with 33.0 on older Gens,
because the differences are related to additional blocks/commands in
the interface to support new Gen12 features. These parts of the
interface will be added when the relevant features are enabled.
v2: fix typos (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191026003507.21769-1-daniele.ceraolospurio@intel.com
Execlists uses a scheduling quantum (a timeslice) to alternate execution
between ready-to-run contexts of equal priority. This ensures that all
users (though only if they of equal importance) have the opportunity to
run and prevents livelocks where contexts may have implicit ordering due
to userspace semaphores. However, not all workloads necessarily benefit
from timeslicing and in the extreme some sysadmin may want to disable or
reduce the timeslicing granularity.
The timeslicing mechanism can be compiled out^W^W disabled (but should
DCE!) with
./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk
Commit 50d84418f5 ("drm/i915: Add i915 to i915_inject_probe_failure")
introduced new functions unfortunately named incompatibly with rules
established by commit f2db53f14d ("drm/i915: Replace "_load" with
"_probe" consequently"). Fix it for consistency.
Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029102036.6326-2-janusz.krzysztofik@linux.intel.com
The design of the OA unit has been split into several units. We now
have a global unit (OAG) and a render specific unit (OAR). This leads
to some changes on how we program things. Some details :
OAR:
- has its own set of counter registers, they are per-context
saved/restored
- counters are not written to the circular OA buffer
- a snapshot of the counters can be acquired with
MI_RECORD_PERF_COUNT, or a single counter can be read with
MI_STORE_REGISTER_MEM.
OAG:
- has global counters that increment across context switches
- counters are written into the circular OA buffer (if requested)
v2: Fix checkpatch warnings on code style (Lucas)
v3: (Umesh)
- Update register from which tail, status and head are read
- Update logic to sample context reports
- Update whitelist mux and b counter regs
v4: Fix a bug when updating context image for new contexts (Umesh)
v5: Squash patch enabling save/restore of counters into context image
We want this so we can preempt performance queries and keep the
system responsive even when long running queries are ongoing. We
avoid doing it for all contexts.
- use LRI to modify context control (Chris)
- use MASKED_FIELD to program just the masked bits (Chris)
- disable save/restore of counters on cleanup (Chris)
v6: Do not use implicit parameters (Chris)
BSpec: 28727, 30021
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Chris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025193746.47155-2-umesh.nerlige.ramappa@intel.com
We may be missing support for the mappable aperture on some platforms.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-7-matthew.auld@intel.com
HWS placement restrictions can't just rely on HAS_LLC flag.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-5-matthew.auld@intel.com
While processing CSB there is no need to look at GuC submission
settings, just check if engine is configured for execlists mode.
While today GuC submission is disabled it's settings are still
based on modparam values that might not correctly reflect actual
submission status in case of any fallback. Until that is fully
fixed, use alternate method to confirm that engine really runs in
execlists mode by comparing set_default_submission vfunc.
v2: add other immediate use of new helper
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028164520.31772-1-michal.wajdeczko@intel.com
drivers/gpu/drm/i915//gt/selftest_engine_heartbeat.c:255 live_heartbeat_fast() error: uninitialized symbol 'err'.
drivers/gpu/drm/i915//gt/selftest_engine_heartbeat.c:320 live_heartbeat_off() error: uninitialized symbol 'err'.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025135943.12524-2-chris@chris-wilson.co.uk
The request's timeline will only contain requests from this context, in
order of execution. Therefore, we can simply look back along this
timeline to find the currently executing request.
If we do find that the current context has completed its last request,
that does not imply that all requests are completed in the context, so
only advance the ring->head up to the end of the known completions!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028124125.25176-1-chris@chris-wilson.co.uk
As we use hard coded offsets for a few locations within the context
image, include those in the selftests to assert that they are valid.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028121803.29408-1-chris@chris-wilson.co.uk
ips uses clock delays as opposed to rps frequency bins. To fit the
delays into the same rps calculations, we need to invert the ips delays.
Fixes: 3e7abf8141 ("drm/i915: Extract GT render power state management")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191026200917.1780-1-chris@chris-wilson.co.uk
Pull the GuC interrupt handlers out of i915_irq.c. They now use the GT
interrupt facilities rather than the central dispatch.
Based on a patch by Chris Wilson.
Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-2-chris@chris-wilson.co.uk