Based on a sampling of a number of benchmarks across platforms, by
default opt for a much more lenient timeout so that we should not
adversely affect existing "good" clients.
640ms ought to be enough for anyone.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk
(cherry picked from commit 5766a5ffc6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In order to avoid some nasty mutex inversions, commit 09c5ab384f
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.
Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.
Fixes: 09c5ab384f ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-6-chris@chris-wilson.co.uk
(cherry picked from commit a266bf4200)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The major drawback of commit 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.
As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!
v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.
v2: Actually to play nicely with engine_retire, we have to raise the
timeline.active_lock before releasing the HW. intel_gt_retire_requsts()
is still serialised by the outer lock so they cannot see this
intermediate state, and engine_retire is serialised by HW submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk
(cherry picked from commit 88a4655e75)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I rushed a last minute correction to cancel_port_requests() to prevent
the snooping of *execlists->active as the inflight array was being
updated, without noticing we iterated the inflight array starting from
active! Oops.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112387
Fixes: 97f9af78f3 ("drm/i915/gt: Mark the execlists->active as the primary volatile access")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125112520.1760492-1-chris@chris-wilson.co.uk
(cherry picked from commit da0ef77e1e)
[Joonas: Fixed Fixes: tag to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we want to do a lockless read of the current active request, and
that request is written to by process_csb also without serialisation, we
need to instruct gcc to take care in reading the pointer itself.
Otherwise, we have observed execlists_active() to report 0x40.
[ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4
[ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000
[ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 }
[ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940
which is impossible!
The answer is that as we keep the existing execlists->active pointing
into the array as we copy over that array, the unserialised read may see
a partial pointer value.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
(cherry picked from commit 331bf90591)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In commit a79ca656b6 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.
As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.
v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.
Fixes: a79ca656b6 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
(cherry picked from commit 5cba288466)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.
This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.
And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].
The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.
Fixes: e5dadff4b0 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
(cherry picked from commit a6edbca74b)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.
Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
(cherry picked from commit 07779a76ee)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When waiting for idle, serialise with any ongoing callback so that it
will have completed before completing the wait.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-12-chris@chris-wilson.co.uk
(cherry picked from commit f4ba0707c8)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
pm_suspend_target_state is declared under CONFIG_PM_SLEEP but only
defined under CONFIG_SUSPEND. Play safe and only use the symbol if it is
both declared and defined.
Reported-by: kbuild-all@lists.01.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: a70a9e998e ("drm/i915: Defer rc6 shutdown to suspend_late")
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120182209.3967833-1-chris@chris-wilson.co.uk
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Commit 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for
engines onto the GT") changed the engine query to iterate over uabi
engines but left the buffer size calculation look at the physical engine
count. Difference has no practical consequence but it is nicer to align
both queries.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for engines onto the GT")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122104115.29610-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 9acc99d8f2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The bspec initially provided a single DKL PHY vswing table for both HDMI
and DP, but was recently updated to include an independent table for
HDMI.
Bspec: 49292
Fixes: 978c3e539b ("drm/i915/tgl: Add dkl phy programming sequences")
Cc: Clinton A Taylor <clinton.a.taylor@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118180219.9309-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 362bfb995b)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The bspec was recently updated with new cdclk -> voltage level tables to
accommodate the new 324/326.4 cdclk values.
Bspec: 21809
Fixes: 63c9dae71d ("drm/i915/ehl: Add voltage level requirement table")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118164412.26216-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit d147483884)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From inside an active timeline in the execbuf ioctl, we may try to
reclaim some space in the GGTT. We need GGTT space for all objects on
!full-ppgtt platforms, and for context images everywhere. However, to
free up space in the GGTT we may need to remove some pinned objects
(e.g. context images) that require flushing the idle barriers to remove.
For this we use the big hammer of intel_gt_wait_for_idle()
However, commit 7936a22dd4 ("drm/i915/gt: Wait for new requests in
intel_gt_retire_requests()") will continue spinning on the wait if a
timeline is active but lacks requests, as is the case during execbuf
reservation. Spinning forever is quite time consuming, so revert that
commit and start again.
In practice, the effect commit 7936a22dd4 was trying to achieve is
accomplished by commit 1683d24c14 ("drm/i915/gt: Move new timelines
to the end of active_list"), so there is no immediate rush to replace
the looping.
Testcase: igt/gem_exec_reloc/basic-range
Fixes: a46bfdc83f ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
References: 1683d24c14 ("drm/i915/gt: Move new timelines to the end of active_list")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-1-chris@chris-wilson.co.uk
(cherry picked from commit 689122dcc3)
[Joonas: Corrected Fixes: tag ref to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
As we want to be able to run inside atomic context for retiring the
i915_active, and we are no longer allowed to abuse mutex_trylock, split
the tree management portion of i915_active.mutex into an irq-safe
spinlock.
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Fixes: 274cbf20fd ("drm/i915: Push the i915_active.retire into a worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114172535.1116-1-chris@chris-wilson.co.uk
(cherry picked from commit c9ad602fea)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
It's supposed to be just a const pointer.
Fixes: 074c77e3ec ("drm/i915/tgl: Gen-12 display loses Yf tiling and legacy CCS support")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115120440.17883-1-jani.nikula@intel.com
(cherry picked from commit 48ea97fabe)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
On some platforms (e.g. KBL) that do not support GuC submission, but
the user enabled the GuC communication (e.g for HuC authentication)
calling the GuC EXIT_S_STATE action results in lose of ability to
enter RC6. We can remove the GuC suspend/resume entirely as we do
not need to save the GuC submission status.
Add intel_guc_submission_is_enabled() function to determine if
GuC submission is active.
v2: Do not suspend/resume the GuC on platforms that do not support
Guc Submission.
v3: Fix typo, move suspend logic to remove goto.
v4: Use intel_guc_submission_is_enabled() to check GuC submission
status.
v5: No need to look at engine to determine if submission is enabled.
Squash fix + intel_guc_submission_is_enabled() patch into one.
v6: Move resume check into intel_guc_resume() for symmetry.
Fix commit Fixes tag.
Reported-by: KiteStramuort <kitestramuort@autistici.org>
Reported-by: S. Zharkoff <s.zharkoff@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111594
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111623
Fixes: ffd5ce22fa ("drm/i915/guc: Updates for GuC 32.0.3 firmware")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceralo Spurio <daniele.ceraolospurio@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Tomas Janousek <tomi@nomi.cz>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115231538.1249-1-don.hiatt@intel.com
(cherry picked from commit 82e0c5bbd6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Our callers fall into two categories, those passing timeout=0 who just
want to flush request retirements and those passing a timeout that need
to wait for submission completion (e.g. intel_gt_wait_for_idle()).
Currently, we only wait for a snapshot of timelines at the start of the
wait (but there was an expectation that new requests would cause timelines
to appear at the end). However, our callers, such as
intel_gt_wait_for_idle() before suspend, do require us to wait for the
power management requests emitted by retirement as well. If we don't,
then it takes an extra second or two for the background worker to flush
the queue and mark the GT as idle.
Fixes: 7e80576266 ("drm/i915: Drop struct_mutex from around i915_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-1-chris@chris-wilson.co.uk
(cherry picked from commit 7936a22dd4)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The workaround to disable coarse power gating is still needed on SKL
GT3/GT4 machines and since the RC6 context corruption was discovered by
the hardware team also on all GEN9 machines. Restore applying the
workaround.
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/intel_gt_pm_late_selftests/live_rc6_ctx
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114152621.7235-1-imre.deak@intel.com
(cherry picked from commit 980f87a2ed)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
fbdev uses the physical address of our framebuffer for its fb_mmap()
routine. While we need to adapt this address for the new io BAR, we have
to fix v5.4 first! The simplest fix is to restore the smem back to v5.3
and we will then probably have to implement our fbops->fb_mmap() callback
to handle local memory.
Reported-by: Neil MacLeod <freedesktop@nmacleod.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256
Fixes: 5f889b9a61 ("drm/i915: Disregard drm_mode_config.fb_base")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: Neil MacLeod <freedesktop@nmacleod.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk
(cherry picked from commit abc5520704)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I'm observing incoherence metric values, changing from run to run.
It appears the patches introducing noa wait & reconfiguration from
command stream switched places in the series multiple times during the
review. This lead to the dependency of one onto the order to go
missing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1f8 ("drm/i915/perf: execute OA configuration from command stream")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 93937659dc)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
io_mapping_map_atomic/kmap_atomic are occasionally taken in error capture
(if there is no aperture preallocated for the use of error capture), but
the error capture and compression routines are now run in normal
context:
<3> [113.316247] BUG: sleeping function called from invalid context at mm/page_alloc.c:4653
<3> [113.318190] in_atomic(): 1, irqs_disabled(): 0, pid: 678, name: debugfs_test
<4> [113.319900] no locks held by debugfs_test/678.
<3> [113.321002] Preemption disabled at:
<4> [113.321130] [<ffffffffa02506d4>] i915_error_object_create+0x494/0x610 [i915]
<4> [113.327259] Call Trace:
<4> [113.327871] dump_stack+0x67/0x9b
<4> [113.328683] ___might_sleep+0x167/0x250
<4> [113.329618] __alloc_pages_nodemask+0x26b/0x1110
<4> [113.334614] pool_alloc.constprop.19+0x14/0x60 [i915]
<4> [113.335951] compress_page+0x7c/0x100 [i915]
<4> [113.337110] i915_error_object_create+0x4bd/0x610 [i915]
<4> [113.338515] i915_capture_gpu_state+0x384/0x1680 [i915]
However, it is not a good idea to run the slow compression inside atomic
context, so we choose not to.
Fixes: 895d8ebeaa ("drm/i915: error capture with no ggtt slot")
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231104.24208-1-yu.bruce.chang@intel.com
(cherry picked from commit 48715f7001)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
TRANS_DDI_MST_TRANSPORT_SELECT is 2 bits wide not 3, it was taking
one bit from EDP/DSI Input Select.
Fixes: b3545e0868 ("drm/i915/tgl: add support to one DP-MST stream")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-1-jose.souza@intel.com
(cherry picked from commit bb747fa5a9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
According to internal documents I found for CMP PCHs the PCI ID 0xA3C1
belongs to a CMP-V chipset. Based on the same docs the programming of
the PCH is compatible with that of KBP. Fix up my previous wrong
assumption accordingly using the SPT programming which in turn is the
basis for KBP.
The original bug reporter verified that this is the correct PCH
identification (the only way we'll program valid DDC pin-pair values to
the GMBUS register) and the Windows team uses the same identification
(that is using the KBP programming model for this PCH).
I filed the necessary Bspec update requests (BSpec/33734).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112051
Fixes: 37c92dc303 ("drm/i915: Add new CNL PCH ID seen on a CML platform")
Reported-and-tested-by: Cyrus <cyrus.lien@canonical.com>
Cc: Cyrus <cyrus.lien@canonical.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112104608.24587-1-imre.deak@intel.com
(cherry picked from commit 50a5065f44)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
- PMU "Frequency" is reported as accumulated cycles
- Avoid OOPS in dumb_create IOCTL when no CRTCs
- Mitigation for userptr put_pages deadlock with trylock_page
- Fix to avoid freeing heartbeat request too early
- Fix LRC coherency issue
- Fix Bugzilla #112212: Avoid screen corruption on MST
- Error path fix to unlock context on failed context VM SETPARAM
- Always consider holding preemption a privileged op in perf/OA
- Preload LUTs if the hw isn't currently using them to avoid color flash on VLV/CHV
- Protect context while grabbing its name for the request
- Don't resize aliasing ppGTT size
- Smaller fixes picked by tooling
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114085213.GA6440@jlahtine-desk.ger.corp.intel.com
drivers/gpu/drm/vmwgfx/vmwgfx_surface.c:339:22:
warning: variable srf set but not used [-Wunused-but-set-variable]
'srf' is never used, so can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Drivers like vmwgfx may want to test whether the dma page pool is present
or not. Since it's activated by default by TTM if compiled-in, define a
hidden configuration option that the driver can test for.
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This backmerges the branch that ended up in Linus' tree. It removes
all the changes for the rc6 patches from Linus' tree in favour of
a patch that is based on a large refactor that occured.
Otherwise it all looks good.
Signed-off-by: Dave Airlie <airlied@redhat.com>
In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.
v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
change.
v5:
- Rebased on latest upstream gt_pm refactoring.
v6:
- s/i915_rc6_/intel_rc6_/
- Don't return a value from i915_rc6_ctx_wa_check().
v7:
- Rebased on latest gt rc6 refactoring.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[airlied: pull this later version of this patch into drm-next
to make resolving the conflict mess easier.]
Signed-off-by: Dave Airlie <airlied@redhat.com>
If a process is interrupted while accessing the "gpu" debugfs file and
the drm device struct_mutex is contended, release() could return early
and fail to free related resources.
Note that the return value from release() is ignored.
Fixes: 4f776f4511 ("drm/msm/gpu: Convert the GPU show function to use the GPU state")
Cc: stable <stable@vger.kernel.org> # 4.18
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010131333.23635-2-johan@kernel.org
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd
CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm
6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL
XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h
Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY
6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR
dUvuYZc=
=1N6F
-----END PGP SIGNATURE-----
Merge v5.4-rc7 into drm-next
We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The setting of MSA is done by the DDI .pre_enable() hook. And when we are
using MST, the MSA is only set to first mst stream by calling of
DDI .pre_eanble() hook. It raies issues to non-first mst streams.
Wrong MSA or missed MSA packets might show scrambled screen or wrong
screen.
This splits a setting of MSA to MST and SST cases. And In the MST case it
will call a setting of MSA after an allocating of Virtual Channel from
MST encoder pre_enable callback.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112212
Fixes: 0c06fa1560 ("drm/i915/dp: Add support of BT.2020 Colorimetry to DP MSA")
Fixes: d4a415dcda ("drm/i915: Fix MST oops due to MSA changes")
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106212636.502471-1-gwan-gyeong.mun@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[vsyrjala: nuke spurious newline]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit bd8c9cca88)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113125241.20547-1-ville.syrjala@linux.intel.com
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
(cherry picked from commit 31b61f0ef9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
(cherry picked from commit fecffa4668)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef780 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 0b0120d4c7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs. So on 64-bit
architectures, only the first half of the bitmap is cleared.
If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.
Use bitmap_zero() instead, which gets the calculation right.
Fixes: f8c08d8fae ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
(cherry picked from commit 3c954c418e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When inside the lock, remember to unlock even if you want to leave
early.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
(cherry picked from commit feba2b8146)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>