On gen7, including Haswell, the MI_FLUSH_DW command is not synchronous
with the command streamer nor is there an option to make it so. To hide
this we add a large delay on the CS so that the breadcrumb should always
be visible before the interrupt. However, that does not seem to be
enough to ensure the memory is actually coherent with the read of the
breadcrumb. The breadcrumb update is a post-sync op... Throw in a
preliminary MI_FLUSH_DW before the breadcrumb update in the hope that
helps.
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111120957.17732-1-chris@chris-wilson.co.uk
Since we removed the pm hookup from the GT, the hook in
drm_i915_private.gem is unused. Remove it.
References: 18f3b2727f ("drm/i915: Remove pm park/unpark notifications")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112113434.31088-1-chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
(cherry picked from commit fecffa4668)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef780 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 0b0120d4c7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Pass around the intended intel_uncore for mmio access during stolen
setup, and avoid relying on the implicit magic I915_READ() macros.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111182143.23479-1-chris@chris-wilson.co.uk
Some basic information that is useful to know, such as how many cycles
is a MI_NOOP.
v2: Keep volatile pages pinned at all times! (Matthew)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anna Karas <anna.karas@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111172716.23733-1-chris@chris-wilson.co.uk
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs. So on 64-bit
architectures, only the first half of the bitmap is cleared.
If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.
Use bitmap_zero() instead, which gets the calculation right.
Fixes: f8c08d8fae ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Enable gup to retry and fault the pages outside of the mmap_sem lock in
our worker. As we are inside our worker, outside of any critical path,
we can allow the mmap_sem lock to be dropped in order to service a page
fault; this in turn allows the mm to populate the page using a slow
fault handler.
References: 5b56d49fc3 ("mm: add locked parameter to get_user_pages_remote()")
Testcase: igt/gem_userptr/userfault
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-2-chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
The setting of MSA is done by the DDI .pre_enable() hook. And when we are
using MST, the MSA is only set to first mst stream by calling of
DDI .pre_eanble() hook. It raies issues to non-first mst streams.
Wrong MSA or missed MSA packets might show scrambled screen or wrong
screen.
This splits a setting of MSA to MST and SST cases. And In the MST case it
will call a setting of MSA after an allocating of Virtual Channel from
MST encoder pre_enable callback.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112212
Fixes: 0c06fa1560 ("drm/i915/dp: Add support of BT.2020 Colorimetry to DP MSA")
Fixes: d4a415dcda ("drm/i915: Fix MST oops due to MSA changes")
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106212636.502471-1-gwan-gyeong.mun@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[vsyrjala: nuke spurious newline]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Having been forced to reduce Braswell back to using the aliasing ppgtt,
the coherency issue we previously observed cannot impact us. Reduce the
performance penalty imposed on all platforms from using the mfence to a
mere sfence.
References: cf66b8a0ba ("drm/i915/execlists: Apply a full mb before execution for Braswell")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-10-chris@chris-wilson.co.uk
As the ftrace buffer is single shot, once dumped it will not update. As
such, it only provides information for the first bug and all subsequent
bugs are noise. The goal of CI is to have zero bugs, so taint the kernel
causing CI to reboot the machine; fix the bug and move on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-13-chris@chris-wilson.co.uk
To test mmap_offset_exhaustion, we first have to fill the entire vma
manager leaving a single page. Don't assume that the vma manager is not
already fragment, and fill all the holes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111122706.28292-2-chris@chris-wilson.co.uk
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
If we detect a hang in a closed context, just flush all of its requests
and cancel any remaining execution along the context. Note that after
closing the context, the last reference to the context may be dropped,
leaving it only valid under RCU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-5-chris@chris-wilson.co.uk
We mention that we are resetting the GPU, and dump the device state for
post mortem debugging. However, while that dump contains the active
processes and the one flagged as causing the error, we do not always
include that information in dmesg. Include the name of the guilty
process in dmesg for reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-4-chris@chris-wilson.co.uk
Use a small char buffer inside the i915_gem_context to store the user
friendly name so that ctx->name has the same lifetime as the RCU
protected GEM context. That is, e.g. when using print_request() that
prints the timeline name (ctx->name), the name will not be prematurely
freed upon the context being closed and the last reference dropped.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-2-chris@chris-wilson.co.uk
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
After doing some measuring, Icelake behaves on a par with Broadwell, and
without having to compromise for low power cores with long latencies, we
can reduce the powergating hysteresis so that the powersaving is enabled
faster. No impact observed on client side throughput measures (so
negligible increase in extra switching), and inspection from high
frequency polling using igt/gem_exec_nop/sequential, provided an estimate
for the upper bound before we can measure a substantial impact on
latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-9-chris@chris-wilson.co.uk
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
(cherry picked from commit 3c954c418e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When inside the lock, remember to unlock even if you want to leave
early.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
(cherry picked from commit feba2b8146)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
(cherry picked from commit 841e867286)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When a custom powerplay table is provided, we need to update
the OD VDDC flag to avoid AVFS being enabled when it shouldn't be.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205393
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is to improve the performance in the compute mode
for vega10. For example, the original performance for a rocm
bandwidth test: 2G internal GPU copy, is about 99GB/s.
With the idle power features disabled dynamically, the porformance
is promoted to about 215GB/s.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
OD is not supported on Arcturus. Thus the
pp_od_clk_voltage sysfs interface is also not supported.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On gen7, we have to avoid concurrent access to the same mmio cacheline,
and so coordinate all mmio access with the uncore->lock. However, for
pmu, we want to avoid perturbing the system and disabling interrupts
unnecessarily, so restrict the w/a to gen7 where it is requied to
prevent machine hangs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-2-chris@chris-wilson.co.uk
We want to avoid taking forcewake when querying the performance stats,
as we wish to avoid perturbing the system under observation. (And with
the forcewake being kept alive for 1ms after use, sampling the frequency
from a 200Hz timer keeps forcewake 40% active.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-1-chris@chris-wilson.co.uk
In the selftests, where we are accessing a private ctx from within the
confines of a single test, we know that the ctx->vm pointer is static
and bounded by the lifetime of the test. We can use a simple helper to
provide the RCU annotations to keep sparse happy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107221201.30497-1-chris@chris-wilson.co.uk
Since drm provided us with a real struct file we can use for our
anonymous internal clients (mock_file), complete our transition to using
that as the primary interface (and not the mocked up struct drm_file we
previous were using).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107213929.23286-1-chris@chris-wilson.co.uk
The headers in the gem/selftests/, gt/selftests, gvt/, selftests/
directories have never been compile-tested, but it would be possible
to make them self-contained.
This commit only addresses missing <linux/types.h> and forward
struct declarations.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108094142.25942-1-yamada.masahiro@socionext.com
drm-fixes-5.4-2019-11-06:
amdgpu:
- Fix navi14 display issue root cause and revert workaround
- GPU reset scheduler interaction fix
- Fix fan boost on multi-GPU
- Gfx10 and sdma5 fixes for navi
- GFXOFF fix for renoir
- Add navi14 PCI ID
- GPUVM fix for arcturus
radeon:
- Port an SI power fix from amdgpu
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107032241.1021217-1-alexander.deucher@amd.com
- Do not use TBT type for non Type-C ports.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJdwz2eAAoJEPpiX2QO6xPK2BgH/14HVq7euW204goERrqwRjWF
11HaSYbT2yG12vbwrahNJjSejj7VQbhN+TF9Fe221WG1R3XYig1SF72tpmfanKNG
u10BEXHxxuTVSPos8TCQmrspUHUDCYRyfzbByrL/g7i2oMuO1pIaFsKkFN8weu9h
EzEAc+h5k/PGrB0pN2Ez0mVKYnKB1WYkgUvQaziHKUUHh1okyQgpJkKzPfoiGQRq
CWNsfsy+YZ8XJJp12HucE1S8faphgusX82e9DuhWLizb0WMIJElq8wx/iIaeCNsr
IFFl1sePZMshq4LXmhz15NS6cqiOXt50BRjCCgJD1b4mFsPytbbMedekDhBlPSc=
=csHm
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-fixes-2019-11-06' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix HPD poll to avoid kworker consuming a lot of cpu cycles.
- Do not use TBT type for non Type-C ports.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106213958.GA16525@intel.com
- Fix for a state dereference in atomic self-refresh helpers
- One compilation fix for c2p fbdev helpers
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXcPUHwAKCRDj7w1vZxhR
xfQ+AQDqa+ddvrlr9S0lAwv3R6iH8E9/uk1/PaKJEEyPFL6lQgEA7lNWpgWKuSDx
fpY3uqEhV1sNcCmBan968wjySi6BpwY=
=b9Yc
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2019-11-07-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- Some new documentation for GEM shmem madvise helpers
- Fix for a state dereference in atomic self-refresh helpers
- One compilation fix for c2p fbdev helpers
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107082215.GA34850@gilmour.lan
Problem:
During GPU reset we call the GPU scheduler to suspend it's
thread, those two functions in amdgpu also suspend and resume
the sceduler for their needs but this can collide with GPU
reset in progress and accidently restart a suspended thread
before time.
Fix:
Serialize with GPU reset.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the sched thread is parked we assume ring_mirror_list is
not accessed from here.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 89b3d86403.
We will do a proper fix in next patch.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Removes thread park/unpark hack from drm_sched_entity_fini and
by this fixes reactivation of scheduler thread while the thread
is supposed to be stopped.
v2: Per sched entity completion.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vega20 only requires all devices be set to same pstate level for low
pstate and not high.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the asic callback function of get_pcie_replay_count is not implement on navi asic,
it will cause null pinter error when read this interface.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For driver reload test, it will report "can't enable
MSI (MSI-X already enabled)".
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Explain fields like aper_base, agp_start etc. The definition
of those fields are confusing as they are from different view
(CPU or GPU). Add comments for easier understand.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <Alex.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rather than just specifying the bullet numbers from the bspec (e.g.,
"4.b") actually include the description of what the bspec wants us to
do. Steps can be renumbered or moved so including the description will
help us match the code up to the spec. Plus if we add support for new
platforms, some of the steps may be added/removed so more descriptive
comments will be useful for ensuring all of the bspec requirements are
met.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107174527.11165-1-matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Whenever, we unbind (or change fence registers) on an object, we must
revoke any and all mmap_gtt using the previous bindings. Those user PTEs
point at the GGTT which know points into a new object, the wrong object.
Ergo, those PTEs must be cleared so that any user access provokes a new
page fault.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-5-chris@chris-wilson.co.uk
Provide a utility function to create a vma corresponding to an mmap() of
our device. And use it to exercise the equivalent of userspace
performing a GTT mmap of our objects.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-4-chris@chris-wilson.co.uk
As drm now exports a method to create an anonymous struct file around a
drm_device for internal use, make use of it to avoid our horrible hacks.
Danial suggested that the mock_file_put() wrapper was suitable for
drm-core, along with the mock_drm_getfile() [and that the vestigal
mock_drm_file() in this patch should perhaps be the drm interface
itself]. However, the eventual goal is to remove the mock_drm_file() and
use the struct file and fput() directly, in this patch we take a simple
transition in that direction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-3-chris@chris-wilson.co.uk
Sometimes we need to create a struct file to wrap a drm_device, as it
the user were to have opened /dev/dri/card0 but to do so anonymously
(i.e. for internal use). Provide a utility method to create a struct
file with the drm_device->driver.fops, that wrap the drm_device.
v2: Restrict usage to selftests
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-2-chris@chris-wilson.co.uk
Currently, we only export symbols for drm-selftests which are either
compiled as modules or into the main drm builtin. However, if we want to
export symbols from drm.ko for the drivers' selftests, we require a
means of controlling that export separately. So we add a new Kconfig to
determine whether or not the EXPORT_SYMBOL_FOR_TESTS_ONLY() takes
effect.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-1-chris@chris-wilson.co.uk
As we read the ctx->vm unlocked before cloning/exporting, we should
validate our reference is correct before returning it. We already do for
clone_vm() but were not so strict around get_ppgtt().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106091312.12921-1-chris@chris-wilson.co.uk
Only add the engine to the available set of uabi engines once it has
been fully initialised and we know we want it in the public set.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107081252.10542-17-chris@chris-wilson.co.uk
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Add helper functions for sending the DSI compression mode and picture
parameter set data type packets. For the time being, limit the support
to using VESA DSC 1.1 and the default PPS. This may need updating if the
need arises for proprietary compression or non-default PPS, however keep
it simple for starters.
v2: Add missing EXPORT_SYMBOL
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028150047.22048-5-jani.nikula@intel.com
The DCS command has been named SET_PARTIAL_ROWS in the DCS spec since
v1.02, for more than a decade. Rename the enumeration to match the spec.
v2: add comment about the rename (David Lechner)
Cc: David Lechner <david@lechnology.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028150047.22048-4-jani.nikula@intel.com
Rename picture parameter set (it's a long packet, not a long write) and
compression mode (it's not a DCS command) enumerations according to the
DSI specification. Order the types according to the spec. Use tabs
instead of spaces for indentation. Use all lower case for hex.
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028150047.22048-1-jani.nikula@intel.com
If Local memory is supported by hardware, we want framebuffer backing
gem objects from local memory.
if the backing obj is not from LMEM, pin_to_display is failed.
v2:
memory regions are correctly assigned to obj->memory_regions [tvrtko]
migration failure is reported as debug log [Tvrtko]
v3:
Migration is dropped. only error is reported [Daniel]
mem region check is move to pin_to_display [Chris]
v4:
s/dev_priv/i915 [chris]
v5:
i915_gem_object_is_lmem is used for detecting the obj mem type. [Matt]
cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105144414.30470-1-ramalingam.c@intel.com
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
So strictly speaking the existing annotation is also ok, because we
have a chain of
obj->mm.lock#I915_MM_GET_PAGES -> fs_reclaim -> obj->mm.lock
(the shrinker cannot get at an object while we're in get_pages, hence
this is safe). But it's confusing, so try to take the right subclass
of the lock.
This does a bit reduce our lockdep based checking, but then it's also
less fragile, in case we ever change the nesting around.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104173720.2696-3-daniel.vetter@ffwll.ch
The trouble with having a plain nesting flag for locks which do not
naturally nest (unlike block devices and their partitions, which is
the original motivation for nesting levels) is that lockdep will
never spot a true deadlock if you screw up.
This patch is an attempt at trying better, by highlighting a bit more
of the actual nature of the nesting that's going on. Essentially we
have two kinds of objects:
- objects without pages allocated, which cannot be on any lru and are
hence inaccessible to the shrinker.
- objects which have pages allocated, which are on an lru, and which
the shrinker can decide to throw out.
For the former type of object, memory allocations while holding
obj->mm.lock are permissible. For the latter they are not. And
get/put_pages transitions between the two types of objects.
This is still not entirely fool-proof since the rules might change.
But as long as we run such a code ever at runtime lockdep should be
able to observe the inconsistency and complain (like with any other
lockdep class that we've split up in multiple classes). But there are
a few clear benefits:
- We can drop the nesting flag parameter from
__i915_gem_object_put_pages, because that function by definition is
never going allocate memory, and calling it on an object which
doesn't have its pages allocated would be a bug.
- We strictly catch more bugs, since there's not only one place in the
entire tree which is annotated with the special class. All the
other places that had explicit lockdep nesting annotations we're now
going to leave up to lockdep again.
- Specifically this catches stuff like calling get_pages from
put_pages (which isn't really a good idea, if we can call get_pages
so could the shrinker). I've seen patches do exactly that.
Of course I fully expect CI will show me for the fool I am with this
one here :-)
v2: There can only be one (lockdep only has a cache for the first
subclass, not for deeper ones, and we don't want to make these locks
even slower). Still separate enums for better documentation.
Real fix: don't forget about phys objs and pin_map(), and fix the
shrinker to have the right annotations ... silly me.
v3: Forgot usertptr too ...
v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
and instead prime lockdep (Chris).
v5: Appease checkpatch, no double empty lines (Chris)
v6: More rebasing over selftest changes. Also somehow I forgot to
push this patch :-/
Also format comments consistently while at it.
v7: Fix typo in commit message (Joonas)
Also drop the priming, with the lmem merge we now have allocations
while holding the lmem lock, which wreaks the generic priming I've
done in earlier patches. Should probably be resurrected when lmem is
fixed. See
commit 232a6ebae4
Author: Matthew Auld <matthew.auld@intel.com>
Date: Tue Oct 8 17:01:14 2019 +0100
drm/i915: introduce intel_memory_region
I'm keeping the priming patch locally so it wont get lost.
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Tang, CQ" <cq.tang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch
[mlankhorst: Fix commit typos pointed out by Michael Ruhl]
Before we grab the engine wakeref, tidy up the previous heartbeat
request. If we then abort because the engine powerwell is off, we ensure
the request is freed as we know we will not have freed it when
cancelling the work (as the work is running!).
Fixes: 841e867286 ("drm/i915/gt: Only drop heartbeat.systole if the sole owner")
References: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106223410.30334-1-chris@chris-wilson.co.uk
Need to set the dte flag on this asic.
Port the fix from amdgpu:
5cb818b861 ("drm/amd/amdgpu: fix si_enable_smc_cac() failed issue")
Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface. This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.
For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.
For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise, the feature enablement will be skipped due to wrong count.
Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix workload bit (WORKLOAD_PPLIB_COMPUTE_BIT) map error
on vega20 and navi asic.
fix commit:
drm/amd/powerplay: add function get_workload_type_map for swsmu
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prefer using intel_encoder and pass the base where needed rather than
keeping both encoder and intel_encoder variables around.
v2: actually add all changes to the patch
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106071715.10613-1-lucas.demarchi@intel.com
Another TGP ID has shown up, so let's add it to avoid South Display
breakage on systems that have this ID.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106005526.1500-1-james.ausmus@intel.com
Clarify some areas, clean up formatting, add section for
unrecoverable error handling.
v2: fix grammatical errors
Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to set the dte flag on this asic.
Port the fix from amdgpu:
5cb818b861 ("drm/amd/amdgpu: fix si_enable_smc_cac() failed issue")
Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The reference to object fence is dropped at the end of the loop.
However, it is dropped again outside the loop. The reference can be
dropped immediately after calling dma_fence_wait() in the loop and
thus the dropping operation outside the loop can be removed.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The object fence is not set to NULL after its reference is dropped. As a
result, its reference may be dropped again if error occurs after that,
which may lead to a use after free bug. To avoid the issue, fence is
explicitly set to NULL after dropping its reference.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Using unified VBIOS has performance drop in sriov environment.
The fix is switching to another register instead.
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface. This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.
For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.
For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
smu_enable_umd_pstate() will try to get the smu->mutex which was already
hold by its parent API smu_force_performance_level() on the call path.
Thus deadlock happens.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
P-state switch should be performed after all devices from the hive
get initialized.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added lock protection so that the p-state switch will
be guarded to be sequential. Also update the hive
pstate only all device from the hive are in the same
state.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise, the feature enablement will be skipped due to wrong count.
Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix workload bit (WORKLOAD_PPLIB_COMPUTE_BIT) map error
on vega20 and navi asic.
fix commit:
drm/amd/powerplay: add function get_workload_type_map for swsmu
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To fit the latest SMU firmware.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Direct uploading save/restore list via mmio register writes breaks the security
policy. Instead, the driver should pass s&r list to psp.
For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list
twice, in non-psp ucode front door loading phase and gfx pg initialization phase.
The latter is not allowed.
VG12 is the only exception where the driver still keeps legacy approach for S&R
list uploading. In theory, this can be elimnated if we have valid srcntl ucode
for VG12.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Candice Li <Candice.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix a static code checker warning.
v2: Drop PTR_ERR_OR_ZERO.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When unloading driver, need to free discovery memory.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
So we know where the tables came from.
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 967a3b85ba.
Reason for revert: Root cause of this issue is found. The workaround is not needed anymore.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Navi10 has 6 PHY, but Navi14 only has 5 PHY, that is
because there is no ENGINE_ID_DIGD in Navi14. Without
this patch, many HDMI related issues (e.g. HDMI S3
resume failure, HDMI pink screen on boot) will be
observed.
[How]
If "eng_id" is larger than ENGINE_ID_DIGD, then
add "eng_id" by 1.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Neil Mayhew <neil@neil.mayhew.name>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To better clarify what is happening in this function.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's safe to enable dynamic VCN powergating on raven and
raven2 for increased power savings.
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add xgmi pstate setting on powerplay routine.
V2: split the change of is_support_sw_smu_xgmi into a separate patch
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add check for is_sw_smu routine and drop check
for amdgpu_dpm which seems non-sense.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pstate settings should be performed after all device of the
XGMI setup get initialized.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 385857adb8.
Reason for revert: Root cause of this issue is found. The workaround is not needed anymore.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Navi10 has 6 PHY, but Navi14 only has 5 PHY, that is
because there is no ENGINE_ID_DIGD in Navi14. Without
this patch, many HDMI related issues (e.g. HDMI S3
resume failure, HDMI pink screen on boot) will be
observed.
[How]
If "eng_id" is larger than ENGINE_ID_DIGD, then
add "eng_id" by 1.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.
[How]
grab the reset lock before calling drm_sched_entity_fini()
Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These were not aligned for optimal performance for GPUVM.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change rem_nsec to u32 since that's what do_div returns, this avoids the
u64 divide in the drm_print args.
Changes in v2:
- Instead of doing do_div in drm_print, make rem_nsec u32 (Ville)
Link to v1: https://patchwork.freedesktop.org/patch/msgid/20191106173622.15573-1-sean@poorly.run
Fixes: 12a280c728 ("drm/dp_mst: Add topology ref history tracking for debugging")
Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sean Paul <sean@poorly.run>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106194121.164458-1-sean@poorly.run
[Why]
doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.
[How]
grab the reset lock before calling drm_sched_entity_fini()
Signed-off-by: Shirish S <shirish.s@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These were not aligned for optimal performance for GPUVM.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm_self_refresh_helper_update_avg_times() was incorrectly accessing the
new incoming state after drm_atomic_helper_commit_hw_done(). But this
state might have already been superceeded by an !nonblock atomic update
resulting in dereferencing an already free'd crtc_state.
TODO I *think* this will more or less do the right thing.. althought I'm
not 100% sure if, for example, we enter psr in a nonblock commit, and
then leave psr in a !nonblock commit that overtakes the completion of
the nonblock commit. Not sure if this sort of scenario can happen in
practice. But not crashing is better than crashing, so I guess we
should either take this patch or rever the self-refresh helpers until
Sean can figure out a better solution.
Fixes: d4da4e3334 ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing")
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
[seanpaul fixed up some checkpatch warns]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104173737.142558-1-robdclark@gmail.com
Fix the cx debugbus related register configuration, to collect accurate
bus data during gpu snapshot. This helps with complete snapshot dump
and also complete proper GPU recovery.
Fixes: 1707add815 ("drm/msm/a6xx: Add a6xx gpu state")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/339165
When inside the lock, remember to unlock even if you want to leave
early.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
We should not be unconditionally calling release_fake_lmem_bar.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106123135.12441-1-matthew.auld@intel.com
The uapi vs. hw state split introduced a bug in
intel_crtc_disable_noatomic() where it's now frobbing an already
freed temp crtc state instead of adjusting the crtc state we
are really left with. Fix that by making a cleaner separation
beteen the two.
This causes explosions on any machine that boots up with pipes
already running but not hooked up to any encoder (typical
behaviour for gen2-4 VBIOS).
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 58d124ea27 ("drm/i915: Complete crtc hw/uapi split, v6.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105171447.22111-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
If the device does not have an aperture through which we can indirectly
access and detile the buffers, simply reject the ioctl. Later we can
extend the ioctl to support different modes, but as an extension the
user must opt in and explicitly control the mmap type (viz
MMAP_OFFSET_IOCTL).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105145305.14314-1-chris@chris-wilson.co.uk
Now that we support both reflections, we can expose 180 degree rotation
and rely on the simplify routine to convert that into REFLECT_X |
REFLECT_Y
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Add support for REFLECT_X rotations.
Cc: Fritz Koenig <frkoenig@chromium.org>
Cc: Daniele Castagna <dcastagna@chromium.org>
Cc: Miguel Casas <mcasas@chromium.org>
Cc: Mark Yacoub <markyacoub@google.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Expose the rotation property and handle REFLECT_Y rotations.
Cc: Fritz Koenig <frkoenig@chromium.org>
Cc: Daniele Castagna <dcastagna@chromium.org>
Cc: Miguel Casas <mcasas@chromium.org>
Cc: Mark Yacoub <markyacoub@google.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
This patch adds the ability for components to expose supported rotations
which will be exposed to userspace via a plane rotation property.
No functional changes in this patch.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
This allows components to implement a .layer_check callback for their
layers which is called during atomic_check.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Instead of hard-coding which components have planes, add a helper
function to walk the components and map a plane index to a component
layer.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Add a couple of functions which enumerate the number of planes for a
component and initialize the planes for a component.
No functional changes in this patch, but it will allow us to selectively
support rotation if the component supports it.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
These formats are handled in the rdma code, but for some reason they're
not published as supported formats for the planes. So add them to the
list.
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Daniele Castagna <dcastagna@chromium.org>
Cc: Miguel Casas <mcasas@chromium.org>
Tested-by: Miguel Casas <mcasas@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
drivers/gpu/drm/drm_dp_mst_topology.c: In function __topology_ref_save:
drivers/gpu/drm/drm_dp_mst_topology.c:1424:6: error: implicit declaration of function stack_trace_save; did you mean stack_depot_save? [-Werror=implicit-function-declaration]
n = stack_trace_save(stack_entries, ARRAY_SIZE(stack_entries), 1);
^~~~~~~~~~~~~~~~
stack_depot_save
drivers/gpu/drm/drm_dp_mst_topology.c: In function __dump_topology_ref_history:
drivers/gpu/drm/drm_dp_mst_topology.c:1513:3: error: implicit declaration of function stack_trace_snprint; did you mean acpi_trace_point? [-Werror=implicit-function-declaration]
stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 4);
^~~~~~~~~~~~~~~~~~~
acpi_trace_point
stack_trace_save and stack_trace_snprint are declared in <linux/stacktrace.h>,
so there is need to include it, and <linux/stackdepot.h> is already included
by practices, so just replace <linux/stackdepot.h> by <linux/stacktrace.h>.
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1572515029-42087-1-git-send-email-chenwandun@huawei.com
In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.
v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
change.
v5: rebased on gem/gt split (Mika)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
In BXT/APL, device 2 MMIO reads from MIPI controller requires its PLL
to be turned ON. When MIPI PLL is turned off (MIPI Display is not
active or connected), and someone (host or GT engine) tries to read
MIPI registers, it causes hard hang. This is a hardware restriction
or limitation.
Driver by itself doesn't read MIPI registers when MIPI display is off.
But any userspace application can submit unprivileged batch buffer for
execution. In that batch buffer there can be mmio reads. And these
reads are allowed even for unprivileged applications. If these
register reads are for MIPI DSI controller and MIPI display is not
active during that time, then the MMIO read operation causes system
hard hang and only way to recover is hard reboot. A genuine
process/application won't submit batch buffer like this and doesn't
cause any issue. But on a compromised system, a malign userspace
process/app can generate such batch buffer and can trigger system
hard hang (denial of service attack).
The fix is to lower the internal MMIO timeout value to an optimum
value of 950us as recommended by hardware team. If the timeout is
beyond 1ms (which will hit for any value we choose if MMIO READ on a
DSI specific register is performed without PLL ON), it causes the
system hang. But if the timeout value is lower than it will be below
the threshold (even if timeout happens) and system will not get into
a hung state. This will avoid a system hang without losing any
programming or GT interrupts, taking the worst case of lowest CDCLK
frequency and early DC5 abort into account.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.
Fix by masking out everything except the Opcode in the cmd tables
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.
Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.
For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.
We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.
Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.
Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.
We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.
Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.
v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
In the next patch we will be adding a second valid
termination condition which will require a small
amount of refactoring to share logic with the BB_END
case.
Refactor all error conditions to jump to a dedicated
exit path, with 'break' reserved only for a successful
parse.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers
Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.
Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.
v3: rebase (Mika)
v4: Add RING_TIMESTAMP registers to whitelist (Jon)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
In "drm/i915: Add support for mandatory cmdparsing" we introduced the
concept of mandatory parsing. This allows the cmdparser to be invoked
even when user passes batch_len=0 to the execbuf ioctl's.
However, the cmdparser needs to know the extents of the buffer being
scanned. Refactor the code to ensure the cmdparser uses the actual
object size, instead of the incoming length, if user passes 0.
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
For Gen7, the original cmdparser motive was to permit limited
use of register read/write instructions in unprivileged BB's.
This worked by copying the user supplied bb to a kmd owned
bb, and running it in secure mode, from the ggtt, only if
the scanner finds no unsafe commands or registers.
For Gen8+ we can't use this same technique because running bb's
from the ggtt also disables access to ppgtt space. But we also
do not actually require 'secure' execution since we are only
trying to reduce the available command/register set. Instead we
will copy the user buffer to a kmd owned read-only bb in ppgtt,
and run in the usual non-secure mode.
Note that ro pages are only supported by ppgtt (not ggtt), but
luckily that's exactly what we need.
Add the required paths to map the shadow buffer to ppgtt ro for Gen8+
v2: IS_GEN7/IS_GEN (Mika)
v3: rebase
v4: rebase
v5: rebase
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.
In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.
Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.
v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Retroactively stop reporting support for secure batches
through the api for gen6+ so that older binaries trigger
the fallback path instead.
Older binaries use secure batches pre gen6 to access resources
that are not available to normal usermode processes. However,
all known userspace explicitly checks for HAS_SECURE_BATCHES
before relying on the secure batch feature.
Since there are no known binaries relying on this for newer gens
we can kill secure batches from gen6, via I915_PARAM_HAS_SECURE_BATCHES.
v2: rebase (Mika)
v3: rebase (Mika)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
We're about to introduce some new tables for later gens, and the
current naming for the gen7 tables will no longer make sense.
v2: rebase
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The counter is removed from the pm wakeref count, but it remains intact
so that we can restore it upon resume. Ergo inside suspend, it may have
a value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-1-chris@chris-wilson.co.uk
(cherry picked from commit 83c55ee82f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Currently we shutdown rc6 during i915_gem_resume() but this is called
during the preparation phase (i915_drm_prepare) for all suspend paths,
but we only want to shutdown rc6 for S3+. Move the actual shutdown to
i915_gem_suspend_late().
We then need to differentiate between suspend targets, to distinguish S0
(s2idle) where the device is kept awake but needs to be in a low power
mode (the same as runtime suspend) from the device suspend levels where
we lose control of HW and so must disable any HW access to dangling
memory.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111909
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/gem_exec_suspend/power-S0
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-4-chris@chris-wilson.co.uk
(cherry picked from commit c601cb2135)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We already track the debugfs user_forcewake on the GT, so it is natural
to pull the suspend/resume handling under gt/ as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-3-chris@chris-wilson.co.uk
(cherry picked from commit 9ab3fe2d7d)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
As we already do reload the kernel context in intel_gt_resume, repeating
that action inside i915_gem_resume() as well is redundant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-2-chris@chris-wilson.co.uk
(cherry picked from commit c8f6cfc56f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Assume all responsibility for operating on the HW to sanitize the GT
state upon load/resume in intel_gt_sanitize() itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-1-chris@chris-wilson.co.uk
(cherry picked from commit 797a615357)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Split out the DP specific parts, making it easier to add DSI specific
configuration. Also move the encoder specific parts towards the end, to
allow overriding generic configuration if needed. This also improves
clarity by making it clear the encoder independent configuration does
not depend on the encoder specific parts.
v2: Rebase
Cc: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104141439.26312-3-jani.nikula@intel.com
Use a simple pointer to the relevant element instead of duplicating the
array subscription. No functional changes.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104141439.26312-2-jani.nikula@intel.com
Since the execlists_active() is no longer protected by the
engine->active.lock, we need to protect the request pointer with RCU to
prevent it being freed as we evaluate whether or not we need to preempt.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Fixes: 13ed13a4dc ("drm/i915: Don't set queue_priority_hint if we don't kick the submission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-2-chris@chris-wilson.co.uk
(cherry picked from commit 7d14863525)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since CNP it's possible for rawclk to have two different values, 19.2
and 24 MHz. If the value indicated by SFUSE_STRAP register is different
from the power on default for PCH_RAWCLK_FREQ, we'll end up having a
mismatch between the rawclk hardware and software states after
suspend/resume. On previous platforms this used to work by accident,
because the power on defaults worked just fine.
Update the rawclk also on resume. The natural place to do this would be
intel_modeset_init_hw(), however VLV/CHV need it done before
intel_power_domains_init_hw(). Thus put it there even if it feels
slightly out of place.
v2: Call intel_update_rawclck() in intel_power_domains_init_hw() for all
platforms (Ville).
Reported-by: Shawn Lee <shawn.c.lee@intel.com>
Cc: Shawn Lee <shawn.c.lee@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Shawn Lee <shawn.c.lee@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101142024.13877-1-jani.nikula@intel.com
For MST on Tiger Lake there are different moments when we need to
configure the transcoder clock select. For the first link this is in step
7.a of the spec, before training the link. For additional streams this
should be done as part of step 8.b after programming receiver VC Payload
ID.
Bspec: 49190
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030012448.14937-4-lucas.demarchi@intel.com
Non-TC ports always have tc_mode == TC_PORT_TBT_ALT so it was
switching aux to TBT mode for all combo-phy ports, happily this did
not caused any issue but is better follow BSpec.
Also this is reserved bit before ICL.
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Fixes: e9b7e1422d ("drm/i915: Sanitize the terminology used for TypeC port modes")
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029011014.286885-1-jose.souza@intel.com
(cherry picked from commit 4974826482)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
For the HPD interrupt functionality the HW depends on power wells in the
display core domain to be on. Accordingly when enabling these power
wells the HPD polling logic will force an HPD detection cycle to account
for hotplug events that may have happened when such a power well was
off.
Thus a detect cycle started by polling could start a new detect cycle if
a power well in the display core domain gets enabled during detect and
stays enabled after detect completes. That in turn can lead to a
detection cycle runaway.
To prevent re-triggering a poll-detect cycle make sure we drop all power
references we acquired during detect synchronously by the end of detect.
This will let the poll-detect logic continue with polling (matching the
off state of the corresponding power wells) instead of scheduling a new
detection cycle.
Fixes: 6cfe7ec02e ("drm/i915: Remove the unneeded AUX power ref from intel_dp_detect()")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112125
Reported-and-tested-by: Val Kulkov <val.kulkov@gmail.com>
Reported-and-tested-by: wangqr <wqr.prg@gmail.com>
Cc: Val Kulkov <val.kulkov@gmail.com>
Cc: wangqr <wqr.prg@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191028181517.22602-1-imre.deak@intel.com
(cherry picked from commit a8ddac7c9f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Lots of redundant assignments inside intel_primary_plane_create().
Get rid of them.
v2: Rebase due to fp16 landing
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031165652.10868-8-ville.syrjala@linux.intel.com
Let's try to keep the pixel format arrays somewhat sorted:
1. RGB before YUV
2. smaller bpp before larger bpp
3. X before A
4. RGB before BGR
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031165652.10868-7-ville.syrjala@linux.intel.com
ICL+ again supports alpha blending with 10bpc pixel formats.
Expose them.
v2: Add all the stuff I missed earlier!
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031165652.10868-6-ville.syrjala@linux.intel.com
CHV pipe B sprites gained support for the 10bpc X/ARGB pixel formats.
On VLV and CHV pipe A/C these are only supported by the primary
plane. Add the require bits to expose the new formats.
v2: Reorder the formats for consistency
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031165652.10868-4-ville.syrjala@linux.intel.com
Since the execlists_active() is no longer protected by the
engine->active.lock, we need to protect the request pointer with RCU to
prevent it being freed as we evaluate whether or not we need to preempt.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Fixes: 13ed13a4dc ("drm/i915: Don't set queue_priority_hint if we don't kick the submission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-2-chris@chris-wilson.co.uk
The counter is removed from the pm wakeref count, but it remains intact
so that we can restore it upon resume. Ergo inside suspend, it may have
a value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-1-chris@chris-wilson.co.uk
UAPI Changes:
- Make context persistence optional
Allow userspace to tie the context lifetime to FD lifetime,
effectively allowing Ctrl-C killing of a process to also clean
up the hardware immediately.
Compute changes: https://github.com/intel/compute-runtime/pull/228
The compute driver is shipping in Ubuntu. uAPI acked by Mesa folks.
- Put future HW and their uAPIs under STAGING & BROKEN
Introduces DRM_I915_UNSTABLE Kconfig menu for working on the new
uAPI for future HW in upstream. We already disable driver loading
by default the platform is deemed ready. This is a second level
of protection based on compile time switch (STAGING & BROKEN).
- Under DRM_I915_UNSTABLE: Add the fake lmem region on iGFX
Fake local memory region on integrated GPU through cmdline:
memmap=2G$16G i915.fake_lmem_start=0x400000000
Currently allows testing non-mappable GGTT behavior and running
kernel selftest for local memory.
Driver Changes:
- Fix Bugzilla #112084: VGA external monitor not working (Ville)
- Add support for half float framebuffers (Ville)
- Add perf support on TGL (Lionel)
- Replace hangcheck by heartbeats (Chris)
- Allow SPT PCH on all AML devices (James)
- Add new CNL PCH for CML platform (Imre)
- Allow 100 ms (Kconfig) for workloads to exit before reset (Chris, Jon, Joonas)
- Forcibly pre-empt a context after 100 ms (Kconfig) of delay (Chris)
- Make timeslice duration Kconfig configurable (Chris)
- Whitelist PS_(DEPTH|INVOCATION)_COUNT for Tigerlake (Tapani)
- Support creating LMEM objects in kernel (Matt A)
- Adjust the location of RING_MI_MODE in the context image for TGL (Chris)
- Handle AUX interrupts for TC ports (Matt R)
- Add support for devices without mappable GGTT aperture (Daniele)
- Rename "inject_load_failure" module parameter to "inject_probe_failure" (Janusz)
- Handle fused off HDCP, FBC, DMC and DSC (Jose)
- Add support to one DP-MST stream on Tigerlake (Lucas)
- Add HuC firmware (and GuC) for TGL (Daniele)
- Allow ICL+ DSI on any pipe (Ville)
- Check some transcoder timing minimum limits (Ville)
- Don't set queue_priority_hint if we don't kick the submission (Chris)
- Introduce barrier pulses along engines to flush idle/in-flight requests (Chris)
- Drop assertion that ce->pin_mutex guards state updates (Chris)
- Cancel banned contexts on schedule-out (Chris)
- Cancel contexts when hangchecking is disabled (Chris)
- Catch GTT fault errors for gen11+ planes (Matt R)
- Print in debugfs if PSR is not enabled because of sink (Jose)
- Do not set MOCS control values on dgfx (Lucas)
- Setup io-mapping for LMEM (Abdiel)
- Support kernel mapping of LMEM objects (Abdiel)
- Add LMEM selftests (Matt A)
- Initialise PMU spinlock before registering (Chris)
- Clear DKL_TX_PMD_LANE_SUS before program TC voltage swing (Jose)
- Flip interpretation of ips fmin/fmax to max rps (Chris)
- Add VBT compression parameter block definition (Jani)
- Limit the blitter sizes to ensure low preemption latency (Chris)
- Fixup block_size rounding on BLT (Matt A)
- Don't try to place HWS in non-existing mappable region (Michal Wa)
- Don't allocate the ring in stolen if we lack aperture (Matt A)
- Add AUX B & C to DC_OFF_POWER_DOMAINS for Tigerlake (Matt R)
- Avoid HPD poll detect triggering a new detect cycle (Imre)
- Document the userspace fail with possible_crtcs (Ville)
- Drop lrc header page now unused by GuC (Daniele)
- Do not switch aux to TBT mode for non-TC ports (Jose)
- Restructure code to avoid depending on i915 but smaller structs (Chris, Tvrtko, Andi)
- Remove pm park/unpark notifications (Chris)
- Avoid lockdep cross-contamination between object types (Chris)
- Restructure DSC code (Jani)
- Fix dead locking in early workload shadow (Zhenyu)
- Split the legacy submission backend from the common CS ring buffer (Chris)
- Move intel_engine_context_in/out into intel_lrc.c (Tvrtko)
- Describe perf/wakeref structure members in documentation (Anna)
- Update renamed header files names in documentation (Anna)
- Add debugs to distingiush a cd2x update from a full cdclk pll update (Ville)
- Rework atomic global state locking (Ville)
- Allow planes to declare their minimum acceptable cdclk (Ville)
- Eliminate skl_check_pipe_max_pixel_rate() and simplify skl_max_scale() (Ville)
- Making loglevel of PSR2/SU logs same (Ap)
- Capture aux page table error register (Lionel)
- Add is_dgfx to device info (Jose)
- Split gen11_irq_handler to make it shareable (Lucas)
- Encapsulate kconfig constant values inside boolean predicates (Chris)
- Split memory_region initialisation into its own file (Chris)
- Use _PICK() for CHICKEN_TRANS() and add CHICKEN_TRANS_D (Ville)
- Add perf helper macros for comparing with whitelisted registers (Umesh)
- Fix i915_inject_load_error() name to read *_probe_* (Janusz)
- Drop unused AUX register offsets (Matt R)
- Provide more information on DP AUX failures (Matt R)
- Add GAM/SFC instdone to error state (Mika)
- Always track callers to intel_rps_mark_interactive() (Chris)
- Nuke 'mode' argument to intel_get_load_detect_pipe() (Ville)
- Simplify LVDS crtc_mask and pipe_mask setup (Ville)
- Stop frobbing crtc->base.mode (Ville)
- Do s/crtc_mask/pipe_mask/ (Ville)
- Split detaching and removing the vma (Chris)
- Selftest improvements (Chris, Tvrtko, Mika, Matt A, Lionel)
- GuC code improvements (Rob, Andi, Daniele)
- Check against i915_selftest only under CONFIG_SELFTEST (Chris)
- Refine occupancy test in kill_context() (Chris)
- Start kthreads before stopping (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101104718.GA14323@jlahtine-desk.ger.corp.intel.com
The bulk of these changes is the addition of DisplayPort support for
Tegra210, Tegra186 and Tegra194. I've been running versions of this for
about three years now, so I'd consider these changes to be pretty
mature. These changes also unify the existing eDP support with the DP
support since the programming is very similar, except for a few steps
that can be easily parameterized.
The rest are a couple of fixes all over the place for minor issues, as
well as some work to support the IOMMU-backed DMA API, which in the end
turned out to also clean up a number of cases where the DMA API was not
being used correctly.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl29isQTHHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zoSXBEACTkVCNwi6xq2mqYl2OnxMEEGuFvMgH
bZrXPm54Md2qM0IBOXCM8GDl6aO6iwj8q+RVADSnU0XlravfTGtVPhLLNbBBgIMy
wjg5Ib6/aN1zuHTvPyvvx++LIeiG2L6PtirdCVPA9GgFCIwVt4xgw35Kekk9lHEt
hia9g9rsfuNUOSKQsBXRbgk/yxlUdk/rM/2tdWQgbpRPNttgc9wrp581jjU2Y+lG
dSkqri2Dv1loNyRAv/tRQgeA+td+lYR6zl5emsukhe4ECPvG4XaqgN+D33Eo+kDx
2MiBJAwefboJV/VjDm+dfR1d/1Qa4XCUEUMgR6P1LXgX0+5jNLOo3InPaKpwDxBw
wvESiibJiLiL3KdJfjrVpDsYDbOt1GUK2fmkLwoH4ihAl674Fvo8M/31P0czxK8X
v5JCm7POhB3vSf/chMoQEJcfklu9NzAUw8J5SJC2Oaj13uhOqAzN+uIrfe7qemtm
W4E2bj50EI+IEVbf8DRfuDCcE/eB35XJMBdip4HwOFCQPeRGGN7EsM49w+TQ9fba
0+GbyY1glxJyUyGN0v/5Lu2IYfbvdOm51pWTlvUW4TS5Lv68rJpiE/lEDmFqHPE2
69cOLe843K4mGIh1wbmT2G+dQKf/PNd/mrQHcMuUK2au849lVQeUE9xz77NuJTeK
3lPim7/95m8eSQ==
=jU8e
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-5.5-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.5-rc1
The bulk of these changes is the addition of DisplayPort support for
Tegra210, Tegra186 and Tegra194. I've been running versions of this for
about three years now, so I'd consider these changes to be pretty
mature. These changes also unify the existing eDP support with the DP
support since the programming is very similar, except for a few steps
that can be easily parameterized.
The rest are a couple of fixes all over the place for minor issues, as
well as some work to support the IOMMU-backed DMA API, which in the end
turned out to also clean up a number of cases where the DMA API was not
being used correctly.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191102140116.3860545-1-thierry.reding@gmail.com
UAPI Changes:
-dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)
Cross-subsystem Changes:
- None
Core Changes:
-dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
state on mmap/munmap (Christian)
-vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
-ttm: always keep bo's on the lru + ttm cleanups (Christian)
-sched: allow a free_job routine to sleep (Steven)
-fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)
Driver Changes:
-bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
-amdgpu: Implement dma-buf import/export without drm helpers (Christian)
-panfrost: Simplify devfreq integration in driver (Steven)
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl27NYYACgkQcywAJXLc
r3k7SQgAy7MbT7MHg1NkObnWFKwbNxA0FuRV7HVuQ5AiVA5fbehGECNVI1hFZE3D
kCyL5zRgzfrbBjI9zqclonXPmjPbD//f+ufUZJ2EaWHfE5k7LUYIEunpgWlCEgUR
qqeuqjvx2+NY3gLZW6D8BIrUW79cKbggBvDa9QeE4aoMiI7/F3lpNog5LNo7TWCQ
4SGgpDqtcJ2eSBneX80zLLVnI1CKfCSiWheZVoqCTRN5bfUftGwhbXb3N/JipG37
cCblVR7D8FR7z0MQUtq2ql76tCn5SJJqloujkmUU06quYBHR5K1deB5BP+8HI1Fw
/XMqWHFo0uX7h4hFIt1MVIJ92U+b+w==
=ViNo
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2019-10-31' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.5:
UAPI Changes:
-dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean)
Cross-subsystem Changes:
- None
Core Changes:
-dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock
state on mmap/munmap (Christian)
-vram: add prepare/cleanup fb helpers to vram helpers (Thomas)
-ttm: always keep bo's on the lru + ttm cleanups (Christian)
-sched: allow a free_job routine to sleep (Steven)
-fb_helper: remove unused drm_fb_helper_defio_init() (Thomas)
Driver Changes:
-bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas)
-amdgpu: Implement dma-buf import/export without drm helpers (Christian)
-panfrost: Simplify devfreq integration in driver (Steven)
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031193015.GA243509@art_vandelay
Avoid
drivers/gpu/drm/i915/i915_perf.c:2442:85: warning: dubious: x | !y
simply by inverting the predicate and reversing the ternary.
v2: Move the long lines into their own function so there is no confusion
on operator precedence.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101192116.12647-1-chris@chris-wilson.co.uk
Currently we shutdown rc6 during i915_gem_resume() but this is called
during the preparation phase (i915_drm_prepare) for all suspend paths,
but we only want to shutdown rc6 for S3+. Move the actual shutdown to
i915_gem_suspend_late().
We then need to differentiate between suspend targets, to distinguish S0
(s2idle) where the device is kept awake but needs to be in a low power
mode (the same as runtime suspend) from the device suspend levels where
we lose control of HW and so must disable any HW access to dangling
memory.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111909
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/gem_exec_suspend/power-S0
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-4-chris@chris-wilson.co.uk
Our timelines are currently contained within an intel_gt, and we only
need to perform list/spinlock initialisation, so we can pull the
intel_timelines_init() into our intel_gt_init_early().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101130406.4142-1-chris@chris-wilson.co.uk
Now that we split plane_state which I didn't want to do yet, we can
program the slave plane without requiring the master plane.
This is useful for programming bigjoiner slave planes as well. We
will no longer need the master's plane_state.
Changes since v1:
- set src/dst rectangles after copy_uapi_to_hw_state.
Changes since v2:
- Use the correct color_plane for pre-gen11 by using planar_linked_plane != NULL.
- Use drm_format_info_is_yuv_semiplanar in skl_plane_check() to fix gen11+.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031112610.27608-12-maarten.lankhorst@linux.intel.com
Splitting plane state is easier than splitting crtc_state,
before plane check we copy the drm properties to hw so we can
do the same in bigjoiner later on.
We copy the state after we did all the modeset handling, but fortunately
i915 seems to be split correctly and nothing during modeset looks
at plane_state.
Changes since v1:
- Do not clear hw state on duplication.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031112610.27608-11-maarten.lankhorst@linux.intel.com
Split up plane_state->base to uapi. This is done using the following patch,
ran after the previous commit that splits out any hw references:
@@
struct intel_plane_state *T;
identifier x;
@@
-T->base.x
+T->uapi.x
@@
struct intel_plane_state *T;
@@
-T->base
+T->uapi
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031112610.27608-10-maarten.lankhorst@linux.intel.com
Split up plane_state->base to hw. This is done using the following patch:
@@
struct intel_plane_state *T;
identifier x =~ "^(crtc|fb|alpha|pixel_blend_mode|rotation|color_encoding|color_range)$";
@@
-T->base.x
+T->hw.x
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031112610.27608-9-maarten.lankhorst@linux.intel.com
get_crtc_from_states() is called before plane_state is copied to uapi,
so use the uapi state there.
intel_legacy_cursor_update() could probably get away with looking at
the hw state, but for clarity always look at the uapi state.
Changes since v1:
- Convert entirety of intel_legacy_cursor_update (Ville).
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031112610.27608-8-maarten.lankhorst@linux.intel.com