Commit Graph

26416 Commits

Author SHA1 Message Date
Peter Antoine
0ccdacf694 drm/i915/mocs: Program MOCS for all engines on init
Allow for the MOCS to be programmed for all engines.
Currently we program the MOCS when the first render batch
goes through. This works on most platforms but fails on
platforms that do not run a render batch early,
i.e. headless servers. The patch now programs all initialised engines
on init and the RCS is programmed again within the initial batch. This
is done for predictable consistency with regards to the hardware
context.

Hardware context loading sets the values of the MOCS for RCS
and L3CC. Programming them from within the batch makes sure that
the render context is valid, no matter what the previous state of
the saved-context was.

v2: posted correct version to the mailing list.
v3: moved programming to within engine->init_hw() (Chris Wilson)
v4: code formatting and white-space changes. (Chris Wilson)

Testcase: igt/gem_mocs_settings
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460556205-6644-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
aa9b78104f drm/i915: Late request cancellations are harmful
Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.

All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.

A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-16-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
fcb5106d66 drm/i915: Reorganise legacy context switch to cope with late failure
After mi_set_context() succeeds, we need to update the state of the
engine's last_context. This ensures that we hold a pin on the context
whilst the hardware may write to it. However, since we didn't complete
the post-switch setup of the context, we need to force the subsequent
use of the same context to complete the setup (which means updating
should_skip_switch()).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-15-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
e1a8daa2d1 drm/i915: Split out !RCS legacy context switching
Having the !RCS legacy context switch threaded through the RCS switching
code makes it much harder to follow and understand. In the next patch, I
want to fix a bug handling the incomplete switch, this is made much
simpler if we segregate the two paths now.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-14-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
349f2ccff9 drm/i915: Move the mb() following release-mmap into release-mmap
As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses from userspace
and guarantees serialisation of the code against concurrent access just
by calling i915_gem_release_mmap().

v2: Reduce the mb() into a wmb() following the revoke.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-13-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
a687a43a48 drm/i915: Force ringbuffers to not be at offset 0
For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.

Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.

References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-12-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
e9135c4f08 drm/i915: Prevent machine death on Ivybridge context switching
Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser.  The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault.  However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.

This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko

The issue was introduced with
commit 2c55018347 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 16 10:02:27 2014 +0000

    drm/i915: Disable PSMI sleep messages on all rings around context switches

Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-11-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
804e59a830 drm/i915: Suppress error message when GPU resets are disabled
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expected failure as a DEBUG message.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-10-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
f4457ae71f drm/i915: Prevent leaking of -EIO from i915_wait_request()
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.

If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.

Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.

v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
f7e5838bb3 drm/i915: Simplify reset_counter handling during atomic modesetting
Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-8-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
299259a3a9 drm/i915: Store the reset counter when constructing a request
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.

PS: With per-engine resets, we obviously cannot assume a global reset
epoch for the requests - a per-engine epoch makes the most sense. The
challenge then is how to handle checking in the waiter for when to break
the wait, as the fine-grained reset may also want to requeue the
request (i.e. the assumption that just because the epoch changes the
request is completed may be broken - or we just avoid breaking that
assumption with the fine-grained resets).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
d98c52cf4f drm/i915: Tighten reset_counter for reset status
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.

The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.

PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
7f1847ebf4 drm/i915: Simplify checking of GPU reset_counter in display pageflips
If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need to abort the operation (to handle the hang).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-5-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
c19ae989b0 drm/i915: Hide the atomic_read(reset_counter) behind a helper
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.

v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
d501b1d2b1 drm/i915: Add GEM debugging Kconfig option
Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
e73bdd206a drm/i915: Disentangle i915_drv.h includes
Separate out the layers of includes (linux, drm, intel, i915) so that it
is a little easier to order our definitions between our multiple
reentrant headers. A couple of headers needed fixes to make them more
standalone (forgotten includes, forward declarations etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-2-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
0a793ad34f drm/i915: Force clean compilation with -Werror
Our driver compiles clean (nowadays thanks to 0day) but for me, at least,
it would be beneficial if the compiler threw an error rather than a
warning when it found a piece of suspect code. (I use this to
compile-check patch series and want to break on the first compiler error
in order to fix the patch.)

v2: Kick off a new "Debugging" submenu for i915.ko

At this point, we applied it to the kernel and promptly kicked it out
again as it broke buildbots (due to a compiler warning on 32bits):

commit 908d759b21
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue May 26 07:46:21 2015 +0200

    Revert "drm/i915: Force clean compilation with -Werror"

v3: Avoid enabling -Werror for allyesconfig/allmodconfig builds, using
COMPILE_TEST as a suitable proxy suggested by Andrew Morton. (Damien)
Only make the option available for EXPERT to reinforce that the option
should not be casually enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Mika Kuoppala
c02e85a06e drm/i915: Calculate edram size
With gen9+ the edram capabilities are defined so
that we can calculate the edram (ellc) size accordingly.

Note that there are undefined combinations for some subset of
edram capability bits. Return the closest size for undefined indexes.
Even if we get it wrong with beginning of future gen enabling, the size
information is currently only used for boot message and in debugfs entry.

v2: Use function instead of hard to read macro (Daniel)
v3: s/INTEL_INFO/INTEL_GEN (Matthew)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460557604-7126-2-git-send-email-mika.kuoppala@intel.com
2016-04-14 12:27:37 +03:00
Mika Kuoppala
3accaf7e73 drm/i915: Store and use edram capabilities
Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.

v2: return bytes for edram size (Chris)
    s/eLLC/eDRAM in output if we are gen > 8

v3: rebase, INTEL_GEN (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 12:27:37 +03:00
Mika Kuoppala
666fbcf5c2 drm/i915: Don't program eLLC IDI hash mask for gen9+
For gen9 onwards, eDRAM is a true memory side cache. So
there is no need to program idi hash mask as it is for eLLC
only.

v2: INTEL_GEN (Chris), s/has/hash (Matthew)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-04-14 12:27:37 +03:00
Jani Nikula
3cb26e26ed drm/i915/opregion: remove unnecessary ifdefs on CONFIG_ACPI
The whole file is ignored on CONFIG_ACPI=n.

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460127589-8357-1-git-send-email-jani.nikula@intel.com
2016-04-13 15:54:29 +03:00
Michał Winiarski
ce81a65c79 drm/i915: Adjust size of PIPE_CONTROL used for gen8 render seqno write
We started to use PIPE_CONTROL to write render ring seqno in order to
combat seqno write vs interrupt generation problems. This was introduced
by commit 7c17d37737 ("drm/i915: Use ordered seqno write interrupt
generation on gen8+ execlists").

On gen8+ size of PIPE_CONTROL with Post Sync Operation should be
6 dwords. When we're using older 5-dword variant it's possible to
observe inconsistent values written by PIPE_CONTROL with Post
Sync Operation from user batches, resulting in rendering corruptions.

v2: Fix BAT failures
v3: Comments on alignment and thrashing high dword of seqno (Chris)
v4: Updated commit msg (Mika)

Testcase: igt/gem_pipe_control_store_loop/*-qword-write
Issue: VIZ-7393
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460469115-26002-1-git-send-email-michal.winiarski@intel.com
2016-04-13 15:34:51 +03:00
Mika Kuoppala
97ea6be161 drm/i915/skl: Fix spurious gpu hang with gt3/gt4 revs
Experiments with heaven 4.0 benchmark and skylake gt3e (rev 0xa)
suggest that WaForceContextSaveRestoreNonCoherent is needed for all
revs. Extending this to all revs cures a gpu hang with rev 0xa when
running heaven4.0 gpu benchmark.

We have been here before, with problems enabling gt4e and extending
up to revision F0 instead of false claims of bspec of E0 only. See
commit <e238659ddd88> ("drm/i915/skl: Default to noncoherent access
up to F0"). In retrospect we should have covered this with this big
blanket back then already, as E0 vs F0 discrepancy was suspicious
enough.

Previously the WaForceEnableNonCoherent has been tied to
context non-coherence, atleast in relevant hsds. So keep this tie
and extended this alongside.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: stable@vger.kernel.org
Reported-by: Mike Lothian <mike@fireburn.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93491
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-2-git-send-email-mika.kuoppala@intel.com
2016-04-13 15:30:25 +03:00
Mika Kuoppala
185c66e57c drm/i915/skl: Fix rc6 based gpu/system hang
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
2016-04-13 15:28:25 +03:00
Jani Nikula
3f10e82fe1 drm/i915: add INTEL_GEN() helper shorthand for INTEL_INFO()->gen
Sudden realization:

$ grep -ho "INTEL_INFO([^)]*)->[a-zA-Z0-9_]*" *.[ch] | sed 's/.*->//' |\
  sort | uniq -c | sort -rn | head -5
  446 gen
   24 num_pipes
   10 ring_mask
    9 color
    4 subslice_per_slice

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460022497-29304-1-git-send-email-jani.nikula@intel.com
2016-04-13 14:36:14 +03:00
Tvrtko Ursulin
7d774cacf0 drm/i915: Use new i915_gem_object_pin_map for LRC
We can use the new pin/lazy unpin API for simplicity
and more performance in the execlist submission paths.

v2:
  * Fix error handling and convert more users.
  * Compact some names for readability.

v3:
  * intel_lr_context_free was not unpinning.
  * Special case for GPU reset which otherwise unbalances
    the HWS object pages pin count by running the engine
    initialization only (not destructors).

v4:
  * Rebased on top of hws setup/init split.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460472042-1998-1-git-send-email-tvrtko.ursulin@linux.intel.com
[tursulin: renames: s/hwd/hws/, s/obj_addr/vaddr/]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-13 10:47:54 +01:00
Tvrtko Ursulin
04794adbdd drm/i915: Split execlists hardware status page initialisation from setup
Split the hardware status page into setup and initialisation,
where setup means setting up the driver state to support the
engine, and initialization means programming the hardware
with the before set up state.

This way the design matches the design of the engine setup/init
code which is split in the same fashion and it enables the
stages to be used in a balanced fashion (engine setup - hws
setup, engine init - hws init).

This will enable the upcoming improvements to slot in without
any kludges on the GPU reset path.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-13 10:44:32 +01:00
Ville Syrjälä
b521973b45 drm/i915: Don't read out port_clock on CHV when DPLL is disabled
Check whether the DPLL is even enabled before readoing out the dividers
and trying to derive port_clock on CHV. We already did this on VLV.

Also remove the comment "MIPI" comment from the VLV code since we call
this function whenever the pipe is enabled.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 21:17:47 +03:00
Ville Syrjälä
7f7d8dd62c drm/i915: Dump pfit PGM_RATIOS as hex
pgm_ratios in stored as a register value in pipe config, so let's dump
this one as hex as well.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-15-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 21:17:35 +03:00
Ville Syrjälä
ae9ec62bda drm/i915: Fix CHV DSI PLL refclk during state readout
Use the proper refclock frequency (100MHz) when reading out the
current DSI clock on CHV.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-13-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 21:12:02 +03:00
Ville Syrjälä
f00b56896e drm/i915: Power down the DSI PLL before reconfiguring it
On VLV at least, the BIOS may leave the DSI PLL enabled in some wonky
state where it just refuses to lock. Simply disabling the PLL before
reconfiguring it is not enough to fix it, but power gating the PLL
prior to reconfiguring does work.

This happens on BYT FFRD8 when booting with HDMI connected so the DSI
display will not be lit up by the BIOS.

Also we can remove the code for BXT that disables the PLL before
enabling it again.

v2: s/vlv/intel/ since BXT made thing generic
v3: Remove the BXT disable PLL before enable trick

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-11-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 21:11:51 +03:00
Ville Syrjälä
50dd63a27b drm/i915: Change lfsr_converts[] to u16
All the values in the DSI PLL LFSR seed table fit into 9bits, so change
the type to u16 from u32 to save a bit of space.

 drivers/gpu/drm/i915/i915.ko:
-.rodata                        90824
+.rodata                        90760

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-10-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 21:11:35 +03:00
Ville Syrjälä
522bad5b5e Revert "drm/i915: Limit the auto arming of mmio debugs on vlv/chv"
Enable the unclaimd register detection stuff on vlv/chv since we've now
fixed the known problems during suspend.

This reverts commit c81eeea6c1.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-11-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:09:39 +03:00
Ville Syrjälä
71b8b41d5b drm/i915: Move DPINVGTT setup to vlv_display_irq_reset()
DPINVGTT lives inside the disp2d power well so we can't frob it unless
we know the power well is active. Let's this stuff into
vlv_display_irq_reset() which is only called at the right times so that
we don't get unclaimed register access errors.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-10-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:09:21 +03:00
Ville Syrjälä
766078df43 drm/i915: Move vlv_init_display_clock_gating() to the display power well
The registers frobbed by vlv_init_display_clock_gating() libve inside
the disp2d power well, so frobbing them while the power well is down
results in unclaimed register access warning (and of course the values
won't stick). Let's do this setup after we know the power well is
enabled.

It's also worth noting that DSPCLK_GATE_D and CBR1_VLV lose their state
when the power well goes down, but fortunately the values we've been
writing are actually the reset defaults.

MI_ARB_VLV actually retains its value even if the power well was turned
off, we just can't access it while the power well is down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:09:07 +03:00
Ville Syrjälä
6b7eafc1b4 drm/i915: Warn if irq_mask isn't ~0 during vlv/cvh display irq postinstall
We expect vlv_display_irq_reset() to have been called prior to
vlv_display_irq_postinstall() so let's WARN if that isn't the case.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-8-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:08:51 +03:00
Ville Syrjälä
9ab981f22b drm/i915: Use GEN5_IRQ_INIT() in vlv_display_irq_postinstall()
Replace the hand rolled IMR/IER setup in vlv_display_irq_postinstall()
with GEN5_IRQ_INIT(). Also rename the iir_mask to enable_mask to avoid
consusion since we no longer deal with IIR here.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:08:32 +03:00
Ville Syrjälä
d6c6980358 drm/i915: Clear display interrupt before enabling when turning on the power well
For a bit of extra paranoia make sure the display irqs are all cleared
before we enabled them when turning on the power well. This should
really be the case already since the power well was off which resets
everything.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:38 +03:00
Ville Syrjälä
8bb613068a drm/i915: Move vlv/chv display irq code to a more logical place
Reshuffle the code a bit to move the vlv/chv display irq functions away
from the main irq hooks, next to the other sub (de,gt,etc.) hooks.

v2: Rebased due to changes in vlv_display_irq_reset()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476604-2035-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:07:24 +03:00
Ville Syrjälä
9918271efc drm/i915: Skip display irq setup if display irqs aren't flagged as enabled
During runtime PM we'll be reinitializing interrupt support from the
ground up. However since the display power well will be off at that
time, well end up with a ton of unclaimed register accesses from the
display irq setup. Since we turned off the power well already before
runtime suspend, we've flagged display irqs as disabled during runtime
PM transitions. So we can just check that flag to see if we should do
skip display irqs during irq setup.

During driver load display irqs will be flagged as enabled since we've
turned on the power well already, however the power well code will have
skipped the display irq setup since irq support as a whole wasn't yet
enabled when the power well was enabled. So we'll want to do the display
irq setup in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:13 +03:00
Ville Syrjälä
ad22d10654 drm/i915: Fix up vlv/chv display irq setup
The vlv/chv display irq setup was a bit of mess after I ran out of steam
when working on it last. Fix it up so that we just have a _reset() and
_postinstall() hooks for the display irqs, and use those consistently.

v2: Clear out pipestat_irq_mask[] and PIPE_FIFO_UNDERRUN_STATUS in
    vlv_display_irq_reset() (Imre)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476574-1921-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:06:59 +03:00
Ville Syrjälä
93de68f940 drm/i915: Remove "VLV magic" from irq setup
No clue what this is supposed to achieve. I think it's been there since
the very beginning, so presumably some kind of kludge for very early
silicon. Let's just throw it out.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:06:52 +03:00
Ville Syrjälä
6b23f3e8a6 drm/i915: Replace ILK eDP underrun suppression with something better
The underruns we were seeing when enabling eDP port A on ILK seem to
have been caused by prematurely clearing the LP1+ watermark values when
disabling LP1+ watermarks. Now that the watermarks are handled
properly, we can rip out the underrun suppression around the port A
enable.

We still need to worry about the underruns on FDI when enabling
the eDP PLL. But as Bspec tells us, we can avoid that by a vblank
wait on the pipe driving FDI just prior to enabling the eDP PLL.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:02:21 +03:00
Ville Syrjälä
1204d5baa8 drm/i915: Make sure LP1+ watermarks levels are preserved when going from 1 to 2 pipes
Once again ILK is unhappy if we clear out the LP1+ watermark levels
outright, and instead we must disable the levels we don't want while
still leaving the actual programmed watermark levels intact.

Fixes underruns on the already enabled pipe when programming watermarks
while enabling the second pipe.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matt Roper <matthew.d.roper@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93787
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:01:59 +03:00
Ville Syrjälä
b2c0593a0c drm/i915: Try to shut up more ILK underruns
Take a bigger hammer to the underrun suppression on ILK. Instead of
trying to suppress them at specific points in the modeset sequence just
silence them across the entire sequence. This gets rid of some underruns
at least on my ILK. Note that this changes SNB and IVB to follow the
same approach just to keep the code less convoluted. The difference is
that on those platforms we won't suppress CPU underruns for port A since
it doesn't seem to be necessary.

My ILK has port A eDP and two PCH HDMI ports, so I can't be sure this is
as effective on other PCH port types. Perhaps we still need some of
Daniel's extra vblank waits [2]?

I've still been able to trigger an underrun on the other pipe, but
fixing that perhaps needs the LP1+ disable trick I implemented here [1]
which never got merged.

A few details which hamper stress testing on my ILK are that sometimes
the PCH transcoder gets messed up and refuses to shut down, and sometimes
even the panel power sequencer apparently gets stuck on the always on
position.

[1] https://lists.freedesktop.org/archives/intel-gfx/2014-March/041317.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2016-January/086397.html

v2: Add a note that we also get underruns when enabling PCH ports

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:01:35 +03:00
Tvrtko Ursulin
3756685a18 drm/i915: Only grab correct forcewake for the engine with execlists
Rather than blindly waking up all forcewake domains on command
submission, we can teach each engine what is (or are) the correct
one to take.

On platforms with multiple forcewake domains like VLV, CHV, SKL
and BXT, this has the potential of lowering the GPU and CPU
power use and submission latency.

To implement it we add a function named
intel_uncore_forcewake_for_reg whose purpose is to query which
forcewake domains need to be taken to read or write a specific
register with raw mmio accessors.

These enables the execlists engine setup  to query which
forcewake domains are relevant per engine on the currently
running platform.

v2:
  * Kerneldoc.
  * Split from intel_uncore.c macro extraction, WARN_ON,
    no warns on old platforms. (Chris Wilson)

v3:
  * Single domain per engine, mention all registers,
    bi-directional function and a new name, fix handling
    of gen6 and gen7 writes. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
a70ecc16d0 drm/i915: Remove forcewake request registers from the shadowed table
Chris Wilson points out that we can remove them from the array
since they are always written to with raw accessors.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
6863b76c62 drm/i915: Extract knowledge of register forcewake domains
Knowledge of which register per platform belonds in which
forcewake domain was embedded in the MMIO accessors themselves.

Extract it into standalone macros so they can be used from
new code in the following patches.

This causes GCC to compile some of the MMIO accessors slightly
differently and grows the code a tiny amount. But none of the
growth is on the fast-path so it does not matter hugely.

Affected sizes before:

00000000000026f0 00000000000001a5 t gen6_read16
0000000000002390 00000000000001a5 t gen6_read32
00000000000028a0 00000000000001a5 t gen6_read64

00000000000061d0 000000000000019e t gen8_write16
0000000000006510 000000000000019d t gen8_write32
0000000000006370 000000000000019d t gen8_write64
00000000000021f0 000000000000019d t gen8_write8

Affected sizes after:

0000000000002840 00000000000001aa t gen6_read16
00000000000024e0 00000000000001a9 t gen6_read32
00000000000029f0 00000000000001a9 t gen6_read64

0000000000004f20 00000000000001b5 t gen8_write16
0000000000004ba0 00000000000001b4 t gen8_write32
00000000000050e0 00000000000001b4 t gen8_write64
0000000000004d60 00000000000001b4 t gen8_write8

Other MMIO accessors are not affected in size.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
4e1176dd61 drm/i915: Do not serialize forcewake acquire across domains
On platforms with multiple forcewake domains it seems more efficient
to request all desired ones and then to wait for acks to avoid
needlessly serializing on each domain.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460045074-1006-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin
33c582c10a drm/i915: Simplify for_each_fw_domain iterators
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.

For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.

Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.

v2: Improve grammar in the commit message and rename the
    iterator to for_each_fw_domain_masked for consistency.
    (Dave Gordon)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
2016-04-12 14:30:41 +01:00