Commit Graph

26379 Commits

Author SHA1 Message Date
Ville Syrjälä
d6c6980358 drm/i915: Clear display interrupt before enabling when turning on the power well
For a bit of extra paranoia make sure the display irqs are all cleared
before we enabled them when turning on the power well. This should
really be the case already since the power well was off which resets
everything.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:38 +03:00
Ville Syrjälä
8bb613068a drm/i915: Move vlv/chv display irq code to a more logical place
Reshuffle the code a bit to move the vlv/chv display irq functions away
from the main irq hooks, next to the other sub (de,gt,etc.) hooks.

v2: Rebased due to changes in vlv_display_irq_reset()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476604-2035-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:07:24 +03:00
Ville Syrjälä
9918271efc drm/i915: Skip display irq setup if display irqs aren't flagged as enabled
During runtime PM we'll be reinitializing interrupt support from the
ground up. However since the display power well will be off at that
time, well end up with a ton of unclaimed register accesses from the
display irq setup. Since we turned off the power well already before
runtime suspend, we've flagged display irqs as disabled during runtime
PM transitions. So we can just check that flag to see if we should do
skip display irqs during irq setup.

During driver load display irqs will be flagged as enabled since we've
turned on the power well already, however the power well code will have
skipped the display irq setup since irq support as a whole wasn't yet
enabled when the power well was enabled. So we'll want to do the display
irq setup in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:13 +03:00
Ville Syrjälä
ad22d10654 drm/i915: Fix up vlv/chv display irq setup
The vlv/chv display irq setup was a bit of mess after I ran out of steam
when working on it last. Fix it up so that we just have a _reset() and
_postinstall() hooks for the display irqs, and use those consistently.

v2: Clear out pipestat_irq_mask[] and PIPE_FIFO_UNDERRUN_STATUS in
    vlv_display_irq_reset() (Imre)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476574-1921-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:06:59 +03:00
Ville Syrjälä
93de68f940 drm/i915: Remove "VLV magic" from irq setup
No clue what this is supposed to achieve. I think it's been there since
the very beginning, so presumably some kind of kludge for very early
silicon. Let's just throw it out.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:06:52 +03:00
Ville Syrjälä
6b23f3e8a6 drm/i915: Replace ILK eDP underrun suppression with something better
The underruns we were seeing when enabling eDP port A on ILK seem to
have been caused by prematurely clearing the LP1+ watermark values when
disabling LP1+ watermarks. Now that the watermarks are handled
properly, we can rip out the underrun suppression around the port A
enable.

We still need to worry about the underruns on FDI when enabling
the eDP PLL. But as Bspec tells us, we can avoid that by a vblank
wait on the pipe driving FDI just prior to enabling the eDP PLL.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:02:21 +03:00
Ville Syrjälä
1204d5baa8 drm/i915: Make sure LP1+ watermarks levels are preserved when going from 1 to 2 pipes
Once again ILK is unhappy if we clear out the LP1+ watermark levels
outright, and instead we must disable the levels we don't want while
still leaving the actual programmed watermark levels intact.

Fixes underruns on the already enabled pipe when programming watermarks
while enabling the second pipe.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matt Roper <matthew.d.roper@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93787
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:01:59 +03:00
Ville Syrjälä
b2c0593a0c drm/i915: Try to shut up more ILK underruns
Take a bigger hammer to the underrun suppression on ILK. Instead of
trying to suppress them at specific points in the modeset sequence just
silence them across the entire sequence. This gets rid of some underruns
at least on my ILK. Note that this changes SNB and IVB to follow the
same approach just to keep the code less convoluted. The difference is
that on those platforms we won't suppress CPU underruns for port A since
it doesn't seem to be necessary.

My ILK has port A eDP and two PCH HDMI ports, so I can't be sure this is
as effective on other PCH port types. Perhaps we still need some of
Daniel's extra vblank waits [2]?

I've still been able to trigger an underrun on the other pipe, but
fixing that perhaps needs the LP1+ disable trick I implemented here [1]
which never got merged.

A few details which hamper stress testing on my ILK are that sometimes
the PCH transcoder gets messed up and refuses to shut down, and sometimes
even the panel power sequencer apparently gets stuck on the always on
position.

[1] https://lists.freedesktop.org/archives/intel-gfx/2014-March/041317.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2016-January/086397.html

v2: Add a note that we also get underruns when enabling PCH ports

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-04-12 19:01:35 +03:00
Tvrtko Ursulin
3756685a18 drm/i915: Only grab correct forcewake for the engine with execlists
Rather than blindly waking up all forcewake domains on command
submission, we can teach each engine what is (or are) the correct
one to take.

On platforms with multiple forcewake domains like VLV, CHV, SKL
and BXT, this has the potential of lowering the GPU and CPU
power use and submission latency.

To implement it we add a function named
intel_uncore_forcewake_for_reg whose purpose is to query which
forcewake domains need to be taken to read or write a specific
register with raw mmio accessors.

These enables the execlists engine setup  to query which
forcewake domains are relevant per engine on the currently
running platform.

v2:
  * Kerneldoc.
  * Split from intel_uncore.c macro extraction, WARN_ON,
    no warns on old platforms. (Chris Wilson)

v3:
  * Single domain per engine, mention all registers,
    bi-directional function and a new name, fix handling
    of gen6 and gen7 writes. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
a70ecc16d0 drm/i915: Remove forcewake request registers from the shadowed table
Chris Wilson points out that we can remove them from the array
since they are always written to with raw accessors.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
6863b76c62 drm/i915: Extract knowledge of register forcewake domains
Knowledge of which register per platform belonds in which
forcewake domain was embedded in the MMIO accessors themselves.

Extract it into standalone macros so they can be used from
new code in the following patches.

This causes GCC to compile some of the MMIO accessors slightly
differently and grows the code a tiny amount. But none of the
growth is on the fast-path so it does not matter hugely.

Affected sizes before:

00000000000026f0 00000000000001a5 t gen6_read16
0000000000002390 00000000000001a5 t gen6_read32
00000000000028a0 00000000000001a5 t gen6_read64

00000000000061d0 000000000000019e t gen8_write16
0000000000006510 000000000000019d t gen8_write32
0000000000006370 000000000000019d t gen8_write64
00000000000021f0 000000000000019d t gen8_write8

Affected sizes after:

0000000000002840 00000000000001aa t gen6_read16
00000000000024e0 00000000000001a9 t gen6_read32
00000000000029f0 00000000000001a9 t gen6_read64

0000000000004f20 00000000000001b5 t gen8_write16
0000000000004ba0 00000000000001b4 t gen8_write32
00000000000050e0 00000000000001b4 t gen8_write64
0000000000004d60 00000000000001b4 t gen8_write8

Other MMIO accessors are not affected in size.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
4e1176dd61 drm/i915: Do not serialize forcewake acquire across domains
On platforms with multiple forcewake domains it seems more efficient
to request all desired ones and then to wait for acks to avoid
needlessly serializing on each domain.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460045074-1006-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin
33c582c10a drm/i915: Simplify for_each_fw_domain iterators
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.

For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.

Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.

v2: Improve grammar in the commit message and rename the
    iterator to for_each_fw_domain_masked for consistency.
    (Dave Gordon)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin
a57a4a67e5 drm/i915: Use consistent forcewake auto-release timeout across kernel configs
Because it is based on jiffies, current implementation releases the
forcewake at any time between straight away and between 1ms and 10ms,
depending on the kernel configuration (CONFIG_HZ).

This is probably not what has been desired, since the dynamics of keeping
parts of the GPU awake should not be correlated with this kernel
configuration parameter.

Change the auto-release mechanism to use hrtimers and set the timeout to
1ms with a 1ms of slack. This should make the GPU power consistent
across kernel configs, and timer slack should enable some timer coalescing
where multiple force-wake domains exist, or with unrelated timers.

For GlBench/T-Rex this decreases the number of forcewake releases from
~480 to ~300 per second, and for a heavy combined OGL/OCL test from
~670 to ~360 (HZ=1000 kernel).

Even though this reduction can be attributed to the average release period
extending from 0-1ms to 1-2ms, as discussed above, it will make the
forcewake timeout consistent for different CONFIG_HZ values.

Real life measurements with the above workload has shown that, with this
patch, both manage to auto-release the forcewake between 2-4 times per
10ms, even though the number of forcewake gets is dramatically different.

T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
10ms period, while the OGL/OCL test requests 250 and 380 times in the same
period.

The two data points together suggest that the nature of the forwake
accesses is bursty and that further changes and potential timeout
extensions, or moving the start of timeout from the first to the last
automatic forcewake grab, should be carefully measured for power and
performance effects.

v2:
  * Commit spelling. (Dave Gordon)
  * More discussion on numbers in the commit. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 14:30:41 +01:00
Ville Syrjälä
a05628195a drm/i915: Get panel_type from OpRegion panel details
We've had problems on several occasions with using the panel type
from the VBT block 40. Usually it seems to be 2, which often
doesn't give us the correct timings for the panel. After some
more digging I found a way to get a panel type via the OpRegion
SWSCI GBDA "Get Panel Details" method. Let's try to use it.

The spec has this to say about the output:
"Bits [15:8] - Panel Type
 Bits contain the panel type user setting from CMOS
 00h = Not Valid, use default Panel Type & Timings from VBT
 01h - 0Fh = Panel Number"

Another version of the spec lists the valid range as 1-16, which makes
more sense since VBT supports 16 panels. Based on actual results
from Rob's G45, 1-16 is what we need to accept.

The other bits in the output don't look relevant for the problem at
hand.

The input is specified as:
"Bits [31:4] - Reserved
 Reserved (must be zero)
 Bits [3:0] - Panel Number
 These bits contain the sequential index of Panel, starting at 0 and
 counting upwards from the first integrated Internal Flat-Panel Display
 Encoder present, and then from the first external Display Encoder
 (e.g., S/DVO-B then S/DVO-C) which supports Internal Flat-Panels.
 0h - 0Fh = Panel number"

For now I've just hardcoded the input panel number as 0. That would seem
like a decent choise for LVDS. Not so sure about eDP when port != A.

v2: Accept values 1-16
    Filter out bogus results in opregion code (Jani)
    Add debug logging for all the different branches (Jani)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94825
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460359431-11003-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Rob Kramer <rob@solution-space.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä
3e845c7a40 drm/i915: Replace the static panel_type variable with dev_priv->vbt.panel_type
Store the extracted panel_type under dev_priv.vbt instead of keeping
around a static variable for it.

Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä
eeeebea6cb drm/i915: Reject panel_type > 0xf from VBT
VBT can only contain 16 panel entries, indexed with the panel_type.
To play it safe we should reject panel_type > 0xf, so that we don't
read past the valid data.

v2: Add debug logging (Jani)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460359329-10817-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 13:23:42 +03:00
Ville Syrjälä
706778013b drm/i915: Make GMBUS timeout message DRM_DEBUG_KMS
There's no real reason the user should care that we're about to fall
back to bitbanging, so let's change the message from DRM_INFO to
DRM_DEBUG_KMS.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-5-git-send-email-ville.syrjala@linux.intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94890
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 13:23:17 +03:00
Ville Syrjälä
3e4d44e0fa drm/i915: Restore GMBUS operation after a failed bit-banging fallback
When the GMBUS based i2c transfer times out, we try to fall back to
bit-banging and retry the operation that way. However if the bit-banging
attempt also fails, we should probably go back to the GMBUS method for
the next attempt. Maybe there simply wasn't anyone one the bus at this
time.

There's also a bit of a mess going on with the force_bit handling.
It's supposed to be a ref count actually, and it is as far as
intel_gmbus_force_bit() is concerned. But it's treated as just a
flag by the timeout based bit-banging fallback. I suppose that's
fine since we should never end up in the timeout fallback case
if force_bit was already non-zero. However now that we want to restore
things back to where they were after the bit-banging attempt failed,
we're going to have to do things a bit differently to avoid clobbering
the force_bit count as set up by intel_gmbus_force_bit(). So let's
dedicate the high bit as a flag for the low level timeout based fallback
and treat the rest of the bits as a ref count just as before.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:20:58 +03:00
Ville Syrjälä
ade754ec11 drm/i915: Protect force_bit with gmbus_mutex
Extend the protection of gmbus_mutex around the force_bit
RMW in intel_gmbus_force_bit(), in case someone gets the
idea of calling it from a separate thread while there's
other stuff happening on the same bus.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:20:58 +03:00
Chris Wilson
f470b19095 drm/i915/userptr: Store i915 backpointer for i915_mm_struct
Since we only ever use the drm_i915_private from the stored
i915_mm_struct->dev, save some electrons by storing the right
backpointer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-3-git-send-email-chris@chris-wilson.co.uk
2016-04-11 20:39:16 +01:00
Chris Wilson
40313f0cd0 drm/i915/userptr: Hold mmref whilst calling get-user-pages
Holding a reference to the containing task_struct is not sufficient to
prevent the mm_struct from being reaped under memory pressure. If this
happens whilst we are calling get_user_pages(), explosions erupt -
sometimes an immediate GPF, sometimes page flag corruption. To prevent
the target mm from being reaped as we are reading from it, acquire a
reference before we begin.

Testcase: igt/gem_shrink/*userptr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
2016-04-11 20:39:01 +01:00
Chris Wilson
393afc2c3f drm/i915/userptr: Flush cancellations before mmu-notifier invalidate returns
In order to ensure that all invalidations are completed before the
operation returns to userspace (i.e. before the munmap() syscall returns)
we need to wait upon the outstanding operations.

We are allowed to block inside the invalidate_range_start callback, and
as struct_mutex is the inner lock with mmap_sem we can wait upon the
struct_mutex without provoking lockdep into warning about a deadlock.
However, we don't actually want to wait upon outstanding rendering
whilst holding the struct_mutex if we can help it otherwise we also
block other processes from submitting work to the GPU. So first we do a
wait without the lock and then when we reacquire the lock, we double
check that everything is ready for removing the invalidated pages.

Finally to wait upon the outstanding unpinning tasks, we create a
private workqueue as a means to conveniently wait upon all at once. The
drawback is that this workqueue is per-mm, so any threads concurrently
invalidating objects will wait upon each other. The advantage of using
the workqueue is that we can wait in parallel for completion of
rendering and unpinning of several objects (of particular importance if
the process terminates with a whole mm full of objects).

v2: Apply a cup of tea to the changelog.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94699
Testcase: igt/gem_userptr_blits/sync-unmap-cycles
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-04-11 20:38:33 +01:00
Daniel Vetter
ba3150ac38 drm/i915: Update DRIVER_DATE to 20160411
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-04-11 20:20:18 +02:00
Daniel Vetter
3970285319 Linux 4.6-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa
 6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V
 VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR
 fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+
 atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz
 LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY=
 =bri6
 -----END PGP SIGNATURE-----

Merge tag 'v4.6-rc3' into drm-intel-next-queued

Linux 4.6-rc3

Backmerge requested by Chris Wilson to make his patches apply cleanly.
Tiny conflict in vmalloc.c with the (properly acked and all) patch in
drm-intel-next:

commit 4da56b99d9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 4 14:46:42 2016 +0100

    mm/vmap: Add a notifier for when we run out of vmap address space

and Linus' tree.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-04-11 19:25:13 +02:00
Chris Wilson
fb8621d3be drm/i915: Avoid allocating a vmap arena for a single page
If we want a contiguous mapping of a single page sized object, we can
forgo using vmap() and just use a regular kmap(). Note that this is only
suitable if the desired pgprot_t is compatible.

v2: Use is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:13:36 +01:00
Chris Wilson
f2a85e1975 drm,i915: Introduce drm_malloc_gfp()
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:13:10 +01:00
Chris Wilson
eae2c43b12 drm/i915/shrinker: Restrict vmap purge to objects with vmaps
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.

v2: Start using is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:12:16 +01:00
Chris Wilson
0a798eb92e drm/i915: Refactor duplicate object vmap functions
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the mapping into the
obj->pages lifetime, then we can reuse an obj->mapping for both and at
the same time couple it into the shrinker. There is a third vmapping
routine in the cmdparser that maps only a range within the object, for
the time being that is left alone, but will eventually use these routines
in order to cache the mapping between invocations.

v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
v3: Call unpin_vmap from the right dmabuf unmapper

v4: Rename vmap to map as we don't wish to imply the type of mapping
involved, just that it contiguously maps the object into kernel space.
Add kerneldoc and lockdep annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:11:44 +01:00
Chris Wilson
d2cad5358b drm/i915: Consolidate common error handling in intel_pin_and_map_ringbuffer_obj
After we pin the ringbuffer into the GGTT, all error paths need to unpin
it again. Move this common step into one block, and make the unable to
iomap error code consistent (i.e. treat it as out of memory to avoid
confusing it with a invalid argument).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-3-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:11:08 +01:00
Chris Wilson
6d19245f18 drm/i915/dmabuf: Tighten struct_mutex for unmap_dma_buf
We only need the struct_mutex to manipulate the pages_pin_count on the
object, we do not need to hold our BKL when freeing the exported
scatterlist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-2-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:10:56 +01:00
Tim Gore
b1e429fe3b drm/i915: implement WaClearTdlStateAckDirtyBits
This is to fix a GPU hang seen with mid thread pre-emption
and pooled EUs.

v2. Use IS_BXT_REVID instead of IS_BROXTON and INTEL_REVID

v3. And use correct type for register addresses

Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458571049-854-1-git-send-email-tim.gore@intel.com
2016-04-11 11:59:43 +01:00
Dongwon Kim
25a5670533 drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
For BXT, description of polarities of PORT_PLL_REF_SEL
has been reversed for newer Gen9LP steppings according to the
recent update in Bspec. This bit now should be set for
"Non-SSC" mode for all Gen9LP starting from B0 stepping.

v2: Only B0 and newer stepping should be affected by this
change.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94866
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458176773-26925-1-git-send-email-dongwon.kim@intel.com
2016-04-11 13:02:23 +03:00
Maarten Lankhorst
c0ead7039a drm/i915: Rename hw state checker to hw state verifier.
Check functions are used by atomic to see if the new state will
be allowed. There's also a hw state checker which checks afterwards
that the committed state is correct. Rename it to hw state verifier
to reduce some confusion.

Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56FB8785.8020506@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2016-04-11 10:34:17 +02:00
Maarten Lankhorst
f6d1973db2 drm/i915: Move modeset state verifier calls.
The modeset state verifier no longer has full access to the hardware,
instead it should only verify affected crtc's.

Looking for disabled stuff can be verified immediately after all crtc
disables have completed, while each enabled crtc can be verified right
after being enabled.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: check -> verify]
2016-04-11 10:33:30 +02:00
Maarten Lankhorst
e7c8454475 drm/i915: Make modeset state verifier take crtc as argument.
This will make it easier to keep the crtc checker when atomic
commit is reworked for asynchronous commits. This prevents checking
crtc's that were not part of the state. It's safe to verify disabled
encoders, connectors and dpll's that are not part of the state,
because during modeset connection_mutex is held.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: Extend commit message and rename check to verify.]
2016-04-11 10:32:56 +02:00
Chris Wilson
5dd8e50c27 drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-5-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:59 +01:00
Chris Wilson
0d317ce99e drm/i915: Use simplest form for flushing the single cacheline in the HWS
Rather than call a function to compute the matching cachelines and
clflush them, just call the clflush *instruction* directly. We also know
that we can use the unpatched plain clflush rather than the clflushopt
alternative.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-4-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:45 +01:00
Chris Wilson
12471ba87a drm/i915: Harden detection of missed interrupts
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.

v2: Clear the stuck interrupt marker between successful batches

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-3-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:29 +01:00
Chris Wilson
c04e0f3b4e drm/i915: Separate out the seqno-barrier from engine->get_seqno
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:05 +01:00
Chris Wilson
9b9ed30936 drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.

v3: Update commentary

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:08:53 +01:00
Akash Goel
782f6bc0ab drm/i915: Fixup the free space logic in ring_prepare
Currently for the case where there is enough space at the end of Ring
buffer for accommodating only the base request, the wrapround is done
immediately and as a result the base request gets added at the start
of Ring buffer. But there may not be enough free space at the beginning
to accommodate the base request, as before the wraparound, the wait was
effectively done for the reserved_size free space from the start of
Ring buffer. In such a case there is a potential of Ring buffer overflow,
the instructions at the head of Ring (ACTHD) can get overwritten.

Since the base request can fit in the remaining space, there is no need
to wraparound immediately. The wraparound will anyway happen later when
the reserved part starts getting used.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1457688402-10411-1-git-send-email-akash.goel@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
2016-04-09 11:37:48 +01:00
Chris Wilson
cffa781e59 drm/i915: Simplify check for idleness in hangcheck
Having fixed the tracking of the engine's last_submitted_seqno, we can
now rely on it for detecting when the engine is idle (and not have to
touch the requests pointer).

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:45:05 +01:00
Chris Wilson
7c90b7de73 drm/i915: Apply a mb between emitting the request and hangcheck
Seal the request and mark it as pending execution before we submit it to
hardware. We assume that the actual submission cannot fail (that
guarantee is provided by preallocating space in the request for the
submission). As we may inspect this state without holding any locks
during hangcheck we should apply a barrier to ensure that we do
not see a more recent value in the HWS than we are tracking.

Based on a patch by Mika Kuoppala.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-8-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:44:48 +01:00
Chris Wilson
01347126f4 drm/i915: Reset engine->last_submitted_seqno
When we change the current seqno, we also need to remember to reset the
last_submitted_seqno for the engine.

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-7-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:44:22 +01:00
Chris Wilson
a058d93482 drm/i915: Reset semaphore page for gen8
An oversight is that when we wrap the seqno, we need to reset the hw
semaphore counters to 0. We did this for gen6 and gen7 and forgot to do
so for the new implementation required for gen8 (legacy).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-6-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:59 +01:00
Chris Wilson
8c12672ee2 drm/i915: Refactor gen8 semaphore offset calculation
We reuse the same calculation into two macros, and I want to add a third
user. Time to refactor.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-5-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:48 +01:00
Chris Wilson
29dcb57026 drm/i915: Move the hw semaphore initialisation from GEM to the engine
Since we are setting engine local values that are tied to the hardware,
move it out of i915_gem_init_seqno() into the intel_ring_init_seqno()
backend, next to where the other hw semaphore registers are written.

v2: Make the explanatory comment about always resetting the semaphores to
0 irrespective of the value of the reset seqno.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-4-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:38 +01:00
Chris Wilson
d04bce48a5 drm/i915: Remove unneeded drm_device pointer from intel_ring_init_seqno()
We only use drm_i915_private within the function, so delete the unneeded
drm_device local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-3-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:25 +01:00
Chris Wilson
2ed53a94d8 drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno
After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligible, as all the requests will be considered completed by this
point, it just brings the HWS into line with expectations for external
viewers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-2-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:11 +01:00