Commit Graph

707074 Commits

Author SHA1 Message Date
Michal Wajdeczko
9f65208f25 drm/i915/guc: Update Guc messages on load failure
In case of GuC firmware loading failure we were reporting
DRM_ERROR also for case when GuC loading was not strictly
required.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-14-michal.wajdeczko@intel.com
2017-10-16 18:53:30 +03:00
Michal Wajdeczko
4502e9ec82 drm/i915/uc: Unify firmware loading
Firmware loading for GuC and HuC are similar. Move common code
into generic function for maximum reuse.

v2: change message levels (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-13-michal.wajdeczko@intel.com
2017-10-16 18:53:29 +03:00
Michal Wajdeczko
f1e86cecf1 drm/i915: Update DMC firmware load error messages
Some of the error messages from DMC load were too generic and
may be confusing for the user. Lets explicitly add DMC words there.
Also as homepage of DMC firmware is same as for the GuC and Huc,
lets reuse URL definition.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-12-michal.wajdeczko@intel.com
2017-10-16 18:53:28 +03:00
Michal Wajdeczko
1e913d27ce drm/i915/uc: Add message with firmware url
In case of firmware fetch failure we should help user
find latest firmware. URL macro duplication from csr.c
will be fixed in upcoming patch.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-11-michal.wajdeczko@intel.com
2017-10-16 18:53:28 +03:00
Michal Wajdeczko
5f99afdbf1 drm/i915/uc: Improve debug messages in firmware fetch
Time to remove less important info and make messages clear
and consistent.

v2: change some message levels (Chris)
v3: restore DRM_WARN (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #2
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-10-michal.wajdeczko@intel.com
2017-10-16 18:53:27 +03:00
Michal Wajdeczko
86ffc31211 drm/i915/guc: Pick better place for Guc final status message
GuC status message printed right after firmware upload may be too
optimistic, as we may fail on subsequent steps. Move that message
to the end of intel_uc_init_hw where we know the status for sure.

v2: use dev_info (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-9-michal.wajdeczko@intel.com
2017-10-16 18:53:27 +03:00
Michal Wajdeczko
afb3484f92 drm/i915/uc: Check all firmwares against WOPCM size
Both GuC and HuC firmwares will be moved into WOPCM so
we can check ucode size early for both cases.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-8-michal.wajdeczko@intel.com
2017-10-16 18:53:26 +03:00
Michal Wajdeczko
cd5a917e35 drm/i915/guc: Reorder functions in intel_guc_fw.c
Functions should be defined in their use order.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-7-michal.wajdeczko@intel.com
2017-10-16 18:53:26 +03:00
Michal Wajdeczko
e8668bbcb0 drm/i915/guc: Rename intel_guc_loader.c to intel_guc_fw.c
Remaining functions in intel_guc_loader.c were focused around
GuC firmware. Rename them to match object-verb pattern and
rename file itself.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-6-michal.wajdeczko@intel.com
2017-10-16 18:53:25 +03:00
Michal Wajdeczko
d9e2e0143c drm/i915/guc: Move doc near related definitions
After GuC code reorg some documentation was left in wrong
place. Move it closer to corresponding definitions.

v2: use consistent name for the GuC (Sagar)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-5-michal.wajdeczko@intel.com
2017-10-16 18:53:24 +03:00
Michal Wajdeczko
fdc6d7319e drm/i915/guc: Small fixups post code move
Existing code needs some style fixes. To avoid polluting
pure move patch, do it now as separate step.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-4-michal.wajdeczko@intel.com
2017-10-16 18:53:23 +03:00
Michal Wajdeczko
5d53be45a8 drm/i915/guc: Move GuC boot param initialization out of xfer
We want to keep ucode xfer functions separate from other
initialization. Once separated, add explicit forcewake.

v2: use BLITTER domain only and add comment (Daniele)

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #1
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-3-michal.wajdeczko@intel.com
2017-10-16 18:53:22 +03:00
Michal Wajdeczko
46f1e8b3de drm/i915: Move intel_guc_wopcm_size to intel_guc.c
Function intel_guc_wopcm_size didn't fit well in the old place.
With this move upcoming cleanup will be simpler.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016144724.17244-2-michal.wajdeczko@intel.com
2017-10-16 18:53:22 +03:00
Weinan Li
1fd51d9d97 drm/i915: enable to read CSB and CSB write pointer from HWSP in GVT-g VM
Let GVT-g VM read the CSB and CSB write pointer from virtual HWSP, not all
the host support this feature, need to check the BIT(3) of caps in PVINFO.

v3 : Remove unnecessary comments.
v4 : Separate VM enable patch with GVT-g implementation patch due to code
dependency.
v5 : Use inline for GVT virtual HWSP caps check function.
v6 : Comments refine.

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1508039725-1066-1-git-send-email-weinan.z.li@intel.com
2017-10-16 13:56:29 +03:00
Chris Wilson
1210d38890 drm/i915: Use bdw_ddi_translations_fdi for Broadwell
The compiler warns:

	drivers/gpu/drm/i915/intel_ddi.c:118:35: warning: ‘bdw_ddi_translations_fdi’ defined but not used

Lo and behold, if we look at intel_ddi_get_buf_trans_fdi(), it uses
hsw_ddi_translations_fdi[] for both Haswell and *Broadwell*

Fixes: 7d1c42e679 ("drm/i915: Refactor code to select the DDI buf translation table")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20171013154735.27163-1-chris@chris-wilson.co.uk
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-10-13 21:02:52 +01:00
Chris Wilson
5896a5c8c9 drm/i915: Always stop the rings before a missing GPU reset
Always try to stop the rings, even if the GPU reset itself has been
disabled (via modparam i915.reset). This should at least stop the hw
from spinning in the background consuming resources (e.g. power and
memory bandwidth) letting the system rest-in-peace.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103260
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013131218.18013-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-10-13 20:57:29 +01:00
Chris Wilson
7836cd02f2 drm/i915: Keep the rings stopped until they have been re-initialized
Before modifying the ring register (RING_START, HEAD, TAIL, CTL) we
first make sure it is stopped (or else the hw may not resample the
registers). However, we do not need to let the hw restart until after we
have reprogrammed all the rings. This should help prevent situations
where pending operations on the ring may resume (because we are trying
to re-initialize following an unsuccessful GPU hang, i.e. from
i915_gem_unset_wedged).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103260
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171013131218.18013-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-10-13 20:57:29 +01:00
Chris Wilson
5d031f4e16 drm/i915: Stop asserting on set-wedged vs nop_submit_request ordering
Since the removal of the stop_machine(), it is allowed and expected for
the nop_submit_request() and nop_complete_submit_request() to run in
parallel to the i915_gem_set_wedged() processing. As such we can no
longer assert that i915_gem_set_wedged() has completed inside the
stop_machine prior to the individual nop_submit_request execution.

Fixes: af7a8ffad9 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012204019.3557-1-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-10-13 20:57:29 +01:00
Ville Syrjälä
15d05f0e77 drm/i915: Split intel_enable_ddi() into DP and HDMI variants
Untangle intel_enable_ddi() by splitting it into DP and HDMI specific
variants.

v2: Keep using intel_ddi_get_encoder_port() for now

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-10-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-10-13 21:00:14 +03:00
Ville Syrjälä
45e0327e28 drm/i915: Plumb crtc_state etc. directly to intel_ddi_pre_enable_{dp,hdmi}()
Rather that plumb the link parameters separately to
intel_ddi_pre_enable_dp() let's just pass the entire crtc state.

intel_ddi_pre_enable_hdmi() already took the crtc state, but for some
reason intel_ddi_pre_enable() still wanted to extract has_infoframe
from therein and pass it in separately. Let's not do that since it's
pointless.

v2: Rebase due to more code getting pulled into the DDI hooks

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-9-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-10-13 20:56:03 +03:00
Ville Syrjälä
33f083f002 drm/i915: Split intel_disable_ddi() into DP vs. HDMI variants
Untangle intel_disable_ddi() by splitting it into DP and HDMI specific
variants.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-8-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-10-13 20:55:00 +03:00
Ville Syrjälä
680b71c201 drm/i915: Remove useless eDP check from intel_ddi_pre_enable_dp()
intel_edp_panel_on() will itself do the is_edp() check, so the caller
doesn't have to bother. Pre-DDI code doesn't bother, so let's follow the
same approach for DDI.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-7-ville.syrjala@linux.intel.com
2017-10-13 20:54:29 +03:00
Ville Syrjälä
f45f3da7c4 drm/i915: Split intel_ddi_post_disable() into DP vs. HDMI variants
To clean up the mess in intel_ddi_post_disable() split it into two
clean variants for HDMI and DP.

v2: Rebase due to MST DPMS changes

Reviewed-by: Jani Nikula <jani.nikula@intel.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-6-ville.syrjala@linux.intel.com
2017-10-13 20:53:43 +03:00
Ville Syrjälä
fb0bd3bd10 drm/i915: Inline the required bits of intel_ddi_post_disable() into intel_ddi_fdi_post_disable()
To untangle the mess that is intel_ddi_post_disable() move the the bits
needed by FDI into intel_ddi_fdi_post_disable(). This way we can stop
worrying about FDI in intel_ddi_post_disable().

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-5-ville.syrjala@linux.intel.com
2017-10-13 20:53:14 +03:00
Ville Syrjälä
e725f6456f drm/i915: Extract intel_disable_ddi_buf()
Extract the code to disable the DDI_BUF_CTL into small helper. This
will allows us to detangle the encoder type mess in
intel_ddi_post_disable().

v2: Keep using intel_ddi_get_encoder_port() for now

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-4-ville.syrjala@linux.intel.com
2017-10-13 20:50:56 +03:00
Ville Syrjälä
6b8506d575 drm/i915: Extract intel_ddi_clk_disable()
Pull the code to disable the port clock into a function. We already have
the intel_ddi_clk_select() counterpart.

v2: Keep using intel_ddi_get_encoder_port() for now (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-10-13 20:50:36 +03:00
Ville Syrjälä
40b2be419f drm/i915: Dump 'output_types' in crtc state dump
To make it easier to debug things let's dump the output types bitmask in
the crtc state dump. And to make life that much better, let's pretty
print it as a a human reaadable string as well.

v2: Have the caller pass in the buffer (Chris)
    #undef OUTPUT_TYPE (Jani)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010121207.570-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2017-10-13 20:49:08 +03:00
Harsha Sharma
c3ed110386 drm/i915: Replace *_reference/unreference() or *_ref/unref with _get/put()
Replace instances of drm_framebuffer_reference/unreference() with
*_get/put() suffixes and drm_dev_unref with *_put() suffix
because get/put is shorter and consistent with the
kernel use of *_get/put suffixes.
Done with following coccinelle semantic patch

@@
expression ex;
@@

(
-drm_framebuffer_unreference(ex);
+drm_framebuffer_put(ex);
|
-drm_dev_unref(ex);
+drm_dev_put(ex);
|
-drm_framebuffer_reference(ex);
+drm_framebuffer_get(ex);
)

Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com>
[danvet: Drop the drm_dev_put change for now, to make the patch apply
with out a backmerge.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009120643.11953-1-harshasharmaiitr@gmail.com
2017-10-13 16:53:59 +02:00
Mika Kahola
4d58443ddd drm/i915: Get rid of hardcoded pipes
Favor for_each_pipe() macro when looping through pipes.

Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507890286-16214-1-git-send-email-mika.kahola@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-10-13 17:04:58 +03:00
Shashank Sharma
a2fc4bd61e drm/i915: Add retries for LSPCON detection
We read the dp dual mode Adapter identifier to detect the
LSPCON device. It's been observed from the CI testing that in
few cases, this read can get delayed or fail. For such scenarios,
LSPCON vendors suggest to retry the read operation.

This patch adds retry in the probe function, while reading
LSPCON identifier.

V3: added this patch in the series

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507630064-17908-4-git-send-email-shashank.sharma@intel.com
2017-10-13 12:15:10 +03:00
Shashank Sharma
d18aef0f75 drm/i915: Don't give up waiting on INVALID_MODE
Our current logic to read LSPCON's current mode, stops retries and
breaks wait-loop, if it gets LSPCON_MODE_INVALID as return from the
core function. This doesn't allow us to try reading the mode again.

This patch removes this condition and allows retries reading
the currnt mode until timeout.

This also fixes/prevents some of the noise in form of debug messages
while running IGT CI test cases.

V2: rebase, added r-b
V2: changed some debug message levels from debug->error and
    error->debug in lspcon_get_current_mode function.
V3: Rebase

Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Mahesh Kumar <Mahesh1.kumar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507630064-17908-3-git-send-email-shashank.sharma@intel.com
2017-10-13 12:14:56 +03:00
Shashank Sharma
f687e25a7a drm: Add retries for lspcon mode detection
From the CI builds, its been observed that during a driver
reload/insert, dp dual mode read function sometimes fails to
read from LSPCON device over i2c-over-aux channel.

This patch:
- adds some delay and few retries, allowing a scope for these
  devices to settle down and respond.
- changes one error log's level from ERROR->DEBUG as we want
  to call it an error only after all the retries are exhausted.

V2: Addressed review comments from Jani (for loop for retry)
V3: Addressed review comments from Imre (break on partial read too)
V3: Addressed review comments from Ville/Imre (Add the retries
    exclusively for LSPCON, not for all dp_dual_mode devices)
V4: Added r-b from Imre, sending it to dri-devel (Jani)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102294
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102295
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102359
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103186
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507826408-19322-1-git-send-email-shashank.sharma@intel.com
2017-10-13 12:13:54 +03:00
James Ausmus
8f5f63d558 drm/i915/bdw: Fix DP_AUX_CH_CTL_TIME_OUT setting
Per BSpec, 400us is "BDW+ Do not use this setting." - not just PORT_A.
Set BDW to 600us unconditionally.

v2:
-Split in to two patches (Rodrigo)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012213037.4245-2-james.ausmus@intel.com
2017-10-13 10:51:18 +03:00
James Ausmus
6fa228ba96 drm/i915: Fix DP_AUX_CH_CTL_TIME_OUT naming
Rename DP_AUX_CH_CTL_TIME_OUT_1600us to DP_AUX_CH_CTL_TIME_OUT_MAX, as
the meaning of the (3 << 26) value varies per platform, but it's always the
maximum timeout for that platform. Pre-CNL it means 1600us, and for CNL
it means 3200us.

v2:
-Split in to two patches (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012213037.4245-1-james.ausmus@intel.com
2017-10-13 10:50:58 +03:00
Chris Wilson
9c1477e83e drm/i915/selftests: Exercise adding requests to a full GGTT
A bug recently encountered involved the issue where are we were
submitting requests to different ppGTT, each would pin a segment of the
GGTT for its logical context and ring. However, this is invisible to
eviction as we do not tie the context/ring VMA to a request and so do
not automatically wait upon it them (instead they are marked as pinned,
preventing eviction entirely). Instead the eviction code must flush those
contexts by switching to the kernel context. This selftest tries to
fill the GGTT with contexts to exercise a path where the
switch-to-kernel-context failed to make forward progress and we fail
with ENOSPC.

v2: Make the hole in the filled GGTT explicit.
v3: Swap out the arbitrary timeout for a private notification from
i915_gem_evict_something()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-10-12 21:06:26 +01:00
Chris Wilson
214707fc2c drm/i915/selftests: Wrap a timer into a i915_sw_fence
For some selftests, we want to issue requests but delay them going to
hardware. Furthermore, we don't want those requests to block
indefinitely (or else we may hang the driver and block testing) so we
want to employ a timeout. So naturally we want a fence that is
automatically signaled by a timer.

v2: Add kselftests.
v3: Limit the API available to selftests; there isn't an overwhelming
reason to export it universally.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-10-12 21:06:26 +01:00
Chris Wilson
55b4f1ce2f drm/i915: Fix eviction when the GGTT is idle but full
In the full-ppgtt world, we can fill the GGTT full of context objects.
These context objects are currently implicitly tracked by the requests
that pin them i.e. they are only unpinned when the request is completed
and retired, but we do not have the link from the vma to the request
(anymore). In order to unpin those contexts, we have to issue another
request and wait upon the switch to the kernel context.

The bug during eviction was that we assumed that a full GGTT meant we
would have requests on the GGTT timeline, and so we missed situations
where those requests where merely in flight (and when even they have not
yet been submitted to hw yet). The fix employed here is to change the
already-is-idle test to no look at the execution timeline, but count the
outstanding requests and then check that we have switched to the kernel
context. Erring on the side of overkill here just means that we stall a
little longer than may be strictly required, but we only expect to hit
this path in extreme corner cases where returning an erroneous error is
worse than the delay.

v2: Logical inversion when swapping over branches.

Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-1-chris@chris-wilson.co.uk
2017-10-12 21:06:26 +01:00
Ville Syrjälä
4d90f2d507 drm/i915: Start tracking PSR state in crtc state
Add the minimal amount of PSR tracking into the crtc state. This allows
precomputing the possibility of using PSR correctly, and it means we can
safely call the psr enable/disable functions for any DP endcoder.

As a nice bonus we get rid of some more crtc->config usage, which we
want to kill off eventually.

v2: Fix 'goto unlock' fail in intel_psr_enable() (Jani)
    Check intel_dp_is_edp() in is_edp_psr() (Jani)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012130201.21318-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-10-12 21:18:00 +03:00
Jani Nikula
fa9caf0b6e drm/i915: Update DRIVER_DATE to 20171012
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-10-12 21:05:11 +03:00
Joonas Lahtinen
612dde7ec3 drm/i915: Simplify intel_sanitize_enable_ppgtt
Remove dead code around has_aliasing_ppgtt condition.

Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010143355.16577-1-joonas.lahtinen@linux.intel.com
2017-10-12 11:26:46 +03:00
Chris Wilson
7c78142337 drm/i915/userptr: Drop struct_mutex before cleanup
Purely to silence lockdep, as we know that no bo can exist at this time
and so the inversion is impossible. Nevertheless, lockdep currently
warns on unload:

[  137.522565] WARNING: possible circular locking dependency detected
[  137.522568] 4.14.0-rc4-CI-CI_DRM_3209+ #1 Tainted: G     U
[  137.522570] ------------------------------------------------------
[  137.522572] drv_module_relo/1532 is trying to acquire lock:
[  137.522574]  ("i915-userptr-acquire"){+.+.}, at: [<ffffffff8109a831>] flush_workqueue+0x91/0x540
[  137.522581]
               but task is already holding lock:
[  137.522583]  (&dev->struct_mutex){+.+.}, at: [<ffffffffa014fb3f>] i915_gem_fini+0x3f/0xc0 [i915]
[  137.522605]
               which lock already depends on the new lock.

[  137.522608]
               the existing dependency chain (in reverse order) is:
[  137.522611]
               -> #3 (&dev->struct_mutex){+.+.}:
[  137.522615]        __lock_acquire+0x1420/0x15e0
[  137.522618]        lock_acquire+0xb0/0x200
[  137.522621]        __mutex_lock+0x86/0x9b0
[  137.522623]        mutex_lock_interruptible_nested+0x1b/0x20
[  137.522640]        i915_mutex_lock_interruptible+0x51/0x130 [i915]
[  137.522657]        i915_gem_fault+0x20b/0x720 [i915]
[  137.522660]        __do_fault+0x1e/0x80
[  137.522662]        __handle_mm_fault+0xa08/0xed0
[  137.522664]        handle_mm_fault+0x156/0x300
[  137.522666]        __do_page_fault+0x2c5/0x570
[  137.522668]        do_page_fault+0x28/0x250
[  137.522671]        page_fault+0x22/0x30
[  137.522672]
               -> #2 (&mm->mmap_sem){++++}:
[  137.522677]        __lock_acquire+0x1420/0x15e0
[  137.522679]        lock_acquire+0xb0/0x200
[  137.522682]        down_read+0x3e/0x70
[  137.522699]        __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
[  137.522701]        process_one_work+0x233/0x660
[  137.522704]        worker_thread+0x4e/0x3b0
[  137.522706]        kthread+0x152/0x190
[  137.522708]        ret_from_fork+0x27/0x40
[  137.522710]
               -> #1 ((&work->work)){+.+.}:
[  137.522714]        __lock_acquire+0x1420/0x15e0
[  137.522717]        lock_acquire+0xb0/0x200
[  137.522719]        process_one_work+0x206/0x660
[  137.522721]        worker_thread+0x4e/0x3b0
[  137.522723]        kthread+0x152/0x190
[  137.522725]        ret_from_fork+0x27/0x40
[  137.522727]
               -> #0 ("i915-userptr-acquire"){+.+.}:
[  137.522731]        check_prev_add+0x430/0x840
[  137.522733]        __lock_acquire+0x1420/0x15e0
[  137.522735]        lock_acquire+0xb0/0x200
[  137.522738]        flush_workqueue+0xb4/0x540
[  137.522740]        drain_workqueue+0xd4/0x1b0
[  137.522742]        destroy_workqueue+0x1c/0x200
[  137.522758]        i915_gem_cleanup_userptr+0x15/0x20 [i915]
[  137.522770]        i915_gem_fini+0x5f/0xc0 [i915]
[  137.522782]        i915_driver_unload+0x122/0x180 [i915]
[  137.522794]        i915_pci_remove+0x19/0x30 [i915]
[  137.522797]        pci_device_remove+0x39/0xb0
[  137.522800]        device_release_driver_internal+0x15d/0x220
[  137.522803]        driver_detach+0x40/0x80
[  137.522805]        bus_remove_driver+0x58/0xd0
[  137.522807]        driver_unregister+0x2c/0x40
[  137.522809]        pci_unregister_driver+0x36/0xb0
[  137.522828]        i915_exit+0x1a/0x8b [i915]
[  137.522831]        SyS_delete_module+0x18c/0x1e0
[  137.522834]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  137.522835]
               other info that might help us debug this:

[  137.522838] Chain exists of:
                 "i915-userptr-acquire" --> &mm->mmap_sem --> &dev->struct_mutex

[  137.522844]  Possible unsafe locking scenario:

[  137.522846]        CPU0                    CPU1
[  137.522848]        ----                    ----
[  137.522850]   lock(&dev->struct_mutex);
[  137.522852]                                lock(&mm->mmap_sem);
[  137.522854]                                lock(&dev->struct_mutex);
[  137.522857]   lock("i915-userptr-acquire");
[  137.522859]
                *** DEADLOCK ***

[  137.522862] 3 locks held by drv_module_relo/1532:
[  137.522864]  #0:  (&dev->mutex){....}, at: [<ffffffff8161d47b>] device_release_driver_internal+0x2b/0x220
[  137.522869]  #1:  (&dev->mutex){....}, at: [<ffffffff8161d489>] device_release_driver_internal+0x39/0x220
[  137.522873]  #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa014fb3f>] i915_gem_fini+0x3f/0xc0 [i915]
[  137.522888]
               stack backtrace:
[  137.522891] CPU: 0 PID: 1532 Comm: drv_module_relo Tainted: G     U          4.14.0-rc4-CI-CI_DRM_3209+ #1
[  137.522894] Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
[  137.522897] Call Trace:
[  137.522900]  dump_stack+0x68/0x9f
[  137.522902]  print_circular_bug+0x235/0x3c0
[  137.522905]  ? lockdep_init_map_crosslock+0x20/0x20
[  137.522908]  check_prev_add+0x430/0x840
[  137.522919]  ? i915_gem_fini+0x5f/0xc0 [i915]
[  137.522922]  ? __kernel_text_address+0x12/0x40
[  137.522925]  ? __save_stack_trace+0x66/0xd0
[  137.522928]  __lock_acquire+0x1420/0x15e0
[  137.522930]  ? __lock_acquire+0x1420/0x15e0
[  137.522933]  ? lockdep_init_map_crosslock+0x20/0x20
[  137.522936]  ? __this_cpu_preempt_check+0x13/0x20
[  137.522938]  lock_acquire+0xb0/0x200
[  137.522940]  ? flush_workqueue+0x91/0x540
[  137.522943]  flush_workqueue+0xb4/0x540
[  137.522945]  ? flush_workqueue+0x91/0x540
[  137.522948]  ? __mutex_unlock_slowpath+0x43/0x2c0
[  137.522951]  ? trace_hardirqs_on_caller+0xe3/0x1b0
[  137.522954]  drain_workqueue+0xd4/0x1b0
[  137.522956]  ? drain_workqueue+0xd4/0x1b0
[  137.522958]  destroy_workqueue+0x1c/0x200
[  137.522975]  i915_gem_cleanup_userptr+0x15/0x20 [i915]
[  137.522987]  i915_gem_fini+0x5f/0xc0 [i915]
[  137.523000]  i915_driver_unload+0x122/0x180 [i915]
[  137.523015]  i915_pci_remove+0x19/0x30 [i915]
[  137.523018]  pci_device_remove+0x39/0xb0
[  137.523021]  device_release_driver_internal+0x15d/0x220
[  137.523023]  driver_detach+0x40/0x80
[  137.523026]  bus_remove_driver+0x58/0xd0
[  137.523028]  driver_unregister+0x2c/0x40
[  137.523030]  pci_unregister_driver+0x36/0xb0
[  137.523049]  i915_exit+0x1a/0x8b [i915]
[  137.523052]  SyS_delete_module+0x18c/0x1e0
[  137.523055]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  137.523057] RIP: 0033:0x7f7bd0609287
[  137.523059] RSP: 002b:00007ffef694bc18 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
[  137.523062] RAX: ffffffffffffffda RBX: ffffffff81493f33 RCX: 00007f7bd0609287
[  137.523065] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564f999f9fc8
[  137.523067] RBP: ffffc90005c4ff88 R08: 0000000000000000 R09: 0000000000000080
[  137.523069] R10: 00007f7bd20ef8c0 R11: 0000000000000246 R12: 0000000000000000
[  137.523072] R13: 00007ffef694be00 R14: 0000000000000000 R15: 0000000000000000
[  137.523075]  ? __this_cpu_preempt_check+0x13/0x20

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171011141857.14161-1-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-10-12 09:11:32 +01:00
Jani Nikula
a8a08886ef drm/i915/dp: limit sink rates based on rate
Get rid of redundant intel_dp_num_rates(). We can simply look at the
rate and limit based on that.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009092959.29021-3-jani.nikula@intel.com
2017-10-11 22:04:33 +03:00
Jani Nikula
fc603ca7f8 drm/i915/dp: centralize max source rate conditions more
Turn intel_dp_source_supports_hbr2() into a simple helper to query the
pre-filled source rates array, and move the conditions about which
platforms support which rates to the single point of truth in
intel_dp_set_source_rates().

This also reduces the code paths you have to think about in the source
rates initialization in intel_dp_set_source_rates(), making it easier to
grasp.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009092959.29021-2-jani.nikula@intel.com
2017-10-11 22:04:33 +03:00
Ville Syrjälä
3c7b6b3c4f drm/i915: Allow PCH platforms fall back to BIOS LVDS mode
With intel_encoder_current_mode() using the normal state readout code it
actually works on PCH platforms as well. So let's nuke the PCH check from
intel_lvds_init(). I suppose there aren't any machines that actually
need this, but at least we get to eliminate a few lines of code, and one
FIXME.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009161951.22420-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-11 21:43:30 +03:00
Ville Syrjälä
de33081567 drm/i915: Reuse normal state readout for LVDS/DVO fixed mode
Reuse the normal state readout code to get the fixed mode for LVDS/DVO
encoders. This removes some partially duplicated state readout code
from LVDS/DVO encoders. The duplicated code wasn't actually even
populating the negative h/vsync flags, leading to possible state checker
complaints. The normal readout code populates that stuff fully.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009161951.22420-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-11 19:44:19 +03:00
Daniel Vetter
af7a8ffad9 drm/i915: Use rcu instead of stop_machine in set_wedged
stop_machine is not really a locking primitive we should use, except
when the hw folks tell us the hw is broken and that's the only way to
work around it.

This patch tries to address the locking abuse of stop_machine() from

commit 20e4933c47
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 22 14:41:21 2016 +0000

    drm/i915: Stop the machine as we install the wedged submit_request handler

Chris said parts of the reasons for going with stop_machine() was that
it's no overhead for the fast-path. But these callbacks use irqsave
spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.

To stay as close as possible to the stop_machine semantics we first
update all the submit function pointers to the nop handler, then call
synchronize_rcu() to make sure no new requests can be submitted. This
should give us exactly the huge barrier we want.

I pondered whether we should annotate engine->submit_request as __rcu
and use rcu_assign_pointer and rcu_dereference on it. But the reason
behind those is to make sure the compiler/cpu barriers are there for
when you have an actual data structure you point at, to make sure all
the writes are seen correctly on the read side. But we just have a
function pointer, and .text isn't changed, so no need for these
barriers and hence no need for annotations.

Unfortunately there's a complication with the call to
intel_engine_init_global_seqno:

- Without stop_machine we must hold the corresponding spinlock.

- Without stop_machine we must ensure that all requests are marked as
  having failed with dma_fence_set_error() before we call it. That
  means we need to split the nop request submission into two phases,
  both synchronized with rcu:

  1. Only stop submitting the requests to hw and mark them as failed.

  2. After all pending requests in the scheduler/ring are suitably
  marked up as failed and we can force complete them all, also force
  complete by calling intel_engine_init_global_seqno().

This should fix the followwing lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G     U
------------------------------------------------------
kworker/3:4/562 is trying to acquire lock:
 (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40

but task is already holding lock:
 (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&dev->struct_mutex){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_interruptible_nested+0x1b/0x20
       i915_mutex_lock_interruptible+0x51/0x130 [i915]
       i915_gem_fault+0x209/0x650 [i915]
       __do_fault+0x1e/0x80
       __handle_mm_fault+0xa08/0xed0
       handle_mm_fault+0x156/0x300
       __do_page_fault+0x2c5/0x570
       do_page_fault+0x28/0x250
       page_fault+0x22/0x30

-> #5 (&mm->mmap_sem){++++}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __might_fault+0x68/0x90
       _copy_to_user+0x23/0x70
       filldir+0xa5/0x120
       dcache_readdir+0xf9/0x170
       iterate_dir+0x69/0x1a0
       SyS_getdents+0xa5/0x140
       entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #4 (&sb->s_type->i_mutex_key#5){++++}:
       down_write+0x3b/0x70
       handle_create+0xcb/0x1e0
       devtmpfsd+0x139/0x180
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #3 ((complete)&req.done){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       wait_for_common+0x58/0x210
       wait_for_completion+0x1d/0x20
       devtmpfs_create_node+0x13d/0x160
       device_add+0x5eb/0x620
       device_create_groups_vargs+0xe0/0xf0
       device_create+0x3a/0x40
       msr_device_create+0x2b/0x40
       cpuhp_invoke_callback+0xc9/0xbf0
       cpuhp_thread_fun+0x17b/0x240
       smpboot_thread_fn+0x18a/0x280
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #2 (cpuhp_state-up){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpuhp_issue_call+0x133/0x1c0
       __cpuhp_setup_state_cpuslocked+0x139/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_writeback_init+0x43/0x67
       pagecache_init+0x3d/0x42
       start_kernel+0x3a8/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #1 (cpuhp_state_mutex){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_nested+0x1b/0x20
       __cpuhp_setup_state_cpuslocked+0x53/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_alloc_init+0x28/0x30
       start_kernel+0x145/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #0 (cpu_hotplug_lock.rw_sem){++++}:
       check_prev_add+0x430/0x840
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpus_read_lock+0x3d/0xb0
       stop_machine+0x1c/0x40
       i915_gem_set_wedged+0x1a/0x20 [i915]
       i915_reset+0xb9/0x230 [i915]
       i915_reset_device+0x1f6/0x260 [i915]
       i915_handle_error+0x2d8/0x430 [i915]
       hangcheck_declare_hang+0xd3/0xf0 [i915]
       i915_hangcheck_elapsed+0x262/0x2d0 [i915]
       process_one_work+0x233/0x660
       worker_thread+0x4e/0x3b0
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

other info that might help us debug this:

Chain exists of:
  cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->struct_mutex);
                               lock(&mm->mmap_sem);
                               lock(&dev->struct_mutex);
  lock(cpu_hotplug_lock.rw_sem);

 *** DEADLOCK ***

3 locks held by kworker/3:4/562:
 #0:  ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
 #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
 #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]

stack backtrace:
CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3179+ #1
Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
Workqueue: events_long i915_hangcheck_elapsed [i915]
Call Trace:
 dump_stack+0x68/0x9f
 print_circular_bug+0x235/0x3c0
 ? lockdep_init_map_crosslock+0x20/0x20
 check_prev_add+0x430/0x840
 ? irq_work_queue+0x86/0xe0
 ? wake_up_klogd+0x53/0x70
 __lock_acquire+0x1420/0x15e0
 ? __lock_acquire+0x1420/0x15e0
 ? lockdep_init_map_crosslock+0x20/0x20
 lock_acquire+0xb0/0x200
 ? stop_machine+0x1c/0x40
 ? i915_gem_object_truncate+0x50/0x50 [i915]
 cpus_read_lock+0x3d/0xb0
 ? stop_machine+0x1c/0x40
 stop_machine+0x1c/0x40
 i915_gem_set_wedged+0x1a/0x20 [i915]
 i915_reset+0xb9/0x230 [i915]
 i915_reset_device+0x1f6/0x260 [i915]
 ? gen8_gt_irq_ack+0x170/0x170 [i915]
 ? work_on_cpu_safe+0x60/0x60
 i915_handle_error+0x2d8/0x430 [i915]
 ? vsnprintf+0xd1/0x4b0
 ? scnprintf+0x3a/0x70
 hangcheck_declare_hang+0xd3/0xf0 [i915]
 ? intel_runtime_pm_put+0x56/0xa0 [i915]
 i915_hangcheck_elapsed+0x262/0x2d0 [i915]
 process_one_work+0x233/0x660
 worker_thread+0x4e/0x3b0
 kthread+0x152/0x190
 ? process_one_work+0x660/0x660
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x27/0x40
Setting dangerous option reset - tainting kernel
i915 0000:00:02.0: Resetting chip after gpu hang
Setting dangerous option reset - tainting kernel
i915 0000:00:02.0: Resetting chip after gpu hang

v2: Have 1 global synchronize_rcu() barrier across all engines, and
improve commit message.

v3: We need to protect the seqno update with the timeline spinlock (in
set_wedged) to avoid racing with other updates of the seqno, like we
already do in nop_submit_request (Chris).

v4: Use two-phase sequence to plug the race Chris spotted where we can
complete requests before they're marked up with -EIO.

v5: Review from Chris:
- simplify nop_submit_request.
- Add comment to rcu_read_lock section.
- Align comments with the new style.

v6: Remove unused variable to appease CI.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171011091019.1425-1-daniel.vetter@ffwll.ch
2017-10-11 17:51:21 +02:00
Sagar Arun Kamble
37d933fc17 drm/i915: Introduce separate status variable for RC6 and LLC ring frequency setup
Defined new struct intel_rc6 to hold RC6 specific state and
intel_ring_pstate to hold ring specific state.

v2: s/intel_ring_pstate/intel_llc_pstate. Removed checks from
autoenable_* functions. (Chris)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-13-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-12-chris@chris-wilson.co.uk
2017-10-11 08:57:05 +01:00
Sagar Arun Kamble
fc77426a8d drm/i915: Create generic functions to control RC6, RPS
Prepared generic functions intel_enable_rc6, intel_disable_rc6,
intel_enable_rps and intel_disable_rps functions to setup RC6/RPS
based on platforms.

v2: Make intel_enable/disable_rc6/rps static. (Chris)

v3: Added lockdep_assert_held(dev_priv->pcu_lock) in new generic
functions. (Chris)
Removed WARN_ON(&dev_priv->pcu_lock) from lower level functions as generic
function now has lockdep_assert. Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-12-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-11-chris@chris-wilson.co.uk
2017-10-11 08:57:05 +01:00
Sagar Arun Kamble
0870a2a4a3 drm/i915: Create generic function to setup LLC ring frequency table
Prepared intel_update_ring_freq function to setup ring frequency
for applicable platforms determined by macro HAS_LLC.

v2: Replaced NEEDS_RING_FREQ_UPDATE with HAS_LLC macro. (Chris)
    Added check while calling from intel_enable_gt_powersave.

v3: s/intel_update_ring_freq/intel_enable_llc_pstate and created
new placeholder function intel_disable_llc_pstate. (Chris)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-11-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-10-chris@chris-wilson.co.uk
2017-10-11 08:57:04 +01:00
Sagar Arun Kamble
771decb0b4 drm/i915: Rename intel_enable_rc6 to intel_rc6_enabled
This function gives the status of RC6, whether disabled or if
enabled then which state. intel_enable_rc6 will be used for
enabling RC6 in the next patch.

v2: Rebase.

v3: Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> #1
Reviewed-by: Ewelina Musial <ewelina.musial@intel.com> #1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-10-git-send-email-sagar.a.kamble@intel.com
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-9-chris@chris-wilson.co.uk
2017-10-11 08:57:02 +01:00