Manual conversion of the printk based logging macros to the new struct
drm_based logging macros in drm/i915/gt/intel_ggtt.c.
Also includes extracting the struct drm_i915_private device from various
intel types to use in the new macros.
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128071437.9284-3-wambui.karugax@gmail.com
Only the first and the last nodes were being added to
ref->preallocated_barriers.
Renaming variables to make it more easy to read.
Fixes: 8413502238 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129232345.84512-1-jose.souza@intel.com
A couple of OOPS fixes, fixes for TU1xx if firmware isn't available,
better behaviour in the face of GPU faults, and a patch to make HD
audio work again after runpm changes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv4xcLF6Ahh7UYEesn-wBEksd2da+ghusBAdODMrH7Sz2A@mail.gmail.com
Now that we have offline error capture and can reset an engine from
inside an atomic context while also preserving the GPU state for
post-mortem analysis, it is time to handle error interrupts thrown by
the command parser.
This provides a much, much faster mechanism for us to detect known
problems than using heartbeats/hangchecks, and also provides a mechanism
for when those are disabled. However, it is limited to problems the HW
can detect in the CS and so not a complete solution for detecting lockups.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128204318.4182039-2-chris@chris-wilson.co.uk
We write to execlists->pending[0] in process_csb() to acknowledge the
completion of the ESLP update, outside of the main spinlock. When we
check the current status of the previous submission in
__execlists_submission_tasklet() we should therefore use READ_ONCE() to
reflect and document the unsynchronized read.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128171614.3845825-1-chris@chris-wilson.co.uk
Measure the memcpy bw between our CPU accessible regions, trying all
supported mapping combinations(WC, WB) across various sizes.
v2:
use smaller sizes
throw in memcpy32/memcpy64/memcpy_from_wc
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129093343.194570-1-matthew.auld@intel.com
Without relaxing this requirement, TU10x boards will fail to load without
an updated linux-firmware, and TU11x will completely fail to load because
FW isn't available yet.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This fixes an oops on TU11x GPUs where SEC2 attempts to register its falcon,
and triggers a NULL-pointer deref because ACR isn't yet supported.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The implementations for most channel types contains a map of methods to
priv registers in order to provide debugging info when a disp exception
has been raised.
This info is missing from the implementation of PIO channels as they're
rather simplistic already, however, if an exception is raised by one of
them, we'd end up triggering a NULL-pointer deref. Not ideal...
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206299
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This patch adds the support for the notification of HD-audio hotplug
via the already existing drm_audio_component framework. This allows
us more reliable hotplug notification and ELD transfer without
accessing HD-audio bus; it's more efficient, and more importantly, it
works without waking up the runtime PM.
The implementation is rather simplistic: nouveau driver provides the
get_eld ops for HD-audio, and it notifies the audio hotplug via
pin_eld_notify callback upon each nv50_audio_enable() and _disable()
call. As the HD-audio pin assignment seems corresponding to the CRTC,
the crtc->index number is passed directly as the zero-based port
number.
The bind and unbind callbacks handle the device-link so that it
assures the PM call order.
Link: https://lore.kernel.org/r/20190722143815.7339-3-tiwai@suse.de
Reviewed-by: Lyude Paul <lyude@redhat.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Don't confuse the poor developer by writing a negative value as a very
large positive, as the flow of requests is already complex enough.
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128151647.3820659-1-chris@chris-wilson.co.uk
The user (e.g. gem_eio) can manipulate the driver into wedging itself,
allowing the user to trigger voluminous logging of inconsequential
details. If we lift the dump to direct calls to intel_gt_set_wedged(),
out of the intel_reset failure handling, we keep the detail logging for
what we expect are true HW or test failures without being tricked.
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200127231540.3302516-6-chris@chris-wilson.co.uk
We always use a deferred bottom-half (either tasklet or irq_work) for
processing the response to an interrupt which means we can recombine the
GT irq ack+handler into one. This simplicity is important in later
patches as we will need to handle and then ack multiple interrupt levels
before acking the GT and master interrupts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200127231540.3302516-2-chris@chris-wilson.co.uk
We don't want to report errors on the internal contexts to userspace,
suppressing their own, so treat them as simulated errors. These mostly
arise inside selftests and so are simulated anyway. For the rest, we can
rely on the normal debug channels in CI.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128113426.3711294-1-chris@chris-wilson.co.uk
There are several spelling mistakes in PP_ASSERT_WITH_CODE messages.
Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A for-loop is iterating from 0 up to 1000 however the loop variable count
is a u8 and hence not large enough. Fix this by making count an int.
Also remove the redundant initialization of count since this is never used
and add { } on the loop statement make the loop block clearer.
v2: drop useless else (Walter Harms)
Addresses-Coverity: ("Operands don't affect result")
Fixes: ed581a0ace ("drm/amd/display: wait for update when setting dpg test pattern")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allocation isn't required and can fail when resuming from suspend.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1009
Signed-off-by: Dor Askayo <dor.askayo@gmail.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Coreboot seems to have a problem correctly setting up access to the stolen VRAM
on KV/KB. Use the direct access only when necessary.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-tested-by: Fredrik Bruhn <fredrik.bruhn@unibap.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 80adaebd2d.
[WHY]
This change was working around a regression that occured in this:
commit 0301ccbaf6 ("drm/amd/display: DP Compliance 400.1.1 failure")
With the fix to run verify_link_cap when the SINK_COUNT of
dongles becomes non-zero this change is no longer needed.
Cc: Louis Li <Ching-shih.Li@amd.com>
Cc: Wenjing Liu <Wenjing.Liu@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Two years ago the patch referenced by the Fixes tag stopped running
dp_verify_link_cap_with_retries during DP detection when the reason
for the detection was a short-pulse interrupt. This effectively meant
that we were no longer doing the verify_link_cap training on active
dongles when their SINK_COUNT changed from 0 to 1.
A year ago this was partly remedied with:
commit 80adaebd2d ("drm/amd/display: Don't skip link training for empty dongle")
This made sure that we trained the dongle on initial hotplug (without
connected downstream devices).
This is all fine and dandy if it weren't for the fact that there are
some dongles on the market that don't like link training when SINK_COUNT
is 0 These dongles will in fact indicate a SINK_COUNT of 0 immediately
after hotplug, even when a downstream device is connected, and then
trigger a shortpulse interrupt indicating a SINK_COUNT change to 1.
In order to play nicely we will need our policy to not link train an
active DP dongle when SINK_COUNT is 0 but ensure we train it when the
SINK_COUNT changes to 1.
[HOW]
Call dp_verify_link_cap_with_retries on detection even when the detection
is triggered from a short pulse interrupt.
With this change we can also revert this commit which we'll do in a separate
follow-up change:
commit 80adaebd2d ("drm/amd/display: Don't skip link training for empty dongle")
Fixes: 0301ccbaf6 ("drm/amd/display: DP Compliance 400.1.1 failure")
Suggested-by: Louis Li <Ching-shih.Li@amd.com>
Tested-by: Louis Li <Ching-shih.Li@amd.com>
Cc: Wenjing Liu <Wenjing.Liu@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switch to a blacklist so we can disable specific boards
that are problematic.
v2: make the blacklist non-raven specific.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is a spelling mistake in a DRM_ERROR message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes coccicheck warning:
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:723:2-50: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:733:3-52: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:747:3-51: WARNING: Assignment of 0/1 to bool variable
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
expand sched_list definition for better understanding.
Also fix a typo atleast -> at least
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
bo_va_list is list_head, so initialize it.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use inst_idx relacing inst in SOC15_DPG_MODE macro to avoid confusion.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix typo error, should be inst_idx instead of inst.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix vcn2.5 instance issue, vcn0 and vcn1 have same register offset
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>