In commit 5a7d202b15, a logical AND was erroneously changed to an OR,
causing WaIncreaseLatencyIPCEnabled to be enabled unconditionally for
kabylake and coffeelake, even when IPC is disabled. Fix the logic so
that WaIncreaseLatencyIPCEnabled is only used when IPC is enabled.
Fixes: 5a7d202b15 ("drm/i915: Drop WaIncreaseLatencyIPCEnabled/1140 for cnl")
Cc: stable@vger.kernel.org # 5.3.x+
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430214654.51314-1-sultan@kerneltoast.com
All these ROUNDING_FACTORs and whatnot are making this thing hard to
read. Get rid of them. And let's massage some of the fractions to
give us less questionable intermediate results and perhaps less
divisions.
Also looks like a good helping of 64bit math stuff is needed to
avoid some of overflows present in the current code. There
might still be a few overflows, namely when calculating
link_clks_available/samples_room (would require a huge hblank
though), and potentially when calculating hblank_rise (not sure
how large link_clks_active can get).
It looks like we're still not calculating exactly what the spec says
since we truncate tu_data and tu_line early. But I'm too lazy to
figure out if we could avoid that.
v2: Fix typo in commit msg (Uma)
Remove ROUNDING_FACTOR define (Uma)
s/5*link_clk+5*cdclk/5*(link_clk+cdclk)/ (Chris)
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-3-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Since the code seems insistent on using the variable names from the
bspec formulat, let's be consistent and use those names for all
the things. For some reason 'link_clk' and 'lanes' were left out
in the code until now.
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
mode.vrefresh is rounded to the nearest integer. You don't want to use
it anywhere that requires precision. Also I want to nuke it.
vtotal*vrefresh == 1000*clock/htotal, so let's use the latter.
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Remove all the stepping dependent cnl workarounds. Bspec lists
more steppings than this so presumably these are classed as
pre-production. And this is cnl after all so no one should
really care anyway.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Display WA #1105 says that FBC requires PLANE_STRIDE to be a multiple
of 512 bytes on gen9 and glk.
This is definitely true for glk as certain tests (such as
igt/kms_big_fb/linear-16bpp-rotate-0) are now failing when the
display resolution results in a plane stride which is not a
multiple of 512 bytes.
Curiously I was not able to reproduce this on a KBL. First I
suspected that our use of the FBC override stride explain this,
but after trying to use the override stride on glk the test
still failed. I did try both the old CHICKEN_MISC_4 way and
the new FBC_STRIDE way, neither had any effect on the result.
Anyways, we need this at least on glk. But let's trust the spec
and apply the w/a for all gen9 as well, despite being unable to
reproduce the problem.
v2: s/FBC_CHICKEN/FBC_STRIDE/ in commit msg
Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: 691f7ba58d ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-2-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
That is a preparation patch before next one where we
introduce old_bw_state and a bunch of other changes
as well.
In a review comment it was suggested to split out
at least that renaming into a separate patch, what
is done here.
v2: Removed spurious space
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423075902.21892-8-stanislav.lisovskiy@intel.com
We need to calculate SAGV mask also in a non-modeset
commit, however currently active_pipes are only calculated
for modesets in global atomic state, thus now we will be
tracking those also in bw_state in order to be able to
properly access global data.
v2: - Removed pre/post plane SAGV updates from modeset(Ville)
- Now tracking active pipes in intel_can_enable_sagv(Ville)
v3: - lock global state if active_pipes change as well(Ville)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430195634.7666-1-stanislav.lisovskiy@intel.com
Future platforms require per-crtc SAGV evaluation
and serializing global state when those are changed
from different commits.
v2: - Add has_sagv check to intel_crtc_can_enable_sagv
so that it sets bit in reject mask.
- Use bw_state in intel_pre/post_plane_enable_sagv
instead of atomic state
v3: - Fixed rebase conflict, now using
intel_atomic_crtc_state_for_each_plane_state in
order to call it from atomic check
v4: - Use fb modifier from plane state
v5: - Make intel_has_sagv static again(Ville)
- Removed unnecessary NULL assignments(Ville)
- Removed unnecessary SAGV debug(Ville)
- Call intel_compute_sagv_mask only for modesets(Ville)
- Serialize global state only if sagv results change, but
not mask itself(Ville)
v6: - use lock global state instead of serialize(Ville)
v7: - use both global state lock and serialize depending on
if we need to change only global state or access hw
(Ville)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Ville Syrjälä <ville.syrjala@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430191757.18206-1-stanislav.lisovskiy@intel.com
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.
In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.
v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earlier gen
as it is not inferred from the chaining until gen6.
Fixes: 964a9b0f61 ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk
We only need the device wakeref on freeing the objects if we have to
unbind the object from the global GTT, or otherwise update device
information. If the objects are clean, we never need the wakeref, so
avoid taking until required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk
If at first we don't succeed, try try again.
Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so we end up falling back to a synchronous
operation that has a liability of blocking. However, Tvrtko pointed out
we don't need to use the same engine to perform the relocations as we
are planning to execute the execbuf on, and so if we switch over to a
working engine, we can perform the relocation asynchronously. The user
execbuf will be queued after the relocations by virtue of fencing.
This patch creates a new context per execbuf requiring asynchronous
relocations on an unusable engines. This is perhaps a bit excessive and
can be ameliorated by a small context cache, but for the moment we only
need it for working around a little used engine on Sandybridge, and only
if relocations are actually required to an active batch buffer.
Now we just need to teach the relocation code to handle physical
addressing for gen2/3, and we should then have universal support!
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_reloc/basic-spin # snb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk
As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocation batch for
all of the target vma. This avoiding emitting a new request into the
ring for each target, consuming precious ring space and a potential
stall.
v2: Propagate the failure from submitting the relocation batch.
Testcase: igt/gem_exec_reloc/basic-wide-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk
The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond our guess in building the request, then instead of waiting at
the start of execbuf before we hold any locks or other resources, we
may trigger a wait inside a critical region. One example is in using gpu
relocations, where currently we emit a new MI_BB_START from the ring
every time we overflow a page of relocation entries. However, instead of
insert the command into the precious ring, we can chain the next page of
relocation entries as MI_BB_START from the end of the previous.
v2: Delay the emit_bb_start until after all the chained vma
synchronisation is complete. Since the buffer pool batches are idle, this
_should_ be a no-op, but one day we may some fancy async GPU bindings
for new vma!
v3: Use pool/batch consitently, once we start thinking in terms of the
batch vma, use batch->obj.
v4: Explain the magic number 4.
Tvrtko spotted that we lose propagation of the error for failing to
submit the relocation request; that's easier to fix up in the next
patch.
Testcase: igt/gem_exec_reloc/basic-many-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk
Create gfx_v10_0_rlc_funcs_sriov for nv12 with rlcg_write function pointers be initialized
so driver can use RLCG to write aceess CSIB and CP_ME_CNTL registers when nv12 in sriov mode
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The queue mask used for set_resources always assumes the queue number
per pipe is 8, so KFD needs to align with that by using function
amdgpu_queue_mask_bit_to_set_resource_bit().
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename it to amdgpu_queue_mask_bit_to_set_resource_bit() to be more
specific about its functionality. KFD will use it later.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
IP discovery is only supported in Navi series and onwards.
There is no need to reserve a portion of vram as discovery
tmr region for pre-Navi adapters.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is to prepare for initializing discovery tmr size per
ASIC type
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use a common method to set queue mask before set kiq resource.
The value of queue mask must suitablt for the designated form.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename hardware IP name from UVD to VCN to reduce confusion.
Hardware IP name UVD and VCN are equivalent for VCN2.5 asics.
Use name VCN for future VCN based asics.
V2: update description
V3: rebase
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: James Zhu <james.zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
gdb uses ptrace() to peek and poke bytes of the target's address space.
The driver must implement an vm_ops->access() handler or else gdb will
be unable to inspect the pointer and report it as out-of-bounds.
Worse than useless as it causes immediate suspicion of the valid GTT
pointer, distracting the poor programmer trying to find his bug.
v2: Write-protect readonly objects (Matthew).
Testcase: igt/gem_mmap_gtt/ptrace
Testcase: igt/gem_mmap_offset/ptrace
Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk
In order to allow userspace to rely on timeslicing to reorder their
batches, we must support preemption of those user batches. Declare
timeslicing as an explicit property that is a combination of having the
kernel support and HW support.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk
Kerneldoc comments should describe all function parameters.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Corrected two function names. Added a missing space.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With previous golden settings, compute task can't use
reserved LDS (32K) on CU0 and CU1. On 64K LDS system,
if compute work group allocate more than 32K LDS, then
it can't be dispatched to CU0 and CU1 because of the
reservation. This enables compute task to use reserved
LDS on CU0 and CU1.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
PCI domain has moved to 32-bits to accommodate virtualization,
so a 32-bit integer is exposed for domain to reflect this change.
Domain can be found in here:
/sys/class/kfd/kfd/topology/nodes/X/properties
Where X is the card number
Signed-off-by: Ori Messinger <ori.messinger@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c:1277:11: warning: variable ‘speakers’ set but not used [-Wunused-but-set-variable]
It is introduced by commit 0c41891c81 ("drm/amd/display:
Refactor stream encoder for HW review"), but never used, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.c:1339:11: warning: variable ‘speakers’ set but not used [-Wunused-but-set-variable]
It is introduced by commit 4562236b3b ("drm/amd/dc:
Add dc display driver (v2)"), but never used, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c:137:11: warning: variable ‘pixel_width’ set but not used [-Wunused-but-set-variable]
It is introduced by commit 70ccab6040 ("drm/amdgpu/display:
Add core dc support for DCN"), but never used, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:8665:13: warning: variable ‘dc’ set but not used [-Wunused-but-set-variable]
It is not used since commit d1ebfdd8d0 ("drm/amd/display:
Unify psr feature flags")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1170:39: warning: variable ‘direct_poll’ set but not used [-Wunused-but-set-variable]
It is introduced by commit 7daaebfea5 ("drm/amdgpu:
add VCN2.5 sriov start for Arctrus"), but never used,
so remove it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1848:39: warning: variable ‘direct_poll’ set but not used [-Wunused-but-set-variable]
It is introduced by commit dd26858a9c ("drm/amdgpu:
implement initialization part on VCN2.0 for SRIOV"), but never used,
so remove it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1211:26: warning: variable ‘priority’ set but not used
It is not used since commit 33abcb1f5a ("drm/amdgpu:
set compute queue priority at mqd_init")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The dcn20_validate_bandwidth function would have code touching the
incorrect registers emitted outside of the boundaries of the
DC_FP_START/END macros, at least on ppc64le. Work around the
problem by wrapping the whole function instead.
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Once should generally be enough for diagnosing what lead up to it,
repeating it over and over can be pretty annoying.
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Maybe no need to check ws before kmalloc, kmalloc will check
itself, kmalloc`s logic is if ptr is NULL, kmalloc will just
return
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RAS TA shall notify driver with flags of error specifics
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update interface to match latest TA
Organized input/output structures to better maintain backward compatiblity in the future
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Parse return status from TA to determine error severity
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
At default, the autosuspend delay of audio controller is 3S. If the
gpu reset is triggered within 3S(after audio controller idle),
the audio controller may be unable into suspended state. Then
the sudden gpu reset will cause some audio errors. The change
here is targeted to resolve this.
However if the audio controller is in use when the gpu reset
triggered, this change may be still not enough to put the
audio controller into suspend state. Under this case, the
gpu reset will still proceed but there will be a warning
message printed("failed to suspend display audio").
V2: limit this for BACO and mode1 reset only
V3: try 1st to use pm_runtime_autosuspend_expiration() to
query how much time is left. Use default setting on
failure
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The struct member 'asic_setup' was assigned twice, let's remove one:
static const struct pp_hwmgr_func smu10_hwmgr_funcs = {
......
.asic_setup = NULL,
......
.asic_setup = smu10_setup_asic_task,
......
};
This fixes the following coccicheck warning:
drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1357:52-53:
asic_setup: first occurrence line 1360, second occurrence line 1388
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the following coccicheck warning:
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_init.c:31:51-52:
exit_optimized_pwr_state: first occurrence line 86, second occurrence
line 92
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_init.c:31:51-52:
optimize_pwr_state: first occurrence line 85, second occurrence line 91
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_init.c:31:51-52:
set_cursor_attribute: first occurrence line 71, second occurrence line
89
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_init.c:31:51-52:
set_cursor_position: first occurrence line 70, second occurrence line 88
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_init.c:31:51-52:
set_cursor_sdr_white_level: first occurrence line 72, second occurrence
line 90
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Track GPU VRAM usage on a per process basis and report it through
sysfs.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In the file drm_dp_helper.h we have a macro named
DP_DSC_THROUGHPUT_MODE_{0,1}_UPSUPPORTED, the correct name should be
DP_DSC_THROUGHPUT_MODE_{0,1}_UNSUPPORTED. This commits adjusts this typo
in the header file and in other places that attempt to access this
macro.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429184142.1867987-1-Rodrigo.Siqueira@amd.com
Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.
v2: Hook up to debugfs/drop_caches to clear the cache on demand.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
The struct member 'set_config' was assigned twice:
static const struct drm_crtc_funcs ast_crtc_funcs = {
.reset = ast_crtc_reset,
.set_config = drm_crtc_helper_set_config,
......
.set_config = drm_atomic_helper_set_config,
......
};
Since the second one is which we use now in fact, we can remove the
first one.
This fixes the following coccicheck warning:
drivers/gpu/drm/ast/ast_mode.c:932:50-51: set_config: first occurrence
line 934, second occurrence line 937
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429141010.8445-1-yanaijie@huawei.com
Extend coverage of the blitter client by exercising conversion to and
from tiled sources. In the process we perform spot checks to verify that
the tiling/detiling is being applied correctly, along with position
invariance of the tiling parameters.
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk
We reduced the clocks slowly after a boost event based on the
observation that the smoothness of animations suffered. However, since
reducing the evalution intervals, we should be able to respond to the
rapidly fluctuating workload of a simple desktop animation and so
restore the more aggressive downclocking.
References: 2a8862d2f3 ("drm/i915: Reduce the RPS shock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk
We treat parking as a manual RPS timeout event, and downclock the GPU
for the next unpark and batch execution. However, having restored the
aggressive downclocking and observed that we have very light workloads
whose only interaction is through the manual parking events, carry over
the aggressive downclocking to the fake RPS events.
References: 21abf0bf16 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk
As with the realisation for soft-rc6, we respond to idling the engines
within microseconds, far faster than the response times for HW RC6 and
RPS. Furthermore, our fast parking upon idle, prevents HW RPS from
running for many desktop workloads, as the RPS evaluation intervals are
on the order of tens of milliseconds, but the typical workload is just a
couple of milliseconds, but yet we still need to determine the best
frequency for user latency versus power.
Recognising that the HW evaluation intervals are a poor fit, and that
they were deprecated [in bspec at least] from gen10, start to wean
ourselves off them and replace the EI with a timer and our accurate
busy-stats. The principle benefit of manually evaluating RPS intervals
is that we can be more responsive for better performance and powersaving
for both spiky workloads and steady-state.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698
Fixes: 98479ada42 ("drm/i915/gt: Treat idling as a RPS downclock event")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk
In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
Some older versions of gcc badly optimize code that passes
an inline function argument into another function by reference,
causing huge stack usage:
drivers/gpu/drm/bridge/tc358768.c: In function 'tc358768_bridge_pre_enable':
drivers/gpu/drm/bridge/tc358768.c:840:1: error: the frame size of 2256 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Use a temporary variable as a workaround and add a comment pointing
to the gcc bug.
Fixes: ff1ca6397b ("drm/bridge: Add tc358768 driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428215408.4111675-1-arnd@arndb.de
We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.
This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.
Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
Let's just calculate the hsync rate on demand. No point in wasting
space storing it and risking the cached value getting out of sync
with reality.
v2: Move drm_mode_hsync() next to its only users
Drop the TODO
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428171940.19552-2-ville.syrjala@linux.intel.com
If intel_context_create() fails then it leads to an error pointer
dereference. I shuffled things around to make error handling easier.
Fixes: 1dd47b54ba ("drm/i915: Add live selftests for indirect ctx batchbuffers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda
When building with clang + -Wuninitialized:
drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable
'rpcurupei' is uninitialized when used here [-Wuninitialized]
rpcurupei,
^~~~~~~~~
drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the
variable 'rpcurupei' to silence this warning
u32 rpcurupei, rpcurup, rpprevup;
^
= 0
1 warning generated.
rpupei is assigned twice; based on the second argument to
intel_uncore_read, it seems this one should have been assigned to
rpcurupei.
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Link: https://github.com/ClangBuiltLinux/linux/issues/1016
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com
The presumption is that by using a circular counter that is twice as
large as the maximum ELSP submission, we would never reuse the same CCID
for two inflight contexts.
However, if we continually preempt an active context such that it always
remains inflight, it can be resubmitted with an arbitrary number of
paired contexts. As each of its paired contexts will use a new CCID,
eventually it will wrap and submit two ELSP with the same CCID.
Rather than use a simple circular counter, switch over to a small bitmap
of inflight ids so we can avoid reusing one that is still potentially
active.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.
To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
The current GWS usage model will only allows a single GWS-enabled
process to be active on the GPU at once. This ensures that a
barrier-using kernel gets a known amount of GPU hardware, to
prevent deadlock due to inability to go beyond the GWS barrier.
The HWS watches how many GWS entries are assigned to each process,
and goes into over-subscription mode when two processes need more
than the 64 that are available. The current KFD method for working
with this is to allocate all 64 GWS entries to each GWS-capable
process.
When more than one GWS-enabled process is in the runlist, we must
make sure the runlist is in over-subscription mode, so that the
HWS gets a chained RUN_LIST packet and continues scheduling
kernels.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rather than only enabling GWS support based on the hws_gws_support
modparm, also check whether the GPU's HWS firmware supports GWS.
Leave the old modparm in place in case users want to test GWS
on GPUs not yet in the support list.
v2: fix broken syntax from the first patch.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a new kfd ioctl to allocate queue GWS. Queue
GWS is released on queue destroy.
v2: re-introduce this API with the following fixes squashed in:
- drm/amdkfd: fix null pointer dereference on dev
- drm/amdkfd: Return proper error code for gws alloc API
- drm/amdkfd: Remove GPU ID in GWS queue creation
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pass unlocked flag value to amdgpu_vm_update_params.unlocked
struct member at amdgpu_vm_bo_update_mapping.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For HMM support we need the ability to invalidate PTEs from
a MM callback where we can't lock the root PD.
Add a new flag to better support this instead of assuming
that all invalidation updates are unlocked.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To avoid confusion with direct ring submissions rename bottom
of pipe VM table changes to immediate updates.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the coding style, move and rename the definitions to
better match what they are supposed to be doing.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We still need to add the VM update fences to the root PD.
So make sure to never sync to those implicitely.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't support secure operation on compute rings at the
moment so reject them.
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the node is larger than 4GB we overrun the size calculation.
Fix this by correctly limiting the size to the window as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This should allow us to also support VRAM->GTT moves.
v2: fix missing vram_base_adjustment
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cleanup amdgpu_ttm_copy_mem_to_mem by using fewer variables
for the same value.
Rename amdgpu_map_buffer to amdgpu_ttm_map_buffer, move it
to avoid the forward decleration, cleanup by moving the map
decission into the function and add some documentation.
No functional change.
v2: add some more cleanup suggested by Felix
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since commit "Move to a per-IB secure flag (TMZ)",
we've been seeing hangs in GFX. We need to send
FRAME CONTROL stop/start back-to-back, every time
we flip the TMZ flag. That is, when we transition
from TMZ to non-TMZ we have to send a stop with
TMZ followed by a start with non-TMZ, and
similarly for transitioning from non-TMZ into TMZ.
This patch implements this, thus fixing the GFX
hang.
v1 -> v2:
As suggested by Luben, and accept part of implemetation from this patch:
- Put "secure" closed to the loop and use optimization
- Change "secure" to bool again, and move "secure == -1" out of loop.
v3: Small fixes/optimizations.
Reported-and-Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add fine-grained per-ASIC TMZ support.
At the moment TMZ support is experimental for all
ASICs which support it.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Swapping out encrypted BOs doesn't work because they can't change
their physical location without going through a bounce copy.
As a workaround disable evicting encrypted BOs to the system
domain for now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This way we should be at least able to move buffers from VRAM to GTT.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is necessary for TMZ handling during buffer moves and scanout.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
While the current amdgpu doesn't support TMZ, it will return the error if user
mode would like to allocate secure buffer.
v2: we didn't need this checking anymore.
v3: only print message once time.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Nirmoy Das <Nirmoy.Das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move from a per-CS secure flag (TMZ) to a per-IB
secure flag.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Hubp needs to know whether a buffer is being scanned out from the trusted
memory zone or not.
[How]
Check for the TMZ flag on the amdgpu_bo and set the tmz_surface flag in
dc_plane_address accordingly.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Implement an accessor of adev->tmz.enabled. Let not
code around access it as "if (adev->tmz.enabled)"
as the organization may change. Instead...
Recruit "bool amdgpu_is_tmz(adev)" to return
exactly this Boolean value. That is, this function
is now an accessor of an already initialized and
set adev and adev->tmz.
Add "void amdgpu_gmc_tmz_set(adev)" to check and
set adev->gmc.tmz_enabled at initialization
time. After which one uses "bool
amdgpu_is_tmz(adev)" to query whether adev
supports TMZ.
Also, remove circular header file include.
v2: Remove amdgpu_tmz.[ch] as requested.
v3: Move TMZ into GMC.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The alignment should match the page size for secure buffer, so we didn't
configure it anymore.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch enables TMZ bit in FRAME_CONTROL for gfx10.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable sdma TMZ mode via setting TMZ bit in sdma copy pkt
for sdma v5.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable sdma TMZ mode via setting TMZ bit in sdma copy pkt
for sdma v4
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands amdgpu_copy_buffer interface with tmz parameter.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands sdma copy_buffer interface with tmz parameter.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a buffer object is secure, i.e. created with
AMDGPU_GEM_CREATE_ENCRYPTED, then the TMZ bit of
the PTEs that belong the buffer object should be
set.
v1: design and draft the skeletion of TMZ bits setting on PTEs (Alex)
v2: return failure once create secure BO on non-TMZ platform (Ray)
v3: amdgpu_bo_encrypted() only checks the BO (Luben)
v4: move TMZ flag setting into amdgpu_vm_bo_update (Christian)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark a job as secure, if and only if the command
submission flag has the secure flag set.
v2: fix the null job pointer while in vmid 0
submission.
v3: Context --> Command submission.
v4: filling cs parser with cs->in.flags
v5: move the job secure flag setting out of amdgpu_cs_submit()
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands the context control function to support trusted flag while we
want to set command buffer in trusted mode.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch expands the emit_tmz function to support trusted flag while we want
to set command buffer in trusted mode.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds tmz bit in frame control pm4 packet, and it will used in future.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a function to check tmz capability with kernel parameter and ASIC type.
v2: use a per device tmz variable instead of global amdgpu_tmz.
v3: refine the comments for the function. (Luben)
v4: add amdgpu_tmz.c/h for future use.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch to add amdgpu_tmz structure which stores all tmz related fields.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds tmz parameter to enable/disable
the feature in the amdgpu kernel module. Nomally,
by default, it should be auto (rely on the
hardware capability).
But right now, it need to set "off" to avoid
breaking other developers' work because it's not
totally completed.
Will set "auto" till the feature is stable and
completely verified.
v2: add "auto" option for future use.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Define the TMZ (encryption) bit in the page table entry (PTE) for
Raven and newer asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
[Why]
Current locking scheme for cursor can result in a flip missing
its vsync, deferring it for one or more vsyncs. Result is a
potential for stuttering when cursor is moved.
[How]
Use cursor update lock so that flips are not blocked while cursor
is being programmed.
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why&How]
modules/color calculates various colour operations which are translated
to abstracted HW. DCE 5-12 had almost no important changes, but
starting with DCN1, every new generation comes with fairly major
differences in color pipeline.
We would hack it with some DCN checks, but a better approach is to
abstract color pipe capabilities so modules/DM can decide mapping to
HW block based on logical capabilities,
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
Add set backlight to hw sequencer, dmu communication will
be handled in hw sequencer for new asics.
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
For debugging, it can be useful to be able to modify the dummy
p-state latency, this will make it easier to do so.
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
We read memory that we shouldn't be touching if the struct isn't
a full union dmub_rb_cmd.
[How]
Fix up all the callers and functions that take in the dmub_cmd_header
to use the dmub_rb_cmd instead.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
The downspread percentage was copied over from a previous version
of the display_mode_lib spreadsheet. This value has been updated,
and the previous value is too high to allow for such modes as
4K120hz. The new value is sufficient for such modes.
[HOW]
Update the value in dcn21_resource to match the spreadsheet.
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Fixes the following scenario:
- Flip has been prepared sometime during the frame, update pending
- Cursor update happens right when VUPDATE would happen
- OPTC lock acquired, VUPDATE is blocked until next frame
- Flip is delayed potentially infinitely
With the igt@kms_cursor_legacy cursor-vs-flip-legacy test we can
observe nearly *13* frames of delay for some flips on Navi.
[How]
Apply the Raven workaround generically. When close enough to VUPDATE
block cursor updates from occurring from the dc_stream_set_cursor_*
helpers.
This could perhaps be a little smarter by checking if there were
pending updates or flips earlier in the frame on the HUBP side before
applying the delay, but this should be fine for now.
This fixes the kms_cursor_legacy test.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY & HOW]
Viewport limit was set to 16 pixels due to an issue with MPO
on small viewports. This restriction does not apply and the
viewport limit can now be lowered.
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY & HOW]
If building scaling parameters fails, validation
should also fail.
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently RN SOC bounding box update assumes we will get at least
2 clock states from SMU. This isn't always true and because of special
casing on first clock state we end up with low disp, dpp, dsc and phy
clocks.
This change removes the special casing allowing the first state to
acquire correct clocks.
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Tony Cheng <Tony.Cheng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check before programming the register since it isn't present on
all IPs using this code.
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Some include paths don't need to have relative paths
And some types missing
[How]
make some changes to headers and modify include path
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
The new metadata offset is located at the end of the firmware binary
without any additional padding.
Firmware state is currently larger than 1024 bytes so new firmware state
will hang when trying to access any data above 1024 bytes.
[How]
Specify the correct offset based on legacy vs new loading method.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
If mode is not supported, pipe split should not be disabled.
This may cause more modes to fail.
[HOW]
Check for mode support before disabling pipe split.
This commit was previously reverted as it was thought to
have problems, but those issues have been resolved.
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since the VExpress setup in pl111_vexpress.c is now just a single
function call, let's move it into pl111_versatile.c and we can further
simplify pl111_versatile_init() by moving the other pieces for VExpress
into pl111_vexpress_clcd_init().
Cc: Eric Anholt <eric@anholt.net>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200409013947.12667-4-robh@kernel.org
The init VExpress variants currently instantiates a 'muxfpga' driver for
the sole purpose of getting a regmap for it. There's no reason to
instantiate a driver and doing so just complicates things. The muxfpga
driver also isn't unregistered properly on module unload. Let's
just simplify all this this by just calling
devm_regmap_init_vexpress_config() directly.
Cc: Eric Anholt <eric@anholt.net>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200409013947.12667-3-robh@kernel.org
This fixes GPU hangs due to cache coherency issues.
Bump the driver version. Split out from the original patch.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This fixes GPU hangs due to cache coherency issues.
v2: Split the version bump to a separate patch
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wait for tiles off after unpause to fix transcode timeout issue.
It is a work around.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
hwmgr->pm_en is initialized at hwmgr_hw_init.
during amdgpu_device_init, there is amdgpu_asic_reset that calls to
soc15_asic_reset (for V320 usecase, Vega10 asic), in which:
1) soc15_asic_reset_method calls to pp_get_asic_baco_capability (pm_en)
2) soc15_asic_baco_reset calls to pp_set_asic_baco_state (pm_en)
pm_en is used in the above two cases while it has not yet been initialized
So avoid using pm_en in the above two functions for V320 passthrough.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit c520787623.
The commit being reverted changed the wrong place, it should have
changed in func get_asic_baco_capability.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some processes, such as systemd, are only polling for EPOLLERR|EPOLLHUP.
As drm_file uses unkeyed wakeups, such a poll receives many spurious
wakeups from uninteresting events.
Use keyed wakeups to allow the wakeup target to more efficiently discard
these uninteresting events.
Signed-off-by: Kenny Levinsen <kl@kl.wtf>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424145103.3048-1-kl@kl.wtf
In order to surface the ASIC revision to user level, we want
to put it into the HSA topology. This can be because different
ASIC revisions may require user-level software to do different
things (e.g. patch code for things that are changed in later
hardware revisions).
The ASIC revision from the hardware is maximum of 4 bits at this
time, so put it into 4 of the open bits in the HSA capability.
Then user-level software can use this capability information to
know -- for each ASIC -- what revision-based things must be done.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is only needed for hotpluggable connectors set up after
drm_dev_register().
Reviewed-by: Thomas Zimemrmann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Emil Velikov <emil.velikov@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-59-daniel.vetter@ffwll.ch
Upcasting using a container_of macro is more typesafe, faster and
easier for the compiler to optimize.
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: virtualization@lists.linux-foundation.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-37-daniel.vetter@ffwll.ch
Already using devm_drm_dev_init, so very simple replacment.
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: virtualization@lists.linux-foundation.org
Cc: Emil Velikov <emil.velikov@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-36-daniel.vetter@ffwll.ch
Komeda uses the component framework, which does open/close a new
devres group around all the bind callbacks. Which means we can use
devm_ functions for managing the drm_device cleanup, with leaking
stuff in case of deferred probes or other reasons to unbind
components, or the component_master.
Also note that this fixes a double-free in the probe unroll code, bot
drm_dev_put and kfree(kms) result in the kms allocation getting freed.
Aside: komeda_bind could be cleaned up a lot, devm_kfree is a bit
redundant. Plus I'm not clear on why there's suballocations for
mdrv->mdev and mdrv->kms. Plus I'm not sure the lifetimes are correct
with all that devm_kzalloc usage ... That structure layout is also the
reason why komeda still uses drm_device->dev_private and can't easily
be replaced with a proper container_of upcasting. I'm pretty sure that
there's endless amounts of hotunplug/hotremove bugs in there with all
the unprotected dereferencing of drm_device->dev_private.
Reviewed-by: James Qian Wang <james.qian.wang@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "James (Qian) Wang" <james.qian.wang@arm.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Mihail Atanassov <mihail.atanassov@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-33-daniel.vetter@ffwll.ch
Upcasting using a container_of macro is more typesafe, faster and
easier for the compiler to optimize.
v2: Move misplaced removal of double-assignment to this patch (Sam)
Reviewed-by: Linus Walleij <linus.walleij@linaro.org> (v1)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-30-daniel.vetter@ffwll.ch
Already using devm_drm_dev_init, so very simple replacment.
v2: Move misplaced double-assignement to next patch (Sam)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-29-daniel.vetter@ffwll.ch
Not used anymore since the switch to suspend/resume helpers.
Tested-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-26-daniel.vetter@ffwll.ch
Upcasting using a container_of macro is more typesafe, faster and
easier for the compiler to optimize.
Tested-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-25-daniel.vetter@ffwll.ch
Already using devm_drm_dev_init, so very simple replacment.
Tested-by: Jyri Sarha <jsarha@ti.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-24-daniel.vetter@ffwll.ch
Upcasting using a container_of macro is more typesafe, faster and
easier for the compiler to optimize.
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-23-daniel.vetter@ffwll.ch
Already using devm_drm_dev_init, so very simple replacment.
Acked-by: David Lechner <david@lechnology.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Eric Anholt <eric@anholt.net>
Cc: David Lechner <david@lechnology.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-19-daniel.vetter@ffwll.ch
Already using devm_drm_dev_init, so very simple replacment.
Aside: There was an oddity in the old code, we allocated priv but in
the error path we've freed priv->dbidev ...
Acked-by: David Lechner <david@lechnology.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Lechner <david@lechnology.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-14-daniel.vetter@ffwll.ch
We're mostly there already, just a handful of places that didn't use
the to_udl container_of cast. To make sure no new appear, don't set
->dev_private.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "José Roberto de Souza" <jose.souza@intel.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Allison Randal <allison@lohutok.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-13-daniel.vetter@ffwll.ch
With Thomas' patch to clean up fbdev init this is a rather standard
conversion to the new wrapper macro.
v2: Rebase on top of Thomas' patches to remove the return value from
drm_fbdev_generic_setup()
v3: Update commit message to reflect the reality of the rebased patch
(Sam)
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-12-daniel.vetter@ffwll.ch
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too
To avoid having to pull in some big headers I implemented the casting
function as a macro instead of a static inline. Typechecking thanks to
container_of still assured.
But also a bit a bikeshed, so feel free to ignore.
v2: More parens for v3d_to_pdev macro (checkpatch)
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-11-daniel.vetter@ffwll.ch
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too
But also a bit a bikeshed, so feel free to ignore.
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch
Aside from deleting all the cleanup code we're now also setting a name
for the pool
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-7-daniel.vetter@ffwll.ch
Allows us to drop the cleanup code on the floor.
Sam noticed in his review:
> With this change we avoid calling pci_disable_device()
> twise in case vbox_mm_init() fails.
> Once in vbox_hw_fini() and once in the error path.
v2: Include Sam's review remarks
v3: Fix typo in commit summary (Thomas Zimmermann)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-6-daniel.vetter@ffwll.ch
We use the baseclass pattern here, so lets to the proper (and more
typesafe) upcasting.
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-5-daniel.vetter@ffwll.ch
Add a new macro helper to combine the usual init sequence in drivers,
consisting of a kzalloc + devm_drm_dev_init + drmm_add_final_kfree
triplet. This allows us to remove the rather unsightly
drmm_add_final_kfree from all currently merged drivers.
The kerneldoc is only added for this new function. Existing kerneldoc
and examples will be udated at the very end, since once all drivers
are converted over to devm_drm_dev_alloc we can unexport a lot of
interim functions and make the documentation for driver authors a lot
cleaner and less confusing. There will be only one true way to
initialize a drm_device at the end of this, which is going to be
devm_drm_dev_alloc.
v2:
- Actually explain what this is for in the commit message (Sam)
- Fix checkpatch issues (Sam)
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-2-daniel.vetter@ffwll.ch
drivers/gpu/drm/omapdrm/dss/venc.c:211:33:
warning: 'venc_config_pal_bdghi' defined but not used [-Wunused-const-variable=]
static const struct venc_config venc_config_pal_bdghi = {
^~~~~~~~~~~~~~~~~~~~~
It is never used, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415132105.43636-1-yuehaibing@huawei.com
While we support using both tx slots for sideband transmissions, it
appears that DisplayPort devices in the field didn't end up doing a very
good job of supporting it. From section 5.2.1 of the DP 2.0
specification:
There are MST Sink/Branch devices in the field that do not handle
interleaved message transactions.
To facilitate message transaction handling by downstream devices, an
MST Source device shall generate message transactions in an atomic
manner (i.e., the MST Source device shall not concurrently interleave
multiple message transactions). Therefore, an MST Source device shall
clear the Message_Sequence_No value in the Sideband_MSG_Header to 0.
This might come as a bit of a surprise since the vast majority of hubs
will support using both tx slots even if they don't support interleaved
message transactions, and we've also been using both tx slots since MST
was introduced into the kernel.
However, there is one device we've had trouble getting working
consistently with MST for so long that we actually assumed it was just
broken: the infamous Dell P2415Qb. Previously this monitor would appear
to work sometimes, but in most situations would end up timing out
LINK_ADDRESS messages almost at random until you power cycled the whole
display. After reading section 5.2.1 in the DP 2.0 spec, some closer
investigation into this infamous display revealed it was only ever
timing out on sideband messages in the second TX slot.
Sure enough, avoiding the second TX slot has suddenly made this monitor
function perfectly for the first time in five years. And since they
explicitly mention this in the specification, I doubt this is the only
monitor out there with this issue. This might even explain explain the
seemingly harmless garbage sideband responses we would occasionally see
with MST hubs!
So - rewrite our sideband TX handlers to only support one TX slot. In
order to simplify our sideband handling now that we don't support
transmitting to multiple MSTBs at once, we also move all state tracking
for down replies from mstbs to the topology manager.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: ad7f8a1f9c ("drm/helper: add Displayport multi-stream helper (v0.6)")
Cc: Sean Paul <sean@poorly.run>
Cc: "Lin, Wayne" <Wayne.Lin@amd.com>
Cc: <stable@vger.kernel.org> # v3.17+
Reviewed-by: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424181308.770749-1-lyude@redhat.com
The '>' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c:602:28-33: WARNING:
conversion to bool not needed here
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_mpc.c:455:70-75: WARNING:
conversion to bool not needed here
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The '>' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3004:68-73: WARNING:
conversion to bool not needed here
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Due to hardware bug that when RSMU UMC index is disabled,
clear EccErrCnt at the first UMC instance will clean up all other
EccErrCnt registes from other instances at the same time. This
will break the correctable error count log in EccErrCnt register
once querying it. So decouple both to make error count query workable.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This makes consistent with other regsiters' access in this module.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CG/PG ungate is already performed in ip_suspend_phase1. Otherwise,
the CG/PG ungate will be performed twice. That will cause gfxoff
disablement is performed twice also on runpm enter while gfxoff
enablemnt once on rump exit. That will put gfxoff into disabled
state.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This sequence change should be safe as what did in ip_suspend_phase1
is to suspend DCE only. And this is a prerequisite for coming
redundant cg/pg ungate dropping.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver steered p-state switching is designed for Vega20 only.
Also simplify early return for temporary disable due to SMU FW
bug.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2534:2-3: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This fixes the following warning detected when running make with W=1
drivers/gpu/drm/rockchip//cdn-dp-core.c:1112:5: warning: no previous
prototype for ‘cdn_dp_suspend’ [-Wmissing-prototypes]
drivers/gpu/drm/rockchip//cdn-dp-core.c:1126:5: warning: no previous
prototype for ‘cdn_dp_resume’ [-Wmissing-prototypes]
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200426161653.7710-1-enric.balletbo@collabora.com
The IRQ postinstall handling had open-coded pipe fault mask selection
that never got updated for gen11. Switch it to use
gen8_de_pipe_fault_mask() to ensure we don't miss updates for new
platforms.
Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: d506a65d56 ("drm/i915: Catch GTT fault errors for gen11+ planes")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424231423.4065231-1-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The bspec lists both the clock frequency and the effective interval. The
interval corresponds to observed behaviour, so adjust the frequency to
match.
v2: Mika rightfully asked if we could measure the clock frequency from a
selftest.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk
Number of endpoints could exceed the fix value MAX_ENDPOINTS(2).
Instead of increase simply this value, the number of endpoint
could be read from device tree. Load sequence has been a little
rework to take care of several panel or bridge which can be
connected/disconnected or enable/disable.
Signed-off-by: Yannick Fertre <yannick.fertre@st.com>
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1582877258-1112-1-git-send-email-yannick.fertre@st.com
We see that if the HW doesn't actually sleep, the HW may eat the poison
we set in its write-only HWSP during sanitize:
intel_gt_resume.part.8: 0000:00:02.0
__gt_unpark: 0000:00:02.0
gt_sanitize: 0000:00:02.0 force:yes
process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90
process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a
assert_pending_valid: Nothing pending for promotion!
The CS TAIL pointer should have been reset by reset_csb_pointers(), so
in this case it is likely that we have read back from the CPU cache and
so we must clflush our control over that page. In doing so, push the
sanitisation to the start of the GT sequence so that our poisoning is
assuredly before we start talking to the HW.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk
We evaluate *active, which is a pointer into execlists->inflight[]
during dequeue to decide how long a preempt-timeout we need to apply.
However, as soon as we do the submit_ports, the HW may send its ACK
interrupt causing us to promote execlists->pending[] tp
execlists->inflight[], overwriting the value of *active. We know *active
is only stable until we submit (as we only submit when there is no
pending promotion).
[ 16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915]
[ 16.102356]
[ 16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1:
[ 16.102780] execlists_dequeue+0x1449/0x1600 [i915]
[ 16.103160] __execlists_submission_tasklet+0x48/0x60 [i915]
[ 16.103540] execlists_submit_request+0x38e/0x3c0 [i915]
[ 16.103940] submit_notify+0x8f/0xc0 [i915]
[ 16.104308] __i915_sw_fence_complete+0x61/0x420 [i915]
[ 16.104683] i915_sw_fence_complete+0x58/0x80 [i915]
[ 16.105054] i915_sw_fence_commit+0x16/0x20 [i915]
[ 16.105457] __i915_request_queue+0x60/0x70 [i915]
[ 16.105843] i915_gem_do_execbuffer+0x2d6b/0x4230 [i915]
[ 16.106227] i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915]
[ 16.106257] drm_ioctl_kernel+0xe9/0x130
[ 16.106279] drm_ioctl+0x27d/0x45e
[ 16.106311] ksys_ioctl+0x89/0xb0
[ 16.106336] __x64_sys_ioctl+0x42/0x60
[ 16.106370] do_syscall_64+0x6e/0x2c0
[ 16.106397] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk
Use indirect ctx bb to load cmd buffer control value
from context image to avoid corruption.
v2: add to lrc layout (Chris)
v3: end to a cacheline (Chris)
v4: add to lrc fixed (Chris)
v5: value in offset+1
Testcase: igt/i915_selftest/gt_lrc
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com
Indirect ctx batchbuffers are a hw feature of which
batch can be run, by hardware, during context restoration stage.
Driver can setup a batchbuffer and also an offset into the
context image. When context image is marshalled from
memory to registers, and when the offset from the start of
context register state is equal of what driver pre-determined,
batch will run. So one can manipulate context restoration
process at cacheline granularity, given some limitations,
as you need to have rudimentaries in place before you can
run a batch.
Add selftest which will write the ring start register
to a canary spot. This will test that hardware will run a
batchbuffer for the context in question.
v2: request wait fix, naming (Chris)
v3: test order (Chris)
v4: rebase
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com
Restoration of a previous timestamp can collide
with updating the timestamp, causing a value corruption.
Combat this issue by using indirect ctx bb to
modify the context image during restoring process.
We can preload value into scratch register. From which
we then do the actual write with LRR. LRR is faster and
thus less error prone as probability of race drops.
v2: tidying (Chris)
v3: lrr for all engines
v4: grp
v5: reg bit
v6: wa_bb_offset, virtual engines (Chris)
References: HSDES#16010904313
Testcase: igt/i915_selftest/gt_lrc
Suggested-by: Joseph Koston <joseph.koston@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com
drivers/gpu/drm/panel/panel-truly-nt35597.c:493:31: warning: variable ‘config’ set but not used [-Wunused-but-set-variable]
const struct nt35597_config *config;
^~~~~~
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417101401.19388-1-yuehaibing@huawei.com
Since commit 89958b7cd9 ("drm/bridge: panel: Infer connector type from
panel by default"), drm_panel_bridge_add() and their variants can return
NULL and an error pointer. This is fine but none of the actual users of
the API are checking for the NULL value. Instead of change all the
users, seems reasonable to return an error pointer instead. So change
the returned value for those functions when the connector type is unknown.
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200416210654.2468805-1-enric.balletbo@collabora.com
The panel connector type should be set by the panel not the bridge, so
remove the connector_type assignment.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200416164404.2418426-2-enric.balletbo@collabora.com
We only hold the active spinlock while dumping the error state, and this
does not prevent another thread from retiring the request -- as it is
quite possible that despite us capturing the current state, the GPU has
completed the request. As such, it is dangerous to dereference state
below the request as it may already be freed, and the simplest way to
avoid the danger is not include it in the error state.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk
For many configuration details within RC6 and RPS we are programming
intervals for the internal clocks. From gen11, these clocks are
configuration via the RPM_CONFIG and so for convenience, we would like
to convert to/from more natural units (ns).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
Add tracek to the RPS events (interrupts, worker, enabling, threshold
selection, frequency setting), so that if we have to debug reticent HW
we have some traces to start from.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and
upon receipt of that interrupt we reprogram the GPU clocks down to the
next idle notch [to help convserve power during rc6]. However, on
execlists, we benefit from soft-rc6 immediately parking the GPU and
setting idle frequencies upon idling [within a jiffie], and here the
interrupt prevents us from restarting from our last frequency.
In the process, we can simply opt for a static pm_events mask and rely
on the enable/disable interrupts to flush the worker on parking.
This will reduce the amount of oscillation observed during steady
workloads with microsleeps, as each time the rc6 timeout occurs we
immediately follow with a waitboost for a dropped frame.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
The variable option is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently the error returns paths are unlocking lock kiq->ring_lock
however it seems this should be dev->gfx.kiq.ring_lock as this
is the lock that is being locked and unlocked around the ring
operations. This looks like a bug, but it's not. The kiq is just
a local variable pointing to the same structure. Make it consistent.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wait for the oldest sequence on the ring
to be signaled in order to make sure there
will be no command overrun.
v2: fix coding stype and remove abs operation
v3: remove the initialization of variable r
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to
finish
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
because nv12 SRIOV support one vf mode
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
for ARCTURUS+ ASICS, we always support SW_SMU for bare-metal
and for SRIOV one_vf_mode
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pass the entire connector state to intel_{gmch,pch}_panel_fitting().
For now we just need to get at .scaling_mode but in the future we'll
want access to the margin properties as well.
v2: Deal with intel_dp_ycbcr420_config()
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-5-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Make things a bit more abstract by replacing the pch_pfit.pos/size
raw register values with a drm_rect. Makes it slighly more convenient
to eg. compute the scaling factors.
v2: Use drm_rect_init()
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-3-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Most of the pfit functions are of the form:
func()
{
if (pfit_enabled) {
...
}
}
Flip the pfit_enabled check around to flatten the functions.
And while we're touching all this let's do the usual
s/pipe_config/crtc_state/ replacement.
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-2-ville.syrjala@linux.intel.com
Fix skl_update_scaler_crtc() to deal with different scaling
modes correctly. The current implementation assumes
DRM_MODE_SCALE_FULLSCREEN. Fortunately we don't expose any
border properties currently so the code does actually end
up doing the right thing (assigning a scaler for pfit).
The code does need to be fixed before any borders are
exposed.
Also we have redundant calls to skl_update_scaler_crtc() in
dp/hdmi .compute_config() which can be nuked. They were anyway
called before we had even computed the pfit state so were
basically nonsense. The real call we need to keep is in
intel_crtc_atomic_check().
v2: Deal witrh skl_update_scaler_crtc() in intel_dp_ycbcr420_config()
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-1-ville.syrjala@linux.intel.com
Enable runtime pm by default so GPU suspend when idle
for 200ms. This value can be changed by
autosuspend_delay_ms in device's power sysfs dir.
On Allwinner H3 lima_device_resume takes ~40us and
lima_device_suspend takes ~20us.
Tested-by: Bhushan Shah <bshah@kde.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421133551.31481-11-yuq825@gmail.com
Add driver pm system and runtime hardware resume/suspend ops.
Note this won't enable runtime pm of the device yet.
v2:
Do clock and power gating when suspend/resume.
Tested-by: Bhushan Shah <bshah@kde.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421133551.31481-10-yuq825@gmail.com
We need to flush TLB anyway before every task start, and the
page directory will be set to empty vm after suspend/resume,
so always set it to the task vm even no ctx switch happens.
Tested-by: Bhushan Shah <bshah@kde.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200421133551.31481-5-yuq825@gmail.com