We want to make DC less chatty but still allow bug reporters to
provide more detailed logs.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No more need since Andrey's change to use drm_crtc's version
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Until I've had time to test it better.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=102372
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable DC for DCE8 APUs.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hook up dc_surface creation/destruction to dm_plane_state.
Rename amdgpu_drm_plane_state to dm_plane_state and do
minor cleanups.
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This flag is needed to pass several of IGT test cases.
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link index is an unnecessery level of inderection when
calling from kernel i2c/aux transfer into DAL.
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch introduces amdgpu_drm_plane_state
structure, which subclasses drm_plane_state and
holds data suitable for configuring hardware.
It switches reset(), atomic_duplicate_state()
& atomic_destroy_state() functions to new internal
implementation, earlier they were pointing to
drm core functions.
TESTS(On Chromium OS on Stoney Only)
* Builds without compilation errors.
* 'plane_test' passes for XR24 format
based Overlay plane.
* Chromium OS ui comes up.
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Current design has per-crtc-plane model.
As a result, for asic's that support underlay,
are unable to expose it to user space for modesetting.
To enable this, the drm driver intialisation now runs
for number of surfaces instead of stream/crtc.
This patch plumbs surface capabilities to drm framework
so that it can be effectively used by user space.
Tests: (On Chromium OS for Stoney Only)
* 'modetest -p' now shows additional plane
with YUV capabilities in case of CZ and ST.
* 'plane_test' fails with below error:
[drm:amdgpu_dm_connector_atomic_set_property [amdgpu]] *ERROR* Unsupported screen depth 0
as ther is no support for YUYV
* Checked multimonitor display works fine
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jordan Lazare <Jordan.Lazare@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This wires DCE12 support into DC and enables it.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
dc_target does not fit well into DRM framework so removed it.
This will prevent the driver from leveraging the pipe-split
code for tiled displays, so will have to be handled at a higher
level. Most places that used dc_target now directly use dc_stream
instead.
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Virtualization don't need the dc, disable it.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Supported DCE versions: 8.0, 10.0, 11.0, 11.2
v2: rebase against 4.11
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is needed to ensure every single DC commit builds. Reverting
this again when it's no longer needed by DC.
This reverts commit 98da65d5e3.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It was not clear. The rest of the driver is MIT/X11.
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: drop hdp invalidate/flush.
v3: honor pgoff during prime mmap. Add a barrier after cpu access.
v4: drop begin/end_cpu_access() for now, revisit later.
Signed-off-by: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just set the CPU access required flag when we pin it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this fix memory leak due to request_firmware after driver
unloaded
v2:
release gmc firmware for gmc6/7/8 as well
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fix missing finish uvd enc_ring.
v2:
since the adev pointer check in already in ring_fini
so drop the check outsider
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this way after KIQ MQD released in drv unloading, CPC
can still let KIQ access this MQD thus RLCV SAVE_VF
will not fail
v2:
always use VRAM domain for KIQ MQD no matter BM or SRIOV
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2:
move kcq_disable out of SRIOV, make it genearal
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
only with this way we can debug the VMC page fault issue
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping().
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size
is 0x200000, the problem will happen.
In order to fix the problem, we separately store the max nums of
PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs
structure, rather than inferring it from bytes limit of SDMA
constant fill, i.e. fill_max_bytes.
Together with the fix, we replace the hard code value "10" in
amdgpu_vm_bo_update_mapping() with the corresponding values from
structure amdgpu_vm_pte_funcs.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use 2MB fragment size by default for older hardware generations as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: John Bridgman <john.bridgman@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SRIOV doesn't implement PMC capability of PCIe, so it can't update
power state by reading PMC register.
Currently, amdgpu driver doesn't disable pci device when removing
driver, the enable_cnt of pci device will not be decrease to 0.
When reloading driver, pci_enable_device will do nothing as
enable_cnt is not zero. And power state will not be updated as PMC
is not support.
So current_state of pci device is not D0 state and pci_enable_msi
return fail.
Add pci_disable_device when remmoving driver to fix the issue.
Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We discovered that on some devices even with iommu enabled
you can access all of system memory through the iommu translation.
Therefore, we revert the read method to the translation only service
and drop the write method completely.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christan König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GPU reset will require all hw doing hw_init thus
ucode_init_bo will be invoked again, which lead to
memory leak
skip the fw_buf allocation during sriov gpu reset to avoid
memory leak.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode
v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RLC need CSB registers initiated under SRIOV during world switch
otherwise the clear state buffer behav will not be recovered to
current VF scheme after switch back
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
increase timeout to 12 seconds,because there may have multiple
FLR waiting for done, the waiting time of events may be long,
increase to 12s to reduce timeout failure.
Signed-off-by: Horace Chen <horace.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
bo_free on csa is too late to put in amdgpu_fini because that
time ttm is already finished,
Move it earlier to avoid the page fault.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
FRAME_CONTROL(begin) is needed for vega10 due to ucode logic change,
it can fix some CTS random fail under gfx preemption enabled mode.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
At least for SRIOV we found reload PSP fw during
gpu reset cause PSP hang.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
currently in_reset is only used in sriov gpu reset, and it
will be used for other non-gfx hw component later, like
PSP, so move it from gfx to adev and rename to in_sriov_reset
make more sense.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
V2
Signed-off-by: Ken Wang <Ken.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v2): Add domain to iova debugfs
(v3): Add true read/write methods to access system memory of pages
mapped to the device
(v4): Move get_domain call out of loop and return on error
(v5): Just use kmap/kunmap
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v2): add domains and avoid strcmp
fix checkpatch.pl WARNING:
Prefer 'unsigned int' to bare use of 'unsigned'
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Requested by SRIOV, the clearance of the bit moved into firmware
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Port it from sdma4 for wptr polling usage.
Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When hypervisor triggering FLR for one of VFs, need to enable sdma
wptr polling to avoid missing wptr update if enabling doorbell.
Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.
The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.
It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.
v2:
- Made the hash table smaller and the FIFO longer. I never want the
FIFO to fill up, because that would make prescreen take longer.
128 pending page faults should be enough to keep migrations busy.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To filter out high-frequency interrupts that can be safely ignored.
v2: squash in trivial typo fix for si (Alex)
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows assigning a PASID to a VM for identifying VMs involved in page
faults. The global PASID manager is also exported in the KFD
interface so that AMDGPU and KFD can share the PASID space.
PASIDs of different sizes can be requested. On APUs, the PASID size
is deterined by the capabilities of the IOMMU. So KFD must be able
to allocate PASIDs in a smaller range.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
currently, for CI asics,
use dpm by default, amdgpu.dpm=-1.
when set amdgpu.dpm=1, enable powplay.
when set amdgpu.dpm=0, disable both dpm and powerplay.
when powerplay is stable on CI asics, ci_dpm will
be removed.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
put amd_pm_funcs table in struct powerplay for all
asics.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
renamed amdgpu_dpm_funcs and moved to amd_shared.h
so can shared with powerplay.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Several users have complained that the tile table update broke Oland
support. Despite several attempts to fix it, the root cause is still
unknown at this point and no solution is available. As it is not
acceptable to leave a known regression breaking a major functionality
in the kernel for several releases, let's just reverse this
optimization for now. It can be implemented again later if and only
if the breakage is understood and fixed.
As there were no complaints for Hainan so far, only the Oland part of
the offending commit is reverted. Optimization is preserved on
Hainan, so this commit isn't an actual revert of the original.
This fixes bug #194761:
https://bugzilla.kernel.org/show_bug.cgi?id=194761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: f8d9422ef8 ("drm/amdgpu: update tile table for oland/hainan")
Cc: Flora Cui <Flora.Cui@amd.com>
Cc: Junwei Zhang <Jerry.Zhang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We adjusted the BO flags for USWC handling, but those never took effect
because the placement was passed in instead of generated inside this
function.
v2: better commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nobody is actually using this and it causes a bunch of unused and buggy code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This always allocated on PAGE_SIZE alignment.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes crash when trying to unload the amdgpu module before the fbdev
framebuffer was initialized, which can happen since the DRM fbdev helper
code supports deferred setup.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no guarantee that the last BO_VA actually needed an update.
Additional to that all command submissions must wait for moved BOs to
be cleared, not just the first one.
v2: Don't overwrite any newer fence.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 10e709cb29.
The patch doesn't work at all:
1. The CS can still be blocked because of amdgpu_ctx_add_fence().
2. The order of submission isn't correct any more.
3. We could end up using freed up memory because we now drop the
ctx reference to early.
This needs to be fixed cleanly by doing the context handling after the BO
handling, but this is a larger task just avoid the obvious crashes for now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu monk.liu@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
All users of a VM must always wait for updates with always
valid BOs to be completed.
v2: remove debugging leftovers, rename struct member
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- v2: share code with CHIP_VEGA10 case
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise, the ring will fail to create on next resume.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- v2: reuse the ring stop api in ring destory
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- fw_size in psp_v10_0_prep_cmd_buf is wrongly set as 0
- fixed the wrong calculation of psp_write_ptr_reg in psp_v10_0_cmd_submit
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just some cleanup.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just some cleanup.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is quite controversial because it adds another lock which is held during
page table updates, but I don't see much other option.
v2: allow multiple updates to be in flight at the same time
v3: simplify the patch, take the read side only once
v4: correctly fix rebase conflict
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the VM instead of the BO list to find the BO for a virtual address.
This fixes UVD/VCE in physical mode with VM local BOs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When we need to find the mapping we need sysvm access anyway.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead take the callback lock during the final parts of CS.
This should solve the last remaining locking order problems with BO reservations.
v2: rebase, make dummy functions static inline
v3: add one more missing inline and comments
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow at least some parallel processing.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of moving them in the MMU notifier move them during CS.
v2: still mark pages as accessed/dirty
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead use a counter to figure out if we need to set new pages or not.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This didn't helped as intended, just simplify the code.
v2: unlock mmap_sem in the error path as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When amdgpu_vm_frag_ptes calls amdgpu_vm_update_ptes and the pt
has a shadow PT we mirror all the write to the shadow PT too, which
results in twice the commands.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 10e709cb29.
The patch doesn't work at all:
1. The CS can still be blocked because of amdgpu_ctx_add_fence().
2. The order of submission isn't correct any more.
3. We could end up using freed up memory because we now drop the
ctx reference to early.
This needs to be fixed cleanly by doing the context handling after the BO
handling, but this is a larger task just avoid the obvious crashes for now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu monk.liu@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move calling put_page into the unpopulate callback. Otherwise we mess up the pages
reference count when it is unbound multiple times.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
first is incorrect if hit NULL/signaled fence
Signed-off-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow interval trees to quickly check for overlaps to avoid unnecesary
tree lookups in interval_tree_iter_first().
As of this patch, all interval tree flavors will require using a
'rb_root_cached' such that we can have the leftmost node easily
available. While most users will make use of this feature, those with
special functions (in addition to the generic insert, delete, search
calls) will avoid using the cached option as they can do funky things
with insertions -- for example, vma_interval_tree_insert_after().
[jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()]
Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZpRPIAAoJEAx081l5xIa+kCIP/2m2q0jBmCATvXXwrMBH0zNk
4lm9yIfl9pmluJP97aklvkeKF77chhost76+hv+0sQ9ZsJD8koHWv5WyTHEs7Cfn
NpmtGPqYlIZsWNSwW0OFF/XzllgLCVEWa+W/7ryYzPZrSEZr6Ge4HE0qS3LfuLJv
K89amZWHkP5ysPZ1uxRBzHtZfNAhdyjYVTUntCR7gj3DYv3yNdeZu+/epfcWK2w/
Q+ggoy644vX/yzy5L5zCGL/J1BjStDuec7sgAKTlNx4TwBUmp2wsfhEdovQBGFiu
t5PHMajvrBRqSJWDIAZSUfjQzIMSz517J9LWeChU7KtAClNJQJEabbu4CoX4aEmG
UbSzEe0IxnxQ4842jcqQXZ+mevlNIEIBVSNR7dXi17jL3Ts+APQgrYjRJYVk2ipg
uQ9TwkeVVu2WRGyU8iRQrXAZI7+O3p4UnbNPjeG2qACD2Ur7Z3n7b0mhNFPOLzO4
gbIv4D6CcUB/vltl+vhZTW3P50oMCVSq8ScCpY8CGo29mZ5vypj5PTS+W8FsyY3Z
ypyMqWg/DyxKlOoO+aK8EmXuZmgtDR4kb8asltH/S1A0NZkzjrFkKgs10Cp6EjJy
Zz1BWa1KKEpdN6yp+jrbJKjf9MJ7K2RPGv3bxWnCCdNv4j49rk4t3IHqvcihddsd
XXFQB5zE7Pz0ROi/VkXR
=5fxW
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This is the main drm pull request for 4.14 merge window.
I'm sending this early, as my continuing journey into fatherhood is
occurring really soon now, I'm going to be mostly useless for the next
couple of weeks, though I may be able to read email, I doubt I'll be
doing much patch applications or git sending. If anything urgent pops
up I've asked Daniel/Jani/Alex/Sean to try and direct stuff towards
you.
Outside drm changes:
Some rcar-du updates that touch the V4L tree, all acks should be in
place. It adds one export to the radix tree code for new i915 use
case. There are some minor AGP cleanups (don't see that too often).
Changes to the vbox driver in staging to avoid breaking compilation.
Summary:
core:
- Atomic helper fixes
- Atomic UAPI fixes
- Add YCBCR 4:2:0 support
- Drop set_busid hook
- Refactor fb_helper locking
- Remove a bunch of internal APIs
- Add a bunch of better default handlers
- Format modifier/blob plane property added
- More internal header refactoring
- Make more internal API names consistent
- Enhanced syncobj APIs (wait/signal/reset/create signalled)
bridge:
- Add Synopsys Designware MIPI DSI host bridge driver
tiny:
- Add Pervasive Displays RePaper displays
- Add support for LEGO MINDSTORMS EV3 LCD
i915:
- Lots of GEN10/CNL support patches
- drm syncobj support
- Skylake+ watermark refactoring
- GVT vGPU 48-bit ppgtt support
- GVT performance improvements
- NOA change ioctl
- CCS (color compression) scanout support
- GPU reset improvements
amdgpu:
- Initial hugepage support
- BO migration logic rework
- Vega10 improvements
- Powerplay fixes
- Stop reprogramming the MC
- Fixes for ACP audio on stoney
- SR-IOV fixes/improvements
- Command submission overhead improvements
amdkfd:
- Non-dGPU upstreaming patches
- Scratch VA ioctl
- Image tiling modes
- Update PM4 headers for new firmware
- Drop all BUG_ONs.
nouveau:
- GP108 modesetting support.
- Disable MSI on big endian.
vmwgfx:
- Add fence fd support.
msm:
- Runtime PM improvements
exynos:
- NV12MT support
- Refactor KMS drivers
imx-drm:
- Lock scanout channel to improve memory bw
- Cleanups
etnaviv:
- GEM object population fixes
tegra:
- Prep work for Tegra186 support
- PRIME mmap support
sunxi:
- HDMI support improvements
- HDMI CEC support
omapdrm:
- HDMI hotplug IRQ support
- Big driver cleanup
- OMAP5 DSI support
rcar-du:
- vblank fixes
- VSP1 updates
arcgpu:
- Minor fixes
stm:
- Add STM32 DSI controller driver
dw_hdmi:
- Add support for Rockchip RK3399
- HDMI CEC support
atmel-hlcdc:
- Add 8-bit color support
vc4:
- Atomic fixes
- New ioctl to attach a label to a buffer object
- HDMI CEC support
- Allow userspace to dictate rendering order on submit ioctl"
* tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux: (1074 commits)
drm/syncobj: Add a signal ioctl (v3)
drm/syncobj: Add a reset ioctl (v3)
drm/syncobj: Add a syncobj_array_find helper
drm/syncobj: Allow wait for submit and signal behavior (v5)
drm/syncobj: Add a CREATE_SIGNALED flag
drm/syncobj: Add a callback mechanism for replace_fence (v3)
drm/syncobj: add sync obj wait interface. (v8)
i915: Use drm_syncobj_fence_get
drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2)
drm/syncobj: Rename fence_get to find_fence
drm: kirin: Add mode_valid logic to avoid mode clocks we can't generate
drm/vmwgfx: Bump the version for fence FD support
drm/vmwgfx: Add export fence to file descriptor support
drm/vmwgfx: Add support for imported Fence File Descriptor
drm/vmwgfx: Prepare to support fence fd
drm/vmwgfx: Fix incorrect command header offset at restart
drm/vmwgfx: Support the NOP_ERROR command
drm/vmwgfx: Restart command buffers after errors
drm/vmwgfx: Move irq bottom half processing to threads
drm/vmwgfx: Don't use drm_irq_[un]install
...
The header comment in include/trace/define_trace.h specifies that the
TRACE_INCLUDE_PATH needs to be relative to the define_trace.h header
rather than the trace file including it. Most instances get that wrong
and work around it by adding the $(src) directory to the include path.
While this works, it is preferable to refer to the correct path to the
trace file in the first place and avoid any workaround.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Even though fini returns 0 always it could theoretically
fail in the future. Might as well return it instead of 0.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Right now there's only one but the rest of the code is being
setup to support more so might as well fix this up too.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we lose the NO_EVICT flag and can try to evict pinned BOs.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only move BOs to the moved/relocated list when they aren't already on a list.
This prevents accidential removal from the evicted list.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This can improve performance for some cases.
v2 (chk): handle all sizes, simplify the patch quite a bit
v3 (chk): adjust dw estimation as well
v4 (chk): use single loop, make end mask 64bit
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()
Remove now useless invalidate_page callback.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make it consistent in style with the other CG/PG enable functions...
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the IOCTL interface so that applications can allocate per VM BOs.
Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.
v2: add some extra checks, remove the WIP tag
v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Per VM BOs are handled like VM PDs and PTs. They are always valid and don't
need to be specified in the BO lists.
v2: validate PDs/PTs first
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't allow them to be GEM imported into another process.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to refer to the parent instead of the root BO for multi
level page tables on Vega10. Also don't set the PDE_PTE bit.
v2: Don't set the PDE_PTE bit either.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This way we can safely call it on SI as well.
v2: fix type in commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The src isn't used any more after GART hack removal.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Keep track off relocated PDs/PTs instead of walking and checking all PDs.
v2: fix root PD handling
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kfree on NULL pointer is a no-op and therefore checking is redundant.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of validating all page tables when one was evicted,
track which one needs a validation.
v2: simplify amdgpu_vm_ready as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Except for the reference count all other members are protected
by the VM PD being reserved.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We changed this to use an extra list a while back, but for the next
series I need a separate flag again.
v2: reorder to avoid unlocked list access
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of using the vm_state use a separate flag to note
that the BO was moved.
v2: reorder patches to avoid temporary lockless access
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows writing data to vram via debugfs.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(v2): Call get_user before holding spinlock.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To allocate additional space for the dynamic cu masks.
Confirmed with the hw team that we only need 1 dword
for the mask. The mask is the same for each SE so
you only need 1 dword.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Confirmed with the hw team. It's the same for all asics.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Those are certainly not kernel allocations, instead set the NO_CPU_ACCESS flag.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stop checking the mapped BO itself, cause that one is
certainly not a page table.
Additional to that move the code into amdgpu_vm.c
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That somehow got lost.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sysfs is more stable, and doesn't require root to access
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add 2 debugfs files, one that contains the VBIOS version, and one that
contains the VBIOS itself. These won't change after initialization,
so we can add the VBIOS version when we parse the atombios information.
This ensures that we can find out the VBIOS version, even when the dmesg
buffer fills up, and makes it easier to associate which VBIOS version is
for which GPU on mGPU configurations. Set the size to 20 characters in
case of some weird VBIOS version that exceeds the expected 17 character
format (3-8-3\0). The VBIOS dump also allows for easy debugging
v2: Move to debugfs, clarify commit message, add VBIOS dump file
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switches the AMDGPU driver over to the TTM tracepoint and removes
our old one. Now you can enable traces before loading the module
and trace all mappings.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v2): Use struct device instead of pci in trace.
Newer versions of the CP firmware require changes in how the driver
initializes the hw block.
Change the firmware name for new firmware to maintain compatibility with
older kernels.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove a redundant identical return statement, it has no use.
Detected by CoverityScan, CID#1454586 ("Structurally dead code")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check memory allocation failure and return -ENOMEM in such a case.
'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.
The calling graph would be, in such a case!
failure in amdgpu_cs_process_syncobj_out_dep()
---> error code returned by amdgpu_cs_dependencies()
--> amdgpu_cs_parser_fini() is called
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.
v2: agd: fix warning
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function is called only once and doesn't do anything special.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use ttm_bo_mem_space instead of manually allocating GART space.
This allows us to evict BOs when there isn't enought GART space any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This isn't used since we don't map evicted BOs to GART any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KIQ doesn't really use the GPU scheduler. The base
drivers generally use the KIQ ring directly rather than
submitting IBs. However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2. KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.
v2: move some expressions to one line
change KIQ sched_hw_submission to at least 16
v3: bump to 256
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the asic specific code into the IP modules.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.
This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode. Change the default size based on
the asic type.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.
BUG: SWDEV-129274
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Correctly detect system memory mappings when using CPU and don't use
huge pages for them.
Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function has far more in common with drm_syncobj_find than with
any in the get/put functions.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Remove a redundant identical return statement, it has no use.
Detected by CoverityScan, CID#1454586 ("Structurally dead code")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check memory allocation failure and return -ENOMEM in such a case.
'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.
The calling graph would be, in such a case!
failure in amdgpu_cs_process_syncobj_out_dep()
---> error code returned by amdgpu_cs_dependencies()
--> amdgpu_cs_parser_fini() is called
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.
v2: agd: fix warning
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function is called only once and doesn't do anything special.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use ttm_bo_mem_space instead of manually allocating GART space.
This allows us to evict BOs when there isn't enought GART space any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This isn't used since we don't map evicted BOs to GART any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KIQ doesn't really use the GPU scheduler. The base
drivers generally use the KIQ ring directly rather than
submitting IBs. However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2. KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.
v2: move some expressions to one line
change KIQ sched_hw_submission to at least 16
v3: bump to 256
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the asic specific code into the IP modules.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.
This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode. Change the default size based on
the asic type.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.
BUG: SWDEV-129274
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Correctly detect system memory mappings when using CPU and don't use
huge pages for them.
Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is the amdkfd pull request for 4.14 merge window.
AMD has started cleaning the pipe and sending patches from their internal
development to the upstream community.
The plan as I understand it is to first get all the non-dGPU patches to
upstream and then move to upstream dGPU support.
The patches here are relevant only for Kaveri and Carrizo.
The following is a summary of the changes:
- Add new IOCTL to set a Scratch memory VA
- Update PM4 headers for new firmware that support scratch memory
- Support image tiling mode
- Remove all uses of BUG_ON
- Various Bug fixes and coding style fixes
* tag 'drm-amdkfd-next-2017-08-18' of git://people.freedesktop.org/~gabbayo/linux: (24 commits)
drm/amdkfd: Implement image tiling mode support v2
drm/amdgpu: Add kgd kfd interface get_tile_config() v2
drm/amdkfd: Adding new IOCTL for scratch memory v2
drm/amdgpu: Add kgd/kfd interface to support scratch memory v2
drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID
drm/amd: Update MEC HQD loading code for KFD
drm/amdgpu: Disable GFX PG on CZ
drm/amdkfd: Update PM4 packet headers
drm/amdkfd: Clamp EOP queue size correctly on Gfx8
drm/amdkfd: Add more error printing to help bringup v2
drm/amdkfd: Handle remaining BUG_ONs more gracefully v2
drm/amdkfd: Allocate gtt_sa_bitmap in long units
drm/amdkfd: Fix doorbell initialization and finalization
drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
drm/amdkfd: Remove usage of alloc(sizeof(struct...
drm/amdkfd: Fix goto usage v2
drm/amdkfd: Change x==NULL/false references to !x
drm/amdkfd: Consolidate and clean up log commands
drm/amdkfd: Clean up KFD style errors and warnings v2
drm/amdgpu: Remove hard-coded assumptions about compute pipes
...
mmVGT_INDEX_TYPE has no default value, need to make sure
it's initialized when gfx is initialized.
Signed-off-by: Ken Wang <Ken.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow overrides on the command line.
v2: agd: sqaush in spelling fix and bogus default value warning
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
adds fragment_size in the vm_manager structure and
implements hardware setup for it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That better describes what happens here with the BO.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Split that into vm_bo_base and bo_va to allow other uses as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just add the flags to the addr field as well.
v2: add some more comments that the flag is for huge pages.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We now properly kmap all BOs after validation.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the CSA bo_va from the VM to the fpriv structure.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The shadow handling isn't implemented completely for userspace BOs and
the kernel sets the VRAM_CONTIGUOUS as necessary.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update the list first to avoid redundant checks.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Looks like a better place for this.
v2: use atomic64_t members instead
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It doesn't make much sense to count those numbers twice.
v2: use and atomic64_t instead
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of the separate switch/case in the calling function.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The BO manager has its own lock and doesn't use the lru_lock.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Provide the drm printer directly instead of just the callback.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This helps map DMA addresses back to physical addresses.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v2): Added tracepoints for USERPTR, SG mappings, and
SWIOTBL mappings. Reformatted trace call perform
PCI decoding internal to the trace.
(v3): Add unmap tracepoints as well
(v4): Move traces into separate functions
Those values weren't correct. This should result in quite some speedup.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to do this on every CS.
v2: remove all other bind, reorder code
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This should save us a bunch of command submission overhead.
v2: move the LRU move to the right place to avoid the move for the root BO
and handle the shadow BOs as well. This turned out to be a bug fix because
the move needs to happen before the kmap.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
More features for 4.14. Nothing too major here. I have a few more additional
patches for large page support in vega10 among other things, but they require
some resevation object patches from drm-misc-next, so I'll send that request
once you've pulled the latest drm-misc-next. Highlights:
- Fixes for ACP audio on stoney
- SR-IOV fixes for vega10
- various powerplay fixes
- lots of code clean up
* 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux: (62 commits)
drm/amdgpu/gfx7: fix function name
drm/amd/amdgpu: Disabling Power Gating for Stoney platform
drm/amd/amdgpu: Added a quirk for Stoney platform
drm/amdgpu: jt_size was wrongly counted twice
drm/amdgpu: fix missing endian-safe guard
drm/amdgpu: ignore digest_size when loading sdma fw for raven
drm/amdgpu: Uninitialized variable in amdgpu_ttm_backend_bind()
drm/amd/powerplay: fix coding style in hwmgr.c
drm/amd/powerplay: refine dmesg info under powerplay.
drm/amdgpu: don't finish the ring if not initialized
drm/radeon: Fix preferred typo
drm/amdgpu: Fix preferred typo
drm/radeon: Fix stolen typo
drm/amdgpu: Fix stolen typo
drm/amd/powerplay: fix coccinelle warnings in vega10_hwmgr.c
drm/amdgpu: set gfx_v9_0_ip_funcs as static
drm/radeon: switch to drm_*{get,put} helpers
drm/amdgpu: switch to drm_*{get,put} helpers
drm/amd/powerplay: add CZ profile support
drm/amd/powerplay: fix PSI not enabled by kmd
...
v2:
* Shortened headline
* Removed write_config_static_mem, it gets initialized by gfx_v?_0_gpu_init
* Renamed alloc_memory_of_scratch to set_scratch_backing_va
* Made set_scratch_backing_va a void function
* Documented set_scratch_backing in kgd_kfd_interface.h
Signed-off-by: Moses Reuben <moses.reuben@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This register only has a single instance in the hardware. Its value
applies to all VMIDS.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Various bug fixes and improvements that accumulated over the last two
years.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
It's causing problems with user mode queues and the HIQ, and can
lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Using checkpatch.pl -f <file> showed a number of style issues. This
patch addresses as many of them as possible. Some long lines have been
left for readability, but attempts to minimize them have been made.
v2: Broke long lines in gfx_v7 get_fw_version
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Remove hard-coded assumption that the first compute pipe is
reserved for amdgpu. Pipe 0 actually means pipe 0 now.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Was using the wrong prefix (gmc rather than gfx). The function
is related to the gfx hw, not gmc. This also makes it consistent
with the naming in gfx8.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Power Gating is disabled in Stoney platform.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added DW_I2S_QUIRK_16BIT_IDX_OVERRIDE quirk for Stoney.
Supported format and bus width for I2S controller read
from I2S Component Parameter registers.
These are ready only registers.
For Stoney, I2S Component Parameter registers are programmed
to support 32 bit format and 4 bytes bus width only.
By setting this quirk,It will override 32 bit format with
16 bit format and 2 bytes as bus width for Stoney.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
digest_size has been retired from sdma v4 fw
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
My static checker complains that it's possible for "r" to be
uninitialized. It used to be set to zero so this returns it to the old
behavior.
Fixes: 98a7f88ce9 ("drm/amdgpu: bind BOs with GTT space allocated directly v2")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a ring is not initialized, it also should not be finished.
For example, in Vega10's SR-IOV environment, UVD's decode ring is not
initialized, but will be finnished in amdgpu_uvd_sw_fini, because UVD
driver put all the uvd decode ring's finish operation into
amdgpu_uvd_sw_fini function, while not uvd_vXXX_0_sw_fini. This will
lead to amdgpu module unloading failure.
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change "prefered" to "preferred"
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change "stollen" to "stolen"
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We won't use this member in other files, so set it static.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm_*_reference() and drm_*_unreference() functions are just
compatibility alias for drm_*_get() and drm_*_put() and should not be
used by new code. So convert all users of compatibility functions to use
the new APIs.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Cihangir Akturk <cakturk@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
May waste a bit of memory, but simplifies the interface
significantly.
v2: convert internal accounting to use 256bit slots
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are no external users of function amdgpu_atif_handler so it can
be static.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Include a missing header to get rid of the following warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:65:6: warning: no previous prototype for ‘amdgpu_pm_acpi_event_handler’ [-Wmissing-prototypes]
void amdgpu_pm_acpi_event_handler(struct amdgpu_device *adev)
^
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Include a missing header to get rid of the following warning:
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c:521:6: warning: no previous prototype for 'dce_v6_0_disable_dce' [-Wmissing-prototypes]
void dce_v6_0_disable_dce(struct amdgpu_device *adev)
^
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As I was staring at the si_init_golden_registers code, I noticed that
the Pitcairn initialization silently falls through the Cape Verde
initialization, and the Oland initialization falls through the Hainan
initialization. However there is no comment stating that this is
intentional, and the radeon driver doesn't have any such fallthrough,
so I suspect this is not supposed to happen.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 62a3755341 ("drm/amdgpu: add si implementation v10")
Cc: Ken Wang <Qingqing.Wang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Marek Olšák" <maraeo@gmail.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Move it up before ring enablement with all of the other
engine setup and explicitly disable it for bare metal.
Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We already allocate this as part of the ring structure,
use that instead.
Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The wb buffer is in system memory, not vram so the flush
is useless.
Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No functional change until wptr polling uses this
location (future patch).
v2: use WRITE_ONCE
Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kfd2kgd is device-specific, so it should not be a global variable.
Merge amdgpu_amdkfd_load_interface and amdgpu_amdkfd_device_probe
so that it's only needed as a local variable in one function.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
nbio registers are not used in this file.
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The gfxhub and mmhub code are now helpers for gmc rather
than standalone IPs. When that changes these were left
over. Remove them.
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use a lower case b to be consistent with the other wb functions.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We achieved that by setting S(SYSTEM) and P(PDE as PTE) bit to 1 for
PDEs and setting S bit to 1 for PTEs when the corresponding addresses
are not occupied by gpu driver allocated buffers.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The parameter init_value contains the value to which we initialized
VRAM bo when AMDGPU_GEM_CREATE_VRAM_CLEARED flag is set.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saves us even more loc.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saves us quite a bunch of loc.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Same as amdgpu_bo_create_kernel, but keeps the BO reserved.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make allocating the new BO optional.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Save some memory because only one of those is used at all times.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>