some registers are PF & VF copy, and we can safely use
mmio method to access them.
and sometime we are forbid to use kiq to access registers
for example in INTR context.
we need a MACRO that always disable KIQ for regs accessing
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change-Id: Ica8f86577a50d817119de4b4fb95068dc72652a9
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
By using ttm_bo_init_reserved instead of the manual initialization of
the reservation object, the reservation lock will be properly unlocked
and destroyed when the TTM BO initialization fails.
Actual deadlocks caused by the missing unlock should have been fixed
by "drm/ttm: never add BO that failed to validate to the LRU list",
superseding the flawed fix in commit 38fc4856ad ("drm/amdgpu: fix
a potential deadlock in amdgpu_bo_create_restricted()").
This change fixes remaining recursive locking errors that can be seen
with lock debugging enabled, and avoids the error of freeing a locked
mutex.
As an additional minor bonus, buffers created with resv == NULL and
the AMDGPU_GEM_CREATE_VRAM_CLEARED flag are now only added to the
global LRU list after the fill commands have been issued.
v2: use amdgpu_bo_unreserve instead of ttm_bo_unreserve
Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This variant of ttm_bo_init returns the validated buffer object with
the reservation lock held when resv == NULL. This is convenient for
callers that want to use the BO immediately, e.g. for initializing its
contents.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the comment says: callers of ttm_bo_init cannot rely on having the
only reference to the BO when the function returns successfully.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 38fc4856ad, which
introduces a use-after-free.
The underlying bug should be properly fixed with "drm/ttm: never add BO
that failed to validate to the LRU list".
Cc: zhoucm1 <david1.zhou@amd.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes a potential race condition in amdgpu that looks as follows:
Task 1: attempt ttm_bo_init, but ttm_bo_validate fails
Task 1: add BO to global list anyway
Task 2: grabs hold of the BO, waits on its reservation lock
Task 1: releases its reference of the BO; never gives up the
reservation lock
The patch "drm/amdgpu: fix a potential deadlock in
amdgpu_bo_create_restricted()" attempts to fix that by releasing
the reservation lock in amdgpu code; unfortunately, it introduces
a use-after-free when this race _doesn't_ happen.
This patch should fix the race properly by never adding the BO
to the global list in the first place.
Cc: zhoucm1 <david1.zhou@amd.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This includes shader/memory clocks, temperature, GPU load, etc.
v2: - add sub-queries for AMDPGU_INFO_GPU_SENSOR_*
- do not break the ABI
v3: - return -ENOENT when amdgpu_dpm == 0
- expose more sensor queries
v4: - s/GPU_POWER/GPU_AVG_POWER/
- improve VDDNB/VDDGFX query description
- fix amdgpu_dpm check
v5: - agd: fix warning
v6: - agd: bump version
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
read_sensor() has been recently implemented for dpm based boards
which means amdgpu_sensors can now be exposed.
v2: - make sure read_sensor is not NULL on dpm chips
- keep sanity check for powerplay chips
v3: - make sure amdgpu_dpm != 0
Cc: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the GPU temperature, the shader clock and eventually the
memory clock (as well as the GPU load on CI). The main goal is
to expose this info to the userspace like Radeon.
v2: - add AMDGPU_PP_SENSOR_GPU_LOAD on CI
- update the commit description
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set alignment mode to unaligned on CIK to align with amdgpu. This is
needed for unaligned loads to work properly in mesa. The current setting
requires dword alignment.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: new approach fixing this by registering a fence callback for
all users of the VM on teardown
v3: agd: rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't assume kmalloc will always succeed.
v2: agd: rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When two VMs stop using PRT support at the same time we might
not disable it in the right order otherwise.
v2: agd: rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Those should be 64bit, even on a 32bit system.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This update allows sensors to return more than 1 value and
indicates to the caller how many bytes are written.
The debugfs interface has been updated to handle reading all
of the values. Simply seek to the enum value (multiplied
by 4) and then read as many bytes as the sensor provides.
(v2): Don't set size to 4 before reading GPU_POWER
(v3): agd: rebase
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable/disable the handling globally for now and
print a warning when we enable it for the first time.
v2: set correct register
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable/disable the handling globally for now and
print a warning when we enable it for the first time.
v2: set correct register
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable/disable the handling globally for now and
print a warning when we enable it for the first time.
v2: write to the correct register, adjust bits to that hw generation
v3: fix compilation, add the missing register bit definitions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Till GFX8 we can only enable PRT support globally, but with the next hardware
generation we can do this on a per page basis.
Keep the interface consistent by adding PRT mappings and enable
support globally on current hardware when the first mapping is made.
v2: disable PRT support delayed and on all error paths
v3: PRT and other permissions are mutal exclusive,
PRT mappings don't need a BO.
v4: update PRT mappings durign CS as well, make va_flags 64bit
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Future hardware generations can handle PRT flags on a per page basis,
but current hardware can only turn it on globally.
Add the basic handling for both, a global callback to enable/disable
triggered by setting a per mapping flag.
v2: agd: rebase fixes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For PRT support we need mappings which aren't backed by any memory.
v2: fix parameter checking
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
no suspend invoked so after VF FLR by host, we just
call hw_init to reinitialize IPs.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The additional output are:
vddc power in Watt;
vddci power in Watt;
max gpu power in Watt;
average gpu power in Watt.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As well as fix print format for uint32_t type.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the sysfs entries pp_gfx_power_profile and
pp_compute_power_profile which give user a way to set
power profile through parameters minimum sclk, minimum mclk,
activity threshold, up hysteresis and down hysteresis only
when the entry power_dpm_force_performance_level is in
default value "auto". It is read and write. Example:
echo 500 800 20 0 5 > /sys/class/drm/card0/device/pp_*_power_profile
cat /sys/class/drm/card0/device/pp_*_power_profile
500 800 20 0 5
Note: first parameter is sclk in MHz, second is mclk in MHz,
third is activity threshold in percentage, fourth is up hysteresis
in ms and fifth is down hysteresis in ms.
echo set > /sys/class/drm/card0/device/pp_*_power_profile
To set power profile state if it exists.
echo reset > /sys/class/drm/card0/device/pp_*_power_profile
To restore default state and clear previous setting.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this lock is used for sriov_gpu_reset, only get this mutex
can run into sriov_gpu_reset.
we have couple source triggers gpu_reset for SRIOV:
1) submit timedout and trigger reset voluntarily
2) invalid instruction detected by ENGINE and trigger reset voluntarily
2) hypervisor found world switch hang and trigger flr and notify guest to
do reset.
all need take care and we need a mutex to protect the consistency of
reset routine.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
implement SRIOV gpu_reset for future use.
it wil be called from:
1) job timeout
2) privl access or instruction error interrupt
3) hypervisor detect VF hang
v2: agd: rebase on upstream
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sw part only invoked once during sw_init.
hw part invoked during first drv load and resume later.
that way we cannot alloc mqd in hw/resume, we only keep
mqd allocted in sw_init routine.
and hw_init routine only kmap and set it.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We ultimately want to re-use this for bare metal,
so no need to have vf checks in the KIQ code itself
since kiq itself is currently only used in VF cases.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is for SRIOV fix:
mqd soft init/fini will be invoked by sw_init to
allocate BO for compute MQD resource, instead of
original scheme that hw_init allocates MQD.
because if hw_init allocates MQD, then resume will
allocate MQD, and that lead to memory leak after
driver recovered from hang.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
introduce a new mqd member in ring is for later usage.
we need keep a clean version of MQD for the purpose
of recovering compute rings from hang.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CG & PG function changes engine clock/gating, which is
not appropriate for VF device, because one vf doesn't know
the whole picture of engine's overall workload.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CPU is not efficient to clean framebuffer especially under
virtualization, then loading driver takes long time which causes
timeout of mailbox handshake.
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ib_pool init should prior to fbdev_init, otherwise
there will be error from amdgpu_sa_bo_new
(amdgpu_sa.c:323)
fbdev_init will call ttm_validate which further call
amdgpu_sa_bo_new.
v2:
move fbdev_init behind ib test.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VF uses KIQ to access registers. When VM fault occurs, the driver
can't get back the fence of KIQ submission and runs into CPU soft
lockup.
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When multiple VFs try to enter exclusive mode at the same time, the
looping mechansim doesn't help to ensure each can get it because it
only loops active VFs, then the last one has to wait for a long
interval.
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Xiangliang.Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>