Commit Graph

2043 Commits

Author SHA1 Message Date
Monk Liu
9cfb06920e drm/amdgpu: fix memory leak during TDR test(v2)
fix system memory leak

v2:
fix coding style

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:25 -05:00
Evan Quan
c16904b0f3 drm/amd/powerplay: correct the way for checking SMU_FEATURE_BACO_BIT support
Since 'smu_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT)' will always return
false considering the 'smu_system_features_control(smu, false)' disabled
all SMU features.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-13 16:16:45 -05:00
Kent Russell
00151afc6f drm/powerplay: Ratelimit PP_ASSERT warnings
In certain situations the message could be reported dozens-to-hundreds of
times, based on how often the function is called.
E.g. If MCLK DPM, any calls to get/set MCLK will result in a failure
message, potentially flooding dmesg. Ratelimit the warnings to avoid
this flood.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:31 -05:00
Evan Quan
79275af61e drm/amd/powerplay: always refetch the enabled features status on dpm enablement
Otherwise, the cached dpm features status may be inconsistent under some
case(e.g. baco reset of Navi asic).

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:18 -05:00
Alex Deucher
d7c7195466 drm/amdgpu/powerplay: fix baco check for vega20
We need to handle the runpm case as well as GPU reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:58 -05:00
Alex Deucher
5d8b936df2 drm/amdgpu/smu: properly handle runpm/suspend/reset
We need some special handling when using baco vs. other
things.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:55 -05:00
Alex Deucher
0a28eee97b drm/amdgpu:/navi10: use the ODCAP enum to index the caps array
Rather than the FEATURE_ID flags.  Avoids a possible reading past
the end of the array.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reported-by: Aleksandr Mezin <mezin.alexander@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:49 -05:00
Alex Deucher
17b9998441 drm/amdgpu: update smu_v11_0_pptable.h
Update to the latest changes.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:44 -05:00
Prike Liang
6a52d4641c drm/amd/powerplay: suppress nonsupport profile mode overrun message
SMU12 not support WORKLOAD_DEFAULT_BIT and WORKLOAD_PPLIB_POWER_SAVING_BIT.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:22 -05:00
Evan Quan
54c96f8679 drm/amd/powerplay: update smu11_driver_if_navi10.h
To pair the latest SMU firmwares.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:14 -05:00
Jack Zhang
4bcbc25ce7 drm/amdgpu/sriov set driver_table address in VF
With the recent patch to unify VRAM address for driver
table(a83f82e). VF cannot dump table info any more because
SMU_MSG_SetDriverDramAddrHigh/Low were deleted in the
function of smu_update_table.

Therefore, VF also needs to set driver_table address
in smu_hw_init to fix this regression issue.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:48:58 -05:00
Evan Quan
4a6f8f01ef drm/amd/powerplay: handle features disablement for baco reset in SMU FW
SMU FW will handle the features disablement for baco reset
on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:43 -05:00
Jack Zhang
86b93fd62d drm/amdgpu/sriov Don't send msg when smu suspend
For sriov and pp_onevf_mode, do not send message to set smu
status, because smu doesn't support these messages under VF.

Besides, it should skip smu_suspend when pp_onevf_mode is disabled.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:24 -05:00
Alex Deucher
7b913a76a6 drm/amdgpu: update default voltage for boot od table for navi1x
It needed to be updated as well so it will show the proper values
if you reset to the defaults.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:38 -05:00
Alex Deucher
1064ad4aee drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage
Cull out 0 clocks to avoid a warning in DC.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/963
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 10:37:52 -05:00
Alex Deucher
4d0a72b660 drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency
Only send non-0 clocks to DC for validation.  This mirrors
what the windows driver does.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/963
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 10:37:52 -05:00
Alex Deucher
0531aa6eb3 drm/amdgpu: fetch default VDDC curve voltages (v2)
Ask the SMU for the default VDDC curve voltage values.  This
properly reports the VDDC values in the OD interface.

v2: only update if the original values are 0

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.5.x
2020-02-04 10:37:52 -05:00
Matt Coffin
93c5f1f66c drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2)
Previously, the syfs functionality for restoring the default powerplay
table was sourcing it's information from the currently-staged powerplay
table.

This patch adds a step to cache the first overdrive table that we see on
boot, so that it can be used later to "restore" the powerplay table

v2: sqaush my original with Matt's fix

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.5.x
2020-02-04 10:37:51 -05:00
Alex Deucher
ee23a518fd drm/amdgpu/navi10: add OD_RANGE for navi overclocking
So users can see the range of valid values.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.5.x
2020-02-04 10:37:51 -05:00
Alex Deucher
45826e9c4e drm/amdgpu/navi: fix index for OD MCLK
You can only adjust the max mclk, not the min.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1020
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.5.x
2020-02-04 10:37:51 -05:00
Evan Quan
1cf8c930b3 drm/amd/powerplay: fix navi10 system intermittent reboot issue V2
This workaround is needed only for Navi10 12 Gbps SKUs.

V2: added SMU firmware version guard

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-04 10:37:08 -05:00
Alex Deucher
e0d5322c29 drm/amdgpu/navi10: add mclk to navi10_get_clock_by_type_with_latency
Doesn't seem to be used, but add it just in case.

Reviewed-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:43 -05:00
Colin Ian King
269a0bf79b drm/amd/powerplay: fix spelling mistake "Attemp" -> "Attempt"
There are several spelling mistakes in PP_ASSERT_WITH_CODE messages.
Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
zhengbin
a16afcdd8c drm/amd/powerplay: use true, false for bool variable in smu7_hwmgr.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:723:2-50: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:733:3-52: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:747:3-51: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Jack Zhang
1291bd70a2 drm/amdgpu/sriov skip the update of SMU_TABLE_ACTIVITY_MONITOR_COEFF
There's no need to dump ACTIVITY_MONITOR_COEFF under VF.
Therefore, Skip the update of SMU_TABLE_ACTIVITY_MONITOR_COEFF
under SRIOV VF.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:45 -05:00
Aaron Liu
6ca476bab8 drm/amd/powerplay: update SMU12_DRIVER_IF_VERSION to 11
This patch updates SMU12_DRIVER_IF_VERSION to 11.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
40c9e7b578 drm/amdgpu/powerplay: fix warning in smu_v11_0.c
Cast to make min() happy.  The values are well within
range.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Kenneth Feng
b1ffd1e309 drm/amd/powerplay: sw ctf for arcturus
change the sw ctf setting to smu_v11_0_set_thermal_range()
since software_shutdown_temp shares the same definition and
name in all the smu11 project.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Evan Quan
a64c9e15e6 drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU
Provided an unified entry point. And fixed the confusing that the API
usage is conflict with what the naming implies.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:54 -05:00
Evan Quan
e0aa4a92f7 drm/amd/powerplay: issue proper hdp flush for table transferring
Guard the content consistence between the view of GPU and CPU
during the table transferring.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:48 -05:00
Evan Quan
29a4596064 drm/amd/powerplay: refine code to support no-dpm case
With "dpm=0", there will be no DPM enabled. The code
needs to be refined to support this.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:40 -05:00
Evan Quan
ce0d0ec339 drm/amd/powerplay: unified VRAM address for driver table interaction with SMU V2
By this, we can avoid to pass in the VRAM address on every table
transferring. That puts extra unnecessary traffics on SMU on
some cases(e.g. polling the amdgpu_pm_info sysfs interface).

V2: document what the driver table is for and how it works

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:32 -05:00
Evan Quan
9fa1ed5bf6 drm/amd/powerplay: cache the watermark settings on system memory
So that we do not need to allocate a piece of VRAM for it. This
is a preparation for coming change which unifies the VRAM address
for all driver tables interaction with SMU.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:15 -05:00
Kevin Wang
d5ec4b4568 drm/amdgpu/smu: custom pstate profiling clock frequence for navi series asics
add navi10 & navi14 pstate profiling clock value support.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 12:01:09 -05:00
Evan Quan
a210d69872 drm/amd/powerplay: add smu11_driver_if_arcturus.h new OOB members
This is to fit the latest SMC firmware and it's backward compatible.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:30 -05:00
Jack Zhang
895bd048fb amd/amdgpu/sriov tdr enablement with pp_onevf_mode
Under sriov and pp_onevf mode,
1.take resume instead of hw_init for smc recover to avoid
potential memory leak.

2.add return condition inside smc resume function for
sriov_pp_onevf_mode and pm_enabled param.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:19 -05:00
Jack Zhang
c2a801af31 amd/amdgpu/sriov enable onevf mode for ARCTURUS VF
Before, initialization of smu ip block would be skipped
for sriov ASICs. But if there's only one VF being used,
guest driver should be able to dump some HW info such as
clks, temperature,etc.

To solve this, now after onevf mode is enabled, host
driver will notify guest. If it's onevf mode, guest will
do smu hw_init and skip some steps in normal smu hw_init
flow because host driver has already done it for smu.

With this fix, guest app can talk with smu and dump hw
information from smu.

v2: refine the logic for pm_enabled.Skip hw_init by not
changing pm_enabled.
v3: refine is_support_sw_smu and fix some indentation
issue.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:10 -05:00
Evan Quan
6a876844e4 drm/amd/powerplay: retrieve the enabled feature mask from cache
This is why those feature mask members designed for. And this
can reduce the SMU workload.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:58:03 -05:00
Evan Quan
e42877b8ba drm/amd/powerplay: avoid deadlock on Vega20 swSMU routine
The lock required was already hold by its parent API.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:54 -05:00
Kevin Wang
e4b613e0b2 drm/amdgpu/smu: add helper function smu_get_dpm_level_range() for smu driver
this function can help smu driver to query dpm level clock range from
smu firmware.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:41 -05:00
Likun Gao
e78adc5a1d drm/amdgpu/powerplay: fix NULL pointer issue when SMU disabled
Fix smu related NULL pointer issue which occurs when SMU is disabled.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:28 -05:00
Kevin Wang
d2f925ff9a drm/amdgpu/smu: use unified variable smu->is_apu to check apu asic platform
use unified variable smu->is_apu to check apu asic in smu driver.

related patch:
drm/amd/powerplay: bypass dpm_context null pointer check guard for some
smu series

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:57:22 -05:00
zhengbin
e1ab862d89 drm/amd/powerplay: use true, false for bool variable in vega20_hwmgr.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c:875:1-31: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:56:16 -05:00
Alex Deucher
337443d0e2 drm/amdgpu/smu: make the set_performance_level logic easier to follow
Have every asic provide a callback for this rather than a mix
of generic and asic specific code.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:56:09 -05:00
Evan Quan
18d7ab9899 drm/amd/powerplay: support custom power profile setting
Support custom power profile mode settings on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07 11:55:18 -05:00
Alex Deucher
468288863e drm/amdgpu/smu: add peak profile support for navi12
Add defined peak sclk for navi12 peak profile mode.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:57 -05:00
Alex Deucher
d24d26540b drm/amdgpu/smu/navi: Adjust default behavior for peak sclk profile
Fetch the sclk from the pptable if there is no specified sclk for
the board.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:57 -05:00
Yintian Tao
fe8a87d71f drm/amd/powerplay: skip disable dynamic state management
Under sriov, the disable operation is no allowed.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-19 10:10:05 -05:00
Alex Deucher
0371e2fba4 drm/amdgpu/smu: add metrics table lock for vega20 (v2)
To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:14 -05:00
Alex Deucher
ed09a629bb drm/amdgpu/smu: add metrics table lock for renoir (v2)
To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:13 -05:00