Commit Graph

5876 Commits

Author SHA1 Message Date
Andrey Grodzovsky
f2bd8a0ed7 drm/amdgpu: Fix amdgpu_display_supported_domains logic.
Add restriction to dissallow GTT domain if the relevant BO
doesn't have USWC flag set to avoid the APU hang scenario.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
a3a09142f4 drm/amdgpu: put the SMC into the proper state on reset/unload
When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:44 -05:00
Alex Deucher
2ddc6c3ef9 drm/amdgpu: add reset_method asic callback for navi
Navi uses either mode1 or baco depending on various
conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:17 -05:00
Alex Deucher
ee360c0b7c drm/amdgpu: add reset_method asic callback for soc15
APUs only support mode2 reset.  dGPUs use either mode1 or
baco depending on various conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:13 -05:00
Alex Deucher
9bc1932f5c drm/amdgpu: add reset_method asic callback for vi
VI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:10 -05:00
Alex Deucher
6d0f50dafe drm/amdgpu: add reset_method asic callback for cik
CIK always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:06 -05:00
Alex Deucher
dd81eede77 drm/amdgpu: add reset_method asic callback for si
SI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:56 -05:00
Alex Deucher
0cf3c64f29 drm/amdgpu: add an asic callback to determine the reset method
Sometimes the driver may have to behave differently depending
on the method we are using to reset the GPU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:44 -05:00
Evan Quan
f0d2a7dc11 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:29 -05:00
Evan Quan
fcd90fee8a drm/amd/powerplay: minor fixes around SW SMU power and fan setting
Add checking for possible invalid input and null pointer. And
drop redundant code.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:21 -05:00
Shirish S
1c42591591 drm/amd/display: enable S/G for RAVEN chip
enables gpu_vm_support in dm and adds
AMDGPU_GEM_DOMAIN_GTT as supported domain

v2:
Move BO placement logic into amdgpu_display_supported_domains

v3:
Use amdgpu_bo_validate_uswc in amdgpu_display_supported_domains.

v4:
amdgpu_bo_validate_uswc moved to sepperate patch.

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:12 -05:00
Andrey Grodzovsky
ddcb7fc62f drm/amdgpu: Add check for USWC support for amdgpu_display_supported_domains
This verifies we don't add GTT as allowed domain for APUs when USWC
is disabled.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:55 -05:00
Andrey Grodzovsky
3d1b8ec76b drm/amdgpu: Create helper to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC
Move the logic to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC in
amdgpu_bo_do_create into standalone helper so it can be reused
in other functions.

v4:
Switch to return bool.

v5: Fix typos.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:48 -05:00
Andrey Grodzovsky
e4c4073b01 drm/amdgpu: Fix hard hang for S/G display BOs.
HW requires for caching to be unset for scanout BO
mappings when the BO placement is in GTT memory.
Usually the flag to unset is passed from user mode
but for FB mode this was missing.

v2:
Keep all BO placement logic in amdgpu_display_supported_domains

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:41 -05:00
Jonathan Kim
24f9aacfb0 drm/amdgpu: adding xgmi error monitoring
monitor xgmi errors via mc pie status through fica registers.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:34 -05:00
Jonathan Kim
64671c0fdc drm/amdgpu: add perfmon and fica atomics for df
adding perfmon and fica atomic operations to adhere to data fabrics finite
state machine requirements for indirect register access.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:26 -05:00
tiancyin
5f4814deab drm/amdgpu/gmc10: fix pte mytpe field error for navi14
navi14 share same PTE format with navi10.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:35 -05:00
James Zhu
6913848087 drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:28 -05:00
Evan Quan
780f3a9c5b drm/amd/powerplay: some cosmetic fixes
Drop redundant check, duplicate check, duplicate setting
and fix the return value.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:13 -05:00
Chuhong Yuan
911d8b3069 drm/amdgpu: Use dev_get_drvdata where possible
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:18:41 -05:00
Kent Russell
4cab85afe9 drm/amdkfd: Fix byte align on VegaM
This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:17:35 -05:00
Alex Deucher
95ccc15508 drm/amdgpu/smu: move fan rpm query into the asic specific code
On vega20, there is an SMU message to query it.  On navi, it's fetched
from the metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:39 -05:00
Hawking Zhang
d52d6de280 drm/amdgpu: set sdma irq src num according to sdma instances
Otherwise, it will cause driver access non-existing sdma registers
in gpu reset code path

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:31 -05:00
Leo Liu
53ef3969dd drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
33c976c961 drm/amdgpu: drop ras self test
this function is not needed any more. error injection is
the only way to validate ras but it can't be executed in
amdgpu_ras_init, where gpu is even not initialized

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
a5dd40ca81 drm/amdgpu: only allow error injection to UMC IP block
error injection to other IP blocks (except UMC) will be enabled
until RAS feature stablize on those IP blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
4d249d3abd drm/amdgpu: disable GFX RAS by default
GFX RAS has not been stablized yet. disable GFX ras until
it is fully funcitonal.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
fb2a36075a drm/amdgpu: do not create ras debugfs/sysfs node for ASICs that don't have ras ability
driver shouldn't init any ras debugfs/sysfs node for ASICs that don't have ras
hardware ability

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Joseph Greathouse
fbdc5d8d84 drm/amdgpu: Default disable GDS for compute VMIDs
The GDS and GWS blocks default to allowing all VMIDs to
access all entries. Graphics VMIDs can handle setting
these limits when the driver launches work. However,
compute workloads under HWS control don't go through the
kernel driver. Instead, HWS firmware should set these
limits when a process is put into a VMID slot.

Disable access to these devices by default by turning off
all mask bits (for OA) and setting BASE=SIZE=0 (for GDS
and GWS) for all compute VMIDs. If a process wants to use
these resources, they can request this from the HWS
firmware (when such capabilities are enabled). HWS will
then handle setting the base and limit for the process when
it is assigned to a VMID.

This will also prevent user kernels from getting 'stuck' in
GWS by accident if they write GWS-using code but HWS
firmware is not set up to handle GWS reset. Until HWS is
enabled to handle GWS properly, all GWS accesses will
MEM_VIOL fault the kernel.

v2: Move initialization outside of SRBM mutex

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
a08a4dae7a drm/amdgpu: flag arcturus as experimental for now
Current support will only work in internal engineering
boards.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
ad91b134a2 drm/amdgpu: drop unused function definitions
These were dropped and the headers never got cleaned up.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
James Zhu
1da418ba65 drm/amdgpu:add all VCN rings into schedule request queue
Add all VCN instances' decode/encode/jpeg decode rings into
drm_sched_rq list.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Le Ma
69d4de94f8 drm/amdgpu: enable all 8 sdma instances for Arcturus silicon
The more 6 sdma instances work fine now with DF fix in vbios:
  * mmDF_PIE_AON_MiscClientsEnable(0x1c728)=0x3fe(DF_ALL_INSTANCE)
       [9:4]MmhubsEnable=3f (change from 0)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Yong Zhao
5ddd4a9a7c drm/amdgpu: Add more detail to the VM fault printing
With the printing, we don't need to parse the value on our own any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Le Ma
fc1e272e8d drm/amdgpu: limit sdma instances to 2 for Arcturus in BU phase
Another 6 sdma instances do not work at present. Disable them to unblock KFD
for silicon bringup as a workaround

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
f9cf36fcaf drm/amdgpu: skip gfx 9 common golden settings for arct
They are not needed by arct

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Yong Zhao
54bd77f3d0 amd/powerplay: No SW XGMI dpm for Arcturus rev 2
xgmi dpm is handled by the SMU.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Le Ma
a80955176d drm/amdgpu: clean up nonexistent firmware declaration for Arcturus
CPG firmwares are not used.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
22f5ea4ca0 drm/amdgpu: init gds config for arct
arct has 4KB gds (4 banks inside) so the max_wave_id
should be 0xfff

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
bfa3a9bb98 drm/amdgpu: keep stolen memory for arct
Any dce register read back from arct is invalid. use hard code
stolen memory for arct until we validate the s3.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
d57c3d5634 drm/amdgpu: init arct external rev id
Properly set the external silicon revision id.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
582870de56 drm/amdgpu: add arct gc golden settings
Golden GC register settings from the hw team.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
ca1961a2f5 drm/amdgpu: add arct sdma golden settings
Golden SDMA register settings from the hw team.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Feifei Xu
48c69cda45 drm/amdgpu: add pci DID for Arcturus GL-XL.
Add device ids for Arcturus.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Le Ma
6c54afc7e8 drm/amdgpu: assign fb_start/end in mmhub v9.4 interface
Align with mmhub v1.0.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
James Zhu
cd1fd7b381 drm/amdgpu: add harvest support for Arcturus
Add VCN harvest support for Arcturus

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
James Zhu
fa739f4b06 drm/amdgpu: add multiple instances support for Arcturus
Arcturus has dual-VCN. Need add multiple instances support for Arcturus.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00
James Zhu
c01b6a1d38 drm/amdgpu: modify amdgpu_vcn to support multiple instances
Arcturus has dual-VCN. Need Restruct amdgpu_device::vcn to support
multiple vcns. There are no any logical changes here

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00
James Zhu
989b6a0549 drm/amdgpu: add vcn nbio doorbell range setting for 2nd vcn instance
add vcn nbio doorbell range setting for 2nd vcn instance

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00
James Zhu
8b75a521c0 drm/amdgpu/: increase AMDGPU_MAX_RINGS to add 2nd vcn instance
increase AMDGPU_MAX_RINGS to add 2nd vcn instance

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00