Commit Graph

7072 Commits

Author SHA1 Message Date
Rohit Khaire
75ddb640e1 drm/amdgpu: Don't write GCVM_L2_CNTL* regs on navi12 VF
This change disables programming of GCVM_L2_CNTL* regs on VF.

Signed-off-by: Rohit Khaire <Rohit.Khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
Daniel Vetter
f3ed67395d drm/amdgpu: Drop DRIVER_USE_AGP
This doesn't do anything except auto-init drm_agp support when you
call drm_get_pci_dev(). Which amdgpu stopped doing with

commit b58c11314a
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jun 2 17:16:31 2017 -0400

    drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev

No idea whether this was intentional or accidental breakage, but I
guess anyone who manages to boot a this modern gpu behind an agp
bridge deserves a price. A price I never expect anyone to ever collect
:-)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xiaojie Yuan <xiaojie.yuan@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: "Tianci.Yin" <tianci.yin@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
Tom St Denis
669e2f91e4 drm/amd/amdgpu: Add gfxoff debugfs entry
Write a 32-bit value of zero to disable GFXOFF and write a 32-bit
value of non-zero to enable GFXOFF.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
Nirmoy Das
c6fc97f9bc drm/amdgpu: use amdgpu_ring_test_helper when possible
amdgpu_ring_test_helper already handles ring->sched.ready correctly

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
Christian König
42e5fee65e drm/amdgpu: add VM update fences back to the root PD v2
Add update fences to the root PD while mapping BOs.

Otherwise PDs freed during the mapping won't wait for
updates to finish and can cause corruptions.

v2: rebased on drm-misc-next

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 90b69cdc5f drm/amdgpu: stop adding VM updates fences to the resv obj
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
Nirmoy Das
6f9f960472 drm/amdgpu: cleanup amdgpu_ring_fini
cleanup amdgpu_ring_fini to check the prerequisites before changing ring->sched.ready

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:33 -05:00
John Clements
ef1caf48bd drm/amdgpu: Add Arcturus D342 page retire support
Check Arcturus SKU type to select I2C address of page retirement EEPROM

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
Hawking Zhang
938065d4cb drm/amdgpu: toggle DF-Cstate to protect DF reg access
driver needs to take DF out Cstate before any DF register
access. otherwise, the DF register may not be accessible.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
Hawking Zhang
19744f5f2d drm/amdgpu: move get_xgmi_relative_phy_addr to amdgpu_xgmi.c
centralize all the xgmi related function to amdgpu_xgmi.c

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
Hawking Zhang
53e0f1e6be drm/amdgpu: add dpm helper function for DF Cstate control
The helper function hides software smu and legacy powerplay
implementation for DF Cstate control.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
Evan Quan
995da6cc4c drm/amdgpu: update psp firmwares loading sequence V2
For those ASICs with DF Cstate management centralized to PMFW,
TMR setup should be performed between pmfw loading and other
non-psp firmwares loading.

V2: skip possible SMU firmware reloading

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
xinhui pan
f4a3c42b5c drm/amdgpu: Remove kfd eviction fence before release bo (v2)
No need to trigger eviction as the memory mapping will not be used
anymore.

All pt/pd bos share same resv, hence the same shared eviction fence.
Everytime page table is freed, the fence will be signled and that cuases
kfd unexcepted evictions.

v2: squash in 32 bit fix

CC: Christian König <christian.koenig@amd.com>
CC: Felix Kuehling <felix.kuehling@amd.com>
CC: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26 14:17:32 -05:00
Daniel Vetter
8a3bddf67c drm/amdgpu: Drop DRIVER_USE_AGP
This doesn't do anything except auto-init drm_agp support when you
call drm_get_pci_dev(). Which amdgpu stopped doing with

commit b58c11314a
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Fri Jun 2 17:16:31 2017 -0400

    drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev

No idea whether this was intentional or accidental breakage, but I
guess anyone who manages to boot a this modern gpu behind an agp
bridge deserves a price. A price I never expect anyone to ever collect
:-)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xiaojie Yuan <xiaojie.yuan@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: "Tianci.Yin" <tianci.yin@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-26 14:02:41 -05:00
Shirish S
a3ed353cf8 amdgpu/gmc_v9: save/restore sdpif regs during S3
fixes S3 issue with IOMMU + S/G  enabled @ 64M VRAM.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-25 11:30:42 -05:00
Felix Kuehling
b80cd524ac drm/amdgpu: Improve Vega20 XGMI TLB flush workaround
Using a heavy-weight TLB flush once is not sufficient. Concurrent
memory accesses in the same TLB cache line can re-populate TLB entries
from stale texture cache (TC) entries while the heavy-weight TLB
flush is in progress. To fix this race condition, perform another TLB
flush after the heavy-weight one, when TC is known to be clean.

Move the workaround into the low-level TLB flushing functions. This way
they apply to amdgpu as well, and KIQ-based TLB flush only needs to
synchronize once.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:57 -05:00
Monk Liu
82c4ebfa35 drm/amdgpu: fix psp ucode not loaded in bare-metal
for bare-metal we alawys need to load sys/sos/kdb

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:49 -05:00
Shirish S
c2ecd79bec amdgpu/gmc_v9: save/restore sdpif regs during S3
fixes S3 issue with IOMMU + S/G  enabled @ 64M VRAM.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:26 -05:00
Alex Deucher
91aeda1811 drm/amdgpu/discovery: make the discovery code less chatty
Make the IP block base output debug only.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:25 -05:00
Monk Liu
6325b38d89 drm/amdgpu: fix colliding of preemption
what:
some os preemption path is messed up with world switch preemption

fix:
cleanup those logics so os preemption not mixed with world switch

this patch is a general fix for issues comes from SRIOV MCBP, but
there is still UMD side issues not resovlved yet, so this patch
cannot fix all world switch bug.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:25 -05:00
Monk Liu
f77a9c920a drm/amdgpu: cleanup some incorrect reg access for SRIOV
1)
we shouldn't load PSP kdb and sys/sos for VF, they are
supposed to be handled by hypervisor

2)
ih reroute doesn't work on VF thus we should avoid calling
it, besides VF should not use those PSP register sets for PF

3)
shouldn't load SMU ucode under SRIOV, otherwise PSP would report
error

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-25 11:01:25 -05:00
Evan Quan
14008574a3 drm/amdgpu: drop the non-sense firmware version check on arcturus
As the firmware versions of arcturus are different from other gfx9
ASICs. And the warning("CP firmware version too old, please update!")
caused by this check can be eliminated.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19 10:36:26 -05:00
changzhu
f61f01b14d drm/amdgpu: add is_raven_kicker judgement for raven1
The rlc version of raven_kicer_rlc is different from the legacy rlc
version of raven_rlc. So it needs to add a judgement function for
raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19 10:36:26 -05:00
Guchun Chen
3cd4f61859 drm/amdgpu: record non-zero error counter info in NBIO before resetting GPU
When NBIO's RAS error happens, before trigging GPU reset, it's needed
to record error counter information, which can correct the error counter
value missed issue when reading from debugfs.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19 10:36:26 -05:00
Guchun Chen
313c8fd33e drm/amdgpu: log on non-zero error conter per IP before GPU reset
Once sync flood interrupt is triggered by RAS error, before
actual GPU recovery job, it's necessary to log on and print
non-zero error counter, this will help user knows where the
RAS error source is from quickly.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19 10:36:26 -05:00
changzhu
debcf83770 drm/amdgpu: add is_raven_kicker judgement for raven1
The rlc version of raven_kicer_rlc is different from the legacy rlc
version of raven_rlc. So it needs to add a judgement function for
raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19 10:33:42 -05:00
Maxime Ripard
28f2aff1ca Linux 5.6-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5JsVQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHZEH+wddtJO4dZk5TZdF
 KZB2w2ldsCvrBZAmas1TVvm8ncvUf+ATUZcSIzvZ3YbLmxsuLF2Cz3kD+n+36Mvy
 ejwq8Scl7jwnouYps/Gfd6rRj/uCafqST4qp15GMGeiy2ST4A8dJrv5IAgZhD8/N
 SN1bSr1AXpZ2JlEzzLDQ/NdVoNMS6IzCOsaINZcc60/XQoQZFRBWamMJFqu+CmXD
 SBJOybQNFJhziy45cGZSAl+67sSCcoPftwTs0Stu4CJsvFWRb3MsbNTDS51Hjcc4
 3tdgGOhoNXzyzZr96MEAHmiaW4VLQv0PGgUfOajE35viMz48OrwCTru8Kbuae3XM
 YHV4qJk=
 =yiem
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkpecgAKCRDj7w1vZxhR
 xU4CAQD/8ZzfYroiiEI7FHEgvnuldGYZqA9kbNAO2Vueeac0OgD+Jojewdxy+pJ9
 dxfA/POdtzx3hOdN+U1YgaDC0hbXwww=
 =XgKq
 -----END PGP SIGNATURE-----

Merge v5.6-rc2 into drm-misc-next

Lyude needs some patches in 5.6-rc2 and we didn't bring drm-misc-next
forward yet, so it looks like a good occasion.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-02-17 10:34:34 +01:00
Alex Deucher
b08c3ed609 drm/amdgpu/gfx10: disable gfxoff when reading rlc clock
Otherwise we readback all ones.  Fixes rlc counter
readback while gfxoff is active.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-14 12:58:58 -05:00
Alex Deucher
120cf95930 drm/amdgpu/gfx9: disable gfxoff when reading rlc clock
Otherwise we readback all ones.  Fixes rlc counter
readback while gfxoff is active.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-14 12:58:58 -05:00
Alex Deucher
c657b936ea drm/amdgpu/soc15: fix xclk for raven
It's 25 Mhz (refclk / 4).  This fixes the interpretation
of the rlc clock counter.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-02-14 12:58:58 -05:00
Bhawanpreet Lakha
c6f8c44044 drm/amd/display: fix dtm unloading
there was a type in the terminate command.

We should be calling psp_dtm_unload() instead of psp_hdcp_unload()

Fixes: 143f230533 ("drm/amdgpu: psp DTM init")
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-14 12:58:58 -05:00
Thomas Zimmermann
e3eff4b5d9 drm/amdgpu: Convert to CRTC VBLANK callbacks
VBLANK callbacks in struct drm_driver are deprecated in favor of
their equivalents in struct drm_crtc_funcs. Convert amdgpu over.

v2:
	* don't wrap existing functions; change signature instead

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de
2020-02-13 13:08:13 +01:00
Thomas Zimmermann
ea702333e5 drm/amdgpu: Convert to struct drm_crtc_helper_funcs.get_scanout_position()
The callback struct drm_driver.get_scanout_position() is deprecated in
favor of struct drm_crtc_helper_funcs.get_scanout_position(). Convert
amdgpu over.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-5-tzimmermann@suse.de
2020-02-13 13:08:13 +01:00
Dan Carpenter
434cbcb1bd drm/amdgpu: return -EFAULT if copy_to_user() fails
The copy_to_user() function returns the number of bytes remaining to be
copied, but we want to return a negative error code to the user.

Fixes: 030d5b97a5 ("drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:41 -05:00
Alex Deucher
72b4c01d66 drm/amdgpu/gfx10: disable gfxoff when reading rlc clock
Otherwise we readback all ones.  Fixes rlc counter
readback while gfxoff is active.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:41 -05:00
Alex Deucher
e5f134958d drm/amdgpu/gfx9: disable gfxoff when reading rlc clock
Otherwise we readback all ones.  Fixes rlc counter
readback while gfxoff is active.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:40 -05:00
Alex Deucher
b90c4d667c drm/amdgpu/soc15: fix xclk for raven
It's 25 Mhz (refclk / 4).  This fixes the interpretation
of the rlc clock counter.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:04:40 -05:00
Bhawanpreet Lakha
c786530b21 drm/amd/display: fix dtm unloading
there was a type in the terminate command.

We should be calling psp_dtm_unload() instead of psp_hdcp_unload()

Fixes: 143f230533 ("drm/amdgpu: psp DTM init")
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:03:22 -05:00
Alex Deucher
4fdda2e66d drm/amdgpu/runpm: enable runpm on baco capable VI+ asics
Seems to work reliably on VI+ except for a few so enable runpm barring
those where baco for runtime power management is not supported.

[rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to
enable runtime pm with baco for kfd. Also fixed a checkpatch warning and
added extra checks for VEGA20 and ARCTURUS.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:01:03 -05:00
Rajneesh Bhardwaj
9593f4d6a6 drm/amdkfd: refactor runtime pm for baco
So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.

However, in an ideal case, when gpu is runtime  suspended the kfd driver
should be able to:

 - auto resume amdgpu driver whenever a client requests compute service
 - prevent runtime suspend for amdgpu  while kfd is in use

This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:00:54 -05:00
Rajneesh Bhardwaj
70bedd68e7 drm/amdgpu: Fix missing error check in suspend
amdgpu_device_suspend might return an error code since it can be called
from both runtime and system suspend flows. Add the missing return code
in case of a failure.

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12 16:00:25 -05:00
Guchun Chen
a934f9d866 drm/amdgpu: correct comment to clear up the confusion
Former comment looks to be one intended behavior in code,
actually it's not. So correct it.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 15:40:12 -05:00
James Zhu
b5336bfd6f drm/amdgpu/vcn2.5: fix warning
Fix warning during switching to dpg pause mode for
VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 15:37:22 -05:00
Guchun Chen
2cabe0d4cd drm/amdgpu: limit GDS clearing workaround in cold boot sequence
GDS clear workaround will cause gfx failure in suspend/resume case.

[   98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <gfx_v9_0> failed -110
[   98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110
[   98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110

As this workaround is specific to the HW bug of GDS's ECC error
existing in cold boot up, so bypass this workaround in suspend/
resume case after booting up.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 15:37:02 -05:00
Jonathan Kim
46d1da733f drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf
hwc->conf was designated specifically for AMD APU IOMMU purposes.  This
could cause problems in performance and/or function since APU IOMMU
implementation is elsewhere.  Also hwc->conf and hwc->config are
different members of an anonymous union so hwc->conf aliases as
hw->last_tag.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 15:35:54 -05:00
James Zhu
f4d0242b7b drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1
Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode
power off issue on instance 1.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 15:10:36 -05:00
Alex Deucher
f0f7ddfc34 drm/amdgpu: add flag for runtime suspend
So we know whether we in are in runtime suspend or
system suspend.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:52 -05:00
xinhui pan
a6605c43f9 drm/amdgpu: Do not move root PT bo to relocated list
As root PD has no parent, we just need move its status to idle.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
CC: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:38 -05:00
Guchun Chen
278628fa46 drm/amdgpu: correct comment to clear up the confusion
Former comment looks to be one intended behavior in code,
actually it's not. So correct it.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:31 -05:00
James Zhu
3b4a18a355 drm/amdgpu/vcn2.5: fix warning
Fix warning during switching to dpg pause mode for
VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:49:23 -05:00
Guchun Chen
ea6f0931c1 drm/amdgpu: limit GDS clearing workaround in cold boot sequence
GDS clear workaround will cause gfx failure in suspend/resume case.

[   98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <gfx_v9_0> failed -110
[   98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110
[   98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110

As this workaround is specific to the HW bug of GDS's ECC error
existing in cold boot up, so bypass this workaround in suspend/
resume case after booting up.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:49:16 -05:00
Linus Torvalds
c16b99d6c5 drm fixes for 5.6-rc1
tegra:
 - merge window regression fixes
 
 nouveau:
 - couple of volta/turing modesetting fixes
 
 amdgpu:
 - EDC fixes for Arcturus
 - GDDR6 memory training fixe
 - Fix for reading gfx clockgating registers while in GFXOFF state
 - i2c freq fixes
 - Misc display fixes
 - TLB invalidation fix when using semaphores
 - VCN 2.5 instancing fixes
 - Switch raven1 gfxoff to a blacklist
 - Coreboot workaround for KV/KB
 - Root cause dongle fixes for display and revert workaround
 - Enable GPU reset for renoir and navi
 - Navi overclocking fixes
 - Fix up confusing warnings in display clock validation on raven
 
 amdkfd:
 - SDMA fix
 
 radeon:
 - Misc LUT fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJePNasAAoJEAx081l5xIa+axgP+wZbUkJZYt75p5thhptp2CY3
 235yIZQcDy0T+J/34Agv0e5DfK2hCafPIVv7dIV1A9AjWxhVzIoZmD2mHSLKnxVb
 MsfemKsrm1WO9KezDkDIEOgkN5rY811xTIsg89VSjyiiPXQmwAcXQBbmIBHg9YtT
 7ttek1afJv7h0HD4W2JdTxLEeKn9lpZHUXESXz+xoE8OUMDizMgmbt8nWXlbFpcw
 Tx1EmxZ3jbCYGsrcQBOLURZiwzOb2zbXJ9o8BgNdomvtI69l2/3mMonavw+IgVq+
 aiHMgWViVsDDP8ijn8uBa7EEqtGWuqTRaL/Ei71w2MGkG7S3Nz4PxbLbl5P9NHAc
 D3OsiaXGdNhoSGfGD6sr7TlOIEo7gjLGKJ3fMW15XmBF4isb0lwlnIB7dTiU9Zaq
 UBMpNA9/Vtty2MtT1SkVKQrC+Mujl4lAYssWhcrh1/RX0Ij6do4aFswy17dbVg3Q
 LIigj1quIfNPXONb0fwkg5JGQyOMnrYs0Q6SkrhQfSR2UzsNWnGoBRyROcZipNnQ
 STaM8F6eEi7RbFUaT0uWL71GBxM00PyjQgpf5fnrOiKp6rfgK3B3ThpsVoAQ2OW4
 CnodlCa1uxIkIgQdcoG/3vT5B6nzpdhgRVjHt+N5sraLjOZFZHxex0YmJAYXcN+/
 QR8ARNgILJvA14xCsOT7
 =C0nE
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just some fixes for this merge window: the tegra changes fix some
  regressions in the merge, nouveau has a few modesetting fixes.

  The amdgpu fixes are bit bigger, but they contain a couple of weeks of
  fixes, and don't seem to contain anything that isn't really a fix.

  Summary:

  tegra:
   - merge window regression fixes

  nouveau:
   - couple of volta/turing modesetting fixes

  amdgpu:
   - EDC fixes for Arcturus
   - GDDR6 memory training fixe
   - Fix for reading gfx clockgating registers while in GFXOFF state
   - i2c freq fixes
   - Misc display fixes
   - TLB invalidation fix when using semaphores
   - VCN 2.5 instancing fixes
   - Switch raven1 gfxoff to a blacklist
   - Coreboot workaround for KV/KB
   - Root cause dongle fixes for display and revert workaround
   - Enable GPU reset for renoir and navi
   - Navi overclocking fixes
   - Fix up confusing warnings in display clock validation on raven

  amdkfd:
   - SDMA fix

  radeon:
   - Misc LUT fixes"

* tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm: (90 commits)
  gpu: host1x: Set DMA direction only for DMA-mapped buffer objects
  drm/tegra: Reuse IOVA mapping where possible
  drm/tegra: Relax IOMMU usage criteria on old Tegra
  drm/amd/dm/mst: Ignore payload update failures
  drm/amdgpu: update default voltage for boot od table for navi1x
  drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage
  drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency
  drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2)
  drm/amdgpu: fetch default VDDC curve voltages (v2)
  drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2)
  drm/amdgpu/navi10: add OD_RANGE for navi overclocking
  drm/amdgpu/navi: fix index for OD MCLK
  drm/amd/display: Fix HW/SW state mismatch
  drm/amd/display: Fix a typo when computing dsc configuration
  drm/amd/powerplay: fix navi10 system intermittent reboot issue V2
  drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode
  drm/amd/display: Only enable cursor on pipes that need it
  drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset
  drm/nouveau/kms/gv100-: move window ownership setup into modesetting path
  drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms
  ...
2020-02-07 12:46:08 -08:00
Christian König
dd1ab79910 drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_access_memory v2
Make use of the better performance here as well.

This patch is only compile tested!

v2: fix calculation bug pointed out by Felix

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:39 -05:00
Christian König
030d5b97a5 drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read
This speeds up the access quite a bit from 2.2 MB/s to
2.9 MB/s on 32bit and 12,8 MB/s on 64bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:33 -05:00
Christian König
c12b84d6e0 drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2
This should speed up debugging VRAM access a lot.

v2: add HDP flush/invalidate

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:25 -05:00
Christian König
ce05ac56e6 drm/amdgpu: optimize amdgpu_device_vram_access a bit.
Only write the _HI register when necessary.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:17 -05:00
Jonathan Kim
42d708db8e drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf
hwc->conf was designated specifically for AMD APU IOMMU purposes.  This
could cause problems in performance and/or function since APU IOMMU
implementation is elsewhere.  Also hwc->conf and hwc->config are
different members of an anonymous union so hwc->conf aliases as
hw->last_tag.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:45:05 -05:00
James Zhu
80ff3e10c8 drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1
Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode
power off issue on instance 1.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:51 -05:00
Jack Zhang
86b93fd62d drm/amdgpu/sriov Don't send msg when smu suspend
For sriov and pp_onevf_mode, do not send message to set smu
status, because smu doesn't support these messages under VF.

Besides, it should skip smu_suspend when pp_onevf_mode is disabled.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07 11:44:24 -05:00
Hawking Zhang
0b9d37609a drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2)
For sriov, psp ip block has to be initialized before
ih block for the dynamic register programming interface
that needed for vf ih ring buffer. On the other hand,
current psp ip block hw_init function will initialize
xgmi session which actaully depends on interrupt to
return session context. This results an empty xgmi ta
session id and later failures on all the xgmi ta cmd
invoked from vf. xgmi ta session initialization has to
be done after ih ip block hw_init call.

to unify xgmi session init/fini for both bare-metal
sriov virtualization use scenario, move xgmi ta init
to xgmi_add_device call, and accordingly terminate xgmi
ta session in xgmi_remove_device call.

The existing suspend/resume sequence will not be changed.

v2: squash in return fix from Nirmoy

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-06 15:04:36 -05:00
Christian König
9f3cc18d19 drm/amdgpu: rework synchronization of VM updates v4
If provided we only sync to the BOs reservation
object and no longer to the root PD.

v2: update comment, cleanup amdgpu_bo_sync_wait_resv
v3: use correct reservation object while clearing
v4: fix typo in amdgpu_bo_sync_wait_resv

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
4939d973b6 drm/amdgpu: simplify and fix amdgpu_sync_resv
No matter what we always need to sync to moves.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
fe6796ac12 drm/amdgpu: allow higher level PD invalidations
Allow partial invalidation on unallocated PDs. This is useful when we
need to silence faults to stop interrupt floods on Vega.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
7d28efe0c3 drm/amdgpu: return EINVAL instead of ENOENT in the VM code
That we can't find a PD above the root is expected can only happen if
we try to update a larger range than actually managed by the VM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
bfcd6c69e4 drm/amdgpu: fix parentheses in amdgpu_vm_update_ptes
For the root PD mask can be 0xffffffff as well which would
overrun to 0 if we don't cast it before we add one.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
46cf5f7626 drm/amdgpu: make sure to never allocate PDs/PTs for invalidations
Make sure that we never allocate a page table for an invalidation operation.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
55cdd4e9b9 drm/amdgpu: drop unnecessary restriction for huge root PDEs
The root PD can also contain huge PDEs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
ef48d4b39e drm/amdgpu: stop using amdgpu_bo_gpu_offset in the VM backend
We need to update page tables without any lock held.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
5d3196605d drm/amdgpu: rework job synchronization v2
For unlocked page table updates we need to be able
to sync to fences of a specific VM.

v2: use SYNC_ALWAYS in the UVD code

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
114fbc3195 drm/amdgpu: use the VM as job owner
For HMM we need to rework how VM synchronization works, so instead of the filp use VM as job owner.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Christian König
81417bea87 drm/amdgpu: explicitly sync VM update to PDs/PTs
Explicitly sync VM updates to the moving fence in PDs and PTs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-04 23:30:39 -05:00
Nathan Chancellor
968162204a drm/amdgpu: Fix implicit enum conversion in gfx_v9_4_ras_error_inject
Clang warns:

../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit
conversion from enumeration type 'enum amdgpu_ras_block' to different
enumeration type 'enum ta_ras_block' [-Wenum-conversion]
        block_info.block_id = info->head.block;
                            ~ ~~~~~~~~~~~^~~~~
1 warning generated.

Use the function added in commit 828cfa2909 ("drm/amdgpu: Fix amdgpu
ras to ta enums conversion") that handles this conversion explicitly.

Fixes: 4c461d89db ("drm/amdgpu: add RAS support for the gfx block of Arcturus")
Link: https://github.com/ClangBuiltLinux/linux/issues/849
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:43 -05:00
Stephen Rothwell
7044cb6c20 amdgpu: using vmalloc requires includeing vmalloc.h
Fixes: 240c811ccd ("drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:42 -05:00
Nirmoy Das
977f7e1068 drm/amdgpu: allocate entities on demand
Currently we pre-allocate entities and fences for all the HW IPs on
context creation and some of which are might never be used.

This patch tries to resolve entity/fences wastage by creating entity
only when needed.

v2: allocate memory for entity and fences together

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:42 -05:00
Joseph Greathouse
18c6b74e7c drm/amdgpu: Enable DISABLE_BARRIER_WAITCNT for Arcturus
In previous gfx9 parts, S_BARRIER shader instructions are implicitly
S_WAITCNT 0 instructions as well. This setting turns off that
mechanism in Arcturus and beyond. With this, shaders must follow the
ISA guide insofar as putting in explicit S_WAITCNT operations even
after an S_BARRIER.

v2: Fix patch title to list component

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-30 17:15:27 -05:00
Linus Torvalds
9f68e3655a drm pull for 5.6-rc1
uapi:
 - dma-buf heaps added (and fixed)
 - command line add support for panel oreientation
 - command line allow overriding penguin count
 
 drm:
 - mipi dsi definition updates
 - lockdep annotations for dma_resv
 - remove dma-buf kmap/kunmap support
 - constify fb_ops in all fbdev drivers
 - MST fix for daisy chained hotplug-
 - CTA-861-G modes with VIC >= 193 added
 - fix drm_panel_of_backlight export
 - LVDS decoder support
 - more device based logging support
 - scanline alighment for dumb buffers
 - MST DSC helpers
 
 scheduler:
 - documentation fixes
 - job distribution improvements
 
 panel:
 - Logic PD type 28 panel support
 - Jimax8729d MIPI-DSI
 - igenic JZ4770
 - generic DSI devicetree bindings
 - sony acx424AKP panel
 - Leadtek LTK500HD1829
 - xinpeng XPP055C272
 - AUO B116XAK01
 - GiantPlus GPM940B0
 - BOE NV140FHM-N49
 - Satoz SAT050AT40H12R2
 - Sharp LS020B1DD01D panels.
 
 ttm:
 - use blocking WW lock
 
 i915:
 - hw/uapi state separation
 - Lock annotation improvements
 - selftest improvements
 - ICL/TGL DSI VDSC support
 - VBT parsing improvments
 - Display refactoring
 - DSI updates + fixes
 - HDCP 2.2 for CFL
 - CML PCI ID fixes
 - GLK+ fbc fix
 - PSR fixes
 - GEN/GT refactor improvments
 - DP MST fixes
 - switch context id alloc to xarray
 - workaround updates
 - LMEM debugfs support
 - tiled monitor fixes
 - ICL+ clock gating programming removed
 - DP MST disable sequence fixed
 - LMEM discontiguous object maps
 - prefaulting for discontiguous objects
 - use LMEM for dumb buffers if possible
 - add LMEM mmap support
 
 amdgpu:
 - enable sync object timelines for vulkan
 - MST atomic routines
 - enable MST DSC support
 - add DMCUB display microengine support
 - DC OEM i2c support
 - Renoir DC fixes
 - Initial HDCP 2.x support
 - BACO support for Arcturus
 - Use BACO for runtime PM power save
 - gfxoff on navi10
 - gfx10 golden updates and fixes
 - DCN support on POWER
 - GFXOFF for raven1 refresh
 - MM engine idle handlers cleanup
 - 10bpc EDP panel fixes
 - renoir watermark fixes
 - SR-IOV fixes
 - Arcturus VCN fixes
 - GDDR6 training fixes
 - freesync fixes
 - Pollock support
 
 amdkfd:
 - unify more codepath with amdgpu
 - use KIQ to setup HIQ rather than MMIO
 
 radeon:
 - fix vma fault handler race
 - PPC DMA fix
 - register check fixes for r100/r200
 
 nouveau:
 - mmap_sem vs dma_resv fix
 - rewrite the ACR secure boot code for Turing
 - TU10x graphics engine support (TU11x pending)
 - Page kind mapping for turing
 - 10-bit LUT support
 - GP10B Tegra fixes
 - HD audio regression fix
 
 hisilicon/hibmc:
 - use generic fbdev code and helpers
 
 rockchip:
 - dsi/px30 support
 
 virtio:
 - fb damage support
 - static some functions
 
 vc4:
 - use dma_resv lock wrappers
 
 msm:
 - use dma_resv lock wrappers
 - sc7180 display + DSI support
 - a618 support
 - UBWC support improvements
 
 vmwgfx:
 - updates + new logging uapi
 
 exynos:
 - enable/disable callback cleanups
 
 etnaviv:
 - use dma_resv lock wrappers
 
 atmel-hlcdc:
 - clock fixes
 
 mediatek:
 - cmdq support
 - non-smooth cursor fixes
 - ctm property support
 
 sun4i:
 - suspend support
 - A64 mipi dsi support
 
 rcar-du:
 - Color management module support
 - LVDS encoder dual-link support
 - R8A77980 support
 
 analogic:
 - add support for an6345
 
 ast:
 - atomic modeset support
 - primary plane garbage fix
 
 arcgpu:
 - fixes for fourcc handling
 
 tegra:
 - minor fixes and improvments
 
 mcde:
 - vblank support
 
 meson:
 - OSD1 plane AFBC commit
 
 gma500:
 - add pageflip support
 - reomve global drm_dev
 
 komeda:
 - tweak debugfs output
 - d32 support
 - runtime PM suppotr
 
 udl:
 - use generic shmem helpers
 - cleanup and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeMm6RAAoJEAx081l5xIa+vN8P/0j4jEOv+KIinAhoH+LG3EpD
 m2TUuu5OQIoBrcCoWOgFBk3wqYpw6PdMBdkXh+5sE5lfeBynp8oC3Bin+QsHJE05
 eGBpZtHe+70MQb0Eha+Aic0hchvBKzRnq6i0MYSIHn6afs76dLmF8knTjycxrvV5
 Xu1Z3WDmjzqgWF9ja5JCD6fby11seP5RrwObYKVikO35QQyJJwGSGKgu5rq/pByK
 /n0PCnCOINuL0Lz6J9qexdh/0/XYFQilRC31GJNlKbDSFuECF0GOEzEE/xUBW/pI
 dLh2YwIIygm18Gar9PgvMwXJn3BfzQ0qEJsf+HlQeNw9iLgbHpp2AsTxHTE87OGe
 R/y85taW3jGjPsNOKZOeLpvg/Ro8l8ZipLApvDCG2O22DThg/cd6NDjZxl1FJfRH
 acDG/JdgPo5MbdRAH/cM1WuFS9gEM+0BeSQ5gCjtPakF+X4Vz+ABFDLMRJoaejkJ
 q8DG32TQXELQx0RMghsqK7YCWGfl+2alA1u9w6TgJh9Rq4iVckvpDeqAZnK1Adkc
 87g957Tl0n6FA4wJj/t5jrceiLRMJAm/rBK+R3GZNfWrgx4bHbCmb4fZDZsrFzph
 nbAjNJ5kOchrFCaRR47ULby6+Q14MAFbkWq4Crfu4YDdzUkTPpep6pi2GIe8w0rV
 P0hdYOYJf6LUda0utuQX
 =oFrI
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Davbe Airlie:
 "This is the main pull request for graphics for 5.6. Usual selection of
  changes all over.

  I've got one outstanding vmwgfx pull that touches mm so kept it
  separate until after all of this lands. I'll try and get it to you
  soon after this, but it might be early next week (nothing wrong with
  code, just my schedule is messy)

  This also hits a lot of fbdev drivers with some cleanups.

  Other notables:
   - vulkan timeline semaphore support added to syncobjs
   - nouveau turing secureboot/graphics support
   - Displayport MST display stream compression support

  Detailed summary:

  uapi:
   - dma-buf heaps added (and fixed)
   - command line add support for panel oreientation
   - command line allow overriding penguin count

  drm:
   - mipi dsi definition updates
   - lockdep annotations for dma_resv
   - remove dma-buf kmap/kunmap support
   - constify fb_ops in all fbdev drivers
   - MST fix for daisy chained hotplug-
   - CTA-861-G modes with VIC >= 193 added
   - fix drm_panel_of_backlight export
   - LVDS decoder support
   - more device based logging support
   - scanline alighment for dumb buffers
   - MST DSC helpers

  scheduler:
   - documentation fixes
   - job distribution improvements

  panel:
   - Logic PD type 28 panel support
   - Jimax8729d MIPI-DSI
   - igenic JZ4770
   - generic DSI devicetree bindings
   - sony acx424AKP panel
   - Leadtek LTK500HD1829
   - xinpeng XPP055C272
   - AUO B116XAK01
   - GiantPlus GPM940B0
   - BOE NV140FHM-N49
   - Satoz SAT050AT40H12R2
   - Sharp LS020B1DD01D panels.

  ttm:
   - use blocking WW lock

  i915:
   - hw/uapi state separation
   - Lock annotation improvements
   - selftest improvements
   - ICL/TGL DSI VDSC support
   - VBT parsing improvments
   - Display refactoring
   - DSI updates + fixes
   - HDCP 2.2 for CFL
   - CML PCI ID fixes
   - GLK+ fbc fix
   - PSR fixes
   - GEN/GT refactor improvments
   - DP MST fixes
   - switch context id alloc to xarray
   - workaround updates
   - LMEM debugfs support
   - tiled monitor fixes
   - ICL+ clock gating programming removed
   - DP MST disable sequence fixed
   - LMEM discontiguous object maps
   - prefaulting for discontiguous objects
   - use LMEM for dumb buffers if possible
   - add LMEM mmap support

  amdgpu:
   - enable sync object timelines for vulkan
   - MST atomic routines
   - enable MST DSC support
   - add DMCUB display microengine support
   - DC OEM i2c support
   - Renoir DC fixes
   - Initial HDCP 2.x support
   - BACO support for Arcturus
   - Use BACO for runtime PM power save
   - gfxoff on navi10
   - gfx10 golden updates and fixes
   - DCN support on POWER
   - GFXOFF for raven1 refresh
   - MM engine idle handlers cleanup
   - 10bpc EDP panel fixes
   - renoir watermark fixes
   - SR-IOV fixes
   - Arcturus VCN fixes
   - GDDR6 training fixes
   - freesync fixes
   - Pollock support

  amdkfd:
   - unify more codepath with amdgpu
   - use KIQ to setup HIQ rather than MMIO

  radeon:
   - fix vma fault handler race
   - PPC DMA fix
   - register check fixes for r100/r200

  nouveau:
   - mmap_sem vs dma_resv fix
   - rewrite the ACR secure boot code for Turing
   - TU10x graphics engine support (TU11x pending)
   - Page kind mapping for turing
   - 10-bit LUT support
   - GP10B Tegra fixes
   - HD audio regression fix

  hisilicon/hibmc:
   - use generic fbdev code and helpers

  rockchip:
   - dsi/px30 support

  virtio:
   - fb damage support
   - static some functions

  vc4:
   - use dma_resv lock wrappers

  msm:
   - use dma_resv lock wrappers
   - sc7180 display + DSI support
   - a618 support
   - UBWC support improvements

  vmwgfx:
   - updates + new logging uapi

  exynos:
   - enable/disable callback cleanups

  etnaviv:
   - use dma_resv lock wrappers

  atmel-hlcdc:
   - clock fixes

  mediatek:
   - cmdq support
   - non-smooth cursor fixes
   - ctm property support

  sun4i:
   - suspend support
   - A64 mipi dsi support

  rcar-du:
   - Color management module support
   - LVDS encoder dual-link support
   - R8A77980 support

  analogic:
   - add support for an6345

  ast:
   - atomic modeset support
   - primary plane garbage fix

  arcgpu:
   - fixes for fourcc handling

  tegra:
   - minor fixes and improvments

  mcde:
   - vblank support

  meson:
   - OSD1 plane AFBC commit

  gma500:
   - add pageflip support
   - reomve global drm_dev

  komeda:
   - tweak debugfs output
   - d32 support
   - runtime PM suppotr

  udl:
   - use generic shmem helpers
   - cleanup and fixes"

* tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits)
  drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing
  drm/nouveau/acr: return error when registering LSF if ACR not supported
  drm/nouveau/disp/gv100-: not all channel types support reporting error codes
  drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
  drm/nouveau: support synchronous pushbuf submission
  drm/nouveau: signal pending fences when channel has been killed
  drm/nouveau: reject attempts to submit to dead channels
  drm/nouveau: zero vma pointer even if we only unreference it rather than free
  drm/nouveau: Add HD-audio component notifier support
  drm/nouveau: fix build error without CONFIG_IOMMU_API
  drm/nouveau/kms/nv04: remove set but not used variable 'width'
  drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector'
  drm/nouveau/mmu: fix comptag memory leak
  drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc
  drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping
  drm/exynos: Rename Exynos to lowercase
  drm/exynos: change callback names
  drm/mst: Don't do atomic checks over disabled managers
  drm/amdgpu: add the lost mutex_init back
  drm/amd/display: skip opp blank or unblank if test pattern enabled
  ...
2020-01-30 08:04:01 -08:00
Alex Deucher
2cb44fb093 drm/amdgpu: enable GPU reset by default on renoir
Everything is in place.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
658c663947 drm/amdgpu: enable GPU reset by default on Navi
Has been working fine for a while.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Christian König
77171eade8 drm/amdgpu: add coreboot workaround for KV/KB
Coreboot seems to have a problem correctly setting up access to the stolen VRAM
on KV/KB. Use the direct access only when necessary.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-tested-by: Fredrik Bruhn <fredrik.bruhn@unibap.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
276cc92945 drm/amdgpu: original raven doesn't support full asic reset
So don't use it.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Alex Deucher
7af2a5771e drm/amdgpu: attempt to enable gfxoff on more raven1 boards (v2)
Switch to a blacklist so we can disable specific boards
that are problematic.

v2: make the blacklist non-raven specific.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
Colin Ian King
b20dcd72c1 drm/amd/amdgpu: fix spelling mistake "to" -> "too"
There is a spelling mistake in a DRM_ERROR message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:45 -05:00
xinhui pan
f583cc57ba drm/amdgpu: initialize bo_va_list when add gws to process
bo_va_list is list_head, so initialize it.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
55bbb747ec drm/amdgpu/vcn: use inst_idx relacing inst
Use inst_idx relacing inst in SOC15_DPG_MODE macro to avoid confusion.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
a455573214 drm/amdgpu/vcn: fix typo error
Fix typo error, should be inst_idx instead of inst.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
326b523eeb drm/amdgpu/vcn: fix vcn2.5 instance issue
Fix vcn2.5 instance issue, vcn0 and vcn1 have same register offset

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
62884a7bf3 drm/amdgpu/vcn2.5: fix a bug for the 2nd vcn instance (v2)
Fix a bug for the 2nd vcn instance at start and stop.

v2: squash in unused label removal.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
James Zhu
b650121726 drm/amdgpu/vcn: Share vcn_v2_0_dec_ring_test_ring to vcn2.5
Share vcn_v2_0_dec_ring_test_ring to vcn2.5 to support
vcn software ring.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Felix Kuehling
fa34edbed4 drm/amdgpu: Use the correct flush_type in flush_gpu_tlb_pasid
The flush_type was incorrectly hard-coded to 0 when calling falling back
to MMIO-based invalidation in flush_gpu_tlb_pasid.

Fixes: ea930000a6 ("drm/amdgpu: export function to flush TLB via pasid")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Felix Kuehling
37c58ddf57 drm/amdgpu: Fix TLB invalidation request when using semaphore
Use a more meaningful variable name for the invalidation request
that is distinct from the tmp variable that gets overwritten when
acquiring the invalidation semaphore.

Fixes: 4ed8a03740 ("drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:36 -05:00
Chris Wilson
7e13ad8964 drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Since drm_global_mutex is a true global mutex across devices, we don't
want to acquire it unless absolutely necessary. For maintaining the
device local open_count, we can use atomic operations on the counter
itself, except when making the transition to/from 0. Here, we tackle the
easy portion of delaying acquiring the drm_global_mutex for the final
release by using atomic_dec_and_mutex_lock(), leaving the global
serialisation across the device opens.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org>
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk
2020-01-24 17:41:34 +00:00
Alex Deucher
23fe1390c7 drm/amdgpu: remove the experimental flag for renoir
Should work properly with the latest sbios on 5.5 and newer
kernels.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-23 12:14:53 -05:00
Nirmoy Das
63e3ab9a82 drm/amdgpu: individualize fence allocation per entity
Allocate fences for each entity and remove ctx->fences reference as
fences should be bound to amdgpu_ctx_entity instead amdgpu_ctx.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Tianci.Yin
7db1d560a4 Revert "drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)"
This reverts commit 9e44147862.

The patch will be replaced with a better solution, revert it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Tianci.Yin
240c811ccd drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)
[why]
In GDDR6 BIST training, a certain mount of bottom VRAM will be encroached by
UMC, that causes problems(like GTT corrupted and page fault observed).

[how]
Saving the content of this bottom VRAM to system memory before training, and
restoring it after training to avoid VRAM corruption.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Nirmoy Das
a9d4fe2fd6 drm/amdgpu: remove unnecessary conversion to bool
Better clean that up before some automation starts to complain about it

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Dennis Li
4c461d89db drm/amdgpu: add RAS support for the gfx block of Arcturus
Implement functions to do the RAS error injection and
query EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:30 -05:00
Dennis Li
504c5e72d7 drm/amdgpu: abstract EDC counter clear to a separated function
1. Add IP prefix for the IP related codes.
2. Refactor the code to clear EDC counter.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:14 -05:00
Dennis Li
5e66403e4d drm/amdgpu: refine the security check for RAS functions
To avoid calling RAS related functions when RAS feature isn't
supported in hardware. Change to check supported features, instead
of checking asic type.

v2: reuse amdgpu_ras_is_supported function, instead of introducing
a new flag for hardware ras feature.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:36:04 -05:00
Dennis Li
39aa0ef163 drm/amdgpu: enable RAS feature for more mmhub sub-blocks of Acrturus
Compared with Vg20, the size of mmhub range is changed from 2 to 8.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:35:56 -05:00
chen gong
e3cd03603d drm/amdgpu: read gfx register using RREG32_KIQ macro
Reading CP_MEM_SLP_CNTL register with RREG32_SOC15 macro will lead to
hang when GPU is in "gfxoff" state.
I do a uniform substitution here.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:29 -05:00
chen gong
c68dbcd8f9 drm/amdgpu: add kiq version interface for RREG32/WREG32
Reading some registers by mmio will result in hang when GPU is in
"gfxoff" state.This problem can be solved by GPU in "ring command
packages" way.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:22 -05:00
chen gong
d33a99c4b6 drm/amdgpu: provide a generic function interface for reading/writing register by KIQ
Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c,
and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and
flexible.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:14 -05:00
John Clements
a6c44d2538 drm/amdgpu: added support to get mGPU DRAM base
resolves issue with RAS error injection in mGPU configuration

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:07 -05:00
Alex Sierra
36a1707afd drm/amdgpu: modify packet size for pm4 flush tlbs
[Why]
PM4 packet size for flush message was oversized.

[How]
Packet size adjusted to allocate flush + fence packets.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:33:52 -05:00
Pan, Xinhui
bd05221123 drm/amdgpu: add the lost mutex_init back
Initialize notifier_lock.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-17 16:03:09 -05:00
Hawking Zhang
e9d4cf918f drm/amdgpu: add arcturus to gpu recovery check code path
support check if dirver should try gpu recovery for
arcturus

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:40:54 -05:00
Hawking Zhang
93af20f74e drm/amdgpu: check if driver should try recovery in ras recovery path
To allow the flexibilty for user to disable gpu recovery
in RAS recovery path by module parameter amdgpu_gpu_recovery

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:40:47 -05:00
Evan Quan
2ac0d68697 drm/amd/powerplay: a quick fix for the deadlock issue below
NFO: task ocltst:2028 blocked for more than 120 seconds.
     Tainted: G           OE     5.0.0-37-generic #40~18.04.1-Ubuntu
echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cltst          D    0  2028   2026 0x00000000
all Trace:
__schedule+0x2c0/0x870
schedule+0x2c/0x70
schedule_preempt_disabled+0xe/0x10
__mutex_lock.isra.9+0x26d/0x4e0
__mutex_lock_slowpath+0x13/0x20
? __mutex_lock_slowpath+0x13/0x20
mutex_lock+0x2f/0x40
amdgpu_dpm_set_powergating_by_smu+0x64/0xe0 [amdgpu]
gfx_v8_0_enable_gfx_static_mg_power_gating+0x3c/0x70 [amdgpu]
gfx_v8_0_set_powergating_state+0x66/0x260 [amdgpu]
amdgpu_device_ip_set_powergating_state+0x62/0xb0 [amdgpu]
pp_dpm_force_performance_level+0xe7/0x100 [amdgpu]
amdgpu_set_dpm_forced_performance_level+0x129/0x330 [amdgpu]

Fixes: a64c9e15e6 ("drm/amd/powerplay: cleanup the interfaces for powergate setting through SMU")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Rui Teng <Rui.Teng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:23 -05:00
Huang Rui
0e5b7a9528 drm/amdgpu: only set cp active field for kiq queue
The mec ucode will set the CP_HQD_ACTIVE bit while the queue is mapped by
MAP_QUEUES packet. So we only need set cp active field for kiq queue.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:16 -05:00
Alex Deucher
27414cd42a drm/amdgpu/pm: clean up return types
count is size_t so don't use negative values.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:38:02 -05:00
James Zhu
0c0dab86d9 drm/amdgpu/vcn2.5: implement indirect DPG SRAM mode
Implement indirect DPG SRAM mode for vcn2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:47 -05:00
James Zhu
8484df9601 drm/amdgpu/vcn2.5: add dpg pause mode
Add dpg pause mode support for vcn2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:41 -05:00
James Zhu
d2a2c64f53 drm/amdgpu/vcn2.5: add DPG mode start and stop
Add DPG mode start and stop functions for vcn2.5

v2: Correct firmware ucode index in vcn_v2_5_mc_resume_dpg_mode

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:34 -05:00
James Zhu
45cec87cd6 drm/amdgpu/vcn: move macro from vcn2.0 to share amdgpu_vcn (v2)
Move macro from vcn2.0 to amdgpu_vcn to share with vcn2.5

v2: squash in macro fix

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:34 -05:00
James Zhu
5db86843e8 drm/amdgpu/vcn: support multiple instance direct SRAM read and write (v2)
Add multiple instance direct SRAM read and write support for vcn2.5

v2: squash in indexing fix

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:36:47 -05:00
James Zhu
597e6ac3a7 drm/amdgpu/vcn: support multiple-instance dpg pause mode
Add multiple-instance dpg pause mode support for VCN2.5

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:51 -05:00
Tianci.Yin
9e44147862 drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training enabled(V5)
[why]
In dual GPUs scenario, stolen_size is assigned to zero on the secondary GPU,
since there is no pre-OS console using that memory. Then the bottom region of
VRAM was allocated as GTT, unfortunately a small region of bottom VRAM was
encroached by UMC firmware during GDDR6 BIST training, this cause page fault.

[how]
Forcing stolen_size to 3MB, then the bottom region of VRAM was
allocated as stolen memory, GTT corruption avoid.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:37 -05:00
Tianci.Yin
6a1094ab68 drm/amdgpu/gfx10: update gfx golden settings for navi14
remove registers: mmSPI_CONFIG_CNTL
add registers: mmSPI_CONFIG_CNTL_1

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:30 -05:00
Tianci.Yin
7b7041f892 drm/amdgpu/gfx10: update gfx golden settings
remove registers: mmSPI_CONFIG_CNTL
add registers: mmSPI_CONFIG_CNTL_1

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:23 -05:00
shaoyunl
b4df2823ec drm/amdgpu: check rlc_g firmware pointer is valid before using it
In SRIOV, rlc_g firmware is loaded by host, guest driver won't load it which will
cause the rlc_fw pointer is null

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:17 -05:00
Christian König
971fe55545 drm/amdgpu: drop amdgpu_job.owner
Entirely unused.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:10 -05:00
Nirmoy Das
55414ad5c9 drm/amdgpu: error out on entity with no run queue
Disabled HW IP's entity initialized with NULL rq. We should not
process any submit request from userspace for a disabled HW IP.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:35:03 -05:00
Huang Rui
8eee00f615 drm/amdkfd: use map_queues for hiq on gfx v10 as well
To align with gfx v9, we use the map_queues packet to load hiq MQD.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:57 -05:00
Aaron Liu
35cd89d5a6 drm/amdkfd: use kiq to load the mqd of hiq queue for gfx v9 (v6)
There is an issue that CP will check the HIQ queue to be configured and mapped
with KIQ ring, otherwise, it will be unable to read back the secure buffer while
the gfxoff is enabled even with trusted IP blocks.

v1 -> v2:
- Fix to remove surplus set_resources packets.
- Fill the whole configuration in MQD.
- Change the author as Aaron because he addressed the key point of this issue.
- Add kiq ring lock.

v2 -> v3:
- Free the lock while in error return case.
- Remove the programming only needed by the queue is unmapped.

v3 -> v4:
- Remove doorbell programming because it's used for restarting queue.
- Remove CP scheduler programming because map_queue packet will handle this.

v4 -> v5:
- Remove cp_hqd_active because mec ucode will enable it while use map_queues.
- Revise goto out_unlock.
- Correct the right doorbell offset for HIQ that kfd driver assigned in the
  packet.

v5 -> v6:
- Merge Arcturus fix into this patch because it will get oops in Arcturus
  platform.

Reported-by: Lisa Saturday <Lisa.Saturday@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:50 -05:00
Alex Sierra
d175e9acf6 drm/amdgpu: flush TLB functions removal from kfd2kgd interface
[Why]
kfd2kgd interface will be deprecated. This removal only covers TLB
invalidation for now. They have been replaced in amdgpu_amdkfd API.

[How]
TLB invalidate functions removed from the different amdkfd_gfx_v*
versions.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:42 -05:00
Alex Sierra
ffa022696f drm/amdgpu: GPU TLB flush API moved to amdgpu_amdkfd
[Why]
TLB flush method has been deprecated using kfd2kgd interface.
This implementation is now on the amdgpu_amdkfd API.

[How]
TLB flush functions now implemented in amdgpu_amdkfd.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:33 -05:00
Alex Sierra
ea930000a6 drm/amdgpu: export function to flush TLB via pasid
This can be used directly from amdgpu and amdkfd to invalidate
TLB through pasid.
It supports gmc v7, v8, v9 and v10.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:27 -05:00
Alex Sierra
4f01f1e58e drm/amdgpu: replace kcq enable/disable functions on gfx_v9
[Why]
There are HW-indpendent functions that enables and disables kcq. These functions use
the kiq_pm4_funcs implementation.

[How]
Local kcq enable and disable functions removed and replace it by the generic kcq
enable under amdgpu_gfx

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:19 -05:00
Alex Sierra
58e508b6be drm/amdgpu: implement tlbs invalidate on gfx9 gfx10
tlbs invalidate pointer function added to kiq_pm4_funcs struct.
This way, tlb flush can be done through kiq member.
TLBs invalidatation implemented for gfx9 and gfx10.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:11 -05:00
Alex Sierra
f167ea6a14 drm/amdgpu: kiq pm4 function implementation for gfx_v9
Functions implemented from kiq_pm4_funcs struct members
for gfx_v9 version.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:04 -05:00
Alex Sierra
a269e44989 drm/amdgpu: Avoid reclaim fs while eviction lock
[Why]
Avoid reclaim filesystem while eviction lock is held called from
MMU notifier.

[How]
Setting PF_MEMALLOC_NOFS flags while eviction mutex is locked.
Using memalloc_nofs_save / memalloc_nofs_restore API.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:33:16 -05:00
Christian König
5e791166d3 drm/ttm: nuke invalidate_caches callback
Another completely unused feature.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/348265/
2020-01-16 16:35:07 +01:00
Aaron Liu
f2360e333b drm/amdgpu: update goldensetting for renoir
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-15 17:52:21 -05:00
Alex Deucher
a9ffe2a983 drm/amdgpu/debugfs: properly handle runtime pm
If driver debugfs files are accessed, power up the GPU
when necessary.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:20:34 -05:00
Alex Deucher
b9a9294b91 drm/amdgpu/pm: properly handle runtime pm
If power management sysfs or debugfs files are accessed,
power up the GPU when necessary.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:20:34 -05:00
Flora Cui
f81110b852 drm/amdgpu: add header file for macro SZ_1M
Fixes: 4dee6e4ca5 ("drm/amdgpu: use linux size macro to simplify ONE_Kib & One_Mib")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Alex Deucher
a2e4b418c6 drm/amdgpu/psp: declare navi1x ta firmware
So that it gets included in the initrd.  At the moment
this is optional firmware that contains support for HDCP.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Joseph Greathouse
22d39fe729 drm/amdgpu: Match TC hash settings to DF settings (v2)
On Arcturus, data fabric hashing is set by the VBIOS, and
affects which addresses map to which memory channels. The
gfx core's caches also need to know this mapping, but the
hash settings for these these caches is set by the driver.

This change queries the DF to understand how the VBIOS
configured DF, then matches the TC hash configuration bits
to do the same thing.

v2: squash in warning fix

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:58 -05:00
Joseph Greathouse
bdf84a80e0 drm/amdgpu: Create generic DF struct in adev
The only data fabric information the adev struct currently
contains is a function pointer table. In the near future,
we will be adding some cached DF information into adev. As
such, this patch creates a new amdgpu_df struct for adev.
Right now, it only containst the old function pointer table,
but new stuff will be added soon.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:41 -05:00
John Clements
eee2eabafe drm/amdgpu: preserve RSMU UMC index mode state
between UMC RAS err register access restore previous RSMU UMC index mode state

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
John Clements
9c8c81fe7d drm/amdgpu: disable XGMI TA unload for arcturus
in event of GPU reset, XGMI TA unload causes unrecoverable GPU hang

Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Aaron Liu
d8459d1b7f drm/amdgpu: update goldensetting for renoir
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
1499bcc7a2 drm/amdgpu/gmc10: free stolen memory in late_init
We don't need to store the pre-OS console memory after
the driver has loaded so free it.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
bbde7162f7 drm/amdgpu/gmc10: remove dead code
Leftover from bring up.  We look up the actual pre-OS memory usage
value later in the same function.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Alex Deucher
403c1ef0d2 drm/amdgpu: enable S/G display on PCO and RV2 (v2)
It should work on all Raven variants, but some users have
reported issues with original Raven with IOMMU enabled.
So far there have been no issues observed with PCO or RV2.

v2: split out the dm init and domain changes into separate
    patches.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Alex Deucher
d44394a9e1 drm/amdgpu/gfx9: remove unused sdma headers
All of the sdma stuff these were used for moves to
the sdma code, so remove them.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Hawking Zhang
49da2ccd2d drm/amdgpu: check sdma ras funcs pointer before accessing
sdma ras funcs are not supported by ASIC prior
to vega20

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Guchun Chen
5d4667ec33 drm/amdgpu: calculate MCUMC_ADDRT0 per asic's UMC offset
Hardcoded offset is not friendly. And another benifit of this
patch is to keep read and write access to this register be
consistent with other similar UMC regsiters  in this file.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Tiecheng Zhou
df5e984c8b drm/amdgpu/sriov: workaround on rev_id for Navi12 under sriov
guest vm gets 0xffffffff when reading RCC_DEV0_EPF0_STRAP0,
as a consequence, the rev_id and external_rev_id are wrong.

workaround it by hardcoding the rev_id to 0, which is the default value.

v2. add comment in the code

Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Hawking Zhang
5e62db9df6 drm/amdgpu: read sdma edc counter to clear the counters
SDMA edc counter registers were added in gfx edc counters
array. When querying gfx error counter in that array, there
is no way to differentiate sdma instance number for different
asic and then results to NULL pointer access when trying to
read sdma register base address for instances greater
than 2 on Vega20.
In addition, this also results to wrong gfx error counters
since it actually added sdma edc counters.
Therefore, sdma edc counter registers should be separated
from gfx edc counter regsiter array and only get initialized
when driver tries to enable sdma ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00