Commit Graph

8027 Commits

Author SHA1 Message Date
Felix Kuehling
11b29c9e25 drm/amdkfd: Fix incorrect use of process->mm
This mm_struct pointer should never be dereferenced. If running in
a user thread, just use current->mm. If running in a kernel worker
use get_task_mm to get a safe reference to the mm_struct.

Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-04 11:37:25 -04:00
Shirish S
987bf11644 drm/amd/display: Signal hw_done() after waiting for flip_done()
In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
we signal hw_done().

[Why]

This is to temporarily address a paging error that occurs when a
nonblocking commit contends with another commit, particularly in a
mirrored display configuration where at least 2 CRTCs are updated.
The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
attempt to access the contents of new_crtc_state->commit.

Here's the sequence for a mirrored 2 display setup (irrelevant steps
left out for clarity):

**THREAD 1**                        | **THREAD 2**
                                    |
Initialize atomic state for flip    |
                                    |
Queue worker                        |
                                   ...

                                    | Do work for flip
                                    |
                                    | Signal hw_done() on CRTC 1
                                    | Signal hw_done() on CRTC 2
                                    |
                                    | Wait for flip_done() on CRTC 1

                                <---- **PREEMPTED BY THREAD 1**

Initialize atomic state for cursor  |
update (1)                          |
                                    |
Do cursor update work on both CRTCs |
                                    |
Clear atomic state (2)              |
**DONE**                            |
                                   ...
                                    |
                                    | Wait for flip_done() on CRTC 2
                                    | *ERROR*
                                    |

The issue starts with (1). When the atomic state is initialized, the
current CRTC states are duplicated to be the new_crtc_states, and
referenced to be the old_crtc_states. (The new_crtc_states are to be
filled with update data.)

Some things to note:

* Due to the mirrored configuration, the cursor updates on both CRTCs.

* At this point, the pflip IRQ has already been handled, and flip_done
  signaled on all CRTCs. The cursor commit can therefore continue.

* The old_crtc_states used by the cursor update are the **same states**
  as the new_crtc_states used by the flip worker.

At (2), the old_crtc_state is freed (*), and the cursor commit
completes. We then context switch back to the flip worker, where we
attempt to access the new_crtc_state->commit object. This is
problematic, as this state has already been freed.

(*) Technically, 'state->crtcs[i].state' is freed, which was made to
    reference old_crtc_state in drm_atomic_helper_swap_state()

[How]

By moving hw_done() after wait_for_flip_done(), we're guaranteed that
the new_crtc_state (from the flip worker's perspective) still exists.
This is because any other commit will be blocked, waiting for the
hw_done() signal.

Note that both the i915 and imx drivers have this sequence flipped
already, masking this problem.

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-04 11:21:03 -04:00
Bhawanpreet Lakha
fbbdadf2fa drm/amd/display: Fix Edid emulation for linux
[Why]
EDID emulation didn't work properly for linux, as we stop programming
if nothing is connected physically.

[How]
We get a flag from DRM when we want to do edid emulation. We check if
this flag is true and nothing is connected physically, if so we only
program the front end using VIRTUAL_SIGNAL.

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27 10:05:21 -05:00
Roman Li
599760d6d0 drm/amd/display: Fix Vega10 lightup on S3 resume
[Why]
There have been a few reports of Vega10 display remaining blank
after S3 resume. The regression is caused by workaround for mode
change on Vega10 - skip set_bandwidth if stream count is 0.
As a result we skipped dispclk reset on suspend, thus on resume
we may skip the clock update assuming it hasn't been changed.
On some systems it causes display blank or 'out of range'.

[How]
Revert "drm/amd/display: Fix Vega10 black screen after mode change"
Verified that it hadn't cause mode change regression.

Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27 10:03:12 -05:00
Rex Zhu
61ea6f5831 drm/amdgpu: Fix vce work queue was not cancelled when suspend
The vce cancel_delayed_work_sync never be called.
driver call the function in error path.

This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause
smu hang because the dgpu has been suspend, and the dgpu also
will be waked up. As the smu has been hang, so the dgpu resume
will failed.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-09-27 10:01:20 -05:00
Yong Zhao
44d8cc6f1a drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.

For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:23 -05:00
Yong Zhao
15426dbb65 drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.

The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:17 -05:00
Amber Lin
caaa4c8a6b drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7
A wrong register bit was examinated for checking SDMA status so it reports
false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct
bit.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:01 -05:00
Alex Deucher
30f3984ede drm/amdgpu: add new polaris pci id
Add new pci id.

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-09-19 22:35:23 -05:00
Christian König
0165de9832 drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk
Slowly leaking memory one page at a time :)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-11 16:35:00 -05:00
Emily Deng
3a74987b24 drm/amdgpu: move PSP init prior to IH in gpu reset
since we use PSP to program IH regs now

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:58:21 -05:00
Tao Zhou
68ebc13ea4 drm/amdgpu: Fix SDMA hang in prt mode v2
Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to avoid the issue

Affected ASICs: VEGA10 VEGA12 RV1 RV2

v2: add reg clear for SDMA1

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Tested-by: Yukun Li <yukun1.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:56:27 -05:00
Christian König
b463d4e53c drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path
Avoid unlocking a lock we never locked.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:53:29 -05:00
Dave Airlie
185c3cfaca Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Fixes for 4.19:
- SR-IOV fixes
- Kasan and page fault fix on device removal
- S3 stability fix for CZ/ST
- VCE regression fixes for CIK parts
- Avoid holding the mn_lock when allocating memory
- DC memory leak fix
- BO eviction fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180829202555.2653-1-alexander.deucher@amd.com
2018-08-30 11:34:14 +10:00
Emily Deng
6ddd9769db drm/amdgpu: Need to set moved to true when evict bo
Fix the VMC page fault when the running sequence is as below:
1.amdgpu_gem_create_ioctl
2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called
amdgpu_vm_bo_base_init, so won't called
list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted,
it won't set the bo_base->moved.
3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called
list_move_tail(&base->vm_status, &vm->evicted), but not set the
bo_base->moved.
4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is
not set true, the function amdgpu_vm_bo_insert_map will call
list_move(&bo_va->base.vm_status, &vm->moved)
5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the
moved list, not in the evict list. So VMC page fault occurs.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 12:42:48 -05:00
Rex Zhu
2f4e7db0f7 drm/amdgpu: Remove duplicated power source update
when ac/dc switch, driver will be notified by acpi event.
then the power source will be updated. so don't need to
get power source when set power state.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:55:00 -05:00
SivapiriyanKumarasamy
e7603dadd3 drm/amd/display: Fix memory leak caused by missed dc_sink_release
[Why]
There is currently an intermittent hang from a memory leak in
DTN stress testing. It is caused by unfreed memory during driver
disable.

[How]
Do a dc_sink_release in the case that skips it incorrectly.

Signed-off-by: SivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:53:37 -05:00
Christian König
4a2de54dc1 drm/amdgpu: fix holding mn_lock while allocating memory
We can't hold the mn_lock while allocating memory.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:50:56 -05:00
Rex Zhu
72ef23de20 drm/amdgpu: Power on uvd block when hw_fini
when hw_fini/suspend, smu only need to power on uvd block
if uvd pg is supported, don't need to call uvd to do hw_init.

v2: fix typo in patch descriptions and comments.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:49:48 -05:00
Rex Zhu
2ab4d0e742 drm/amdgpu: Update power state at the end of smu hw_init.
For SI/Kv, the power state is managed by function
amdgpu_pm_compute_clocks.

when dpm enabled, we should call amdgpu_pm_compute_clocks
to update current power state instand of set boot state.

this change can fix the oops when kfd driver was enabled on Kv.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:49:33 -05:00
Rex Zhu
6d39df146f drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins
Forgot to add vce pg support via smu for Kaveri/Mullins.

Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
                       to set_powergating_by_smu")

v2: refine patch descriptions suggested by Michel

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:49:15 -05:00
Rex Zhu
8ef23364b6 drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode
This is required by gfx hw and can fix the rlc hang when
do s3 stree test on Cz/St.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hang Zhou <hang.zhou@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:48:55 -05:00
Felix Kuehling
fca5d95997 drm/amdgpu: Adjust the VM size based on system memory size v2
Set the VM size based on system memory size between the ASIC-specific
limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their
default VM size of 256TB (48 bit). Only older GPUs will adjust VM size
depending on system memory size.

This makes more VM space available for ROCm applications on GFXv8 GPUs
that want to map all available VRAM and system memory in their SVM
address space.

v2:
* Clarify comment
* Round up memory size before >> 30
* Round up automatic vm_size to power of two

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 14:48:25 -05:00
Linus Torvalds
5e8704ac1c amdgpu and panel/misc fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQIbBAABAgAGBQJbf4GtAAoJEAx081l5xIa+WzsP8gNAadlpbPP3zmMoVwCKPeOR
 YZBeJxc/C9EXR8PfsMzo5kuUFm6rf7F9hYCaqmJj+exKCEGSwRAsHxBk2Ll9PbHm
 7iFf7HjnGlU6swRtHOniLV2c1XxyUJGafkHBET6/SarqHfTW3QvaGTAxAKmxpNSr
 qY7m1SyU1JKV2sw9N4xODLMRvoG/Y0OaiAgzOkUTbrSNomCfWVTGV/Pat7pDh/3z
 jbeeTGaMLYJISKWBMfTeN4lxex+b2txGAMBs5OQvwetD4bvsstbp1tdkB7KNmISm
 RrOJOTlD2ZhZ/Yo2i3JETa3v6nhAw6+zeOq7o/eiBtnkKSYb57hL42toKx/k4YV4
 pdTnkImrYtVOzFj39MvByISSD0XaM7nuP769DkGsRjlPa1O5LnpDcawsNuc+brqY
 wQQ/CBhh614YfTcQKRoKkeZSr5CjuQILwX5VVQFiL7DJaHXpfqK7IXUQ99zF1SF+
 V9Byb6BZpfDYEsN9biqxGTqIYEs8FWi9NP7uK8vQeMaT+9P5/ntVQVIfl5rhp3W8
 RtZc1F5NBVc3XTJOcgpcfyM5gz7L31NDHbWRxgIsCMWDFfmTxxSirJSc8EE+2fqn
 EtNzuzcVKBmTvUrcQsuclC4iymDAOnCjSwdSVTg19loi84y9UKx7BIg1pqYku3hA
 Yihn27r0Ftyd4cXdgu4=
 =kMpO
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-08-24' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just a couple of fixes"

  One MAINTAINERS address change, two panels fixes, and set of amdgpu
  fixes (build fixes, display fixes and some others)"

* tag 'drm-next-2018-08-24' of git://anongit.freedesktop.org/drm/drm:
  drm/edid: Add 6 bpc quirk for SDC panel in Lenovo B50-80
  drm/amd/display: Don't build DCN1 when kcov is enabled
  Revert "drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86"
  drm/amdgpu/display: disable eDP fast boot optimization on DCE8
  drm/amdgpu: fix amdgpu_amdkfd_remove_eviction_fence v3
  drm/amdgpu: fix incorrect use of drm_file->pid
  drm/amdgpu: fix incorrect use of fcheck
  drm/powerplay: enable dpm under pass-through
  drm/amdgpu: access register without KIQ
  drm/amdgpu: set correct base for THM/NBIF/MP1 IP
  drm/amd/display: fix dentist did ranges
  drm/amd/display: make dp_ss_off optional
  drm/amd/display: fix dp_ss_control vbios flag parsing
  drm/amd/display: Do not retain link settings
  MAINTAINERS: drm-misc: Change seanpaul's email address
  drm/panel: simple: tv123wam: Add unprepare delay
2018-08-24 09:22:54 -07:00
Rex Zhu
a296b16270 drm/amd/display: Fix bug use wrong pp interface
Used wrong pp interface, the original interface is
exposed by dpm on SI and paritial CI.

Pointed out by Francis David <david.francis@amd.com>

v2: dal only need to set min_dcefclk and min_fclk to smu.
    so use display_clock_voltage_request interface,
    instand of update all display configuration.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-24 11:16:58 -05:00
Andrey Grodzovsky
eb7e5cfced drm/amdgpu: Fix page fault and kasan warning on pci device remove.
Problem:
When executing echo 1 > /sys/class/drm/card0/device/remove kasan warning
as bellow and page fault happen because adev->gart.pages already freed by the
time amdgpu_gart_unbind is called.

BUG: KASAN: user-memory-access in amdgpu_gart_unbind+0x98/0x180 [amdgpu]
Write of size 8 at addr 0000000000003648 by task bash/1828
CPU: 2 PID: 1828 Comm: bash Tainted: G        W  O      4.18.0-rc1-dev+ #29
Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming/AX370-Gaming-CF, BIOS F3 06/19/2017
Call Trace:
dump_stack+0x71/0xab
kasan_report+0x109/0x390
amdgpu_gart_unbind+0x98/0x180 [amdgpu]
ttm_tt_unbind+0x43/0x60 [ttm]
ttm_bo_move_ttm+0x83/0x1c0 [ttm]
ttm_bo_handle_move_mem+0xb97/0xd00 [ttm]
ttm_bo_evict+0x273/0x530 [ttm]
ttm_mem_evict_first+0x29c/0x360 [ttm]
ttm_bo_force_list_clean+0xfc/0x210 [ttm]
ttm_bo_clean_mm+0xe7/0x160 [ttm]
amdgpu_ttm_fini+0xda/0x1d0 [amdgpu]
amdgpu_bo_fini+0xf/0x60 [amdgpu]
gmc_v8_0_sw_fini+0x36/0x70 [amdgpu]
amdgpu_device_fini+0x2d0/0x7d0 [amdgpu]
amdgpu_driver_unload_kms+0x6a/0xd0 [amdgpu]
drm_dev_unregister+0x79/0x180 [drm]
amdgpu_pci_remove+0x2a/0x60 [amdgpu]
pci_device_remove+0x5b/0x100
device_release_driver_internal+0x236/0x360
pci_stop_bus_device+0xbf/0xf0
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0xda/0xf0
kernfs_fop_write+0x186/0x220
__vfs_write+0xcc/0x330
vfs_write+0xe6/0x250
ksys_write+0xb1/0x140
do_syscall_64+0x77/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f66ebbb32c0

Fix:
Split gmc_v{6,7,8,9}_0_gart_fini to postpone amdgpu_gart_fini to after
memory managers are shut down since gart unbind happens
as part of this procedure

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-22 16:36:49 -05:00
Emily Deng
2f40c6eac7 amdgpu: fix multi-process hang issue
SWDEV-146499: hang during multi vulkan process testing

cause:
the second frame's PREAMBLE_IB have clear-state
and LOAD actions, those actions ruin the pipeline
that is still doing process in the previous frame's
work-load IB.

fix:
need insert pipeline sync if have context switch for
SRIOV (because only SRIOV will report PREEMPTION flag
to UMD)

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-22 16:05:20 -05:00
Christian König
d98ff24e8e drm/amdgpu: fix preamble handling
At this point the command submission can still be interrupted.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-22 16:05:00 -05:00
Christian König
8604ffcbf0 drm/amdgpu: fix VM clearing for the root PD
We need to figure out the address after validating the BO, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-22 16:04:14 -05:00
Michal Hocko
93065ac753 mm, oom: distinguish blockable mode for mmu notifiers
There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: David Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:44 -07:00
Leo (Sunpeng) Li
9d1d02ff36 drm/amd/display: Don't build DCN1 when kcov is enabled
DCN1 contains code that utilizes fp math. When
CONFIG_KCOV_INSTRUMENT_ALL and CONFIG_KCOV_ENABLE_COMPARISONS are
enabled, build errors are found. See this earlier patch for details:

https://lists.freedesktop.org/archives/dri-devel/2018-August/186131.html

As a short term solution, disable CONFIG_DRM_AMD_DC_DCN1_0 when
KCOV_INSTRUMENT_ALL and KCOV_ENABLE_COMPARISONS are enabled. In
addition, make it a fully derived config, taking into account
CONFIG_X86.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:33:59 -05:00
Leo (Sunpeng) Li
dc37a9a08d Revert "drm/amdgpu/display: Replace CONFIG_DRM_AMD_DC_DCN1_0 with CONFIG_X86"
This reverts commit 8624c3c4dbfe24fc6740687236a2e196f5f4bfb0.

We need CONFIG_DRM_AMD_DC_DCN1_0 to guard code that is using fp math.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:32:28 -05:00
Alex Deucher
95f05a3a2e drm/amdgpu/display: disable eDP fast boot optimization on DCE8
Seems to cause blank screens.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106940
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:23:17 -05:00
Christian König
e6f8d26ebb drm/amdgpu: fix amdgpu_amdkfd_remove_eviction_fence v3
Fix quite a number of bugs here. Unfortunately only compile tested.

v2: fix copy&paste error
v3: fix 80 chars issue in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:19:26 -05:00
Christian König
c4aed87630 drm/amdgpu: fix incorrect use of drm_file->pid
That's the PID of the creator of the file (usually the X server) and not
the end user of the file.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
2018-08-21 14:19:18 -05:00
Christian König
bce31d4c1a drm/amdgpu: fix incorrect use of fcheck
The usage isn't RCU protected.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
2018-08-21 14:19:10 -05:00
Yintian Tao
11a88c2e92 drm/powerplay: enable dpm under pass-through
Repeat enable dpm under pass-through because there is no actually
hardware-fini and real power-off when guest vm shutdown or reboot.
Otherwise, under pass-through it will be failed to populate populate
and upload SCLK MCLK DPM levels due to zero of pcie_speed_table.count.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:18:21 -05:00
Yintian Tao
fa1d04e9a8 drm/amdgpu: access register without KIQ
there is no need to access register such as mmSMC_IND_INDEX_11
and mmSMC_IND_DATA_11 through KIQ because they are VF-copy.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:17:36 -05:00
Evan Quan
bde0781561 drm/amdgpu: set correct base for THM/NBIF/MP1 IP
Set correct address base for vega20.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:16:50 -05:00
Dmytro Laktyushkin
39a3cd6783 drm/amd/display: fix dentist did ranges
Dentist did ranges were incomplete as max setting has an unusual
divider step up of 66.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:16:43 -05:00
Dmytro Laktyushkin
66b198ffc9 drm/amd/display: make dp_ss_off optional
dp_ss_off flag doesn't need to be set, so we create a link_init
function if it is needed by an asic

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:16:35 -05:00
Dmytro Laktyushkin
16747b2109 drm/amd/display: fix dp_ss_control vbios flag parsing
dp_ss_control = 0 means ss is off, we had a typo where
we would double not dp_ss_control while setting dp_ss_off
flag

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:16:28 -05:00
Samson Tam
8f7040b8f2 drm/amd/display: Do not retain link settings
Do not retrain link settings if lane count and link rate are both
unknown.  Causes driver to be stuck reading VBIOS register after
removing emulated connection.

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-21 14:16:20 -05:00
Michel Dänzer
c9533d1bca drm/amdgpu: Use kvmalloc for allocating UVD/VCE/VCN BO backup memory
The allocated size can be (at least?) as large as megabytes, and
there's no need for it to be physically contiguous.

May avoid spurious failures to initialize / suspend the corresponding
block while there's memory pressure.

Bugzilla: https://bugs.freedesktop.org/107432
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-16 12:59:11 -05:00
Nicholas Kazlauskas
dddc0557e3 drm/amd/display: Guard against null crtc in CRC IRQ
[Why]

A null pointer deference can occur if crtc is null in
amdgpu_dm_crtc_handle_crc_irq. This can happen if get_crtc_by_otg_inst
returns NULL during dm_crtc_high_irq, leading to a hang in some IGT
test cases.

[How]

Check that CRTC is non-null before accessing its fields.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-13 17:29:44 -05:00
Mikita Lipski
433149130c drm/amd/display: Pass connector id when executing VBIOS CT
[why]
Older ASICs require both phys_id and connector_id
to execute bios command table. If we are not passing the
right connector_id - it can lead to a black screen.

[how]
Set connector_obj_id when executing vbios command table

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-08-13 17:29:37 -05:00
Mikita Lipski
ad8960a6cb drm/amd/display: Check if clock source in use before disabling
[why]
We are disabling clock source while other pipes are still using
it, because we don't verify the number of pipes that share it.

[how]
- Adding a function in resources to return the number of pipes
sharing the clock source.
- Checking that no one is sharing the clock source before disabling

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-08-13 17:29:11 -05:00
Mikita Lipski
fc69009e35 drm/amd/display: Allow clock sharing b/w HDMI and DVI
[why]
HDMI and DVI share the same PHY clock and single link
DVI and HDMI both use 4 lanes, so they should be allowed
to be sharing the same clock source if all other parameters
are satisfied.

[how]
Change a check for general DVI to Dual DVI.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-13 17:24:21 -05:00
Jerry (Fangzhi) Zuo
7cb5285507 drm/amd/display: Fix warning observed in mode change on Vega
[Why]
DOUBLE_BUFFER_EN bit is getting cleared before enable blanking.
That leads to CRTC_BLANK_DATA_EN is getting updated immediately.

[How]
Get DOUBLE_BUFFER_EN bit set, the same as DCE110.

Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-13 17:22:08 -05:00
Charlene Liu
321f65a623 drm/amd/display: fix single link DVI has no display
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-13 17:21:49 -05:00