linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 12:45:16 +07:00

Author	SHA1	Message	Date
Michel Dänzer	bdb1922abd	drm/amdgpu: Only retrieve GPU address of GART table after pinning it Doing it earlier hits a WARN_ON_ONCE in amdgpu_bo_gpu_offset. Fixes: "drm/amdgpu: remove gart.table_addr" Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:55:47 -05:00
Emily Deng	7ef0b43545	drm/amdgpu: Need to set moved to true when evict bo Fix the VMC page fault when the running sequence is as below: 1.amdgpu_gem_create_ioctl 2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called amdgpu_vm_bo_base_init, so won't called list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted, it won't set the bo_base->moved. 3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called list_move_tail(&base->vm_status, &vm->evicted), but not set the bo_base->moved. 4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is not set true, the function amdgpu_vm_bo_insert_map will call list_move(&bo_va->base.vm_status, &vm->moved) 5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the moved list, not in the evict list. So VMC page fault occurs. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:55:38 -05:00
Yintian Tao	e78196444b	drm/amdgpu: move full access into amdgpu_device_ip_suspend It will be more safe to make full-acess include both phase1 and phase2. Then accessing special registeris wherever at phase1 or phase2 will not block any shutdown and suspend process under virtualization. Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:55:20 -05:00
Christian König	0c79c0bb87	drm/amdgpu: remove extra newline when printing VM faults Looks like a copy&paste error to me. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:54:58 -05:00
Amber Lin	521fb7d021	drm/amdgpu: Move KFD parameters to amdgpu (v3) After merging KFD into amdgpu, move module parameters defined in KFD to amdgpu_drv.c, where other module parameters are declared. v2: add kernel-doc comments v3: rebase and fix parameter variable name (Alex) Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:51:11 -05:00
Amber Lin	82b7b619c4	drm/amdgpu: Remove CONFIG_HSA_AMD_MODULE After amdkfd is merged to amdgpu, CONFIG_HSA_AMD_MODULE no longer exists. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:24:56 -05:00
Amber Lin	04d5e27658	drm/amdgpu: Merge amdkfd into amdgpu Since KFD is only supported by single GPU driver, it makes sense to merge amdgpu and amdkfd into one module. This patch is the initial step: merge Kconfig and Makefile. v2: also remove kfd from drm Kconfig Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:22:42 -05:00
Andrey Grodzovsky	7d0aa3765f	drm/amdgpu: Refine gmc9 VM fault print. The fault reports the page number where the fault happend and not the exact faulty address. Update the print message to reflect that. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:21:20 -05:00
David Francis	9bbf6a5341	drm/amd/display: Flatten unnecessary i2c functions [Why] The dce_i2c_hw code contained four funtcions that were only called in one place and did not have a clearly delineated purpose. [How] Inline these functions, keeping the same functionality. This is not a functional change. The functions disable_i2c_hw_engine and release_engine_dce_hw were pulled into their respective callers. The most interesting part of this change is the acquire functions. dce_i2c_hw_engine_acquire_engine was pulled into dce_i2c_engine_acquire_hw, and dce_i2c_engine_acquire_hw was pulled into acquire_i2c_hw_engine. Some notes to show that this change is not functional: -Failure conditions in any function resulted in a cascade of calls that ended in a 'return NULL'. Those are replaced with a direct 'return NULL'. -The variable result is the one from dce_i2c_hw_engine_acquire_engine. The boolean result used as part of return logic was removed. -As the second half of dce_i2c_hw_engine_acquire_engine is only executed if that function is returning true and therefore exiting the do-while loop in dce_i2c_engine_acquire_hw, those lines were moved outside of the loop. Signed-off-by: David Francis <David.Francis@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:21:11 -05:00
Eric Yang	ad908423ef	drm/amd/display: support 48 MHZ refclk off [Why] On PCO and up, whenever SMU receive message to indicate active display count = 0. SMU will turn off 48MHZ TMDP reference clock by writing to 1 TMDP_48M_Refclk_Driver_PWDN. Once this clock is off, no PHY register will respond to register access. This means our current sequence of notifying display count along with requesting clock will cause driver to hang when accessing PHY registers after displays count goes to 0. [How] Separate the PPSMC_MSG_SetDisplayCount message from the SMU messages that request clocks, have display own sequencing of this message so that we can send it at the appropriate time. Do not redundantly power off HW when entering S3, S4, since display should already be called to disable all streams. And ASIC soon be powered down. Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:21:03 -05:00
David Francis	d377ae4e37	drm/amd/display: Remove redundant i2c structs [Why] The i2c code contains two structs that contain the same information as i2c_payload [How] Replace references to those structs with references to i2c_payload dce_i2c_transaction_request->status was written to but never read, so all references to it are removed Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Jordan Lazare <Jordan.Lazare@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:56 -05:00
Nicholas Kazlauskas	46659a83e4	drm/amd/display: Support reading hw state from debugfs file [Why] Logging hardware state can be done by triggering a write to the debugfs file. It would also be useful to be able to read the hardware state from the debugfs file to be able to generate a clean log without timestamps. [How] Usage: cat /sys/kernel/debug/dri/0/amdgpu_dm_dtn_log Threading is an obvious concern when dealing with multiple debugfs operations and blocking on global state in dm or dc seems unfavorable. Adding an extra parameter for the debugfs log context state is the implementation done here. Existing code that made use of DTN_INFO and its associated macros needed to be refactored to support this. We don't know the size of the log in advance so it reallocates the log string dynamically. Once the log has been generated it's copied into the user supplied buffer for the debugfs. This allows for seeking support but it's worth nothing that unlike triggering output via dmesg the hardware state might change in-between reads if your buffer size is too small. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Jordan Lazare <Jordan.Lazare@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:49 -05:00
David Francis	1f6010a962	drm/amd/display: Improve spelling, grammar, and formatting of amdgpu_dm.c comments [Why] Good spelling and grammar makes comments more pleasant and clearer. Linux has coding standards for comments that we should try to follow. [How] Fix obvious spelling and grammar issues Ensure all comments use '/' and '/' and multi-line comments follow linux convention Remove line-of-stars comments that do not separate sections of code and comments referring to lines of code that have since been removed Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:41 -05:00
SivapiriyanKumarasamy	219097df0f	drm/amd/display: Fix memory leak caused by missed dc_sink_release [Why] There is currently an intermittent hang from a memory leak in DTN stress testing. It is caused by unfreed memory during driver disable. [How] Do a dc_sink_release in the case that skips it incorrectly. Signed-off-by: SivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:31 -05:00
Ken Chalmers	18e4aa33bd	drm/amd/display: eliminate long wait between register polls on Maximus [Why] Now that we "scale" time delays correctly on Maximus (as of diags svn r170115), the forced "35 ms" wait time now becomes 35 ms * 500 = 17.5 seconds, which is far too long. Even having to repeat polling a register once causes excessive delays on Maximus. [How] Just use the regular wait time passed to the generic_reg_wait() function. This is sufficient for Maximus now, and it also means that there's one less "Maximus-only" code path in DAL. Also disable the "REG_WAIT taking a while:" message on Maximus, since things do take a while longer there and 1-2ms delays are not uncommon (and nothing to worry about). Signed-off-by: Ken Chalmers <ken.chalmers@amd.com> Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:24 -05:00
Nicholas Kazlauskas	e498eb7136	drm/amd/display: Add support for hw_state logging via debugfs [Why] We have logging methods for printing hardware state for newer ASICs but no way to trigger the log output. [How] Add support for triggering the output via writing to a debugfs file entry. Log output currently goes into dmesg for convenience, but accessing via a read should be possible later. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Jordan Lazare <Jordan.Lazare@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:17 -05:00
Leo (Sunpeng) Li	e5d0170e56	drm/amd/display: Use non-deprecated vblank handler [Why] drm_handle_vblank is deprecated. Use drm_crtc_handle_vblank instead. Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:10 -05:00
Tony Cheng	58382a445b	drm/amd/display: dc 3.1.63 Signed-off-by: Tony Cheng <tony.cheng@amd.com> Reviewed-by: Steven Chiu <Steven.Chiu@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:20:04 -05:00
David Francis	9bbdb0f345	drm/amd/display: Eliminate i2c hw function pointers [Why] The function pointers of the dce_i2c_hw struct were never accessed from outside dce_i2c_hw.c and had only one version. As function pointers take up space and make debugging difficult, and they are not needed in this case, they should be removed. [How] Remove the dce_i2c_hw_funcs struct and make static all functions that were previously a part of it. Reorder the functions in dce_i2c_hw.c. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:19:56 -05:00
Christian König	284dec4317	drm/amdgpu: enable GTT PD/PT for raven v3 Should work on Vega10 as well, but with an obvious performance hit. Older APUs can be enabled as well, but will probably be more work. v2: fix error checking v3: use more general check Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:19:49 -05:00
Christian König	24a8d289d5	drm/amdgpu: add amdgpu_gmc_get_pde_for_bo helper v2 Helper to get the PDE for a PD/PT. v2: improve documentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:19:42 -05:00
Christian König	bbc9fb10e5	drm/amdgpu: add GMC9 support for PDs/PTs in system memory Add the necessary handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:19:27 -05:00
Christian König	e21eb2613d	drm/amdgpu: add helper for VM PD/PT allocation parameters v3 Add a helper function to figure them out only once. v2: fix typo with memset v3: rebase on kfd changes (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:19:12 -05:00
Christian König	248f2b8ef2	drm/amdgpu: remove extra root PD alignment Just another leftover from radeon. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:12:23 -05:00
Christian König	4f9ea1d0d1	drm/amdgpu: fix holding mn_lock while allocating memory We can't hold the mn_lock while allocating memory. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:12:17 -05:00
Christian König	85eff20020	drm/amdgpu: amdgpu_ctx_add_fence can't fail No more waiting for a fence done here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:12:10 -05:00
Rex Zhu	a9a8a788e5	drm/amdgpu: Change kiq ring initialize sequence on gfx9 1. initialize kiq before initialize gfx ring. 2. set kiq ring ready immediately when kiq initialize successfully. 3. split function gfx_v9_0_kiq_resume into two functions. gfx_v9_0_kiq_resume is for kiq initialize. gfx_v9_0_kcq_resume is for kcq initialize. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:12:03 -05:00
Rex Zhu	36859cd535	drm/amdgpu: Change kiq initialize/reset sequence on gfx8 1. initialize kiq before initialize gfx ring. 2. set kiq ring ready immediately when kiq initialize successfully. 3. split function gfx_v8_0_kiq_resume into two functions. gfx_v8_0_kiq_resume is for kiq initialize. gfx_v8_0_kcq_resume is for kcq initialize. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:56 -05:00
Rex Zhu	ffabea84c5	drm/amdgpu: Refine gfx_v9_0_kcq_disable function Send all kcq unmap_queue packets and then wait for complete. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:49 -05:00
Rex Zhu	841cf911fb	drm/amdgpu: Remove duplicate code in gfx_v9_0.c There are no any logical changes here. 1. if kcq can be enabled via kiq, we don't need to do kiq ring test. 2. amdgpu_ring_test_ring function can be used to sync the ring complete, remove the duplicate code. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:42 -05:00
Rex Zhu	a62a49e5b9	drm/amdgpu: Refine gfx_v8_0_kcq_disable function Send all kcq unmap_queue packets and then wait for complete. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:36 -05:00
Rex Zhu	6c10b5cc4e	drm/amdgpu: Remove duplicate code in gfx_v8_0.c There are no any logical changes here. 1. if kcq can be enabled via kiq, we don't need to do kiq ring test. 2. amdgpu_ring_test_ring function can be used to sync the ring complete, remove the duplicate code. v2: alloc 6 (not 7) dws for unmap_queues Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:28 -05:00
Rex Zhu	f1df06d0f9	drm/amdgpu: Remove dead code in amdgpu_pm.c As we have unify powergate_uvd/vce/mmhub to set_powergating_by_smu, and set_powergating_by_smu was supported by both dpm and powerplay. so remove the else case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:22 -05:00
Rex Zhu	e851abd830	drm/amdgpu: Power on uvd block when hw_fini when hw_fini/suspend, smu only need to power on uvd block if uvd pg is supported, don't need to call uvd to do hw_init. v2: fix typo in patch descriptions and comments. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:15 -05:00
Rex Zhu	3442516d14	drm/amdgpu: Update power state at the end of smu hw_init. For SI/Kv, the power state is managed by function amdgpu_pm_compute_clocks. when dpm enabled, we should call amdgpu_pm_compute_clocks to update current power state instand of set boot state. this change can fix the oops when kfd driver was enabled on Kv. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:09 -05:00
Rex Zhu	3510bafe56	drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins Forgot to add vce pg support via smu for Kaveri/Mullins. Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub to set_powergating_by_smu") v2: refine patch descriptions suggested by Michel Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:11:01 -05:00
Rex Zhu	d3200a536c	drm/amdgpu: Remove duplicated power source update when ac/dc switch, driver will be notified by acpi event. then the power source will be updated. so don't need to get power source when set power state. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:55 -05:00
Rex Zhu	1f06dee8f7	drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode This is required by gfx hw and can fix the rlc hang when do s3 stree test on Cz/St. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hang Zhou <hang.zhou@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:49 -05:00
Felix Kuehling	43370c4ce5	drm/amdgpu: Adjust the VM size based on system memory size v2 Set the VM size based on system memory size between the ASIC-specific limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their default VM size of 256TB (48 bit). Only older GPUs will adjust VM size depending on system memory size. This makes more VM space available for ROCm applications on GFXv8 GPUs that want to map all available VRAM and system memory in their SVM address space. v2: * Clarify comment * Round up memory size before >> 30 * Round up automatic vm_size to power of two Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:42 -05:00
Christian König	80dbea4720	drm/amdgpu: implement soft_recovery for GFX9 Try to kill waves on the SQ. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:36 -05:00
Christian König	f5d850331e	drm/amdgpu: implement soft_recovery for GFX8 v2 Try to kill waves on the SQ. v2: only for the GFX ring for now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:29 -05:00
Christian König	efb6706405	drm/amdgpu: implement soft_recovery for GFX7 Try to kill waves on the SQ. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 15:10:08 -05:00
Christian König	7876fa4f55	drm/amdgpu: add ring soft recovery v4 Instead of hammering hard on the GPU try a soft recovery first. v2: reorder code a bit v3: increase timeout to 10ms, increment GPU reset counter v4: squash in compile fix (Christian) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com>	2018-08-27 15:10:07 -05:00
Huang Rui	07e6d3f03b	drm/amdgpu: move PD/PT bos on LRU again The new bulk moving functionality is ready, the overhead of moving PD/PT bos to LRU is fixed. So move them on LRU again. Signed-off-by: Huang Rui <ray.huang@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:22 -05:00
Huang Rui	f921661bd4	drm/amdgpu: use bulk moves for efficient VM LRU handling (v6) I continue to work for bulk moving that based on the proposal by Christian. Background: amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of them on the end of LRU list one by one. Thus, that cause so many BOs moved to the end of the LRU, and impact performance seriously. Then Christian provided a workaround to not move PD/PT BOs on LRU with below patch: Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid validating VM PTs") However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU instead of one by one. Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be validated we move all BOs together to the end of the LRU without dropping the lock for the LRU. While doing so we note the beginning and end of this block in the LRU list. Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do, we don't move every BO one by one, but instead cut the LRU list into pieces so that we bulk move everything to the end in just one operation. Test data: +--------------+-----------------+-----------+---------------------------------------+ \| \|The Talos \|Clpeak(OCL)\|BusSpeedReadback(OCL) \| \| \|Principle(Vulkan)\| \| \| +------------------------------------------------------------------------------------+ \| \| \| \|0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) \| \| Original \| 147.7 FPS \| 76.86 us \|0.307 ms(8K) 0.310 ms(16K) \| +------------------------------------------------------------------------------------+ \| Orignial + WA\| \| \|0.254 ms(1K) 0.241 ms(2K) \| \|(don't move \| 162.1 FPS \| 42.15 us \|0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)\| \|PT BOs on LRU)\| \| \| \| +------------------------------------------------------------------------------------+ \| Bulk move \| 163.1 FPS \| 40.52 us \|0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) \| \| \| \| \|0.214 ms(8K) 0.225 ms(16K) \| +--------------+-----------------+-----------+---------------------------------------+ After test them with above three benchmarks include vulkan and opencl. We can see the visible improvement than original, and even better than original with workaround. v2: move all BOs include idle, relocated, and moved list to the end of LRU and put them together. v3: remove unused parameter and use list_for_each_entry instead of the one with save entry. v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time, all bo will be back on idle list. v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of validated, and move ttm_bo_bulk_move_lru_tail() also into amdgpu_vm_move_to_lru_tail(). v6: clean up and fix return value. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:22 -05:00
Huang Rui	7748e2dcda	drm/ttm: add bulk move function on LRU This function allow us to bulk move a group of BOs to the tail of their LRU. The positions of group of BOs are stored on the (first, last) bulk_move_pos structure. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:21 -05:00
Christian König	9a2779528e	drm/ttm: revise ttm_bo_move_to_lru_tail to support bulk moves When move a BO to the end of LRU, it need remember the BO positions. Make sure all moved bo in between "first" and "last". And they will be bulk moving together. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:21 -05:00
Christian König	11c3a249ff	drm/amdgpu: add amdgpu_gmc_pd_addr helper Add a helper to get the root PD address and remove the workarounds from the GMC9 code for that. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:19 -05:00
Christian König	4e830fb1b5	drm/amdgpu: remove gart.table_addr We can easily figure out the address on the fly. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:19 -05:00
Christian König	1123b989c5	drm/amdgpu: rename gart.robj into gart.bo sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/.c sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/.h Just cleaning up radeon leftovers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-27 11:11:18 -05:00

1 2 3 4 5 ...

48177 Commits