linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 13:02:19 +07:00

Author	SHA1	Message	Date
Tao Zhou	34cc4fd9ff	drm/amdgpu: move umc ras irq functions to umc block move umc ras irq functions from gmc v9 to generic umc block, these functions are relevant to umc and they can be shared among all generations of umc Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Tao Zhou	f5f06e21e9	drm/amdgpu: update parameter of ras_ih_cb change struct ras_err_data err_data to void err_data, align with umc code and the callback's declaration in each ras block could pay no attention to the structure type Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Monk Liu	e7da754b00	drm/amdgpu: fix an UMC hw arbitrator bug(v3) issue: the UMC6 h/w bug is that when MCLK is doing the switch in the middle of a page access being preempted by high priority client (e.g. DISPLAY) then UMC and the mclk switch would stuck there due to deadlock how: fixed by disabling auto PreChg for UMC to avoid high priority client preempting other client's access on the same page, thus the deadlock could be avoided v2: put the patch in callback of UMC6 v3: rename the callback to "init_registers" Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Marek Olšák	6de088a08d	drm/amdgpu: remove gfx9 NGG Never used. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Alex Deucher	631cdbd27e	drm/amdgpu/atomfirmware: simplify the interface to get vram info fetch both the vram type and width in one function call. This avoids having to parse the same data table twice to get the two pieces of data. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Alex Deucher	bd5520273c	drm/amdgpu/atomfirmware: use proper index for querying vram type (v3) The index is stored in scratch register 4 after asic init. Use that index. No functional change since all asics in a family use the same type of vram (G5, G6, HBM) and that is all we use at the monent, but if we ever need to query other info, we will now have the proper index. v2: module array is variable sized, handle that. v3: fix off by one in array handling Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Shirish S	52510a4035	drm/amdgpu/psp: silence response status warning log the response status related error to the driver's debug log since psp response status is not 0 even though there was no problem while the command was submitted. This warning misleads, hence this change. Signed-off-by: Shirish S <shirish.s@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Jesse Zhang	df99ac0fcc	drm/amd/amdgpu:Fix compute ring unable to detect hang. When compute fence did not signal, compute ring cannot detect hardware hang because its timeout value is set to be infinite by default. In SR-IOV and passthrough mode, if user does not declare custome timeout value for compute ring, then use gfx ring timeout value as default. So that when there is a ture hardware hang, compute ring can detect it. Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
chen gong	90a08351f7	drm/amdgpu: Use mode2 mode to perform GPU RESET for Renoir Renoir need to use mode2 mode to implement GPU RESET Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Yong Zhao	40463bdc22	drm/amdkfd: Sync gfx10 kfd2kgd_calls function pointers get_hive_id was not set. Also, adjust the function setting sequence. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Yong Zhao	c637b36aea	drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va() Currently this function pointer is missing for GFX10. Considering it is a void function since GFX9, fix it by checking the function pointer before dereferencing it. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Yong Zhao	812330eb69	drm/amdkfd: Add an error print if SDMA RLC is not idle The message will be useful when troubleshooting the issues. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Le Ma	05ba0095fb	drm/amdgpu: correct condition check for psp rlc autoload Otherwise non-autoload case will go into the wrong routine and fail. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Hawking Zhang	1f01cd9905	drm/amdgpu: add command id in psp response failure message For better clarification of issue. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Le Ma	90c88dab8e	drm/amdgpu: enable psp front door loading by default on Arcturus Front door firmware loading is done via the psp rather than the driver. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Le Ma	9a018e5a85	drm/amdgpu: disable vcn ip block for front door loading on Arcturus Needs more work to enable via front door loading. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Tianci.Yin	4db37544ce	drm/amdgpu/gfx10: add support for wks firmware loading load different cp firmware according to the DID and RID Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	f77c7109c0	drm/amdgpu/ras: fix and update the documentation for RAS Add new sections to amdgpu.rst, fix up formatting issues, add additional documentation to each section. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	a667b75c1e	drm/amdgpu: fix documentation for amdgpu_pm.c Fix DOC link name, clean up formatting in pp_dpm_* section. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	fc9c7f8470	drm/amdgpu/ih: fix documentation in amdgpu_irq_dispatch Fix parameters. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	1d614ded87	drm/amdgpu/vm: fix up documentation in amdgpu_vm.c Missing parameters, wrong comment type, etc. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	4d8e54d2b9	drm/amdgpu/mn: fix documentation for amdgpu_mn_read_lock Document the new parameter. Fixes: `93065ac753` ("mm, oom: distinguish blockable mode for mmu notifiers") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Alex Deucher	ebc52c1692	drm/amdgpu: fix documentation for amdgpu_gem_prime_export Drop extra function parameter. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
yu kuai	d0580c09c6	drm/amdgpu: remove excess function parameter description Fixes gcc warning: drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:431: warning: Excess function parameter 'sw' description in 'vcn_v2_5_disable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:550: warning: Excess function parameter 'sw' description in 'vcn_v2_5_enable_clock_gating' Fixes: `cbead2bdfc` ("drm/amdgpu: add VCN2.5 VCPU start and stop") Signed-off-by: yu kuai <yukuai3@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Guchun Chen	e53aec7e41	drm/amdgpu: enable full ras by default Enable full ras by default, user does not need to enable it by boot parameter. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:00 -05:00
Jiange Zhao	57d4f3b7fd	drm/amdgpu/SRIOV: add navi12 pci id for SRIOV (v2) Add Navi12 PCI id support. v2: flag as experimental for now (Alex) Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Tianci.Yin	7677b0dbce	drm/amdgpu/gfx10: update gfx golden settings for navi14 update registers: mmUTCL1_CTRL Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Tianci.Yin	aa4604b6e4	drm/amdgpu/gfx10: update gfx golden settings update registers: mmUTCL1_CTRL Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Alex Deucher	ade9a34e7d	drm/amdgpu: flag navi12 and 14 as experimental for 5.4 We can remove this later as things get closer to launch. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Alex Deucher	01b40c98ed	drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response Otherwise we may get stale data. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Alex Deucher	e8186eeccb	drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp We need to make sure the fifo is flushed before we ask the psp to process the commands. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Guchun Chen	5222d26146	drm/amdgpu: remove redundant variable definition No need to define the same variables in each loop of the function. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Guchun Chen	8a3e801f19	drm/amdgpu: avoid null pointer dereference null ptr should be checked first to avoid null ptr access Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Hawking Zhang	fec6a08aae	drm/amdgpu: do not init mec2 jt for renoir For ASICs like renoir/arct, driver doesn't need to load mec2 jt. when mec1 jt is loaded, mec2 jt will be loaded automatically since the write is actaully broadcasted to both. We need to more time to test other gfx9 asic. but for now we should be able to draw conclusion that mec2 jt is not needed for renoir and arct. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Hawking Zhang	2011eaea21	drm/amdgpu: add psp ip block for arct enable psp block for firmware loading and other security feature setup. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Alex Deucher	a142ba8800	drm/amdgpu/ras: use GPU PAGE_SIZE/SHIFT for reserving pages We are reserving vram pages so they should be aligned to the GPU page size. Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Xiaojie Yuan	ec51d3facd	drm/amdgpu/discovery: get gpu info from ip discovery table except soc_bounding_box which is not integrated in discovery table yet Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Tao Zhou	afa44809a4	drm/amdgpu: use GPU PAGE SHIFT for umc retired page umc retired page belongs to vram and it should be aligned to gpu page size Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Tianci.Yin	57516cdd74	drm/amdgpu: add navi12 pci id Add Navi12 PCI id support. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Tao Zhou	ae115c81ec	drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages There are two cases of reserve error should be ignored: 1) a ras bad page has been allocated (used by someone); 2) a ras bad page has been reserved (duplicate error injection for one page); DRM_ERROR is unnecessary for the failure of bad page reserve Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Adam Zerella	879e723df3	docs: drm/amdgpu: Resolve build warnings Some of the documentation formatting could be improved which will resolve some Sphinx amdgpu build warnings e.g WARNING: Unexpected indentation. WARNING: Block quote ends without a blank line; unexpected unindent. WARNING: Inline emphasis start-string without end-string. Signed-off-by: Adam Zerella <adam.zerella@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:58 -05:00
Alex Deucher	63b2b5e91b	drm/amdgpu/vm: fix documentation for amdgpu_vm_bo_param Add new parameters. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha	143f230533	drm/amdgpu: psp DTM init DTM is the display topology manager. This is needed to communicate with psp about the display configurations. This patch adds -Loading the firmware -The functions and definitions for communication with the firmware v2: Fix formatting Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:58 -05:00
Bhawanpreet Lakha	ed19a9a2bb	drm/amdgpu: psp HDCP init This patch adds -Loading the firmware -The functions and definitions for communication with the firmware v2: Fix formatting Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:58 -05:00
Arnd Bergmann	29174a4310	drm/amdgpu: hide another #warning An earlier patch of mine disabled some #warning statements that get in the way of build testing, but then another instance was added around the same time. Remove that as well. Fixes: `b5203d16ae` ("drm/amd/amdgpu: hide #warning for missing DC config") Fixes: `e1c14c4339` ("drm/amdgpu: Enable DC on Renoir") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:33 -05:00
Arnd Bergmann	ec3e5c0f0c	drm/amdgpu: make pmu support optional, again When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu portion of the amdgpu driver: drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event' struct hw_perf_event *hwc = &event->hw; ~~~~~ ^ drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event' if (event->attr.type != event->pmu->type) ~~~~~ ^ ... The same bug was already fixed by commit `d155bef063` ("amdgpu: make pmu support optional") but broken again by what looks like an incorrectly rebased patch. Fixes: `64f55e6292` ("drm/amdgpu: Add RAS EEPROM table.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:33 -05:00
Navid Emamdoost	57be09c6e8	drm/amdgpu: fix multiple memory leaks in acp_hw_init In acp_hw_init there are some allocations that needs to be released in case of failure: 1- adev->acp.acp_genpd should be released if any allocation attemp for adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails. 2- all of those allocations should be released if mfd_add_hotplug_devices or pm_genpd_add_device fail. 3- Release is needed in case of time out values expire. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:33 -05:00
Marek Olšák	815fb4c9d7	drm/amdgpu: return tcc_disabled_mask to userspace UMDs need this for correct programming of harvested chips. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:33 -05:00
Alex Deucher	49379032aa	drm/amdgpu: don't increment vram lost if we are in hibernation We reset the GPU as part of our hibernation sequence so we need to make sure we don't mark vram as lost in that case. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111879 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:19 -05:00
Christian König	69f08e68af	drm/amdgpu: revert "disable bulk moves for now" This reverts commit `a213c2c7e2`. The changes to fix this should have landed in 5.1. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:23:03 -05:00
Linus Torvalds	289991ce1c	drm fixes for 5.4-rc1 core: - Some cleanups and fixes in the self-refresh helpers - Some cleanups and fixes in the atomic helpers amdgpu: - Fix a 64 bit divide - Prevent a memory leak in a failure case in dc - Load proper gfx firmware on navi14 variants - Add more navi12 and navi14 PCI ids - Misc fixes for renoir - Fix bandwidth issues with multiple displays on vega20 - Support for Dali - Fix a possible oops with KFD on hawaii - Fix for backlight level after resume on some APUs - Other misc fixes panfrost: - Multiple panfrost fixes for regulator support and page fault handling -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdjZr6AAoJEAx081l5xIa+nboP+gPWzx45Q3IsbnaZmcdFTFEf +/XgScoFcv5Uhd3aXtrSYDvPnSyNXpGsV5ccE/FtxNd4G75n20tPFxGNhjzyXfdc B2x1IRgc82W1KxYwwDlmd+f/86h6uthFkh1ToKN3hsHWNm8Wu8AgoJnoWvqwluf9 natSFnQPQIvcADpbpyk8FiNcXvMg0qrKQ8aj3uPxqUs4/ftigzez+5vYJOkktoEJ NFtlouVvIZejVo6l4Q5ebXXsol7On02iHUszpdJtb5FxMcBQwAafewCGln2622cA 8ooWmekZNtoHUH3WmUlrs7TqPKtoOqIEkMO8UvCJDwBB4/ft8sJRPDKFgk547E/8 Htv6MZXCSOT+/XxebM/wHijOg3MQVjPzO9s73YSmkytzGZVQ/Fgohl/6W+bN/xAm j/huS5ZozengAldFJHG4wruSk820Vzx736x2pk+9sbpf7PdFDIpuZus8U8wHc411 hu3S2397IxyX4XswLg8BTaIOhCXwT7CluQ9mYD1THPgRzG5YPha8JelTcwwlVsD9 2Cw6mCUAqydHHMboWQnEhRXhuhVfGlPAdJTsdyoI6zdXYqU/ThihJPBgG0wSq0y0 fAsj/9NRqSzg6hk9vm1QdCeOthKOuAZ0PgLcVHI1RNSwEyrN8yOupVwe7+Mn+q2z UNbfr2qXGqKxn6rqUy2W =yCyF -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Fixes built up over the past 1.5 weeks or so, it's two weeks of amdgpu, some core cleanups and some panfrost fixes. I also finally figured out why my desktop was slow to do a bunch of stuff (someone gave it an IPv6 address which can't reach anything!). core: - Some cleanups and fixes in the self-refresh helpers - Some cleanups and fixes in the atomic helpers amdgpu: - Fix a 64 bit divide - Prevent a memory leak in a failure case in dc - Load proper gfx firmware on navi14 variants - Add more navi12 and navi14 PCI ids - Misc fixes for renoir - Fix bandwidth issues with multiple displays on vega20 - Support for Dali - Fix a possible oops with KFD on hawaii - Fix for backlight level after resume on some APUs - Other misc fixes panfrost: - Multiple panfrost fixes for regulator support and page fault handling" * tag 'drm-next-2019-09-27' of git://anongit.freedesktop.org/drm/drm: (34 commits) drm/amd/display: prevent memory leak drm/amdgpu/gfx10: add support for wks firmware loading drm/amdgpu/display: include slab.h in dcn21_resource.c drm/amdgpu/display: fix 64 bit divide drm/panfrost: Prevent race when handling page fault drm/panfrost: Remove NULL checks for regulator drm/panfrost: Fix regulator_get_optional() misuse drm: Measure Self Refresh Entry/Exit times to avoid thrashing drm: Fix kerneldoc and remove unused struct member in self_refresh helper drm/atomic: Rename crtc_state->pageflip_flags to async_flip drm/atomic: Reject FLIP_ASYNC unconditionally drm/atomic: Take the atomic toys away from X drm/amdgpu: flag navi12 and 14 as experimental for 5.4 drm/kms: Duct-tape for mode object lifetime checks drm/amdgpu: add navi12 pci id drm/amdgpu: add navi14 PCI ID for work station SKU drm/amdkfd: Swap trap temporary registers in gfx10 trap handler drm/amd/powerplay: implement sysfs for getting dpm clock drm/amd/display: Restore backlight brightness after system resume drm/amd/display: Implement voltage limitation for dali ...	2019-09-27 11:13:35 -07:00
Andrey Konovalov	35f3fc87be	drm/amdgpu: untag user pointers This patch is a part of a series that extends kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages() an MMU notifier is set up with a (tagged) userspace pointer. The untagged address should be used so that MMU notifiers for the untagged address get correctly matched up with the right BO. This patch untag user pointers in amdgpu_gem_userptr_ioctl() for the GEM case and in amdgpu_amdkfd_gpuvm_ alloc_memory_of_gpu() for the KFD case. This also makes sure that an untagged pointer is passed to amdgpu_ttm_tt_get_user_pages(), which uses it for vma lookups. Link: http://lkml.kernel.org/r/d684e1df08f2ecb6bc292e222b64fa9efbc26e69.1563904656.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-09-25 17:51:41 -07:00
Tianci.Yin	1e94b43813	drm/amdgpu/gfx10: add support for wks firmware loading load different cp firmware according to the DID and RID Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-24 13:25:15 -05:00
Linus Torvalds	84da111de0	hmm related patches for 5.4 This is more cleanup and consolidation of the hmm APIs and the very strongly related mmu_notifier interfaces. Many places across the tree using these interfaces are touched in the process. Beyond that a cleanup to the page walker API and a few memremap related changes round out the series: - General improvement of hmm_range_fault() and related APIs, more documentation, bug fixes from testing, API simplification & consolidation, and unused API removal - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and make them internal kconfig selects - Hoist a lot of code related to mmu notifier attachment out of drivers by using a refcount get/put attachment idiom and remove the convoluted mmu_notifier_unregister_no_release() and related APIs. - General API improvement for the migrate_vma API and revision of its only user in nouveau - Annotate mmu_notifiers with lockdep and sleeping region debugging Two series unrelated to HMM or mmu_notifiers came along due to dependencies: - Allow pagemap's memremap_pages family of APIs to work without providing a struct device - Make walk_page_range() and related use a constant structure for function pointers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl1/nnkACgkQOG33FX4g mxqaRg//c6FqowV1pQlLutvAOAgMdpzfZ9eaaDKngy9RVQxz+k/MmJrdRH/p/mMA Pq93A1XfwtraGKErHegFXGEDk4XhOustVAVFwvjyXO41dTUdoFVUkti6ftbrl/rS 6CT+X90jlvrwdRY7QBeuo7lxx7z8Qkqbk1O1kc1IOracjKfNJS+y6LTamy6weM3g tIMHI65PkxpRzN36DV9uCN5dMwFzJ73DWHp1b0acnDIigkl6u5zp6orAJVWRjyQX nmEd3/IOvdxaubAoAvboNS5CyVb4yS9xshWWMbH6AulKJv3Glca1Aa7QuSpBoN8v wy4c9+umzqRgzgUJUe1xwN9P49oBNhJpgBSu8MUlgBA4IOc3rDl/Tw0b5KCFVfkH yHkp8n6MP8VsRrzXTC6Kx0vdjIkAO8SUeylVJczAcVSyHIo6/JUJCVDeFLSTVymh EGWJ7zX2iRhUbssJ6/izQTTQyCH3YIyZ5QtqByWuX2U7ZrfkqS3/EnBW1Q+j+gPF Z2yW8iT6k0iENw6s8psE9czexuywa/Lttz94IyNlOQ8rJTiQqB9wLaAvg9hvUk7a kuspL+JGIZkrL3ouCeO/VA6xnaP+Q7nR8geWBRb8zKGHmtWrb5Gwmt6t+vTnCC2l olIDebrnnxwfBQhEJ5219W+M1pBpjiTpqK/UdBd92A4+sOOhOD0= =FRGg -----END PGP SIGNATURE----- Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is more cleanup and consolidation of the hmm APIs and the very strongly related mmu_notifier interfaces. Many places across the tree using these interfaces are touched in the process. Beyond that a cleanup to the page walker API and a few memremap related changes round out the series: - General improvement of hmm_range_fault() and related APIs, more documentation, bug fixes from testing, API simplification & consolidation, and unused API removal - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and make them internal kconfig selects - Hoist a lot of code related to mmu notifier attachment out of drivers by using a refcount get/put attachment idiom and remove the convoluted mmu_notifier_unregister_no_release() and related APIs. - General API improvement for the migrate_vma API and revision of its only user in nouveau - Annotate mmu_notifiers with lockdep and sleeping region debugging Two series unrelated to HMM or mmu_notifiers came along due to dependencies: - Allow pagemap's memremap_pages family of APIs to work without providing a struct device - Make walk_page_range() and related use a constant structure for function pointers" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits) libnvdimm: Enable unit test infrastructure compile checks mm, notifier: Catch sleeping/blocking for !blockable kernel.h: Add non_block_start/end() drm/radeon: guard against calling an unpaired radeon_mn_unregister() csky: add missing brackets in a macro for tlb.h pagewalk: use lockdep_assert_held for locking validation pagewalk: separate function pointers from iterator data mm: split out a new pagewalk.h header from mm.h mm/mmu_notifiers: annotate with might_sleep() mm/mmu_notifiers: prime lockdep mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports mm/hmm: hmm_range_fault() infinite loop mm/hmm: hmm_range_fault() NULL pointer bug mm/hmm: fix hmm_range_fault()'s handling of swapped out pages mm/mmu_notifiers: remove unregister_no_release RDMA/odp: remove ib_ucontext from ib_umem RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address ...	2019-09-21 10:07:42 -07:00
Alex Deucher	e16a7cbced	drm/amdgpu: flag navi12 and 14 as experimental for 5.4 We can remove this later as things get closer to launch. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-18 08:29:30 -05:00
Tianci.Yin	10e85054f9	drm/amdgpu: add navi12 pci id Add Navi12 PCI id support. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:49:19 -05:00
Tianci.Yin	a82e163bca	drm/amdgpu: add navi14 PCI ID for work station SKU Add the navi14 PCI device id. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:45:17 -05:00
Felix Kuehling	dcafbd50f2	drm/amdgpu: Fix KFD-related kernel oops on Hawaii Hawaii needs to flush caches explicitly, submitting an IB in a user VMID from kernel mode. There is no s_fence in this case. Fixes: `eb3961a574` ("drm/amdgpu: remove fence context from the job") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:38:35 -05:00
Prike Liang	4f3a2c1077	drm/amd/amdgpu: power up sdma engine when S3 resume back The sdma_v4 should be ungated when the IP resume back, otherwise it will hang up and resume time out error. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:37:58 -05:00
Trek	73d8e6c7b8	drm/amdgpu: Check for valid number of registers to read Do not try to allocate any amount of memory requested by the user. Instead limit it to 128 registers. Actually the longest series of consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and mmGB_MACROTILE_MODE0-15 (0x2644-0x2673). Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273 Signed-off-by: Trek <trek00@inbox.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:37:45 -05:00
Aaron Liu	df794f679b	drm/amdgpu: remove program of lbpw for renoir These is no LBPW on Renoir. So removing program of lbpw for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:26:24 -05:00
Andrey Grodzovsky	103efdc1ea	drm/amdgpu: Remove clock gating restore. Restoring clock gating break SMU opeartion afterwards, avoid this until this further invistigated with SMU. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:25:01 -05:00
Christian König	de7b45babd	drm/amdgpu: cleanup creating BOs at fixed location (v2) The placement is something TTM/BO internal and the RAS code should avoid touching that directly. Add a helper to create a BO at a fixed location and use that instead. v2: squash in fixes (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 08:06:54 -05:00
José Roberto de Souza	62afb4ad42	drm/connector: Allow max possible encoders to attach to a connector Currently we restrict the number of encoders that can be linked to a connector to 3, increase it to match the maximum number of encoders that can be initialized(32). To more effiently do that lets switch from an array of encoder ids to bitmask. v2: Fixing missed return on amdgpu_dm_connector_to_encoder() Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190913232857.389834-2-jose.souza@intel.com	2019-09-16 15:13:53 -07:00
Andrey Grodzovsky	db338e1663	drm/amdgpu:Fix EEPROM checksum calculation. Fix checksum calculation after manually resetting the table. Unify reset and empty EEPROM init flow. Protect the table reset with lock. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:30:44 -05:00
Guchun Chen	012dd14d1d	drm/amdgpu: fix ras ctrl debugfs node leak Use debugfs_remove_recursive to remove the whole debugfs directory instead of removing the node one by one. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:30:38 -05:00
Christian König	1313dacfad	drm/amdgpu: trace if a PD/PT update is done directly This is usfull for debugging. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:30:32 -05:00
Christian König	bc51c1e56f	drm/amdgpu: drop double HDP flush in the VM code Already done in the CPU based backend code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:30:27 -05:00
Christian König	fc39d903eb	drm/amdgpu: cleanup coding style in the VM code a bit No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:30:13 -05:00
Christian König	03fb560f2e	drm/amdgpu: revert "disable bulk moves for now" This reverts commit `a213c2c7e2`. The changes to fix this should have landed in 5.1. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:29:37 -05:00
Jiange Zhao	393993ac0c	drm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base With changes in PSP and HV, SRIOV VF will handle vram gtt location just like bare metal. There is no need to differentiate it anymore. Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:29:04 -05:00
Aaron Liu	28faa17ee8	drm/amdgpu: remove program of lbpw for renoir These is no LBPW on Renoir. So removing program of lbpw for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 15:28:57 -05:00
zhong jiang	2032324682	drm/amdgpu: remove the redundant null checks debugfs_remove and kfree has taken the null check in account. hence it is unnecessary to check it. Just remove the condition. No functional change. This issue was detected by using the Coccinelle software. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:56 -05:00
Jean Delvare	ae2a349597	drm/amd: be quiet when no SAD block is found It is fine for displays without audio functionality to not provide any SAD block in their EDID. Do not log an error in that case, just return quietly. This fixes half of bug fdo#107825: https://bugs.freedesktop.org/show_bug.cgi?id=107825 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:56 -05:00
Trek	13238d4fa6	drm/amdgpu: Check for valid number of registers to read Do not try to allocate any amount of memory requested by the user. Instead limit it to 128 registers. Actually the longest series of consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and mmGB_MACROTILE_MODE0-15 (0x2644-0x2673). Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273 Signed-off-by: Trek <trek00@inbox.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:56 -05:00
Kent Russell	3e103fc301	Revert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20" This reverts commit `e01f2d4189`. VG20 did not require this workaround, as the fix is in the VBIOS. Leave VG10/12 workaround as some older shipped cards do not have the VBIOS fix in place, and the kernel workaround is required in those situations Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	ec671737f8	drm/amdgpu: add graceful VM fault handling v3 Next step towards HMM support. For now just silence the retry fault and optionally redirect the request to the dummy page. v2: make sure the VM is not destroyed while we handle the fault. v3: fix VM destroy check, cleanup comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	b65709a921	drm/amdgpu: reserve the root PD while freeing PASIDs Free the pasid only while the root PD is reserved. This prevents use after free in the page fault handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	061468c405	drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault While handling a page fault we can't wait for other ongoing GPU operations or we can potentially run into deadlocks. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	0f6064d6af	drm/amdgpu: allow direct submission of clears For handling PD/PT clears directly in the fault handler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	acb476f541	drm/amdgpu: allow direct submission of PTE updates For handling PTE updates directly in the fault handler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	807e299409	drm/amdgpu: allow direct submission of PDE updates v2 For handling PDE updates directly in the fault handler. v2: fix typo in comment Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	47ca7efa4c	drm/amdgpu: allow direct submission in the VM backends v2 This allows us to update page tables directly while in a page fault. v2: use direct/delayed entities and still wait for moves Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	a2cf324785	drm/amdgpu: split the VM entity into direct and delayed For page fault handling we need to use a direct update which can't be blocked by ongoing user CS. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:55 -05:00
Christian König	6817bf283b	drm/amdgpu: grab the id mgr lock while accessing passid_mapping Need to make sure that we actually dropping the right fence. Could be done with RCU as well, but to complicated for a fix. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:16:28 -05:00
Jiange Zhao	1b65782468	drm/amdgpu/SRIOV: Navi12 SRIOV VF doesn't load TOC In SRIOV case, the autoload sequence is the same as bare metal, except VF won't load TOC. Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:16:21 -05:00
Jiange Zhao	a4ac7693f8	drm/amdgpu/SRIOV: Navi10/12 VF doesn't support SMU In SRIOV case, SMU and powerplay are handled in HV. VF shouldn't have control over SMU and powerplay. Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:16:14 -05:00
Prike Liang	a90a24d581	drm/amd/amdgpu: power up sdma engine when S3 resume back The sdma_v4 should be ungated when the IP resume back, otherwise it will hang up and resume time out error. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:16:07 -05:00
Jiange Zhao	b05b69036f	drm/amdgpu: For Navi12 SRIOV VF, register mailbox functions Mailbox functions and interrupts are only for Navi12 VF. Register functions and irqs during initialization. Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:15:20 -05:00
Jack Zhang	51c0f58e9f	drm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code psp v11 code missed ring stop in ring create function(VMR) while psp v3.1 code had the code. This will cause VM destroy1 fail and psp ring create fail. For SIOV-VF, ring_stop should not be deleted in ring_create function. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:11:28 -05:00
Evan Quan	0e0b89c0d7	drm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine Set mp1 state properly for SW SMU suspend/reset routine. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:10:58 -05:00
Felix Kuehling	d950800e79	drm/amdgpu: Fix KFD-related kernel oops on Hawaii Hawaii needs to flush caches explicitly, submitting an IB in a user VMID from kernel mode. There is no s_fence in this case. Fixes: `eb3961a574` ("drm/amdgpu: remove fence context from the job") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:10:06 -05:00
Andrey Grodzovsky	708901a666	drm/amdgpu: Fix mutex lock from atomic context. Problem: amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu because writing to EEPROM during ASIC reset was unstable. But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called directly from ISR context and so locking is not allowed. Also it's irrelevant for this partilcular interrupt as this is generic RAS interrupt and not memory errors specific. Fix: Avoid calling amdgpu_ras_reserve_bad_pages if not in task context. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:59 -05:00
Jiange Zhao	3636169cc0	drm/amdgpu: Add SRIOV mailbox backend for Navi1x Mimic the ones for Vega10, add mailbox backend for Navi1x Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:52 -05:00
Guchun Chen	1a3f2e8c3c	drm/amdgpu: implement ras query function for pcie bif ras error query funtionality implementation Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:44 -05:00
Guchun Chen	d7bd680d40	drm/amdgpu: support pcie bif ras query and inject Call pcie bif ras query/inject in amdgpu ras. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:30 -05:00
Guchun Chen	52652ef286	drm/amdgpu: add ras error query count interface for nbio Add the interface query_ras_error_count for nbio. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:21 -05:00
Tianci.Yin	ff9d097193	drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10 add and_mask since the programming logic of golden setting changed Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:14 -05:00
Hawking Zhang	f317035288	drm/amdgpu: enable error injection to XGMI block via debugfs allow inject error to XGMI block via debugfs node ras_ctrl Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:09:08 -05:00
Hawking Zhang	029fbd437e	drm/amdgpu: initialize ras structures for xgmi block (v2) init ras common interface and fs node for xgmi block v2: remove unnecessary physical node number check before invoking amdgpu_xgmi_ras_late_init Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:08:51 -05:00
Andrey Grodzovsky	084fe13b2c	drm/amdgpu: Allow to reset to EERPOM table. The table grows quickly during debug/development effort when multiple RAS errors are injected. Allow to avoid this by setting table header back to empty if needed. v2: Switch to debugfs entry instead of load time parameter. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:06:34 -05:00
Andrey Grodzovsky	d01b400b1a	drm/amdgpu: Add amdgpu_ras_eeprom_reset_table This will allow to reset the table on the fly. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:06:27 -05:00
Tao Zhou	d99659a062	drm/amdgpu: rename umc ras_init to err_cnt_init this interface is related to specific version of umc, distinguish it from ras_late_init Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:06:20 -05:00
Tao Zhou	4930aabe7c	drm/amdgpu: move umc ras init to umc block move umc ras init from ras module to umc block, generic ras module should pay less attention to specific ras block. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:06:12 -05:00
Tao Zhou	86edcc7dba	drm/amdgpu: move umc late init from gmc to umc block umc late init is umc specific, it's more suitable to be put in umc block Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:06:03 -05:00
Guchun Chen	1bd252c57b	drm/amdgpu: remove duplicated header file include amdgpu_ras.h is already included. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:05:29 -05:00
Shirish S	a35ad98bf9	drm/amdgpu: remove needless usage of #ifdef define sched_policy in case CONFIG_HSA_AMD is not enabled, with this there is no need to check for CONFIG_HSA_AMD else where in driver code. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:05:20 -05:00
Shirish S	8c9f69bc5c	drm/amdgpu: fix build error without CONFIG_HSA_AMD If CONFIG_HSA_AMD is not set, build fails: drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init': drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy' Use CONFIG_HSA_AMD to guard this. Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy") Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:05:07 -05:00
Andrey Grodzovsky	4d1337d2e9	drm/amdgpu: Avoid RAS recovery init when no RAS support. Fixes driver load regression on APUs. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:59:35 -05:00
Christian König	cbfae36cea	drm/amdgpu: cleanup PTE flag generation v3 Move the ASIC specific code into a new callback function. v2: mask the flags for SI and CIK instead of a BUG_ON(). v3: remove last missed BUG_ON(). Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:59:29 -05:00
Christian König	71776b6dae	drm/amdgpu: cleanup mtype mapping Unify how we map the UAPI flags to the PTE hardware flags for a mapping. Only the MTYPE is actually ASIC dependent, all other flags should be copied over 1 to 1 and ASIC differences are handled later on. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:59:21 -05:00
Tianci.Yin	1dd077bbba	drm/amdgpu: add navi14 PCI ID for work station SKU Add the navi14 PCI device id. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:59:14 -05:00
Philip Yang	cde85ac247	drm/amdgpu: check if nbio->ras_if exist To avoid NULL function pointer access. This happens on VG10, reboot command hangs and have to power off/on to reboot the machine. This is serial console log: [ OK ] Reached target Unmount All Filesystems. [ OK ] Reached target Final Step. Starting Reboot... [ 305.696271] systemd-shutdown[1]: Syncing filesystems and block devices. [ 306.947328] systemd-shutdown[1]: Sending SIGTERM to remaining processes... [ 306.963920] systemd-journald[1722]: Received SIGTERM from PID 1 (systemd-shutdow). [ 307.322717] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [ 307.336472] systemd-shutdown[1]: Unmounting file systems. [ 307.454202] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro [ 307.480523] systemd-shutdown[1]: All filesystems unmounted. [ 307.486537] systemd-shutdown[1]: Deactivating swaps. [ 307.491962] systemd-shutdown[1]: All swaps deactivated. [ 307.497624] systemd-shutdown[1]: Detaching loop devices. [ 307.504418] systemd-shutdown[1]: All loop devices detached. [ 307.510418] systemd-shutdown[1]: Detaching DM devices. [ 307.565907] sd 2:0:0:0: [sda] Synchronizing SCSI cache [ 307.731313] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 307.738802] #PF: supervisor read access in kernel mode [ 307.744326] #PF: error_code(0x0000) - not-present page [ 307.749850] PGD 0 P4D 0 [ 307.752568] Oops: 0000 [#1] SMP PTI [ 307.756314] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted 5.2.0-rc1-kfd-yangp #453 [ 307.764644] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1, BIOS 9001 03/07/2016 [ 307.773580] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu] [ 307.779760] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48 8b b3 90 7d 00 00 48 c7 c7 17 b8 530 [ 307.799967] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286 [ 307.805585] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX: 0000000000000006 [ 307.813261] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI: ffff9eb29e350000 [ 307.820935] RBP: ffff9eb299da0000 R08: 0000000000000000 R09: 0000000000000000 [ 307.828609] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9eb299dbd1f8 [ 307.836284] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15: 0000000000000000 [ 307.843959] FS: 00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000) knlGS:0000000000000000 [ 307.852663] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 307.858842] CR2: 0000000000000000 CR3: 000000081d798005 CR4: 00000000001606e0 [ 307.866516] Call Trace: [ 307.869169] amdgpu_device_ip_suspend_phase2+0x80/0x110 [amdgpu] [ 307.875654] ? amdgpu_device_ip_suspend_phase1+0x4d/0xd0 [amdgpu] [ 307.882230] amdgpu_device_ip_suspend+0x2e/0x60 [amdgpu] [ 307.887966] amdgpu_pci_shutdown+0x2f/0x40 [amdgpu] [ 307.893211] pci_device_shutdown+0x31/0x60 [ 307.897613] device_shutdown+0x14c/0x1f0 [ 307.901829] kernel_restart+0xe/0x50 [ 307.905669] __do_sys_reboot+0x1df/0x210 [ 307.909884] ? task_work_run+0x73/0xb0 [ 307.913914] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 307.918970] do_syscall_64+0x4a/0x1c0 [ 307.922904] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 307.928336] RIP: 0033:0x7f0671cf8373 [ 307.932176] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 89 fa be 69 19 128 [ 307.952384] RSP: 002b:00007ffdd1723d68 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9 [ 307.960527] RAX: ffffffffffffffda RBX: 0000000001234567 RCX: 00007f0671cf8373 [ 307.968201] RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead [ 307.975875] RBP: 00007ffdd1723dd0 R08: 0000000000000000 R09: 0000000000000000 [ 307.983550] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffdd1723dd8 [ 307.991224] R13: 0000000000000000 R14: 0000001b00000004 R15: 00007ffdd17240c8 [ 307.998901] Modules linked in: xt_MASQUERADE nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_cos [ 308.026505] CR2: 0000000000000000 [ 308.039998] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu] [ 308.046180] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48 8b b3 90 7d 00 00 48 c7 c7 17 b8 530 [ 308.066392] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286 [ 308.072013] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX: 0000000000000006 [ 308.079689] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI: ffff9eb29e350000 [ 308.087366] RBP: ffff9eb299da0000 R08: 0000000000000000 R09: 0000000000000000 [ 308.095042] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9eb299dbd1f8 [ 308.102717] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15: 0000000000000000 [ 308.110394] FS: 00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000) knlGS:0000000000000000 [ 308.119099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 308.125280] CR2: 0000000000000000 CR3: 000000081d798005 CR4: 00000000001606e0 [ 308.135304] printk: systemd-shutdow: 3 output lines suppressed due to ratelimiting [ 308.143518] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 308.151798] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xff) [ 308.171775] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]--- Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:58:52 -05:00
Xiaojie Yuan	bfa603aa5e	drm/amdgpu: fix null pointer deref in firmware header printing v2: declare as (struct common_firmware_header *) type because struct xxx_firmware_header inherits from it When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an unallocated amdgpu_firmware_info instance. This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have optimized out such 'defined but not referenced' variable. [ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a [ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0 [ 1120.818931] Oops: 0000 [#1] SMP [ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid [ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1 [ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015 [ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000 [ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>] [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.990483] RSP: 0018:ffff991ee625f950 EFLAGS: 00010202 [ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000 [ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3 [ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000 [ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220 [ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30 [ 1121.032666] FS: 00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000 [ 1121.041000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0 [ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1121.068938] Call Trace: [ 1121.071494] [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu] [ 1121.077886] [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu] [ 1121.085296] [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu] [ 1121.092534] [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu] [ 1121.099675] [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu] [ 1121.106888] [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm] [ 1121.113419] [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140 [ 1121.120183] [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu] [ 1121.126919] [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0 [ 1121.132622] [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160 [ 1121.138607] [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0 [ 1121.144766] [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0 [ 1121.150507] [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50 [ 1121.156422] [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0 [ 1121.162213] [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20 [ 1121.167771] [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0 [ 1121.173590] [<ffffffffa4eb4c94>] driver_register+0x64/0xf0 [ 1121.179345] [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0 [ 1121.185593] [<ffffffffc099f000>] ? 0xffffffffc099efff [ 1121.190914] [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu] [ 1121.197101] [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240 [ 1121.202901] [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0 [ 1121.208598] [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100 [ 1121.214894] [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140 [ 1121.220698] [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a [ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7 [ 1121.247422] RIP [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1121.254432] RSP <ffff991ee625f950> [ 1121.258017] CR2: 000000000000000a [ 1121.261427] ---[ end trace e98b35387ede75bd ]--- Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Fixes: `c5fb912653` ("drm/amdgpu: add firmware header printing for psp fw loading (v2)") Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:56:01 -05:00
Huang Rui	4042a18872	drm/amdkfd: enable renoir while device probes This patch is to add asic flag to enable device probe during kfd init. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:55:50 -05:00
Huang Rui	aa978594cf	drm/amdgpu: disable gfxoff while use no H/W scheduling policy While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX is in "off" state. KFD queue creattion doesn't use ring based method, so it will trigger a VM fault. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:55:41 -05:00
Felix Kuehling	7cae706193	drm/amdgpu: Disable retry faults in VMID0 There is no point retrying page faults in VMID0. Those faults are always fatal. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-and-Tested-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:54:34 -05:00
Yong Zhao	4e66d7d215	drm/amdgpu: Add a kernel parameter for specifying the asic type As more and more new asics start to reuse the old device IDs before launch, there is a need to quickly override the existing asic type corresponding to the reused device ID through a kernel parameter. With this, engineers no longer need to rely on local hack patches, facilitating cooperation across teams. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:54:25 -05:00
Alex Deucher	bb42eda284	drm/amdgpu/irq: check if nbio funcs exist We need to check if the nbios funcs exist before checking the individual pointers. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>	2019-09-16 09:54:14 -05:00
Tao Zhou	1a6fc071e1	drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place ras recovery_init should be called after ttm init, bad page reserve should be put in front of gpu reset since i2c may be unstable during gpu reset. add cleanup for recovery_init and recovery_fini v2: add more comment and print. remove cancel_work_sync in recovery_init. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:47 -05:00
Tao Zhou	87d2b92f1e	drm/amdgpu: save umc error records save umc error records to ras bad page array v2: add bad pages before gpu reset v3: add NULL check for adev->umc.funcs Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:40 -05:00
Tao Zhou	78ad00c903	drm/amdgpu: Hook EEPROM table to RAS support eeprom records load and save for ras, move EEPROM records storing to bad page reserving v2: remove redundant check for con->eh_data Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:32 -05:00
Tao Zhou	9dc23a6325	drm/amdgpu: change ras bps type to eeprom table record structure change bps type from retired page to eeprom table record, prepare for saving umc error records to eeprom Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:26 -05:00
Andrey Grodzovsky	4bc2234077	drm/madgpu: Fix EEPROM Checksum calculation. Fix typo which messed up the calculation. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:19 -05:00
Andrey Grodzovsky	4d25fba4e3	drm/amdgpu: Remove clock gating restore. Restoring clock gating break SMU opeartion afterwards, avoid this until this further invistigated with SMU. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:10 -05:00
Yong Zhao	050091ab6e	drm/amdkfd: Query kfd device info by CHIP id instead of pci device id This optimizes out the pci device id usage in KFD and makes the code more maintainable. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:49:45 -05:00
Felix Kuehling	cd05c86510	drm/amdgpu: Disable page faults while reading user wptrs These wptrs must be pinned and GPU accessible when this is called from hqd_load functions. So they should never fault. This resolves a circular lock dependency issue involving four locks including the DQM lock and mmap_sem. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:49:38 -05:00
John Clements	337c200756	drm/amdgpu: clean up load TMR sequence Removed redundant goto statement Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:49:11 -05:00
John Clements	4fb60b02fb	drm/amdgpu: enable TA load support in Arcturus Add support for loading XGMI/RAS FW Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:49:02 -05:00
Tao Zhou	c5b6e585b2	drm/amdgpu: change r type to int in gmc_v9_0_late_init change r type from bool to int, suitable for both bool and int return value Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:48:51 -05:00
Jack Zhang	f1d59e00ff	drm/amd/amdgpu: add sw_fini interface for df_funcs add sw_fini interface of df_funcs. This interface will remove sysfs file of df_cntr_avail function. The old behavior only create sysfs of df_cntr_avail in sw_init, but never remove it for lack of sw_fini interface. With this,driver will report create sysfs fail when it's loaded for the second time. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:42:15 -05:00
Hawking Zhang	9dc9134258	drm/amdgpu: init UMC & RSMU register base address UMC RAS feature requires access to UMC & RSMU registers Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:42:08 -05:00
Hawking Zhang	1c70d3d9c4	drm/amdgpu/nbio: switch to amdgpu_nbio_ras_late_init helper function amdgpu_nbio_ras_late_init is used to init nbio specfic ras debugfs/sysfs node and nbio specific interrupt handler. It can be shared among nbio generations Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:42:02 -05:00
Hawking Zhang	47930de4aa	drm/amdgpu/mmhub: switch to amdgpu_mmhub_ras_late_init helper function amdgpu_mmhub_ras_late_init is used to init mmhub specfic ras debugfs/sysfs node and mmhub specific interrupt handler. It can be shared among mmhub generations Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:55 -05:00
Hawking Zhang	bfcf62c2a5	drm/amdgpu/sdma: switch to amdgpu_sdma_ras_late_init helper function amdgpu_sdma_ras_late_init is used to init sdma specfic ras debugfs/sysfs node and sdma specific interrupt handler. It can be shared among sdma generations Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:49 -05:00
Hawking Zhang	6caeee7a70	drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function amdgpu_gfx_ras_late_init is used to init gfx specfic ras debugfs/sysfs node and gfx specific interrupt handler. It can be shared among gfx generations Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:42 -05:00
Hawking Zhang	a85eff14da	drm/amdgpu/gmc: switch to amdgpu_gmc_ras_late_init helper function amdgpu_gmc_ras_late_init is used to init gmc specfic ras debugfs/sysfs node and gmc specific interrupt handler. It can be shared among gmc generations. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:36 -05:00
Hawking Zhang	d094aea312	drm/amdgpu: set ip specific ras interface pointer to NULL after free it to prevent access to dangling pointers Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:29 -05:00
Andrey Grodzovsky	d5ea093eeb	dmr/amdgpu: Add system auto reboot to RAS. In case of RAS error allow user configure auto system reboot through ras_ctrl. This is also part of the temproray work around for the RAS hang problem. v4: Use latest kernel API for disk sync. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:17 -05:00
Andrey Grodzovsky	7c6e68c777	drm/amdgpu: Avoid HW GPU reset for RAS. Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable error happens the DF will unconditionally broadcast error event packets to all its clients/slave upon receiving fatal error event and freeze all its outbound queues, err_event_athub interrupt will be triggered. In such case and we use this interrupt to issue GPU reset. THe GPU reset code is modified for such case to avoid HW reset, only stops schedulers, deatches all in progress and not yet scheduled job's fences, set error code on them and signals. Also reject any new incoming job submissions from user space. All this is done to notify the applications of the problem. v2: Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c Remove print param from amdgpu_ras_query_error_count v3: Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset for other XGMI hive memebers. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:05 -05:00
Andrey Grodzovsky	12ffa55da6	drm/amdgpu: Fix bugs in amdgpu_device_gpu_recover in XGMI case. Issue 1: In XGMI case amdgpu_device_lock_adev for other devices in hive was called to late, after access to their repsective schedulers. So relocate the lock to the begining of accessing the other devs. Issue 2: Using amdgpu_device_ip_need_full_reset to switch the device list from all devices in hive to the single 'master' device who owns this reset call is wrong because when stopping schedulers we iterate all the devices in hive but when restarting we will only reactivate the 'master' device. Also, in case amdgpu_device_pre_asic_reset conlcudes that full reset IS needed we then have to stop schedulers for all devices in hive and not only the 'master' but with amdgpu_device_ip_need_full_reset we already missed the opprotunity do to so. So just remove this logic and always stop and start all schedulers for all devices in hive. Also minor cleanup and print fix. v4: Minor coding style fix. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:39:05 -05:00
Christian König	43ce6bab7b	drm/amdgpu: remove amdgpu_cs_try_evict Trying to evict things from the current working set doesn't work that well anymore because of per VM BOs. Rely on reserving VRAM for page tables to avoid contention. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:38:56 -05:00
Christian König	9d1b3c7805	drm/amdgpu: reserve at least 4MB of VRAM for page tables v2 This hopefully helps reduce the contention for page tables. v2: adjust maximum reported VRAM size as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:38:47 -05:00
Christian König	629be20395	drm/amdgpu: use moving fence instead of exclusive for VM updates Make VM updates depend on the moving fence instead of the exclusive one. Makes it less likely to actually have a dependency. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:38:38 -05:00
Hawking Zhang	39857252e5	drm/amdgpu: only apply gds clearing workaround when ras is supported gds clearing workaround should only be applied on asics that support gfx ras Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:36:29 -05:00
Hawking Zhang	8bf2485aec	drm/amdgpu: fix memory leak when ras is not supported on specific ip block free ras_if if ras is not supported Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:36:22 -05:00
Hawking Zhang	4ce71be67b	drm/amdgpu: check mmhub_funcs pointer before refering to it mmhub callback functions are not initialized for all the ASICs Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:36:16 -05:00
Felix Kuehling	17da41bf00	drm/amdgpu: Remove unnecessary TLB workaround (v2) This workaround is better handled in user mode in a way that doesn't require allocating extra memory and breaking userptr BOs. The TLB bug is a performance bug, not a functional or security bug. Hence it is safe to remove this kernel part of the workaround to allow a better workaround using only virtual address alignments in user mode. v2: Removed VI_BO_SIZE_ALIGN definition Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:36:08 -05:00
Felix Kuehling	e0253d083c	drm/amdgpu: Use optimal mtypes and PTE bits for Arcturus For compute VRAM allocations on Arturus use the new RW mtype for non-coherent local memory, CC mtype for coherent local memory and PTE_SNOOPED bit for invalidating non-dirty cache lines on remote XGMI mappings. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:36:01 -05:00
Felix Kuehling	d0ba51b1ca	drm/amdgpu: Determing PTE flags separately for each mapping (v3) The same BO can be mapped with different PTE flags by different GPUs. Therefore determine the PTE flags separately for each mapping instead of storing them in the KFD buffer object. Add a helper function to determine the PTE flags to be extended with ASIC and memory-type-specific logic in subsequent commits. v2: Split Arcturus-specific MTYPE changes into separate commit v3: Fix return type of get_pte_flags to uint64_t Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:35:55 -05:00
Oak Zeng	093e48c04d	drm/amdgpu: Support new arcturus mtype Arcturus repurposed mtype WC to RW. Modify gmc functions to support the new mtype Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:35:48 -05:00
Hawking Zhang	22e1d14fef	drm/amdgpu: switch to amdgpu_ras_late_init for nbio v7_4 (v2) call helper function in late init phase to handle ras init for nbio ip block v2: init local var r to 0 in case the function return failure on asics that don't have ras_late_init implementation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:05 -05:00
Hawking Zhang	9ad1dc295b	drm/amdgpu: add ras_late_init callback function for nbio v7_4 (v3) ras_late_init callback function will be used to do common ras init in late init phase. v2: call ras_late_fini to do cleanup when fails to enable interrupt v3: rename sysfs/debugfs node name to pcie_bif_xxx Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:05 -05:00
Hawking Zhang	dda79907a7	drm/amdgpu: add mmhub ras_late_init callback function (v2) The function will be called in late init phase to do mmhub ras init v2: check ras_late_init function pointer before invoking the function Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:05 -05:00
Hawking Zhang	2452e7783c	drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2) call helper function in late init phase to handle ras init for gmc ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:05 -05:00
Hawking Zhang	7d0a31e8cc	drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2) call helper function in late init phase to handle ras init for sdma ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:04 -05:00
Hawking Zhang	63fa48db49	drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2) call helper function in late init phase to handle ras init for gfx ip block v2: call ras_late_fini to do clean up when fail to enable interrupt Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:04 -05:00
Hawking Zhang	b293e891b0	drm/amdgpu: add helper function to do common ras_late_init/fini (v3) In late_init for ras, the helper function will be used to 1). disable ras feature if the IP block is masked as disabled 2). send enable feature command if the ip block was masked as enabled 3). create debugfs/sysfs node per IP block 4). register interrupt handler v2: check ih_info.cb to decide add interrupt handler or not v3: add ras_late_fini for cleanup all the ras fs node and remove interrupt handler Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:04 -05:00
Hawking Zhang	a344db8e5e	drm/amdgpu: poll ras_controller_irq and err_event_athub_irq status For the hardware that can not enable BIF ring for IH cookies for both ras_controller_irq and err_event_athub_irq, the driver has to poll the status register in irq handling and ack the hardware properly when there is interrupt triggered Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:04 -05:00
Hawking Zhang	4e644fffb5	drm/amdgpu: add ras_controller and err_event_athub interrupt support Ras controller interrupt and Ras err event athub interrupt are two dedicated interrupts for RAS support. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:04 -05:00
Hawking Zhang	4241863afc	drm/amdgpu/nbio: add functions to query ras specific interrupt status ras_controller_interrupt and err_event_interrupt are ras specific interrupts. add functions to check their status and ack them if they are generated. both funcitons should only be invoked in ISR when BIF ring is disabled or even not initialized. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:03 -05:00
Hawking Zhang	bebc076285	drm/amdgpu: switch to new amdgpu_nbio structure no functional change, just switch to new structures Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:03 -05:00
Hawking Zhang	078ef4e932	drm/amdgpu: add new amdgpu nbio header file More nbio funcitonalities will be added and nbio could be treated as an ip block like gfx/sdma.etc Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:11:03 -05:00
Gerd Hoffmann	e7bf74d0aa	drm/amdgpu: switch to gem vma offset manager Pass gem vma_offset_manager to ttm_bo_device_init(), so ttm uses it instead of its own embedded struct. This makes some gem functions (specifically drm_gem_object_lookup) work on ttm objects. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-6-kraxel@redhat.com	2019-09-11 08:03:24 +02:00
Gerd Hoffmann	9d6f4484e8	drm/ttm: turn ttm_bo_device.vma_manager into a pointer Rename the embedded struct vma_offset_manager, new name is _vma_manager. ttm_bo_device.vma_manager changed to a pointer. The ttm_bo_device_init() function gets an additional vma_manager argument which allows to initialize ttm with a different vma manager. When passing NULL the embedded _vma_manager is used. All callers are updated to pass NULL, so the behavior doesn't change. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20190905070509.22407-2-kraxel@redhat.com	2019-09-11 08:03:23 +02:00
Dave Airlie	9ed45a209a	Merge tag 'drm-next-5.4-2019-08-30' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.4-2019-08-30: amdgpu: - Add DC support for Renoir - Add some GPUVM hw bug workarounds - add support for the smu11 i2c controller - GPU reset vram lost bug fixes - Navi1x powergating fixes - Navi12 power fixes - Renoir power fixes - Misc bug fixes and cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190830212650.5055-1-alexander.deucher@amd.com	2019-09-06 16:40:28 +10:00
Petr Cvek	20c14ee135	drm/amdgpu: Fix undefined dm_ip_block for navi12 There is missing "if defined" CONFIG_DRM_AMD_DC block for non DC configurations. This will cause link error. The patch is fixing that. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110979 Signed-off-by: Petr Cvek <petrcvekcz@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-30 15:37:17 -05:00
Aaron Liu	537e3bbfee	drm/amdgpu: fix no interrupt issue for renoir emu (v2) In renoir's vega10_ih model, there's a security change in mmIH_CHICKEN register, that limits IH to use physical address (FBPA, GPA) directly. Those chicken bits need to be programmed first. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-30 15:37:17 -05:00
Andrey Grodzovsky	0b2d2c2eec	drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover This should be checked at all places job is accessed. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-30 15:02:39 -05:00
Roman Li	e1c14c4339	drm/amdgpu: Enable DC on Renoir Enable DC support for renoir. Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:34 -05:00
Prike Liang	334ffd0daa	drm/amdgpu: Initialize and update SDMA power gating Init SDMA HW base configuration and enable idle INT for rn. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Tianci.Yin	12842d02c7	drm/amdgpu/psp: keep TMR in visible vram region for SRIOV Fix compute ring test failure in sriov scenario. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Tianci.Yin	994dcfaa7e	drm/amdgpu: keep the stolen memory in visible vram region stolen memory should be fixed in visible region. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Colin Ian King	92ead9fa6f	drm/amdgpu: fix spelling mistake "jumpimng" -> "jumping" There is a spelling mistake in a DRM_DEBUG_DRIVER debug message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Alex Deucher	53fd9b5ae8	drm/amdgpu/virtual_dce: drop error message in hw_init No need to add new asic cases. This is a sw display implementation, so just drop the error message so when we add new asics, all we have to do is add the virtual dce IP module. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Jean Delvare	77efe48a72	drm/amdgpu/si: fix ASIC tests Comparing adev->family with CHIP constants is not correct. adev->family can only be compared with AMDGPU_FAMILY constants and adev->asic_type is the struct member to compare with CHIP constants. They are separate identification spaces. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: `62a3755341` ("drm/amdgpu: add si implementation v10") Cc: Ken Wang <Qingqing.Wang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Jean Delvare	1cdd229bec	drm/amd/amdgpu: hide voltage and power sensors on SI and KV parts The driver does not support these sensors yet and there is no point in creating sysfs attributes which will always return an error. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Monk Liu	e352625796	drm/amdgpu: introduce vram lost for reset (v2) for SOC15/vega10 the BACO reset & mode1 would introduce vram lost in high end address range, current kmd's vram lost checking cannot catch it since it only check very ahead visible frame buffer v2: cover NV as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Xiaojie Yuan	5ef3b8acdc	drm/amdgpu: enable athub powergating for navi12 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Xiaojie Yuan	c1653ea05b	drm/amdgpu: enable vcn powergating for navi12 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:31 -05:00
Hawking Zhang	317f9cc97b	drm/amdgpu: correct in_suspend setting for navi series in_suspend flag should be set in amdgpu_device_suspend/resume in pairs, instead of gfx10 ip suspend/resume function. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:26 -05:00
Aaron Liu	c072b0c24e	drm/amdgpu: fix GFXOFF on Picasso and Raven2 For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8), firmware is sufficient to support gfxoff. In commit `98f58ada2d`, for picasso&raven2, return directly and cause gfxoff disabled. Fixes: `98f58ada2d` ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible") Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 10:38:40 -05:00
Kai-Heng Feng	c7b33cfb3c	drm/amdgpu: Add APTX quirk for Dell Latitude 5495 Needs ATPX rather than _PR3 to really turn off the dGPU. This can save ~5W when dGPU is runtime-suspended. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 10:09:12 -05:00
Andrey Grodzovsky	691bac9d09	drm/amdgpu: Vega20 SMU I2C HW engine controller. Implement HW I2C enigne controller to be used by the RAS EEPROM table manager. This is based on code from ATITOOLs. v2: Rename the file and all function prefixes to smu_v11_0_i2c By Luben's observation always fill the TX fifo to full so we don't have garbadge interpreted by the slave as valid data. v3: Remove preemption disable as the HW I2C controller will not stop the clock on empty TX fifo and so it's not critical to keep not empty queue. Switch to fast mode 400 khz SCL clock for faster read and write. v5: Restore clock gating before releasing I2C bus and fix some style comments. v6: squash in warning fix, fix includes (Alex) Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <Luben.Tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 09:17:35 -05:00
Andrey Grodzovsky	64f55e6292	drm/amdgpu: Add RAS EEPROM table. Add RAS EEPROM table manager to eanble RAS errors to be stored upon appearance and retrived on driver load. v2: Fix some prints. v3: Fix checksum calculation. Make table record and header structs packed to do correct byte value sum. Fix record crossing EEPROM page boundry. v4: Fix byte sum val calculation for record - look at sizeof(record). Fix some style comments. v5: Add description to EEPROM_TABLE_RECORD_SIZE and syntax fixes. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <Luben.Tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 08:17:14 -05:00
Gang Ba	250af743c0	Revert "drm/amdgpu: free up the first paging queue v2" This reverts commit `4f8bc72fbf`. It turned out that a single reserved queue wouldn't be sufficient for page fault handling. Signed-off-by: Gang Ba <gaba@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 08:16:26 -05:00
Xiaojie Yuan	534991731c	drm/amdgpu: add dummy read for some GCVM status registers The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. SW may operate on an incorrect value if they write a register and immediately check the corresponding status register. Registers requiring HW to clear or set fields may be delayed by 1 cycle. For example, 1. write VM_INVALIDATE_ENG0_REQ mask = 5a 2. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is complete 3. write VM_INVALIDATE_ENG0_REQ mask = 5a 4. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0 b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because the remote GRBM h/w register takes one extra cycle to be cleared c. In this case, SW will see a false ACK if they exit on first read Affected registers (only GC variant) \| Recommended Dummy Read --------------------------------------+---------------------------- VM_INVALIDATE_ENG_ACK \| VM_INVALIDATE_ENG_REQ VM_L2_STATUS \| VM_L2_STATUS VM_L2_PROTECTION_FAULT_STATUS \| VM_L2_PROTECTION_FAULT_STATUS VM_L2_PROTECTION_FAULT_ADDR_HI/LO32 \| VM_L2_PROTECTION_FAULT_ADDR_HI/LO32 VM_L2_IH_LOG_BUSY \| VM_L2_IH_LOG_BUSY MC_VM_L2_PERFCOUNTER_HI/LO \| MC_VM_L2_PERFCOUNTER_HI/LO ATC_L2_PERFCOUNTER_HI/LO \| ATC_L2_PERFCOUNTER_HI/LO ATC_L2_PERFCOUNTER2_HI/LO \| ATC_L2_PERFCOUNTER2_HI/LO Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-27 08:15:32 -05:00
Dave Airlie	578d2342ec	Merge tag 'drm-next-5.4-2019-08-23' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.4-2019-08-23: amdgpu: - Enable power features on Navi12 - Enable power features on Arcturus - RAS updates - Initial Renoir APU support - Enable power featyres on Renoir - DC gamma fixes - DCN2 fixes - GPU reset support for Picasso - Misc cleanups and fixes scheduler: - Possible race fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823202620.3870-1-alexander.deucher@amd.com	2019-08-27 17:22:15 +10:00
Alex Deucher	bad4c3e665	drm/amdgpu: set adev->num_vmhubs for gmc6,7,8 So that we properly handle them on older asics. Fixes: `3ff985485b` ("drm/amdgpu: Export function to flush TLB of specific vm hub") Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-23 11:35:25 -05:00
Guchun Chen	64cc5414fb	drm/amdgpu: correct ras error count type Use unsigned long type for the same ras count variable. This will avoid overflow on 64 bit system. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-23 11:30:32 -05:00
Gerd Hoffmann	35616a4aa9	drm: drop resource_id parameter from drm_fb_helper_remove_conflicting_pci_framebuffers Not needed any more for remove_conflicting_pci_framebuffers calls. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20190822090645.25410-3-kraxel@redhat.com	2019-08-23 10:48:31 +02:00
Thong Thai	8540098492	drm/amdgpu: enable VCN DPG for Renoir This will enable indirect SRAM loading for VCN DPG mode initialization. Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:48:46 -05:00
Thong Thai	134b1461ea	Revert "drm/amdgpu: use direct loading on renoir vcn for the moment" This reverts commit `444a0fea51`. We are ready to enable it now. Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:48:46 -05:00
Aaron Liu	f13580a947	drm/amdgpu: update gc/sdma goldensetting for rn This patch updates gc/sdma goldensetting for renoir Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:48:46 -05:00
Prike Liang	9a868d8bbb	drm/amdgpu: enable SDMA power gating for rn Enable SDMA PG flag during device ip early init. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	91c5b6b326	drm/amdgpu/sdma4: set sdma clock gating for rn Add support for SDMA clockgating on RN. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	2f47d6492b	drm/amdgpu/mmhub1: set mmhub clock gating for rn setup mmhub clockgating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	8db63b7c38	drm/amdgpu: enable DF clock gating for rn Enable DF clock gating during DF IP early init. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	e2ef3b70e8	drm/amdgpu: enable athub clock gating for rn Enable athub MG and LS clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	91ec8bbb88	drm/amdgpu: enable IH clock gating for rn Enable IH clock gating during IH block initialized. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	753c929cc7	drm/amdgpu: enable vcn clock gating for rn Enable VCN middle grain clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	de273070c5	drm/amdgpu: enable rom clock gating for rn Enable rom light sleep clock gating. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	9deac0a415	drm/amdgpu: enable HDP clock gating for rn Enable HDP light sleep clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	d98930f52e	drm/amdgpu: enable BIF clock gating for rn Enable BIF light sleep clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	ef0e7d08a5	drm/amdgpu: enable sdma clock gating for rn Enable sdma middle grain and light sleep clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	a2d15255ea	drm/amdgpu: enable mmhub clock gating for rn Enable mmhub midle grain and light sleep clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Prike Liang	ec3636a53a	drm/amdgpu: enable gfx clock gating for rn Enable gfx cg/mg/cp etc clock gating. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:40:58 -05:00
Aaron Liu	9f21e9ee7f	drm/amdgpu: add and enable gfxoff feature This patch updates gfxoff feature. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:37:39 -05:00
Aaron Liu	1268795511	drm/amdgpu: add set_gfx_cgpg implement (v2) add set_gfx_cgpg implement v2: check if using sw_smu (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:37:36 -05:00
Aaron Liu	97222cfac7	drm/amdgpu/powerplay: add power up/down SDMA interfaces for renoir 1.Implement PowerUpSDMA/PowerDownSDMA interfaces in the swSMU for renoir 2.adjust smu ip block ahead of gfx&sdma ip block Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:37:05 -05:00
Aaron Liu	5dbbe6a77d	drm/amdgpu/powerplay: add smu ip block for renoir (v2) add swSMU [smu_v12_0] for renoir v2: whitespace fixes (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:36:58 -05:00
Xiaojie Yuan	9e48495017	drm/amdgpu/sdma5: fix number of sdma5 trap irq types for navi1x v2: set num_types based on num_instances navi1x has 2 sdma engines but commit "e7b58d03b678 drm/amdgpu: reorganize sdma v4 code to support more instances" changes the max number of sdma irq types (AMDGPU_SDMA_IRQ_LAST) from 2 to 8 which causes amdgpu_irq_gpu_reset_resume_helper() to recover irq of sdma engines with following logic: (enable irq for sdma0) * 1 time (enable irq for sdma1) * 1 time (disable irq for sdma1) * 6 times as a result, after gpu reset, interrupt for sdma1 is lost. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:25:01 -05:00
Christian König	75e1cafde1	drm/amdgpu: fix dma_fence_wait without reference We need to grab a reference to the fence we wait for. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:17:53 -05:00
Frank.Min	ea207b29ae	amd/amdgpu: add Arcturus vf DID support Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:17:12 -05:00
Frank.Min	9d4f837aa0	drm/amdgpu: unity mc base address for arcturus arcturus for sriov would use the unified mc base address Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:15:21 -05:00
Frank.Min	81c274c473	drm/amdgpu: disable agp for sriov Since agp is not used for sriov, just disable it Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-22 17:15:06 -05:00
YueHaibing	252d2a5246	drm/amdgpu: remove duplicated include from gfx_v9_0.c Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:18:54 -05:00
YueHaibing	6892c1f866	drm/amdgpu: remove set but not used variable 'psp_enabled' Fixes gcc '-Wunused-but-set-variable' warning: drivers/gpu/drm/amd/amdgpu/nv.c: In function 'nv_common_early_init': drivers/gpu/drm/amd/amdgpu/nv.c:471:7: warning: variable 'psp_enabled' set but not used [-Wunused-but-set-variable] It's not used since inroduction in commit `c6b6a42175` ("drm/amdgpu: add navi10 common ip block (v3)") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:18:51 -05:00
Nicolai Hähnle	5a6a4c9d1b	drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the allocated array as well as the reference counts of sync objects are leaked. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:18:38 -05:00
Kenneth Feng	6da6c27928	drm/amd/amdgpu: disable MMHUB PG for navi10 Disable MMHUB PG for navi10 according to the production requirement. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:18:05 -05:00
Evan Quan	9aef809b5c	drm/amd/powerplay: expose supported clock domains only through sysfs Do not expose those unsupported clock domains through sysfs on Arcturus. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:17:28 -05:00
Tianci.Yin	828d6fde7f	drm/amdgpu/psp: move TMR to cpu invisible vram region so that more visible vram can be available for umd. Reviewed-by: Christian König <christian.koenig@amd.com>. Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:16:45 -05:00
Xiaojie Yuan	50e275e880	drm/amdgpu: remove redundant argument for psp_funcs::cmd_submit callback Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:16:37 -05:00
Feifei Xu	51bfac71ca	drm/amdgpu: Set no-retry as default. This is to improve performance. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Tested-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:16:18 -05:00
Xiaojie Yuan	c5fb912653	drm/amdgpu: add firmware header printing for psp fw loading (v2) firmware header information is printed for direct fw loading but not added for psp fw loading yet v2: squash in warning fix (Alex) Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:16:18 -05:00
Xiaojie Yuan	6c2243efa0	drm/amdgpu: fix debug level for ppt offset/size Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:15:28 -05:00
Xiaojie Yuan	cc216214ac	drm/amdgpu: remove special autoload handling for navi12 s/r list in rlc firmware is ready, so remove the special autoload handling Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-21 22:15:14 -05:00
Alex Deucher	b05f65d772	drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible We need to set certain power gating flags after we determine if the firmware version is sufficient to support gfxoff. Previously we set the pg flags in early init, but we later we might have disabled gfxoff if the firmware versions didn't support it. Move adding the additional pg flags after we determine whether or not to support gfxoff. Fixes: `005440066f` ("drm/amdgpu: enable gfxoff again on raven series (v2)") Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>	2019-08-21 22:15:13 -05:00
Jason Gunthorpe	daa138a58c	Merge branch 'odp_fixes' into hmm.git From rdma.git Jason Gunthorpe says: ==================== This is a collection of general cleanups for ODP to clarify some of the flows around umem creation and use of the interval tree. ==================== The branch is based on v5.3-rc5 due to dependencies, and is being taken into hmm.git due to dependencies in the next patches. * odp_fixes: RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address RDMA/core: Make invalidate_range a device operation RDMA/odp: Use kvcalloc for the dma_list and page_list RDMA/odp: Check for overflow when computing the umem_odp end RDMA/odp: Provide ib_umem_odp_release() to undo the allocs RDMA/odp: Split creating a umem_odp from ib_umem_get RDMA/odp: Make the three ways to create a umem_odp clear RMDA/odp: Consolidate umem_odp initialization RDMA/odp: Make it clearer when a umem is an implicit ODP umem RDMA/odp: Iterate over the whole rbtree directly RDMA/odp: Use the common interval tree library instead of generic RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-08-21 20:58:18 -03:00
Dave Airlie	5f680625d9	drm-misc-next for 5.4: UAPI Changes: Cross-subsystem Changes: Core Changes: - dma-buf: add reservation_object_fences helper, relax reservation_object_add_shared_fence, remove reservation_object seq number (and then restored) - dma-fence: Shrinkage of the dma_fence structure, Merge dma_fence_signal and dma_fence_signal_locked, Store the timestamp in struct dma_fence in a union with cb_list Driver Changes: - More dt-bindings YAML conversions - More removal of drmP.h includes - dw-hdmi: Support get_eld and various i2s improvements - gm12u320: Few fixes - meson: Global cleanup - panfrost: Few refactors, Support for GPU heap allocations - sun4i: Support for DDC enable GPIO - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1 -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXVqvpwAKCRDj7w1vZxhR xa3RAQDzAnt5zeesAxX4XhRJzHoCEwj2PJj9Re6xMJ9PlcfcvwD+OS+bcB6jfiXV Ug9IBd/DqjlmD9G9MxFxfSV946rksAw= =8uv4 -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2019-08-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.4: UAPI Changes: Cross-subsystem Changes: Core Changes: - dma-buf: add reservation_object_fences helper, relax reservation_object_add_shared_fence, remove reservation_object seq number (and then restored) - dma-fence: Shrinkage of the dma_fence structure, Merge dma_fence_signal and dma_fence_signal_locked, Store the timestamp in struct dma_fence in a union with cb_list Driver Changes: - More dt-bindings YAML conversions - More removal of drmP.h includes - dw-hdmi: Support get_eld and various i2s improvements - gm12u320: Few fixes - meson: Global cleanup - panfrost: Few refactors, Support for GPU heap allocations - sun4i: Support for DDC enable GPIO - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1 Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: fixup dma_resv rename fallout] From: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea	2019-08-21 16:44:41 +10:00
Jason Gunthorpe	c7d8b7824f	hmm: use mmu_notifier_get/put for 'struct hmm' This is a significant simplification, it eliminates all the remaining 'hmm' stuff in mm_struct, eliminates krefing along the critical notifier paths, and takes away all the ugly locking and abuse of page_table_lock. mmu_notifier_get() provides the single struct hmm per struct mm which eliminates mm->hmm. It also directly guarantees that no mmu_notifier op callback is callable while concurrent free is possible, this eliminates all the krefs inside the mmu_notifier callbacks. The remaining krefs in the range code were overly cautious, drivers are already not permitted to free the mirror while a range exists. Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.ca Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-08-20 09:35:02 -03:00
Chris Wilson	b016cd6ed4	dma-buf: Restore seqlock around dma_resv updates This reverts `67c97fb79a` ("dma-buf: add reservation_object_fences helper") `dd7a7d1ff2` ("drm/i915: use new reservation_object_fences helper") `0e1d8083bd` ("dma-buf: further relax reservation_object_add_shared_fence") `5d344f58da` ("dma-buf: nuke reservation_object seq number") The scenario that defeats simply grabbing a set of shared/exclusive fences and using them blissfully under RCU is that any of those fences may be reallocated by a SLAB_TYPESAFE_BY_RCU fence slab cache. In this scenario, while keeping the rcu_read_lock we need to establish that no fence was changed in the dma_resv after a read (or full) memory barrier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190814182401.25009-1-chris@chris-wilson.co.uk	2019-08-16 12:40:58 +01:00
Andrey Grodzovsky	c43b849f89	drm/amdgpu: Use new mode2 reset interface for RV. Integrate the mode2 reset into rest sequence. v2: Check ppfuncs pointer for NULL Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 11:00:44 -05:00
Andrey Grodzovsky	b1f5b4538e	dmr/amdgpu: Fix compile error with CONFIG_DRM_AMDGPU_GART_DEBUGFS Double defintion of 'i' Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:59:17 -05:00
Gang Ba	108b4d928c	drm/amd/amdgpu: Update VM function pointer When VM state changed and system in large bar mode, make sure to use CPU update function, otherwise use SDMA function. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Gang Ba <gaba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:58:28 -05:00
Yong Zhao	8b7d6157f2	drm/amdgpu: Set VM_L2_CNTL.PDE_FAULT_CLASSIFICATION to 0 for GFX10 We have done this for pre-GFX10 asics, but GFX10 did not pick up the new change. The below is the commit message for that change. This is recommended by HW designers. Previously when it was set to 1, the PDE walk error in VM fault will be treated as PERMISSION_OR_INVALID_PAGE_FAULT rather than usually expected OTHER_FAULT. As a result, the retry control in VM_CONTEXT*_CNTL will change accordingly. The above behavior is kind of abnormal. Furthermore, the PDE_FAULT_CLASSIFICATION == 1 feature was targeted for very old ASICs and it never made it way to production. Therefore, we should set it to 0. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:58:14 -05:00
Yong Zhao	5d36d4c976	drm/amdgpu: Add more page fault info printing for GFX10 The printing we did for GFX9 was not propogated to GFX10 somehow, so fix it now. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:58:08 -05:00
Yong Zhao	4e0ae5e214	drm/amdgpu: Add printing for RW extracted from VM_L2_PROTECTION_FAULT_STATUS RW is also useful in most cases. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:58:02 -05:00
Oak Zeng	5413fce4b2	drm/amdkfd/gfx10: Calling amdgpu functions to invalidate TLB Calling amdgpu function to invalidate TLB, instead of using a kfd implementation. Delete the kfd local TLB invalidation implementation. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:57:55 -05:00
Oak Zeng	3ff985485b	drm/amdgpu: Export function to flush TLB of specific vm hub This is for kfd to reuse amdgpu TLB invalidation function. On gfx10, kfd only needs to flush TLB on gfx hub but not on mm hub. So export a function for KFD flush TLB only on specific hub. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:57:48 -05:00
Stephen Rothwell	2568cedc13	drm/amdgpu: MODULE_FIRMWARE requires linux/module.h Fixes: `6a7a0bdbfa` ("drm/amdgpu: add psp_v12_0 for renoir (v2)") Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:52:01 -05:00
Tao Zhou	d6e0cbb152	drm/amdgpu: implement querying ras error count for mmhub get mmhub ea ras error count by accessing EDC_CNT register Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:51:50 -05:00
Kevin Wang	f0f50dcfd4	drm/amdgpu: use exiting amdgpu_ctx_total_num_entities function simplify driver code. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:51:45 -05:00
Kevin Wang	b81e57fbf9	drm/amdgpu: fix typo error amdgput -> amdgpu fix typo error: change function name from "amdgput_ctx_total_num_entities" to "amdgpu_ctx_total_num_entities". Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:51:37 -05:00
Christoph Hellwig	244511f386	drm/amdgpu: simplify and cleanup setting the dma mask Use dma_set_mask_and_coherent to set both masks in one go, and remove the no longer required fallback, as the kernel now always accepts larger than required DMA masks. Fail the driver probe if we can't set the DMA mask, as that means the system can only support a larger mask. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:51:01 -05:00
Christoph Hellwig	90489ce18c	drm/amdgpu: handle PCIe root ports with addressing limitations amdgpu uses a need_dma32 flag to indicate to the drm core that some allocations need to be done using GFP_DMA32, but it only checks the device addressing capabilities to make that decision. Unfortunately PCIe root ports that have limited addressing exist as well. Use the dma_addressing_limited instead to also take those into account. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 10:51:00 -05:00
Christian König	52791eeec1	dma-buf: rename reservation_object to dma_resv Be more consistent with the naming of the other DMA-buf objects. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/323401/	2019-08-13 09:09:30 +02:00
Pierre-Eric Pelloux-Prayer	17b6d2d528	drm/amdgpu: fix gfx9 soft recovery The SOC15_REG_OFFSET() macro wasn't used, making the soft recovery fail. v2: use WREG32_SOC15 instead of WREG32 + SOC15_REG_OFFSET Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-08-12 22:01:17 -05:00
Alex Deucher	b8cf3219cc	drm/amdgpu: flag renoir as experimental for now The current code won't likely work on production hw when it ships so leave it as experimental until it's ready. Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-12 12:47:51 -05:00
Huang Rui	c9d0ca8528	drm/amdgpu: skip mec2 jump table loading for renoir Renoir need not load mec2 jump table with psp. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-12 12:47:51 -05:00

... 3 4 5 6 7 ...

6538 Commits