linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 11:47:53 +07:00

Author	SHA1	Message	Date
Heiner Kallweit	babe2ef342	drm/amdkfd: Use pci_dev_id() helper Use new helper pci_dev_id() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Christian König <christian.koenig@amd.com>	2019-04-29 16:12:35 -05:00
Amber Lin	0da8b10e36	drm/amdgpu: get_fw_version isn't ASIC specific Method of getting firmware version is the same across ASICs, so remove them from ASIC-specific files and create one in amdgpu_amdkfd.c. This new created get_fw_version simply reads fw_version from adev->gfx than parsing the ucode header. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-19 11:32:40 -05:00
Dave Airlie	f06ddb5309	Linux 5.1-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlyzsYgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGMw0H/ir42KJiABBKSETD 0d38qXVclAI/123zl8EkSfDrBKOsuIpXUDxzKeoDMhMkiurMpK6bbEOTPJAQMZJe nEYpq/bZQi+vO8Q/pMMpaC3ExlIRosd0JAR7TyDUh5ZAeeMuDNzmvMk/DPxXPbNt 0P1FWePDa7908ajCOW1T8ZrB9Ak8boo7TKkF3LBb00ks1mEkyp/l74MKOHdu+HYn XIwncX/Jotl4BrKdNC2f/NXYLYk6MrJDGug8TxuHgIqiMWhhrcSqbxU1ri7iqFXB cBYdFo6ZJ8CWHux8/5LY5CMjSqEtzKha2Ohuhy3MMu1RsICyFLQtHnxHJ1ytLSBt DOPcDQ0= =CEUD -----END PGP SIGNATURE----- BackMerge v5.1-rc5 into drm-next Need rc5 for udl fix to add udl cleanups on top. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-15 15:51:49 +10:00
Alex Deucher	e7ad88553a	drm/amdkfd: Add picasso pci id Picasso is a new raven variant. Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-04 10:20:34 -05:00
Alex Deucher	20d059278e	Revert "drm/amdkfd: avoid HMM change cause circular lock" This reverts commit `8dd69e69f4`. This depends on an HMM fix which is not upstream yet. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-28 10:15:49 -05:00
Eric Huang	9b54d20176	drm/amdkfd: add RAS ECC event support (v3) RAS ECC event will combine with GPU reset event, due to ECC interrupts are caused by uncorrectable error that triggers GPU reset. v2: Fix misleading-indentation warning v3: fix build with CONFIG_HSA_AMD disabled Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:51 -05:00
Eric Huang	0dee45a25a	drm/amdkfd: add RAS capabilities in topology for Vega20 (v2) It is to collaborate with HSA_CAPABILITY in libhsakmt. v2: squash in NULL pointer check Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:51 -05:00
Philip Yang	8dd69e69f4	drm/amdkfd: avoid HMM change cause circular lock There is circular lock between gfx and kfd path with HMM change: lock(dqm) -> bo::reserve -> amdgpu_mn_lock To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested locking between mmap_sem and bo::reserve. The locking order is: bo::reserve -> amdgpu_mn_lock(p->mn) Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:03:37 -05:00
Kevin Wang	cac734c2db	drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI) if use the legacy method to allocate object, when mqd_hiq need to run uninit code, it will be cause WARNING call trace. eg: (s3 suspend test) [ 34.918944] Call Trace: [ 34.918948] [<ffffffff92961dc1>] dump_stack+0x19/0x1b [ 34.918950] [<ffffffff92297648>] __warn+0xd8/0x100 [ 34.918951] [<ffffffff9229778d>] warn_slowpath_null+0x1d/0x20 [ 34.918991] [<ffffffffc03ce1fe>] uninit_mqd_hiq_sdma+0x4e/0x50 [amdgpu] [ 34.919028] [<ffffffffc03d0ef7>] uninitialize+0x37/0xe0 [amdgpu] [ 34.919064] [<ffffffffc03d15a6>] kernel_queue_uninit+0x16/0x30 [amdgpu] [ 34.919086] [<ffffffffc03d26c2>] pm_uninit+0x12/0x20 [amdgpu] [ 34.919107] [<ffffffffc03d4915>] stop_nocpsch+0x15/0x20 [amdgpu] [ 34.919129] [<ffffffffc03c1dce>] kgd2kfd_suspend.part.4+0x2e/0x50 [amdgpu] [ 34.919150] [<ffffffffc03c2667>] kgd2kfd_suspend+0x17/0x20 [amdgpu] [ 34.919171] [<ffffffffc03c103a>] amdgpu_amdkfd_suspend+0x1a/0x20 [amdgpu] [ 34.919187] [<ffffffffc02ec428>] amdgpu_device_suspend+0x88/0x3a0 [amdgpu] [ 34.919189] [<ffffffff922e22cf>] ? enqueue_entity+0x2ef/0xbe0 [ 34.919205] [<ffffffffc02e8220>] amdgpu_pmops_suspend+0x20/0x30 [amdgpu] [ 34.919207] [<ffffffff925c56ff>] pci_pm_suspend+0x6f/0x150 [ 34.919208] [<ffffffff925c5690>] ? pci_pm_freeze+0xf0/0xf0 [ 34.919210] [<ffffffff926b45c6>] dpm_run_callback+0x46/0x90 [ 34.919212] [<ffffffff926b49db>] __device_suspend+0xfb/0x2a0 [ 34.919213] [<ffffffff926b4b9f>] async_suspend+0x1f/0xa0 [ 34.919214] [<ffffffff922c918f>] async_run_entry_fn+0x3f/0x130 [ 34.919216] [<ffffffff922b9d4f>] process_one_work+0x17f/0x440 [ 34.919217] [<ffffffff922bade6>] worker_thread+0x126/0x3c0 [ 34.919218] [<ffffffff922bacc0>] ? manage_workers.isra.25+0x2a0/0x2a0 [ 34.919220] [<ffffffff922c1c31>] kthread+0xd1/0xe0 [ 34.919221] [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40 [ 34.919222] [<ffffffff92974c1d>] ret_from_fork_nospec_begin+0x7/0x21 [ 34.919224] [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40 [ 34.919224] ---[ end trace 38cd9f65c963adad ]--- Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-27 22:19:07 -05:00
Dave Airlie	fbac3c48fa	Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next Fixes for 5.1: amdgpu: - Fix missing fw declaration after dropping old CI DPM code - Fix debugfs access to registers beyond the MMIO bar size - Fix context priority handling - Add missing license on some new files - Various cleanups and bug fixes radeon: - Fix missing break in CS parser for evergreen - Various cleanups and bug fixes sched: - Fix entities with 0 run queues Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com	2019-02-22 15:56:42 +10:00
Yong Zhao	234441dd49	drm/amdkfd: Optimize out sdma doorbell array in kgd2kfd_shared_resources We can directly calculate sdma doorbell indexes in the process doorbell pages through the doorbell_index structure in amdgpu_device, so no need to cache them in kgd2kfd_shared_resources any more. This alleviates the adaptation needs when new SDMA configurations are introduced. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-18 18:00:50 -05:00
Yong Zhao	1f86805adc	drm/amdkfd: Fix bugs regarding CP queue doorbell mask on SOC15 Reserved doorbells for SDMA IH and VCN were not properly masked out when allocating doorbells for CP user queues. This patch fixed that. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-18 18:00:41 -05:00
Yong Zhao	7452394310	drm/amdkfd: Move a constant definition around The similar definitions should be consecutive. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-18 17:59:56 -05:00
Dave Airlie	c06de56121	Linux 5.0-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj BW9GgU8= =ZizU -----END PGP SIGNATURE----- Merge v5.0-rc7 into drm-next Backmerging for nouveau and imx that needed some fixes for next pulls. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-02-18 13:27:15 +10:00
Nathan Chancellor	6d3d8065bb	drm/amdkfd: Fix if preprocessor statement above kfd_fill_iolink_info_for_cpu Clang warns: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:866:5: warning: 'CONFIG_X86_64' is not defined, evaluates to 0 [-Wundef] ^ 1 warning generated. Fixes: `d1c234e2cd` ("drm/amdkfd: Allow building KFD on ARM64 (v2)") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-05 18:10:28 -05:00
Felix Kuehling	bbdf514fe5	drm/amdkfd: Don't assign dGPUs to APU topology devices dGPUs need their own topology devices. Don't assign them to APU topology devices with CPU cores. Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Elias Konstantinidis <ekondis@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:59:50 -05:00
Felix Kuehling	d1c234e2cd	drm/amdkfd: Allow building KFD on ARM64 (v2) ifdef x86_64 specific code. Allow enabling CONFIG_HSA_AMD on ARM64. v2: Fixed a compiler warning due to an unused variable CC: Mark Nutter <Mark.Nutter@arm.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Mark Nutter <Mark.Nutter@arm.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:59:37 -05:00
Felix Kuehling	b8fe05247d	drm/amdkfd: Don't assign dGPUs to APU topology devices dGPUs need their own topology devices. Don't assign them to APU topology devices with CPU cores. Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Elias Konstantinidis <ekondis@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:37:33 -05:00
Felix Kuehling	df1dd4f4a7	drm/amdkfd: Allow building KFD on ARM64 (v2) ifdef x86_64 specific code. Allow enabling CONFIG_HSA_AMD on ARM64. v2: Fixed a compiler warning due to an unused variable CC: Mark Nutter <Mark.Nutter@arm.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Mark Nutter <Mark.Nutter@arm.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:37:15 -05:00
Amber Lin	308176d6f6	drm/amdgpu: Remove kgd2kfd function pointers kgd2kfd function pointers and global kgd2kfd pointer are no longer in use. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:29 -05:00
Amber Lin	2d3d25b616	drm/amdgpu: Relocate kgd2kfd function declaration Since amdkfd is merged into amdgpu module and amdgpu can access amdkfd directly, move declaration of kgd2kfd functions from kfd_priv.h to amdgpu_amdkfd.h Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:28 -05:00
Linus Torvalds	0fe4e2d5cd	drm i915 gvt, amdgpu, core fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcL65eAAoJEAx081l5xIa+y7EP+wQnTk3GV7rKiIi5LEtux5xW X2tTaPKHnwrMYjRaP2VNUntJPH6Wxcby3OHGNvGMe1IqNGL/5qRLQ/g1rSSPuM4z rYwWR/ooDU/KwYvsT/o+DSO62AoVzIqx8gn8+ShirRN3MdobCcwDebd5oqKjduOn hRy9WQwgPOnDG1D3fRWOGSzOE1K9yDFCUaR0AmhUehn9NvsztQGamMBBwMNg+y52 a5vu+nSLxQrv3ZyZ5TQUgAzi2pWFtC6QxIVuLpl5TqFA3vdRVyN1T78klDnQ7WU7 6GY1yq9D923c1Tfa0RZoXnE++bX91KKJ5y9YFuNFv8X/th6UoEzRrOPDINfLoZv3 JsPPSPAiZTgoXc/RGfoMbnidajNB7Gx+No+Pd8P6MeY5H1E+ivMXt5MrOgcMXUqk FajthiuSlaB+u5OjNjuS6gBbAMIKw7Idg4hEFSabj91qhJIet/fPhzNmp0HPJ1wF XlNnxI7XOytCAORrjLy2q4/lkaoG2AlVpZzeMLgXSxGGlSCtIpDUIqgQbtV1ppCi RboQ8yMflRejeK6oXoC92mI8yDB6rwoQy2tK0Hvnag5/q1r7AVYJq+3890NFEU4X F5TuCgvhswdkTEJUED1G6pnX7aQzW0dh6KrCltF34sFzD1etYb150En7laa+2kmX G5HfZbkLwscPt91moA6B =hFld -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Happy New Year, just decloaking from leave to get some stuff from the last week in before rc1: core: - two regression fixes for damage blob and atomic i915 gvt: - Some missed GVT fixes from the original pull amdgpu: - new PCI IDs - SR-IOV fixes - DC fixes - Vega20 fixes" * tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm: (53 commits) drm: Put damage blob when destroy plane state drm: fix null pointer dereference on null state pointer drm/amdgpu: Add new VegaM pci id drm/ttm: Use drm_debug_printer for all ttm_bo_mem_space_debug output drm/amdgpu: add Vega20 PSP ASD firmware loading drm/amd/display: Fix MST dp_blank REG_WAIT timeout drm/amd/display: validate extended dongle caps drm/amd/display: Use div_u64 for flip timestamp ns to ms drm/amdgpu/uvd:Change uvd ring name convention drm/amd/powerplay: add Vega20 LCLK DPM level setting support drm/amdgpu: print process info when job timeout drm/amdgpu/nbio7.4: add hw bug workaround for vega20 drm/amdgpu/nbio6.1: add hw bug workaround for vega10/12 drm/amd/display: Optimize passive update planes. drm/amd/display: verify lane status before exiting verify link cap drm/amd/display: Fix bug with not updating VSP infoframe drm/amd/display: Add retry to read ddc_clock pin drm/amd/display: Don't skip link training for empty dongle drm/amd/display: Wait edp HPD to high in detect_sink drm/amd/display: fix surface update sequence ...	2019-01-05 18:25:19 -08:00
Linus Torvalds	96d4f267e4	Remove 'type' argument from access_ok() function Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-03 18:57:57 -08:00
Arun KS	9705bea5f8	mm: convert zone->managed_pages to atomic variable totalram_pages, zone->managed_pages and totalhigh_pages updates are protected by managed_page_count_lock, but readers never care about it. Convert these variables to atomic to avoid readers potentially seeing a store tear. This patch converts zone->managed_pages. Subsequent patches will convert totalram_panges, totalhigh_pages and eventually managed_page_count_lock will be removed. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. Link: http://lkml.kernel.org/r/1542090790-21750-3-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Suggested-by: Michal Hocko <mhocko@suse.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-28 12:11:47 -08:00
Linus Torvalds	4971f090aa	drm pull request for 4.21-rc1 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJcExwOAAoJEAx081l5xIa+euIP/1NZZvSB+bsCtOwDG8I6uWsS OU5JUZ8q2dqyyFagRxzlkeSt3uWJqKp5NyNwuc9z/5u6AGF+3/97D0J1lG6Os/st 4abF6NadivYJ4cXhJ1ddIHOFMVDcAsyMWNDb93NwPwncCsQ0jt5FFOsrCyj6BGY+ ihHFlHrIyDrbBGDHz+u1E/EO5WkNnaLDoC+/k2fTRWCNI3bQL3O+orsYTI6S2uvU lQJnRfYAllgLD2p1k/rrBHcHXBv50roR0e8uhGmbdhGdp5bEW30UGBLHXxQjjSVy fQCwFwTO8X6zoxU53Zbbk+MVrp+jkTHcGKViHRuLkaHzE5mX26UXDwlXdN32ZUbK yHOJp+uDaWXX7MIz0LsB9Iqj2+eIUoFaIJMoZTMGVTNvqnTxKnoHnjAtbTH2u258 teFgmy4BIgPgo2kwEnBEZjCapou0Eivyut2wq8bTAB2Fe8LwURJpr3cioTtMLlUO L5/PoD27eFvBCAeFrQIwF3b2XiQEnBpXocmilEwP1xDMPgoyeePAfIF2iEpDvi0U jce3rLd2yVvo92xYUgoHkVTD8si/pKKnZ1D0U3+RI6pxK6s0HJEHjcNEMdvdm+2S 4qgvBQV3wlWFkXEK8PR5BHPoLntg18tKon/BTLBjgGkN9E1o9fWs1/s6KQGY4xdo l3Vvfx2LTdkgEoBssSwB =wh4W -----END PGP SIGNATURE----- Merge tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Core: - shared fencing staging removal - drop transactional atomic helpers and move helpers to new location - DP/MST atomic cleanup - Leasing cleanups and drop EXPORT_SYMBOL - Convert drivers to atomic helpers and generic fbdev. - removed deprecated obj_ref/unref in favour of get/put - Improve dumb callback documentation - MODESET_LOCK_BEGIN/END helpers panels: - CDTech panels, Banana Pi Panel, DLC1010GIG, - Olimex LCD-O-LinuXino, Samsung S6D16D0, Truly NT35597 WQXGA, - Himax HX8357D, simulated RTSM AEMv8. - GPD Win2 panel - AUO G101EVN010 vgem: - render node support ttm: - move global init out of drivers - fix LRU handling for ghost objects - Support for simultaneous submissions to multiple engines scheduler: - timeout/fault handling changes to help GPU recovery - helpers for hw with preemption support i915: - Scaler/Watermark fixes - DP MST + powerwell fixes - PSR fixes - Break long get/put shmemfs pages - Icelake fixes - Icelake DSI video mode enablement - Engine workaround improvements amdgpu: - freesync support - GPU reset enabled on CI, VI, SOC15 dGPUs - ABM support in DC - KFD support for vega12/polaris12 - SDMA paging queue on vega - More amdkfd code sharing - DCC scanout on GFX9 - DC kerneldoc - Updated SMU firmware for GFX8 chips - XGMI PSP + hive reset support - GPU reset - DC trace support - Powerplay updates for newer Polaris - Cursor plane update fast path - kfd dma-buf support virtio-gpu: - add EDID support vmwgfx: - pageflip with damage support nouveau: - Initial Turing TU104/TU106 modesetting support msm: - a2xx gpu support for apq8060 and imx5 - a2xx gpummu support - mdp4 display support for apq8060 - DPU fixes and cleanups - enhanced profiling support - debug object naming interface - get_iova/page pinning decoupling tegra: - Tegra194 host1x, VIC and display support enabled - Audio over HDMI for Tegra186 and Tegra194 exynos: - DMA/IOMMU refactoring - plane alpha + blend mode support - Color format fixes for mixer driver rcar-du: - R8A7744 and R8A77470 support - R8A77965 LVDS support imx: - fbdev emulation fix - multi-tiled scalling fixes - SPDX identifiers rockchip - dw_hdmi support - dw-mipi-dsi + dual dsi support - mailbox read size fix qxl: - fix cursor pinning vc4: - YUV support (scaling + cursor) v3d: - enable TFU (Texture Formatting Unit) mali-dp: - add support for linear tiled formats sun4i: - Display Engine 3 support - H6 DE3 mixer 0 support - H6 display engine support - dw-hdmi support - H6 HDMI phy support - implicit fence waiting - BGRX8888 support meson: - Overlay plane support - implicit fence waiting - HDMI 1.4 4k modes bridge: - i2c fixes for sii902x" * tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (1403 commits) drm/amd/display: Add fast path for cursor plane updates drm/amdgpu: Enable GPU recovery by default for CI drm/amd/display: Fix duplicating scaling/underscan connector state drm/amd/display: Fix unintialized max_bpc state values Revert "drm/amd/display: Set RMX_ASPECT as default" drm/amdgpu: Fix stub function name drm/msm/dpu: Fix clock issue after bind failure drm/msm/dpu: Clean up dpu_media_info.h static inline functions drm/msm/dpu: Further cleanups for static inline functions drm/msm/dpu: Cleanup the debugfs functions drm/msm/dpu: Remove dpu_irq and unused functions drm/msm: Make irq_postinstall optional drm/msm/dpu: Cleanup callers of dpu_hw_blk_init drm/msm/dpu: Remove unused functions drm/msm/dpu: Remove dpu_crtc_is_enabled() drm/msm/dpu: Remove dpu_crtc_get_mixer_height drm/msm/dpu: Remove dpu_dbg drm/msm: dpu: Remove crtc_lock drm/msm: dpu: Remove vblank_requested flag from dpu_crtc drm/msm: dpu: Separate crtc assignment from vblank enable ...	2018-12-25 11:48:26 -08:00
Felix Kuehling	e98bdb8061	drm/amdkfd: Fix handling of return code of dma_buf_get On errors, dma_buf_get returns a negative error code, rather than NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-18 17:39:11 -05:00
Alex Deucher	9bd206f89f	drm/amdkfd: add new vega20 pci id New vega20 id. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-10 15:27:45 -05:00
Alex Deucher	756e16bf79	drm/amdkfd: add new vega10 pci ids New vega10 ids. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2018-12-10 15:27:32 -05:00
Felix Kuehling	b408a54884	drm/amdkfd: Add support for doorbell BOs This allows user mode to map doorbell pages into GPUVM address space. That way GPUs can submit to user mode queues (self-dispatch). Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-07 18:14:00 -05:00
Felix Kuehling	1dde0ea95b	drm/amdkfd: Add DMABuf import functionality This is used for interoperability between ROCm compute and graphics APIs. It allows importing graphics driver BOs into the ROCm SVM address space for zero-copy GPU access. The API is split into two steps (query and import) to allow user mode to manage the virtual address space allocation for the imported buffer. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-07 18:13:54 -05:00
Felix Kuehling	3704d56e1a	drm/amdkfd: Add NULL-pointer check top_dev->gpu is NULL for CPUs. Avoid dereferencing it if NULL. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-07 18:13:48 -05:00
Brajeswar Ghosh	b8b3ede2de	drm/amd/amdkfd: Remove duplicate header Remove gca/gfx_8_0_enum.h which is included more than once Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-26 15:54:39 -05:00
Yong Zhao	a53a11a835	drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS This is a known gfx9 HW issue, and this change can perfectly workaround the issue. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-19 16:38:14 -05:00
Yong Zhao	00557f4131	drm/amdkfd: Adjust the debug message in KFD ISR This makes debug message get printed even when there is early return. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-19 16:38:14 -05:00
Gang Ba	846a44d7e9	drm/amdkfd: Added Vega12 and Polaris12 for KFD. Add Vega12 and Polaris12 device info and device IDs to KFD. Signed-off-by: Gang Ba <gaba@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-19 16:38:13 -05:00
Yong Zhao	4e6c6fc19d	drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager This will make reading code much easier. This fixes a few spots missed in a previous commit with the same title. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-19 16:38:13 -05:00
Christian König	2383a767c0	drm/amdkfd: fix interrupt spin lock Vega10 has multiple interrupt rings, so this can be called from multiple calles at the same time resulting in: [ 71.779334] ================================ [ 71.779406] WARNING: inconsistent lock state [ 71.779478] 4.19.0-rc1+ #44 Tainted: G W [ 71.779565] -------------------------------- [ 71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...}, at: kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780058] {IN-HARDIRQ-W} state was registered at: [ 71.780115] _raw_spin_lock+0x2c/0x40 [ 71.780180] kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780248] amdgpu_irq_callback+0x6c/0x150 [amdgpu] [ 71.780315] amdgpu_ih_process+0x88/0x100 [amdgpu] [ 71.780380] amdgpu_irq_handler+0x20/0x40 [amdgpu] [ 71.780409] __handle_irq_event_percpu+0x49/0x2a0 [ 71.780436] handle_irq_event_percpu+0x30/0x70 [ 71.780461] handle_irq_event+0x37/0x60 [ 71.780484] handle_edge_irq+0x83/0x1b0 [ 71.780506] handle_irq+0x1f/0x30 [ 71.780526] do_IRQ+0x53/0x110 [ 71.780544] ret_from_intr+0x0/0x22 [ 71.780566] cpuidle_enter_state+0xaa/0x330 [ 71.780591] do_idle+0x203/0x280 [ 71.780610] cpu_startup_entry+0x6f/0x80 [ 71.780634] start_secondary+0x1b0/0x200 [ 71.780657] secondary_startup_64+0xa4/0xb0 Fix this by always using irq save spin locks. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 15:49:38 -05:00
Yong Zhao	435e2f9709	drm/amdkfd: page_table_base already have the flags needed The flags are added when calling amdgpu_gmc_pd_addr(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:14 -05:00
Yong Zhao	deb99d7c4f	drm/amdkfd: Delete a duplicate statement in set_pasid_vmid_mapping() The same statement is later done in kgd_set_pasid_vmid_mapping(), so no need to do it in set_pasid_vmid_mapping(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:13 -05:00
Amber Lin	7cd52c917a	drm/amdkfd: Add proper prefix to functions Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage. v2: fix indentation Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:08 -05:00
Amber Lin	5b87245faf	drm/amdkfd: Simplify kfd2kgd interface After amdkfd module is merged into amdgpu, KFD can call amdgpu directly and no longer needs to use the function pointer. Replace those function pointers with functions if they are not ASIC dependent. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:07 -05:00
zhong jiang	6dfeb11a4b	drm/amdkfd: Use kmemdup instead of duplicating its function kmemdup has implemented the function that kmalloc() + memcpy(). We prefer to kmemdup rather than code opened implementation. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:20:36 -05:00
YueHaibing	ae5c59a83b	drm/amdkfd: Remove set but not used variable 'preempt_all_queues' Fixes gcc '-Wunused-but-set-variable' warning: drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c: In function 'destroy_queue_cpsch': drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:1366:7: warning: variable 'preempt_all_queues' set but not used [-Wunused-but-set-variable] It never used since introduct in commit `992839ad64` ("drm/amdkfd: Add static user-mode queues support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:49:42 -05:00
Felix Kuehling	1b19aa5aa8	drm/amdkfd: Fix incorrect use of process->mm This mm_struct pointer should never be dereferenced. If running in a user thread, just use current->mm. If running in a kernel worker use get_task_mm to get a safe reference to the mm_struct. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-09 17:06:19 -05:00
Shaoyun Liu	006a0b3d86	drm/amdkfd: Remove the requirement for atomic Ops on vg20 Firmware have the workaround to replace the atomic Ops with read-modify-write on CP side. User should not expect atomic Ops on system memory works normally if system didn't not support it. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-By: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-27 09:39:57 -05:00
Shaoyun Liu	22a3a2941b	drm/amdkfd: Vega20 bring up on amdkfd side Add Vega20 device IDs, device info and enable it in KFD. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2018-09-26 21:09:18 -05:00
Shaoyun Liu	e715c6d0ea	drm/amd: Interface change to support 64 bit page_table_base amdgpu_gpuvm_get_process_page_dir should return the page table address in the format expected by the pm4_map_process packet for all ASIC generations. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:17 -05:00
Shaoyun Liu	d50941892e	drm/amdkfd: Make the number of SDMA queues variable Vega20 supports 8 SDMA queues per engine Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:16 -05:00
Jay Cornwall	5df099e8bc	drm/amdkfd: Add wavefront context save state retrieval ioctl Wavefront context save data is of interest to userspace clients for debugging static wavefront state. The MQD contains two parameters required to parse the control stack and the control stack itself is kept in the MQD from gfx9 onwards. Add an ioctl to fetch the context save area and control stack offsets and to copy the control stack to a userspace address if it is kept in the MQD. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:15 -05:00
Felix Kuehling	5ade6c9c35	drm/amdkfd: Report SDMA firmware version in the topology Also save the version in struct kfd_dev so we only need to query it once. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:14 -05:00
Emily Deng	6d12aa8741	drm/amdkfd: KFD doesn't support TONGA SRIOV KFD module doesn't support TONGA SRIOV, if init KFD module in TONGA SRIOV environment, it will let compute ring IB test fail. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:14 -05:00
Eric Huang	d35f00d8ec	drm/amdkfd: reflect atomic support in IO link properties Add the flags of properties according to Asic type and pcie capabilities. Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-26 21:09:13 -05:00
Dave Airlie	bf78296ab1	This is the 4.19-rc5 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlunyjMACgkQONu9yGCS aT52HhAA0JU7E88QPZ1gSxc1ifTaIlHXhLQSvQKAXOhIvHDwj4tEKDqPhpCN/dWX /o/xaUf36gU0VzUD/1IyEiMFmJEeFKnfvN5SZYZLk8uSrd4swqaY8mSueZxNEDz4 YNK9ugI/tPztuuz7I6KrO1iVquY1WlnECxc9FH76wvHsit8Sr3PvzhR+CVrOi+8p k3cpWlhHiOzT/3K3Wv2Et+oh+U+myKtQTlJDSe3fMx5chksJpBmsV/IDEtsLNZfz 3v25fHz5a3DOYqKkGJaDrbLyPNC85249B+CiXqbXvfOAHDVkMwYqcxYUG+YZ5cpm U0OShLXm67dz8vT9cxqOSguCliPRlM9W5+EKzmVT7l8+ycds3BuEEHg1xWPrJWgG 7XO10HkhZl+VvnJCj54KaszMUOdpvdEQSUs82gAFxjPbQIx5gosN9O0H+DnirMhS 6VtzS20ZoIzjd4YVkRoLNcobHB4bZVTNXZ1Zi3C/neP9pxUjhOk0y+Vr/crC5Xph 3TykIMgiVa+CdvQ/f4LOSiCgTFhF0tLGtfDQTG7f+9+W5pMc4NKSLi8EOMlJtYEy wsCYZ7/T9ElgrEzFvlxSvDwiPUhcldNao/EGdRYvMxXtgj0Ctw8LhR/2YKkqo6LK oMoKKWkj0o7uKSHKq+dakS0FprKnBnvE2Y+XA4SO/saPGFlDAVc= =OFJh -----END PGP SIGNATURE----- BackMerge v4.19-rc5 into drm-next Sean Paul requested an -rc5 backmerge from some sun4i fixes. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-09-27 11:06:46 +10:00
Yong Zhao	44d8cc6f1a	drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we need to workaround this. For future compatibility, we also overwrite the bit in capability according to the value of needs_iommu_device. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-20 10:25:23 -05:00
Yong Zhao	15426dbb65	drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9 CWSR fails on Raven if the control stack is MTYPE_UC, which is used for regular GART mappings. As a workaround we map it using MTYPE_NC. The MEC firmware expects the control stack at one page offset from the start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU added a memory allocation flag just for this purpose. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-20 10:25:17 -05:00
shaoyunl	67f7cf9f76	drm/amdkfd: Only add bi-directional iolink on GPU with XGMI or largebar (v2) v2: compile fix Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-10 22:49:33 -05:00
Shaoyun Liu	ae9a25aea7	drm/amdkfd: Generate xGMI direct iolink Generate xGMI iolink for upper level usage Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-10 22:49:00 -05:00
Shaoyun Liu	aa64ca38ed	drm/amdkfd: Add new iolink type defines Update the iolink type defines according to the new thunk spec Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-10 22:48:51 -05:00
Shaoyun Liu	0c1690e38b	drm/amdkfd: kfd expose the hive_id of the device through its node properties Thunk will generate the XGMI topology information when necessary with the hive_id for each specified device Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-10 22:48:43 -05:00
Amber Lin	2690262ec9	drm/amdgpu: Relocate some definitions v2 Move some KFD-related (but used in amdgpu_drv.c) definitions from kfd_priv.h to kgd_kfd_interface.h so we don't need to include kfd_priv.h in amdgpu_drv.c. This fixes a build failure when AMDGPU is enabled but MMU_NOTIFIER is not. This patch also disables KFD-related module options when HSA_AMD is not enabled. v2: rebase (Alex) Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-29 12:41:50 -05:00
Oak Zeng	bf47afbabf	drm/amdkfd: Release an acquired process vm For compute vm acquired from amdgpu, vm.pasid is managed by kfd. Decouple pasid from such vm on process destroy to avoid duplicate pasid release. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-29 12:35:00 -05:00
Oak Zeng	1685b01a85	drm/amdgpu: Set pasid for compute vm (v2) To make a amdgpu vm to a compute vm, the old pasid will be freed and replaced with a pasid managed by kfd. Kfd can't reuse original pasid allocated by amdgpu because kfd uses different pasid policy with amdgpu. For example, all graphic devices share one same pasid in a process. v2: rebase (Alex) Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-29 12:34:49 -05:00
Amber Lin	521fb7d021	drm/amdgpu: Move KFD parameters to amdgpu (v3) After merging KFD into amdgpu, move module parameters defined in KFD to amdgpu_drv.c, where other module parameters are declared. v2: add kernel-doc comments v3: rebase and fix parameter variable name (Alex) Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:51:11 -05:00
Amber Lin	04d5e27658	drm/amdgpu: Merge amdkfd into amdgpu Since KFD is only supported by single GPU driver, it makes sense to merge amdgpu and amdkfd into one module. This patch is the initial step: merge Kconfig and Makefile. v2: also remove kfd from drm Kconfig Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-08-28 11:22:42 -05:00
Felix Kuehling	b5aa3f4aef	drm/amdkfd: Call kfd2kgd.set_compute_idle User mode queue submissions don't go through KFD. Therefore we don't know exactly when compute is idle or not idle. We use the existence of user mode queues on a device as an approximation. register_process is called when the first queue of a process is created. Conversely unregister_process is called when the last queue is destroyed. The first process that is registered takes compute out of idle. The last process that is unregisters sets compute back to idle. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-16 19:10:37 -04:00
Felix Kuehling	39e7f33186	drm/amdkfd: Add CU-masking ioctl to KFD CU-masking allows a KFD client to control the set of CUs used by a user mode queue for executing compute dispatches. This can be used for optimizing the partitioning of the GPU and minimize conflicts between concurrent tasks. Signed-off-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-14 19:05:59 -04:00
Yong Zhao	4d663df658	drm/amdkfd: Enable Raven for KFD Add DID and kfd_device_info for Raven. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:48 -04:00
Yong Zhao	359cecdd49	drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() memory_exception_data is already initialized for not-present faults. It only needs to be overridden for permission faults. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:47 -04:00
Yong Zhao	8725aecac3	drm/amdkfd: Workaround to accommodate Raven too many PPR issue On Raven multiple PPRs can be queued up by the hardware. When the first of those requests is handled by the IOMMU driver, the memory access succeeds. After that the application may be done with the memory and unmap it. At that point the page table entries are invalidated, but there are still outstanding duplicate PPRs for those addresses. When the IOMMU driver processes those duplicate requests, it finds invalid page table entries and triggers an invalid PPR fault. As a workaround, don't signal invalid PPR faults on Raven to avoid segfaulting applications that haven't done anything wrong. As a side effect, real GPU memory access faults may go unnoticed by the application. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:46 -04:00
Yong Zhao	eab69801cf	drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues On Raven Invalid PPRs (peripheral page requests) can be reported because multiple PPRs can be still queued when memory is freed. Apply a rate limit to avoid flooding the log in this case. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:45 -04:00
Yong Zhao	98bb92222e	drm/amdkfd: Make SDMA engine number an ASIC-dependent variable On Raven there is only one SDMA engine instead of previously assumed two, so we need to adapt our code to this new scenario. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:44 -04:00
Yong Zhao	f3ed5df84c	drm/amdkfd: Consolidate duplicate memory banks info in topology If there are several memory banks that has the same properties in CRAT, we aggregate them into one memory bank. This cleans up memory banks on APUs (e.g. Raven) where the CRAT reports each memory channel as a separate bank. This only confuses user mode, which only deals with virtual memory. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-13 16:17:43 -04:00
Yong Zhao	e7016d8e6f	drm/amdkfd: Clean up reference of radeon Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:08 -04:00
Yong Zhao	8d5f355290	drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager This will make reading code much easier. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:07 -04:00
Yong Zhao	2b281977f5	drm/amdkfd: Use module parameters noretry as the internal variable name This makes all module parameters use the same form. Meanwhile clean up the surrounding code. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:06 -04:00
Yong Zhao	0e9a860c72	drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang This avoids triggering a GPU reset or otherwise changing the HW state. Instead KFD will hang, which allows HW debugging tools to analyze the problem. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:05 -04:00
Shaoyun Liu	a29ec470b1	drm/amdkfd: Add debugfs interface to trigger HWS hang Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:04 -04:00
Shaoyun Liu	951df6d9cf	drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation The bitmap index calculation should reverse the logic used on allocation so it will clear the same bit used on allocation Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:33:01 -04:00
Shaoyun Liu	73ea648d92	drm/amdkfd: Implement hang detection in KFD and call amdgpu The reset will be performed in a new hw_exception work thread to handle HWS hang without blocking the thread that detected the hang. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:58 -04:00
Shaoyun Liu	e42051d213	drm/amdkfd: Implement GPU reset handlers in KFD Lock KFD and evict existing queues on reset. Notify user mode by signaling hw_exception events. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:56 -04:00
Shaoyun Liu	e3b7a96774	drm/amdkfd: Add gpu reset interface and place holder Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:54 -04:00
Lan Xiao	58e6988612	drm/amdkfd: fix zero reading of VMID and PASID for Hawaii Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by: Lan Xiao <Lan.Xiao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:51 -04:00
shaoyunl	2640c3facb	drm/amdkfd: Handle VM faults in KFD 1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:50 -04:00
Moses Reuben	101fee63cb	drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORY Signed-off-by: Moses Reuben <moses.reuben@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:48 -04:00
Wei Lu	e47cb828eb	drm/amdkfd: Fix error codes in kfd_get_process Return ERR_PTR(-EINVAL) if kfd_get_process fails to find the process. This fixes kernel oopses when a child process calls KFD ioctls with a file descriptor inherited from the parent process. Signed-off-by: Wei Lu <wei.lu2@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:47 -04:00
Jay Cornwall	a60d811b2b	drm/amdkfd: Fix race between scheduler and context restore The scheduler may raise SQ_WAVE_STATUS.SPI_PRIO via SQ_CMD before context restore has completed. Restoring SPI_PRIO=0 after this point may cause context save to fail as the lower priority wavefronts are not selected for execution among spin-waiting wavefronts. Leave SPI_PRIO at its SPI-initialized or scheduler-raised value. v2: Also fix race with exception handler Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:46 -04:00
Felix Kuehling	1cd106ecfc	drm/amdkfd: Stop using GFP_NOIO explicitly This is no longer needed with the memalloc_nofs_save/restore in dqm_lock/unlock. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:45 -04:00
Felix Kuehling	efeaed4d98	drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock This is needed to prevent deadlocks when MMU notifiers run in reclaim-FS context and take the DQM lock for userptr evictions. Previously this was done by making all memory allocations under DQM locks GFP_NOIO. This is error prone. Using memalloc_nofs_save/restore will reliably affect all memory allocations anywhere in the kernel while the DQM lock is held. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 22:32:44 -04:00
Arnd Bergmann	0337976f40	drm/admkfd use modern ktime accessors getrawmonotonic64() and get_monotonic_boottime64() are deprecated because of the nonstandard naming. The replacement functions ktime_get_raw_ns() and ktime_get_boot_ns() also simplify the callers. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-07-11 14:41:00 +02:00
Dave Airlie	c76f0b2cc2	Merge tag 'drm-amdkfd-next-2018-05-14' of git://people.freedesktop.org/~gabbayo/linux into drm-next This is amdkfd pull for 4.18. The major new features are: - Add support for GFXv9 dGPUs (VEGA) - Add support for userptr memory mapping In addition, there are a couple of small fixes and improvements, such as: - Fix lock handling - Fix rollback packet in kernel kfd_queue - Optimize kfd signal handling - Fix CP hang in APU Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180514070126.GA1827@odedg-x270	2018-05-15 16:06:08 +10:00
Randy Dunlap	7bbc0b950f	drm/amdkfd: fix build, select MMU_NOTIFIER When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an incomplete type definition, which causes build errors. ../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type ../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type ../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-24 12:50:04 +03:00
Andres Rodriguez	1cf6cc74bb	drm/amdkfd: fix clock counter retrieval for node without GPU Currently if a user requests clock counters for a node without a GPU resource we will always return EINVAL. Instead if no GPU resource is attached, fill the gpu_clock_counter argument with zeroes so that we may proceed and return valid CPU counters. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-24 12:34:44 +03:00
Wei Yongjun	ded5e5622c	drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu() Passing NULL pointer to PTR_ERR will result in return value of 0 indicating success which is clearly not what it is intended here. This patch returns -EINVAL instead. v2: change ret code to -ENODEV Fixes: `5ec7e02854` ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-24 12:14:55 +03:00
kbuild test robot	a4efd3a4e6	drm/amdkfd: kfd_dev_is_large_bar() can be static Fixes: `5ec7e02854` ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-24 12:05:27 +03:00
Laura Abbott	af47b39027	drm/amdkfd: Remove vla There's an ongoing effort to remove VLAs[1] from the kernel to eventually turn on -Wvla. Switch to a constant value that covers all hardware. [1] https://lkml.org/lkml/2018/3/7/621 Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-13 14:24:12 -07:00
Felix Kuehling	c129db1206	drm/amdkfd: Add sanity checks in IRQ handlers Only accept interrupts from KFD VMIDs. Just checking for a PASID may not be enough because amdgpu started using PASIDs to map VM faults to processes. Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug). Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Suggested-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:12 -04:00
Shaoyun Liu	2533f0741e	drm/amdkfd: Remove queue node when destroy queue failed HWS may hang in the middle of destroy queue, remove the queue from the process queue list so it won't be freed again in the future Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:11 -04:00
Ben Goz	bfdcbfd255	drm/amdkfd: Locking PM mutex while allocating IB buffer Signed-off-by: Ben Goz <ben.goz@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:10 -04:00
Felix Kuehling	ccb76b149e	drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK The initialization is not necessary. amd-kfd-staging and ROCm releases have worked without it for two years. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:09 -04:00
Felix Kuehling	eeb27b7eb3	drm/amdkfd: Fix signal handling performance again It turns out that idr_for_each_entry is really slow compared to just iterating over the slots. Based on measurements the difference is estimated to be about a factor 64. That means using idr_for_each_entry is only worth it with very few allocated events. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:08 -04:00
Yong Zhao	f8ea72d097	drm/amdkfd: Fix CP soft hang on APUs The problem happens on Raven and Carrizo. The context save handler should not clear the high bits of PC_HI before extracting the bits of IB_STS. The bug is not relevant to VEGA10 until we enable demand paging. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:07 -04:00
Yong Zhao	0db54b24ad	drm/amdkfd: Separate trap handler assembly code and its hex values Since the assembly code is inside "#if 0", it is ineffective. Despite that, during debugging, we need to change the assembly code, extract it into a separate file and compile the new file into hex values using sp3. That process also requires us to remove "#if 0" and modify lines starting with "#", so that sp3 can successfully compile the new file. With this change, all the above chore is no longer needed, and cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its hex values. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:06 -04:00
Felix Kuehling	a2e94158b8	drm/amdkfd: Remove redundant include of amd-iommu.h Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:05 -04:00
Philip Yang	fa7e65147e	drm/amdkfd: use %px to print user space address instead of %p Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:04 -04:00
Jay Cornwall	2774c63ef3	drm/amdkfd: Use volatile MTYPE in default/alternate apertures MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile. Cache lines loaded through these apertures are intended to be invalidated before (and sometimes during) a dispatch. The non-volatile qualifier prevents these cache lines from being distinguished from those loaded through the private aperture. Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor to automatic per-dispatch scalar/vector L1 volatile invalidation. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:03 -04:00
Jay Cornwall	87e6d4e077	drm/amdkfd: Reduce priority of context-saving waves before spin-wait Synchronization between context-saving wavefronts is achieved by sending a SAVEWAVE message to the SPI and then spin-waiting for a response. These spin-waiting wavefronts may inhibit the progress of other wavefronts in the context save handler, leading to the synchronization condition never being achieved. Before spin-waiting reduce the priority of each wavefront to guarantee foward progress in the others. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:02 -04:00
Oak Zeng	24f48a4203	drm/amdkfd: Dump HQD of HIQ Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-05-01 17:56:01 -04:00
Dan Carpenter	8feaccf71d	drm/amdkfd: Integer overflows in ioctl args->n_devices is a u32 that comes from the user. The multiplication could overflow on 32 bit systems possibly leading to privilege escalation. Fixes: `5ec7e02854` ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-24 16:35:49 +03:00
Felix Kuehling	389056e5fe	drm/amdkfd: Add Vega10 topology and device info * Report 64-bit doorbells as HSA_CAP_DOORBELL_TYPE_2_0 in topology * Report cache information in topology (duplicates GFXv8 info for now) * Add device info for Vega10 support in KFD Raven is not enabled at this time as it needs additional changes in DQM to work with a single SDMA engine. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:18 -04:00
welu	6106dce955	drm/amdkfd: Try to enable atomics for all GPUs Report failure to enable atomics only on GPUs that require them. This allows GPUs that don't require atomics to function, but can benefit if they are available. This is the case for Vega10, which doesn't use atomics for basic functioning of the MEC, AQL and HWS microcode. So it can work without atomics. But shader programs can still use atomic instructions on systems that support PCIe atomics. Signed-off-by: welu <Wei.Lu2@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:17 -04:00
Felix Kuehling	3e76c2399b	drm/amdkfd: Add GFXv9 CWSR trap handler Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:16 -04:00
Felix Kuehling	70a31d16cc	drm/amdkfd: Support flat memory apertures for GFXv9 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:15 -04:00
Felix Kuehling	6aac0a48b0	drm/amdkfd: Remove limit on number of GPUs (follow-up) This condition was missed in a previous commit with the same title. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:14 -04:00
Felix Kuehling	9d7d024816	drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queue v2: Removed redundant 0x before %p. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-08 22:03:51 -04:00
Felix Kuehling	bebfd2f412	drm/amdkfd: Fix kernel queue rollback_packet kq->queue->properties.write_ptr is a GPU address which can'd be derefenced in the kernel. Use kq->wptr_kernel instead, which is the kernel CPU address of the same buffer. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:12 -04:00
Felix Kuehling	2a26fbfe80	drm/amdkfd: Fix goto usage Missed a spot in previous cleanup commit: Remove gotos that do not feature any common cleanup, and use gotos instead of repeating cleanup commands. According to kernel.org: "The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly." Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:11 -04:00
Felix Kuehling	ca750681bc	drm/amdkfd: Add SOC15 interrupt processing support Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:10 -04:00
Felix Kuehling	bed4f11025	drm/amdkfd: Add GFXv9 device queue manager Signed-off-by: John Bridgman <john.bridgman@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:09 -04:00
Felix Kuehling	b91d43dd01	drm/amdkfd: Add GFXv9 MQD manager Signed-off-by: John Bridgman <john.bridgman@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:08 -04:00
Felix Kuehling	454150b1f9	drm/amdkfd: Add GFXv9 PM4 packet writer functions Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:07 -04:00
Felix Kuehling	f6e27ff19d	drm/amdkfd: Move packet writer functions into ASIC-specific file This is in preparation for GFXv9 (Vega10) which uses incompatible PM4 packet formats from previous ASIC generations. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:06 -04:00
Felix Kuehling	ef568db792	drm/amdkfd: Implement doorbell allocation for SOC15 Allocate doorbells according to the doorbell routing information on SOC15 ASICs (Vega10 and later). On older ASICs we continue to use the queue_id as the doorbell ID to maintain compatibility with the Thunk. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:05 -04:00
Harish Kasiviswanathan	df03ef9342	drm/amdkfd: Clean up KFD_MMAP_ offset handling Use bit-rotate for better clarity and remove _MASK from the #defines as these represent mmap types. Centralize all the parsing of the mmap offset in kfd_mmap and add device parameter to doorbell and reserved_mem map functions. Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits for encoding the the doorbell ID on Vega10. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:04 -04:00
Felix Kuehling	ada2b29c4a	drm/amdkfd: Make doorbell size ASIC-dependent This prepares for GFXv9 (Vega10), which has 64-bit doorbells. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-04-10 17:33:03 -04:00
Linus Torvalds	320b164abb	main drm pull request for v4.17 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJavDxYAAoJEAx081l5xIa+pCoP/iwjuxkSTdJpZUx5g0daGkCK O18moGqGPChb7qJovfHqCKZ1f9PGulQt7SxwFzzJXNbv0PbfMA/Og0EhMLBImb+Q VfYgq2vJLpmkikgcI5fBrzs9DRMQKKobGIzw24VS7IkPYA7d8KgAyywBwG0+LUFR G3sobClgapsfaUcleb3ZOeDwymGkGCuuYRpYE4giHtuMDIxCWLePKJKOaOIq8o6P A1557EvSbKuLQGI9X50jzJOoBE3TKRQYkzuM1GthdOF8RHaMNcFy44lDNO030HwZ hzwAIg5Izhu16PqZGyEdIQ6SJTv3isRJWEciPnOsijvjl1li3ehMdQfhGISa/jZO ivEGd32kaactiT0jJ5OyexergEViCPVKCIORksSIk46L84luDva9L22A3yu0mf3F ixB63bAiLH7Py77kH3DmeJdqhMxlVZXCbdBVFDvzZvY4O3Mx0Dv9mmN/nw1FVCFH scSYnXea9/o4IY5yGASU6FAUJEEGu20HAN12oHJw7/taqV/gbbEos3F7AGmjJE0f qe6Rt/8fwi7Lhm2va6EoOo6yltH/gL4/AgnsN76VzppNGbaIv7W8Qa4Y/ES1lAE1 SATAEUJfU8kiLrVOolIElPbgfdJwv8TzoxiKB5wK/eoH20wf4BTmOuBMviaL2qXK Sz6wihq+IlMXW7Y7pIl/ =DrA+ -----END PGP SIGNATURE----- Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "Cannonlake and Vega12 support are probably the two major things. This pull lacks nouveau, Ben had some unforseen leave and a few other blockers so we'll see how things look or maybe leave it for this merge window. core: - Device links to handle sound/gpu pm dependency - Color encoding/range properties - Plane clipping into plane check helper - Backlight helpers - DP TP4 + HBR3 helper support amdgpu: - Vega12 support - Enable DC by default on all supported GPUs - Powerplay restructuring and cleanup - DC bandwidth calc updates - DC backlight on pre-DCE11 - TTM backing store dropping support - SR-IOV fixes - Adding "wattman" like functionality - DC crc support - Improved DC dual-link handling amdkfd: - GPUVM support for dGPU - KFD events for dGPU - Enable PCIe atomics for dGPUs - HSA process eviction support - Live-lock fixes for process eviction - VM page table allocation fix for large-bar systems panel: - Raydium RM68200 - AUO G104SN02 V2 - KEO TX31D200VM0BAA - ARM Versatile panels i915: - Cannonlake support enabled - AUX-F port support added - Icelake base enabling until internal milestone of forcewake support - Query uAPI interface (used for GPU topology information currently) - Compressed framebuffer support for sprites - kmem cache shrinking when GPU is idle - Avoid boosting GPU when waited item is being processed already - Avoid retraining LSPCON link unnecessarily - Decrease request signaling latency - Deprecation of I915_SET_COLORKEY_NONE - Kerneldoc and compiler warning cleanup for upcoming CI enforcements - Full range ycbcr toggling - HDCP support i915/gvt: - Big refactor for shadow ppgtt - KBL context save/restore via LRI cmd (Weinan) - Properly unmap dma for guest page (Changbin) vmwgfx: - Lots of various improvements etnaviv: - Use the drm gpu scheduler - prep work for GC7000L support vc4: - fix alpha blending - Expose perf counters to userspace pl111: - Bandwidth checking/limiting - Versatile panel support sun4i: - A83T HDMI support - A80 support - YUV plane support - H3/H5 HDMI support omapdrm: - HPD support for DVI connector - remove lots of static variables msm: - DSI updates from 10nm / SDM845 - fix for race condition with a3xx/a4xx fence completion irq - some refactoring/prep work for eventual a6xx support (ie. when we have a userspace) - a5xx debugfs enhancements - some mdp5 fixes/cleanups to prepare for eventually merging writeback - support (ie. when we have a userspace) tegra: - mmap() fixes for fbdev devices - Overlay plane for hw cursor fix - dma-buf cache maintenance support mali-dp: - YUV->RGB conversion support rockchip: - rk3399/chromebook fixes and improvements rcar-du: - LVDS support move to drm bridge - DT bindings for R8A77995 - Driver/DT support for R8A77970 tilcdc: - DRM panel support" * tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux: (1646 commits) drm/i915: Fix hibernation with ACPI S0 target state drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt drm/i915: Specify which engines to reset following semaphore/event lockups drm/i915/dp: Write to SET_POWER dpcd to enable MST hub. drm/amdkfd: Use ordered workqueue to restore processes drm/amdgpu: Fix acquiring VM on large-BAR systems drm/amd/pp: clean header file hwmgr.h drm/amd/pp: use mlck_table.count for array loop index limit drm: Fix uabi regression by allowing garbage mode->type from userspace drm/amdgpu: Add an ATPX quirk for hybrid laptop drm/amdgpu: fix spelling mistake: "asssert" -> "assert" drm/amd/pp: Add new asic support in pp_psm.c drm/amd/pp: Clean up powerplay code on Vega12 drm/amd/pp: Add smu irq handlers for legacy asics drm/amd/pp: Fix set wrong temperature range on smu7 drm/amdgpu: Don't change preferred domian when fallback GTT v5 drm/vmwgfx: Bump version patchlevel and date drm/vmwgfx: use monotonic event timestamps drm/vmwgfx: Unpin the screen object backup buffer when not used drm/vmwgfx: Stricter count of legacy surface device resources ...	2018-04-02 07:59:23 -07:00
Felix Kuehling	6b95e7973a	drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls These interfaces allow KGD to stop and resume all GPU user mode queue access to a process address space. This is needed for handling MMU notifiers of userptrs mapped for GPU access in KFD VMs. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-23 15:32:32 -04:00
Felix Kuehling	d1853f42b6	drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifier When an MMU notifier runs in memory reclaim context, it can deadlock trying to take locks that are already held in the thread causing the memory reclaim. The solution is to avoid memory reclaim while holding locks that are taken in MMU notifiers by using GFP_NOIO. This commit fixes memory allocations done while holding the dqm->lock which is needed in the MMU notifier (dqm->ops.evict_process_queues). Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-23 15:32:31 -04:00
Felix Kuehling	1679ae8f8f	drm/amdkfd: Use ordered workqueue to restore processes Restoring multiple processes concurrently can lead to live-locks where each process prevents the other from validating all its BOs. v2: fix duplicate check of same variable Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-23 15:30:36 -04:00
Felix Kuehling	72a01d231d	drm/amdkfd: Deallocate SDMA queues correctly Deallocate SDMA queues during abnormal process termination and when queue creation fails after the SDMA allocation. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-23 15:30:34 -04:00
Felix Kuehling	c70a362687	drm/amdkfd: Fix scratch memory with HWS enabled Program sh_hidden_private_base_vmid correctly in the map-process PM4 packet. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-23 15:30:33 -04:00
Felix Kuehling	374200b154	drm/amdkfd: Add module option for testing large-BAR functionality Simulate large-BAR system by exporting only visible memory. This limits the amount of available VRAM to the size of the BAR, but enables CPU access to VRAM. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:53 -04:00
Felix Kuehling	0fc8011f89	drm/amdkfd: Kmap event page for dGPUs The events page must be accessible in user mode by the GPU and CPU as well as in kernel mode by the CPU. On dGPUs user mode virtual addresses are managed by the Thunk's GPU memory allocation code. Therefore we can't allocate the memory in kernel mode like we do on APUs. But KFD still needs to map the memory for kernel access. To facilitate this, the Thunk provides the buffer handle of the events page to KFD when creating the first event. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:52 -04:00
Felix Kuehling	5ec7e02854	drm/amdkfd: Add ioctls for GPUVM memory management v2: * Fix error handling after kfd_bind_process_to_device in kfd_ioctl_map_memory_to_gpu v3: * Add ioctl to acquire VM from a DRM FD v4: * Return number of successful map/unmap operations in failure cases * Facilitate partial retry after failed map/unmap * Added comments with parameter descriptions to new APIs * Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:51 -04:00
Felix Kuehling	552764b680	drm/amdkfd: Add TC flush on VMID deallocation for Hawaii On GFX7 the CP does not perform a TC flush when queues are unmapped. To avoid TC eviction from accessing an invalid VMID, flush it explicitly before releasing a VMID. v2: Fix unnecessary list_for_each_entry_safe v3: Moved allocation to kfd_process_device_init_vm Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:50 -04:00
Felix Kuehling	f35751b870	drm/amdkfd: Allocate CWSR trap handler memory for dGPUs Add helpers for allocating GPUVM memory in kernel mode and use them to allocate memory for the CWSR trap handler. v2: Use dev instead of pdd->dev in kfd_process_free_gpuvm v3: * Cleaned up and simplified kfd_process_alloc_gpuvm * Moved allocation for dGPU to kfd_process_device_init_vm Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:49 -04:00
Felix Kuehling	52b29d7334	drm/amdkfd: Add per-process IDR for buffer handles Also used for cleaning up on process termination. v2: Refactored cleanup on process termination Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:48 -04:00
Felix Kuehling	d01994c24c	drm/amdkfd: Aperture setup for dGPUs Set up the GPUVM aperture for SVM (shared virtual memory) that allows sharing a part of virtual address space between GPUs and CPUs. Report the size of the GPUVM aperture that is supported by KGD accurately. The low part of the GPUVM aperture is reserved for kernel use. This is for kernel-allocated buffers that are only accessed on the GPU: - CWSR trap handler - IB for submitting commands in user-mode context from kernel mode Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:47 -04:00
Felix Kuehling	c7bcbfa4f8	drm/amdkfd: Remove limit on number of GPUs Currently the number of GPUs is limited by aperture placement options available on GFX7 and GFX8 hardware. This limitation is not necessary. Scratch and LDS represent per-work-item and per-work-group storage respectively. Different work-items and work-groups use the same virtual address to access their own data. Work running on different GPUs is by definition in different work-groups (different dispatches, in fact). That means the same virtual addresses can be used for these apertures on different GPUs. Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the artificial limitation on the number of GPUs that can be supported. The new ioctl allows user mode to query the number of GPUs to allocate enough memory for all GPUs to be reported. This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:46 -04:00
Oak Zeng	7c9b717196	drm/amdkfd: Populate DRM render device minor Populate DRM render device minor in kfd topology Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:45 -04:00
Felix Kuehling	b84394e206	drm/amdkfd: Create KFD VMs on demand Instead of creating all VMs on process creation, create them when a process is bound to a device. This will later allow registering an existing VM from a DRM render node FD at runtime, before the process is bound to the device. This way the render node VM can be used for KFD instead of creating our own redundant VM. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:27:44 -04:00
Arnd Bergmann	48a4438718	drm/amdkfd: fix uninitialized variable use When CONFIG_ACPI is disabled, we never initialize the acpi_table structure in kfd_create_crat_image_virtual: drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual': drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized] The undefined behavior also happens for any other acpi_get_table() failure, but then the compiler can't warn about it. This adds an error check that prevents the structure from being used in error, avoiding both the undefined behavior and the warning about it. Fixes: `520b8fb755` ("drm/amdkfd: Add topology support for CPUs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-03-15 17:49:40 +01:00
Felix Kuehling	26103436da	drm/amdkfd: Implement KFD process eviction/restore When the TTM memory manager in KGD evicts BOs, all user mode queues potentially accessing these BOs must be evicted temporarily. Once user mode queues are evicted, the eviction fence is signaled, allowing the migration of the BO to proceed. A delayed worker is scheduled to restore all the BOs belonging to the evicted process and restart its queues. During suspend/resume of the GPU we also evict all processes to allow KGD to save BOs in system memory, since VRAM will be lost. v2: * Account for eviction when updating of q->is_active in MQD manager Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-02-06 20:32:45 -05:00
Felix Kuehling	403575c44e	drm/amdkfd: Add GPUVM virtual address space to PDD Create/destroy the GPUVM context during PDD creation/destruction. Get VM page table base and program it during process registration (HWS) or VMID allocation (non-HWS). v2: * Used dev instead of pdd->dev in kfd_flush_tlb Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-02-06 20:32:44 -05:00
Harish Kasiviswanathan	4252bf6866	drm/amdkfd: Remove unaligned memory access Unaligned atomic operations can cause problems on some CPU architectures. Use simpler bitmask operations instead. Atomic bit manipulations are not necessary since dqm->lock is held during these operations. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-02-06 20:32:42 -05:00
Gustavo A. R. Silva	2e3dca5365	drm/amdkfd: Fix potential NULL pointer dereferences In case kfd_get_process_device_data returns null, there are some null pointer dereferences in functions kfd_bind_processes_to_device and kfd_unbind_processes_from_device. Fix this by printing a WARN_ON for PDDs that aren't found and skip them with continue statements. Addresses-Coverity-ID: 1463794 ("Dereference null return value") Addresses-Coverity-ID: 1463772 ("Dereference null return value") Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-10 17:15:09 -06:00
Oded Gabbay	a1235e10ee	drm/amdkfd: add ull suffix to 64bit defines Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>	2018-01-10 12:55:17 +02:00
Yong Zhao	40a526dc1e	drm/amdkfd: don't always call execute_queues_cpsch() When destroying an inactive queue, we don't need to call execute_queues_cpsch. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Oak Zeng <oak.zeng@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-02 13:10:50 -05:00
Yong Zhao	9e8272240b	drm/amdkfd: Fix return value 0 when execute_queues_cpsch fails Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Oak Zeng <oak.zeng@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-02 13:10:49 -05:00
Harish Kasiviswanathan	b441093e40	drm/amdkfd: Ignore ACPI CRAT for non-APU systems Some AMD motherboards without an APU have a broken CRAT table which causes KFD initialization failures or incorrect information about NUMA nodes, CPU cores or system memory. Ignore CRAT tables without GPUs and rely on KFD's code to create a CRAT table for the CPU. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:09:04 -05:00
Felix Kuehling	ebcfd1e276	drm/amdkfd: Module option to disable CRAT table Some systems have broken CRAT tables. Add a module option to ignore a CRAT table. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:09:03 -05:00
Ben Goz	413e85d5d3	drm/amdkfd: Add AQL Queue Memory flag on topology This is needed for enabling a user-mode workaround for an AQL queue wrapping HW bug on Tonga. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:09:02 -05:00
Philip Cox	70f372bffc	drm/amdkfd: Fixup incorrect info in the CZ CRAT table * Wrong value for max_waves_per_simd * Missing ATC capability bit Signed-off-by: Philip Cox <Philip.Cox@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:09:01 -05:00
Amber Lin	f475734729	drm/amdkfd: Add perf counters to topology For hardware blocks whose performance counters are accessed via MMIO registers, KFD provides the support for those privileged blocks. IOMMU is one of those privileged blocks. Most performance counter properties required by Thunk are available at /sys/bus/event_source/devices/amd_iommu. This patch adds properties to topology in KFD sysfs for information not available in /sys/bus/event_source/devices/amd_iommu. They are shown at /sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/ formatted as /sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/<block>/<property>, i.e. /sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/max_concurrent. For dGPUs, who don't have IOMMU, nothing appears under /sys/devices/virtual/kfd/kfd/topology/nodes/0/perf. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:09:00 -05:00
Harish Kasiviswanathan	3a87177eb1	drm/amdkfd: Add topology support for dGPUs Generate and parse VCRAT tables for dGPUs in kfd_topology_add_device. Some information that isn't available in the CRAT table is patched into the topology after parsing. HSA_CAP_DOORBELL_TYPE_1_0 is dependent on the ASIC feature CP_HQD_PQ_CONTROL.SLOT_BASED_WPTR, which was not introduced in VI until Carrizo. Report HSA_CAP_DOORBELL_TYPE_PRE_1_0 on Tonga ASICs. v2: Added #include <linux/pci.h> to kfd_crat.c to make it compile Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:59 -05:00
Felix Kuehling	520b8fb755	drm/amdkfd: Add topology support for CPUs Currently, the KFD topology information is generated by parsing the CRAT (ACPI) table. However, at present CRAT table is available only for AMD APUs. To support CPUs on systems without a CRAT table, the KFD driver will create a Virtual CRAT (VCRAT) table and then the existing code will parse that table to generate topology. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:58 -05:00
Harish Kasiviswanathan	bc0c75a367	drm/amdkfd: Fix sibling_map[] size Change kfd_cache_properties.sibling_map[256] to kfd_cache_properties.sibling_map[32]. Since, CRAT uses bitmap for sibling_map, it is more efficient to use bitmap in the kfd structure also. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:57 -05:00
Felix Kuehling	175b926335	drm/amdkfd: Simplify counting of memory banks Only count memory banks in one place. Ignore redundant num_banks entry in crat_subtype_computeunit. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:56 -05:00
Felix Kuehling	42aa8793d7	drm/amdkfd: Turn verbose topology messages into pr_debug Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:55 -05:00
Harish Kasiviswanathan	4f2937bfff	drm/amdkfd: sync IOLINK defines to thunk spec Current thunk spec v1.07 dated Feb 1, 2016 v2: fix indentation Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:54 -05:00
Harish Kasiviswanathan	6d82eb0ef2	drm/amdkfd: Support enumerating non-GPU devices Modify kfd_topology_enum_kfd_devices(..) function to support non-GPU nodes. The function returned NULL when it encountered non-GPU (say CPU) nodes. This caused kfd_ioctl_create_event and kfd_init_apertures to fail for Intel + Tonga. kfd_topology_enum_kfd_devices will now parse all the nodes and return valid kfd_dev for nodes with GPU. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:53 -05:00
Harish Kasiviswanathan	4f449311e9	drm/amdkfd: Decouple CRAT parsing from device list update Currently, CRAT parsing is intertwined with topology_device_list and hence repeated calls to kfd_parse_crat_table() will fail. Decouple kfd_parse_crat_table() and topology_device_list. kfd_parse_crat_table() will parse CRAT and add topology devices to a temporary list temp_topology_device_list and then kfd_topology_update_device_list will move contents from temporary list to master list. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:52 -05:00
Harish Kasiviswanathan	8e05247d4c	drm/amdkfd: Reorganize CRAT fetching from ACPI Reorganize and rename kfd_topology_get_crat_acpi function. In this way acpi_get_table(..) needs to be called only once. This will also aid in dGPU topology implementation. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:51 -05:00
Felix Kuehling	174de876d6	drm/amdkfd: Group up CRAT related functions Take CRAT related functions out of kfd_topology.c and place them in kfd_crat.c. This is the initial step of supporting more CRAT features, i.e. creating virtual CRAT table for KFD devices without CRAT. v2: Minor cleanup that was missed previously because code moved around Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:49 -05:00
Yong Zhao	5108d76840	drm/amdkfd: Fix memory leaks in kfd topology Kobject created using kobject_create_and_add() can be freed using kobject_put() when there is no referenece any more. However, kobject memory allocated with kzalloc() has to set up a release callback in order to free it when the counter decreases to 0. Otherwise it causes memory leak. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:48 -05:00
Harish Kasiviswanathan	d63f0ba27a	drm/amdkfd: Topology: Fix location_id Fix location_id format to match Thunk specification. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:47 -05:00
Flora Cui	f7ce2fade6	drm/amdkfd: Update number of compute unit from KGD Overwrite the active simd_count from KGD at driver loading time. This is based on assumption that register GC_USER_SHADER_ARRAY_CONFIG won’t get changed. V2: remove the incorrect simd_count reported at loading module. Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed by: Yair Shachar< yair.shachar@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:46 -05:00
Harish Kasiviswanathan	0504cccf34	drm/amdkfd: Stop using get_vmem_size KGD-KFD interface get_vmem_size() is deprecated. Instead use get_local_mem_info(). Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 23:08:43 -05:00
Felix Kuehling	64d1c3a43a	drm/amdkfd: Centralize IOMMUv2 code and make it conditional dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on ASIC information. Also allow building KFD without IOMMUv2 support. This is still useful for dGPUs and prepares for enabling KFD on architectures that don't support AMD IOMMUv2. v2: * Centralize IOMMUv2 code to avoid #ifdefs in too many places v3: * Imply AMD_IOMMU_V2 in Kconfig Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian Konig <christian.koenig@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-08 19:22:12 -05:00
Felix Kuehling	a3084e6c52	drm/amdkfd: Add dGPU device IDs and device info v2: remove needs_iommu field as it doesn't exists CC: linux-pci@vger.kernel.org Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:47 -05:00
Felix Kuehling	1d63669885	drm/amdkfd: Add dGPU support to kernel_queue_init Recognize dGPU ASIC families. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:46 -05:00
Felix Kuehling	ee04955af6	drm/amdkfd: Add dGPU support to the MQD manager On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:45 -05:00
Felix Kuehling	97672cbe3d	drm/amdkfd: Add dGPU support to the device queue manager GFXv7 and v8 dGPUs use a different addressing mode for KFD compared to APUs (GPUVM64 vs HSA64). And dGPUs don't support MTYPE_CC. They use MTYPE_UC instead for memory that requires coherency. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:44 -05:00
Felix Kuehling	d146c5a719	drm/amdkfd: Make sched_policy a per-device setting Some dGPUs don't support HWS. Allow them to use a per-device sched_policy that may be different from the global default. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:43 -05:00
Felix Kuehling	3ee2d00cfb	drm/amdkfd: Conditionally enable PCIe atomics This will be needed for most dGPUs. CC: linux-pci@vger.kernel.org Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-04 17:17:41 -05:00
Gustavo A. R. Silva	3f866f5f04	drm/amdkfd: Use ARRAY_SIZE macro in kfd_build_sysfs_node_entry Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Felix Kuehling<Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2018-01-18 18:39:55 -06:00
Yong Zhao	c0ede1f8dc	drm/amdkfd: Simplify locking during process creation Also fixes error handling if kfd_process_init_cwsr fails. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:56 -05:00
Felix Kuehling	de1450a559	drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:55 -05:00
Felix Kuehling	2d9b36f983	drm/amdkfd: Reduce nesting in kfd_create_process_device_data Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:54 -05:00
Yong Zhao	82c16b4280	drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails If no matching process is found, return NULL instead of a pointer to the last process in the kfd_processes_table. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:53 -05:00
Felix Kuehling	abb208a8d4	drm/amdkfd: Use ref count to prevent kfd_process destruction Use a reference counter instead of a lock to prevent process destruction while functions running out of process context are using the kfd_process structure. In many cases these functions don't need the structure to be locked. In the few cases that really do need the process lock, take it explicitly. This helps simplify lock dependencies between the process lock and other locks, particularly amdgpu and mm_struct locks. This will be important when amdgpu calls back to amdkfd for memory evictions. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:52 -05:00
Felix Kuehling	5ce10687ae	drm/amdkfd: Make kfd_process reference counted This will be used to elliminate the use of the process lock for preventing concurrent process destruction. This will simplify lock dependencies between KFD and KGD. This also simplifies the process destruction in a few ways: * Don't allocate work struct dynamically * Remove unnecessary hack that increments mm reference counter * Remove unnecessary process locking during destruction Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:51 -05:00
Felix Kuehling	c7b1243eef	drm/amdkfd: Get reference to lead_thread task struct Increment the kfd_process.lead_thread's reference counter to make it safe to dereference. This is needed for getting a safe reference to the process' mm_struct. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:50 -05:00
Felix Kuehling	851a645efd	drm/amdkfd: Add debugfs support to KFD This commit adds several debugfs entries for kfd: kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and SDMA RLC queues kfd/mqds: dumps all MQDs of all KFD processes on all GPUs kfd/rls: dumps HWS runlists on all GPUs Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:49 -05:00
Felix Kuehling	36582fa516	drm/amdkfd: Fix oversubscription accounting Don't count SDMA queues towards compute HQD oversubscription when deciding whether to create a chained runlist. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:46 -05:00
Felix Kuehling	a99c6d4fdc	drm/amdkfd: map multiple processes to HW scheduler Allow HWS to to execute multiple processes on the hardware concurrently. The number of concurrent processes is limited by the number of VMIDs allocated to the HWS. A module parameter can be used for limiting this further or turn it off altogether (mainly for debugging purposes). Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:45 -05:00
Kent Russell	8f8fb9b9d0	drm/amdkfd: Fix printing pointer cast Just print a pointer instead of casting v2: Remove the 0x prefix, since %p prints that automatically, and remove it from one other spot as well Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-12-04 06:50:17 -05:00
Philip Yang	3c0b428090	drm/amdkfd: Add crash protection in debugger register path After debugger is registered, the pqm_destroy_queue fails because is_debug is true, the queue should not be removed from process_queue_list since the count is not reduced. Test application calls debugger unregister without register debugger, add null pointer check protection to avoid crash for this case Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-27 18:29:44 -05:00
Yong Zhao	b46cb7d70e	drm/amdkfd: Delete a useless parameter from create_queue function pointer Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-24 18:10:54 -05:00
Felix Kuehling	d7b9bd2248	drm/amdkfd: Add support for user-mode trap handlers A second-level user mode trap handler can be installed. The CWSR trap handler jumps to the secondary trap handler conditionally for any conditions not handled by it. This can be used e.g. for debugging or catching math exceptions. When CWSR is disabled, the user mode trap handler is installed as first level trap handler. Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-14 16:41:20 -05:00
Felix Kuehling	373d708089	drm/amdkfd: Add CWSR support This hardware feature allows the GPU to preempt shader execution in the middle of a compute wave, save the state and restore it later to resume execution. Memory for saving the state is allocated per queue in user mode and the address and size passed to the create_queue ioctl. The size depends on the number of waves that can be in flight simultaneously on a given ASIC. Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com> Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-14 16:41:19 -05:00
Felix Kuehling	449fea6126	drm/amdkfd: Add trap handler for CWSR The trap handler is like an interrupt handler running on the GPU compute unit. It is needed for supporting CWSR (compute wave save/restore). This file defines an array with the pre-compiled GFXv8 shader ISA. The assembly code is included for reference in #if 0 ... #endif. Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-14 16:41:18 -05:00
Felix Kuehling	b20cd0df15	drm/amdkfd: Cleanup qpd.pqm initialization The PQM doesn't change after process creation. So initialize it in kfd_create_process_device_data. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-14 16:41:17 -05:00
Felix Kuehling	115c8c4104	drm/amdkfd: Use order_base_2 to get log2 of buffes sizes Replace (ffs(size) - 1) with order_base_2(size) as a more straight forward way to get log2 of buffer sizes. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-06 14:52:28 -05:00
Felix Kuehling	6d56693025	drm/amdkfd: Hardware DWORD size is 4 bytes Don't use sizeof(uint32_t) or similar types for hardware or firmware DWORD size. The hardware and firmware don't care about Linux types. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-06 14:52:27 -05:00
Philip Cox	5aaf2befd4	drm/amdkfd: Implement amdkfd SDMA functions for VI Signed-off-by: Philip Cox <Philip.Cox@amd.com> Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-01 19:22:02 -04:00
Felix Kuehling	97b9ad12ba	drm/amdkfd: Use ASIC-specific SDMA MQD type Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-01 19:22:01 -04:00
Felix Kuehling	7ce66118aa	drm/amd: Update kgd_kfd interface for resuming SDMA queues Add wptr and mm parameters to hqd_sdma_load and pass these parameters from device_queue_manager through the mqd_manager. SDMA doesn't support polling while the engine believes it's idle. The driver must update the wptr. The new parameters will be used for looking up the updated value from the specified mm when SDMA queues are resumed after being disabled. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-01 19:21:58 -04:00
Alex Deucher	e2874a3c8c	drm/amdgpu: add license to Makefiles Was missing license text. Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 11:47:55 -05:00
Dave Airlie	662e704007	Merge tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes This is amdkfd pull request for -rc2. It contains three small fixes to the CIK SDMA code, compilation error fix in kfd_ioctl.h and fix to accessing a pointer after it was released. * tag 'drm-amdkfd-fixes-2017-11-26' of git://people.freedesktop.org/~gabbayo/linux: uapi: fix linux/kfd_ioctl.h userspace compilation errors drm/amdkfd: fix amdkfd use-after-free GP fault drm/amdkfd: Fix SDMA oversubsription handling drm/amdkfd: Fix SDMA ring buffer size calculation drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode	2017-12-01 09:14:46 +10:00
Randy Dunlap	c393e9b2d5	drm/amdkfd: fix amdkfd use-after-free GP fault Fix GP fault caused by dev_info() reference to a struct device* after the device has been freed (use after free). kfd_chardev_exit() frees the device so 'kfd_device' should not be used after calling kfd_chardev_exit(). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-11-26 11:31:32 +02:00

... 2 3 4 5 6 ...

616 Commits