Commit Graph

104 Commits

Author SHA1 Message Date
Laura Abbott
af47b39027 drm/amdkfd: Remove vla
There's an ongoing effort to remove VLAs[1] from the kernel to eventually
turn on -Wvla. Switch to a constant value that covers all hardware.

[1] https://lkml.org/lkml/2018/3/7/621

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-13 14:24:12 -07:00
Felix Kuehling
9d7d024816 drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queue
v2: Removed redundant 0x before %p.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-08 22:03:51 -04:00
Felix Kuehling
ca750681bc drm/amdkfd: Add SOC15 interrupt processing support
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:10 -04:00
Felix Kuehling
bed4f11025 drm/amdkfd: Add GFXv9 device queue manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:09 -04:00
Felix Kuehling
b91d43dd01 drm/amdkfd: Add GFXv9 MQD manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:08 -04:00
Felix Kuehling
454150b1f9 drm/amdkfd: Add GFXv9 PM4 packet writer functions
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:07 -04:00
Felix Kuehling
f6e27ff19d drm/amdkfd: Move packet writer functions into ASIC-specific file
This is in preparation for GFXv9 (Vega10) which uses incompatible PM4
packet formats from previous ASIC generations.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:06 -04:00
Felix Kuehling
ef568db792 drm/amdkfd: Implement doorbell allocation for SOC15
Allocate doorbells according to the doorbell routing information on
SOC15 ASICs (Vega10 and later). On older ASICs we continue to use the
queue_id as the doorbell ID to maintain compatibility with the Thunk.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:05 -04:00
Harish Kasiviswanathan
df03ef9342 drm/amdkfd: Clean up KFD_MMAP_ offset handling
Use bit-rotate for better clarity and remove _MASK from the #defines as
these represent mmap types.

Centralize all the parsing of the mmap offset in kfd_mmap and add device
parameter to doorbell and reserved_mem map functions.

Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits
for encoding the the doorbell ID on Vega10.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:04 -04:00
Felix Kuehling
ada2b29c4a drm/amdkfd: Make doorbell size ASIC-dependent
This prepares for GFXv9 (Vega10), which has 64-bit doorbells.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:03 -04:00
Felix Kuehling
6b95e7973a drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls
These interfaces allow KGD to stop and resume all GPU user mode queue
access to a process address space. This is needed for handling MMU
notifiers of userptrs mapped for GPU access in KFD VMs.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:32 -04:00
Felix Kuehling
1679ae8f8f drm/amdkfd: Use ordered workqueue to restore processes
Restoring multiple processes concurrently can lead to live-locks
where each process prevents the other from validating all its BOs.

v2: fix duplicate check of same variable

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:36 -04:00
Felix Kuehling
374200b154 drm/amdkfd: Add module option for testing large-BAR functionality
Simulate large-BAR system by exporting only visible memory. This
limits the amount of available VRAM to the size of the BAR, but
enables CPU access to VRAM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:53 -04:00
Felix Kuehling
0fc8011f89 drm/amdkfd: Kmap event page for dGPUs
The events page must be accessible in user mode by the GPU and CPU
as well as in kernel mode by the CPU. On dGPUs user mode virtual
addresses are managed by the Thunk's GPU memory allocation code.
Therefore we can't allocate the memory in kernel mode like we do
on APUs. But KFD still needs to map the memory for kernel access.
To facilitate this, the Thunk provides the buffer handle of the
events page to KFD when creating the first event.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:52 -04:00
Felix Kuehling
5ec7e02854 drm/amdkfd: Add ioctls for GPUVM memory management
v2:
* Fix error handling after kfd_bind_process_to_device in
  kfd_ioctl_map_memory_to_gpu
v3:
* Add ioctl to acquire VM from a DRM FD
v4:
* Return number of successful map/unmap operations in failure cases
* Facilitate partial retry after failed map/unmap
* Added comments with parameter descriptions to new APIs
* Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:51 -04:00
Felix Kuehling
552764b680 drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
On GFX7 the CP does not perform a TC flush when queues are unmapped.
To avoid TC eviction from accessing an invalid VMID, flush it
explicitly before releasing a VMID.

v2: Fix unnecessary list_for_each_entry_safe
v3: Moved allocation to kfd_process_device_init_vm

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:50 -04:00
Felix Kuehling
52b29d7334 drm/amdkfd: Add per-process IDR for buffer handles
Also used for cleaning up on process termination.

v2: Refactored cleanup on process termination

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:48 -04:00
Felix Kuehling
d01994c24c drm/amdkfd: Aperture setup for dGPUs
Set up the GPUVM aperture for SVM (shared virtual memory) that allows
sharing a part of virtual address space between GPUs and CPUs.

Report the size of the GPUVM aperture that is supported by KGD accurately.

The low part of the GPUVM aperture is reserved for kernel use. This is
for kernel-allocated buffers that are only accessed on the GPU:
- CWSR trap handler
- IB for submitting commands in user-mode context from kernel mode

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:47 -04:00
Felix Kuehling
b84394e206 drm/amdkfd: Create KFD VMs on demand
Instead of creating all VMs on process creation, create them when
a process is bound to a device. This will later allow registering
an existing VM from a DRM render node FD at runtime, before the
process is bound to the device. This way the render node VM can be
used for KFD instead of creating our own redundant VM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:44 -04:00
Felix Kuehling
26103436da drm/amdkfd: Implement KFD process eviction/restore
When the TTM memory manager in KGD evicts BOs, all user mode queues
potentially accessing these BOs must be evicted temporarily. Once
user mode queues are evicted, the eviction fence is signaled,
allowing the migration of the BO to proceed.

A delayed worker is scheduled to restore all the BOs belonging to
the evicted process and restart its queues.

During suspend/resume of the GPU we also evict all processes to allow
KGD to save BOs in system memory, since VRAM will be lost.

v2:
* Account for eviction when updating of q->is_active in MQD manager

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:45 -05:00
Felix Kuehling
403575c44e drm/amdkfd: Add GPUVM virtual address space to PDD
Create/destroy the GPUVM context during PDD creation/destruction.
Get VM page table base and program it during process registration
(HWS) or VMID allocation (non-HWS).

v2:
* Used dev instead of pdd->dev in kfd_flush_tlb

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:44 -05:00
Felix Kuehling
64d1c3a43a drm/amdkfd: Centralize IOMMUv2 code and make it conditional
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
ASIC information. Also allow building KFD without IOMMUv2 support.
This is still useful for dGPUs and prepares for enabling KFD on
architectures that don't support AMD IOMMUv2.

v2:
* Centralize IOMMUv2 code to avoid #ifdefs in too many places

v3:
* Imply AMD_IOMMU_V2 in Kconfig

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 19:22:12 -05:00
Felix Kuehling
ee04955af6 drm/amdkfd: Add dGPU support to the MQD manager
On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent
memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:45 -05:00
Felix Kuehling
3ee2d00cfb drm/amdkfd: Conditionally enable PCIe atomics
This will be needed for most dGPUs.

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:41 -05:00
Oded Gabbay
a1235e10ee drm/amdkfd: add ull suffix to 64bit defines
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-01-10 12:55:17 +02:00
Felix Kuehling
ebcfd1e276 drm/amdkfd: Module option to disable CRAT table
Some systems have broken CRAT tables. Add a module option to ignore
a CRAT table.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:03 -05:00
Harish Kasiviswanathan
3a87177eb1 drm/amdkfd: Add topology support for dGPUs
Generate and parse VCRAT tables for dGPUs in kfd_topology_add_device.

Some information that isn't available in the CRAT table is patched
into the topology after parsing.

HSA_CAP_DOORBELL_TYPE_1_0 is dependent on the ASIC feature
CP_HQD_PQ_CONTROL.SLOT_BASED_WPTR, which was not introduced in VI
until Carrizo. Report HSA_CAP_DOORBELL_TYPE_PRE_1_0 on Tonga ASICs.

v2: Added #include <linux/pci.h> to kfd_crat.c to make it compile

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:59 -05:00
Felix Kuehling
520b8fb755 drm/amdkfd: Add topology support for CPUs
Currently, the KFD topology information is generated by parsing the CRAT
(ACPI) table. However, at present CRAT table is available only for AMD
APUs. To support CPUs on systems without a CRAT table, the KFD driver will
create a Virtual CRAT (VCRAT) table and then the existing code will parse
that table to generate topology.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:58 -05:00
Harish Kasiviswanathan
6d82eb0ef2 drm/amdkfd: Support enumerating non-GPU devices
Modify kfd_topology_enum_kfd_devices(..) function to support non-GPU
nodes. The function returned NULL when it encountered non-GPU (say CPU)
nodes. This caused kfd_ioctl_create_event and kfd_init_apertures to fail
for Intel + Tonga.

kfd_topology_enum_kfd_devices will now parse all the nodes and return
valid kfd_dev for nodes with GPU.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:53 -05:00
Felix Kuehling
abb208a8d4 drm/amdkfd: Use ref count to prevent kfd_process destruction
Use a reference counter instead of a lock to prevent process
destruction while functions running out of process context are using
the kfd_process structure. In many cases these functions don't need
the structure to be locked. In the few cases that really do need the
process lock, take it explicitly.

This helps simplify lock dependencies between the process lock and
other locks, particularly amdgpu and mm_struct locks. This will be
important when amdgpu calls back to amdkfd for memory evictions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:52 -05:00
Felix Kuehling
5ce10687ae drm/amdkfd: Make kfd_process reference counted
This will be used to elliminate the use of the process lock for
preventing concurrent process destruction. This will simplify lock
dependencies between KFD and KGD.

This also simplifies the process destruction in a few ways:
* Don't allocate work struct dynamically
* Remove unnecessary hack that increments mm reference counter
* Remove unnecessary process locking during destruction

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:51 -05:00
Felix Kuehling
851a645efd drm/amdkfd: Add debugfs support to KFD
This commit adds several debugfs entries for kfd:

kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and
    SDMA RLC queues

kfd/mqds: dumps all MQDs of all KFD processes on all GPUs

kfd/rls: dumps HWS runlists on all GPUs

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:49 -05:00
Felix Kuehling
a99c6d4fdc drm/amdkfd: map multiple processes to HW scheduler
Allow HWS to to execute multiple processes on the hardware
concurrently. The number of concurrent processes is limited by
the number of VMIDs allocated to the HWS.

A module parameter can be used for limiting this further or turn
it off altogether (mainly for debugging purposes).

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-27 18:29:45 -05:00
Felix Kuehling
373d708089 drm/amdkfd: Add CWSR support
This hardware feature allows the GPU to preempt shader execution in
the middle of a compute wave, save the state and restore it later
to resume execution.

Memory for saving the state is allocated per queue in user mode and
the address and size passed to the create_queue ioctl. The size
depends on the number of waves that can be in flight simultaneously
on a given ASIC.

Signed-off-by: Shaoyun.liu <shaoyun.liu@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-14 16:41:19 -05:00
Felix Kuehling
97b9ad12ba drm/amdkfd: Use ASIC-specific SDMA MQD type
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:22:01 -04:00
Felix Kuehling
894a8293aa drm/amdkfd: Minor cleanups
These were missed previously when rebasing changes for upstreaming.

v2: Remove redundant sched_policy conditions

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:33 -04:00
Yong Zhao
ab40cba303 drm/amdkfd: Clean up the data structure in kfd_process
A list of per-process queues is maintained in the
kfd_process_queue_manager, so the queues array in kfd_process is
redundant and in fact unused.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-11-01 19:21:26 -04:00
Andres Rodriguez
48e876a20e drm/amdkfd: use a high priority workqueue for IH work
In systems under heavy load the IH work may experience significant
scheduling delays.

Under load + system workqueue:
    Max Latency: 7.023695 ms
    Avg Latency: 0.263994 ms

Under load + high priority workqueue:
    Max Latency: 1.162568 ms
    Avg Latency: 0.163213 ms

Further work is required to measure the impact of per-cpu settings on IH
performance.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:34 -04:00
Andres Rodriguez
04ad47bd14 drm/amdkfd: use standard kernel kfifo for IH
Replace our implementation of a lockless ring buffer with the standard
linux kernel kfifo.

We shouldn't maintain our own version of a standard data structure.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:31 -04:00
Felix Kuehling
b9a5d0a5db drm/amdkfd: Make event limit dependent on user mode mapping size
This allows increasing the KFD_SIGNAL_EVENT_LIMIT in kfd_ioctl.h
without breaking processes built with older kfd_ioctl.h versions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:29 -04:00
Felix Kuehling
482f07775c drm/amdkfd: Simplify event ID and signal slot management
Signal slots are identical to event IDs.

Replace the used_slot_bitmap and events hash table with an IDR to
allocate and lookup event IDs and signal slots more efficiently.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:27 -04:00
Felix Kuehling
50cb7dd94c drm/amdkfd: Simplify events page allocator
The first event page is always big enough to handle all events.
Handling of multiple events pages is not supported by user mode, and
not necessary.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:26 -04:00
Felix Kuehling
fdf0c8332a drm/amdkfd: Clean up kfd_wait_on_events
Cleaned up the code while resolving some potential bugs and
inconsistencies in the process.

Clean-ups:
* Remove enum kfd_event_wait_result, which duplicates
  KFD_IOC_EVENT_RESULT definitions
* alloc_event_waiters can be called without holding p->event_mutex
* Return an error code from copy_signaled_event_data instead of bool
* Clean up error handling code paths to minimize duplication in
  kfd_wait_on_events

Fixes:
* Consistently return an error code from kfd_wait_on_events and set
  wait_result to KFD_IOC_WAIT_RESULT_FAIL in all failure cases.
* Always call free_waiters while holding p->event_mutex
* copy_signaled_event_data might sleep. Don't call it while the task state
  is TASK_INTERRUPTIBLE.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:22 -04:00
Felix Kuehling
9b56bb1154 drm/amdkfd: Don't dereference kfd_process.mm
The kfd_process doesn't own a reference to the mm_struct, so it can
disappear without warning even while the kfd_process still exists.

Therefore, avoid dereferencing the kfd_process.mm pointer and make
it opaque. Use get_task_mm to get a temporary reference to the mm
when it's needed.

v2: removed unnecessary WARN_ON

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27 19:35:19 -04:00
Felix Kuehling
bc920fd4f4 drm/amdkfd: Clean up process queue management
Removed unused num_concurrent_processes.

Implemented counting of queues in QPD. This makes counting the queue
list repeatedly in several places unnecessary.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:54 -04:00
Yong Zhao
e6f791b1b0 drm/amdkfd: Compress unnecessary function parameters
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:53 -04:00
Felix Kuehling
9fd3f1bfae drm/amdkfd: Improve process termination handling
Separate device queue termination from process queue manager
termination. Unmap all queues at once instead of one at a time.
Unmap device queues before the PASID is unbound, in the
kfd_process_iommu_unbind_callback.

When resetting wavefronts in non-HWS mode, do it before the VMID is
released.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:52 -04:00
Yong Zhao
7da2bcf876 drm/amdkfd: Avoid name confusion involved in queue unmapping
When unmapping the queues from HW scheduler, there are two actions:
reset and preempt. So naming the variables with only preempt is
inapproriate.

For functions such as destroy_queues_cpsch, what they do actually is to
unmap the queues on HW scheduler rather than to destroy them. Change the
name to reflect that fact. On the other hand, there is already a function
called destroy_queue_cpsch() which exactly destroys a queue, and the name
is very close to destroy_queues_cpsch(), resulting in confusion.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-27 00:09:48 -04:00
Yong Zhao
e596b90338 drm/amdkfd: Reuse CHIP_* from amdgpu v2
There are already CHIP_* definitions under amd_shared.h file on amdgpu
side, so KFD should reuse them rather than defining new ones.

Using enum for asic type requires default cases on switch statements
to prevent compiler warnings. WARN on unsupported ASICs. It should never
get there because KFD should not be initialized on unsupported devices.

v2: Replace BUG() with WARN and error return

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:19 -04:00
Yong Zhao
44008d7a87 drm/amdkfd: Use VMID bitmap from KGD v2
The hard-coded values related to VMID were removed in KFD, as those
values can be calculated in the KFD initialization function.

v2: remove unnecessary local variable

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-09-20 18:10:18 -04:00