linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-21 10:06:00 +07:00

Author	SHA1	Message	Date
Chunming Zhou	e8deea2d4b	drm/amdgpu: add entity only when first job come umd somtimes will create a context for every ring, that means some entities wouldn't be used at all. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-14 19:41:19 -05:00
Nicolai Hähnle	786b521908	drm/amdgpu: fix race condition in amd_sched_entity_push_job As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <christian.koenig@amd.com>	2015-12-04 11:26:52 -05:00
Chunming Zhou	d033a6de80	drm/amd: abstract kernel rq and normal rq to priority of run queue Allows us to set priorities in the scheduler. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>	2015-12-02 15:54:33 -05:00
Christian König	3d65193635	drm/amdgpu: move dependency handling out of atomic section v2 This way the driver isn't limited in the dependency handling callback. v2: remove extra check in amd_sched_entity_pop_job() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-11-23 12:20:15 -05:00
Christian König	393a0bd437	drm/amdgpu: optimize scheduler fence handling We only need to wait for jobs to be scheduled when the dependency is from the same scheduler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-11-23 12:19:58 -05:00
Christian König	e284022163	drm/amdgpu: fix incorrect mutex usage v3 Before this patch the scheduler fence was created when we push the job into the queue, so we could only get the fence after pushing it. The mutex now was necessary to prevent the thread pushing the jobs to the hardware from running faster than the thread pushing the jobs into the queue. Otherwise the thread pushing jobs into the queue would have accessed possible freed up memory when it tries to get a reference to the fence. So what you get in the end is thread A: mutex_lock(&job->lock); ... Kick of thread B. ... mutex_unlock(&job->lock); And thread B: mutex_lock(&job->lock); .... mutex_unlock(&job->lock); kfree(job); I'm actually not sure if I'm still up to date on this, but this usage pattern used to be not allowed with mutexes. See here as well https://lwn.net/Articles/575460/. v2: remove unrelated changes, fix missing owner v3: rebased, add more commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 11:05:58 -05:00
Christian König	4a56228337	drm/amdgpu: cleanup scheduler fence get/put dance The code was correct, but getting two references when the ownership is linearly moved on is a bit awkward and just overhead. Signed: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 11:05:58 -05:00
Chunming Zhou	7034decf6a	drm/amdgpu: add command submission workflow tracepoint OGL needs these tracepoints to investigate performance issue. Change-Id: I5e58187d061253f7d665dfce8e4e163ba91d3e2b Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>	2015-11-16 11:05:57 -05:00
Chunming Zhou	f5617f9dde	drm/amd: add kmem cache for sched fence Change-Id: I45bb8ff10ef05dc3b15e31a77fbcf31117705f11 Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-16 11:05:51 -05:00
Christian König	424839a6a9	drm/amdgpu: fix stoping the scheduler timeout cancel_delayed_work_sync is forbidden in interrupt context. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-04 12:29:21 -05:00
Dave Airlie	32544d0215	drm/amd/scheduler: don't oops on failure to load In two places amdgpu tries to tear down something it hasn't initalised when failing. This is what happens when you enable experimental support on topaz which then fails in ring init. This patch allows it to fail cleanly. agd: Split out from from the original patch since the scheduler is a driver independent. Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2015-11-03 11:15:29 -05:00
Christian König	fe537d003f	drm/amdgpu: ignore scheduler fences from the same entity We are going to submit them before the job anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-10-28 17:04:18 -04:00
Junwei Zhang	2fcef6ec87	drm/amdgpu: fix lockup when clean pending fences The first lockup fence will lock the fence list of scheduler. Then cancel the delayed workqueues for all clean pending fences without waiting the workqueues to finish. Change-Id: I9bec826de1aa49d587b0662f3fb4a95333979429 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-14 16:20:32 -04:00
Junwei Zhang	2440ff2c91	drm/amdgpu: add timer to fence to detect scheduler lockup Change-Id: I67e987db0efdca28faa80b332b75571192130d33 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-14 16:16:42 -04:00
Christian König	4f839a243d	drm/amdgpu: more scheduler cleanups v2 Embed the scheduler into the ring structure instead of allocating it. Use the ring name directly instead of the id. v2: rebased, whitespace cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>	2015-09-23 17:23:39 -04:00
Christian König	9b398fa5c2	drm/amdgpu: rename fence->scheduler to sched v2 Just to be consistent with the other members. v2: rename the ring member as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: Chunming Zhou<david1.zhou@amd.com>	2015-09-23 17:23:37 -04:00
Christian König	0f75aee751	drm/amdgpu: cleanup entity init Reorder the fields and properly return the kfifo_alloc error code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>	2015-09-23 17:23:37 -04:00
Junwei Zhang	4c7eb91cae	drm/amdgpu: refine the job naming for amdgpu_job and amdgpu_sched_job Use consistent naming across functions. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>	2015-09-23 17:23:36 -04:00
Christian König	1886d1a9ca	drm/amdgpu: remove process_job callback from the scheduler Just free the resources immediately after submitting the job. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-09-23 17:23:33 -04:00
Christian König	258f3f99d5	drm/amdgpu: move scheduler fence callback into fence v2 And call the processed callback directly after submitting the job. v2: split adding error handling into separate patch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-09-23 17:23:32 -04:00
Christian König	27439fcac0	drm/amdgpu: signal scheduler fence when hw submission fails v3 Otherwise the resource blocked by it will never be reclaimed. v2: add DRM_ERROR. v3: fix typo in commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-09-23 17:23:31 -04:00
Chunming Zhou	353da3c520	drm/amdgpu: add tracepoint for scheduler (v2) track sched job status like the length of job queue and hw job queue. v2: fix build after rebase Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-09-23 17:23:31 -04:00
Alex Deucher	5134e999cb	drm/amdgpu: fix warning in scheduler This should never happen so warn when the count does not equal the expected size. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-04 11:04:04 -04:00
Chunming Zhou	c9f0fe5e19	drm/amdgpu: make wait_event uninterruptible in push_job with interruptible, the push_job maybe return -ERESTARTSYS, then result in push_job error. E.g. bug trace: [ 181.618860] *****amdgpu_copy_buffer:fence->seq:0x0000000048d8758b, contxt:1207959552, ref:683967304, r:-512 [ 181.618929] BUG: unable to handle kernel paging request at ffffffff811aa266 [ 181.625887] IP: [<ffffffff81548ffc>] reservation_object_add_excl_fence+0x3c/0x120 ... [ 181.859767] [<ffffffff811aa266>] ? unmap_mapping_range+0x66/0x110 [ 181.865928] [<ffffffffc0608ac1>] ttm_bo_move_accel_cleanup+0x41/0x3c0 [ttm] [ 181.872971] [<ffffffffc062d382>] amdgpu_move_blit.isra.18+0x122/0x150 [amdgpu] [ 181.880254] [<ffffffff811aa266>] ? unmap_mapping_range+0x66/0x110 [ 181.886420] [<ffffffffc062d709>] amdgpu_bo_move+0xa9/0x200 [amdgpu] [ 181.892753] [<ffffffffc0606e8d>] ttm_bo_handle_move_mem+0x26d/0x5c0 [ttm] Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-09-02 12:19:53 -04:00
Christian König	e61235db62	drm/amdgpu: add scheduler dependency callback v2 This way the scheduler doesn't wait in it's work thread any more. v2: fix race conditions Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-08-28 15:04:17 -04:00
Christian König	69bd5bf13a	drm/amdgpu: let the scheduler work more with jobs v2 v2: fix another race condition Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-08-28 15:04:17 -04:00
Christian König	c2b6bd7e91	drm/amdgpu: fix wait queue handling in the scheduler Freeing up a queue after signalling it isn't race free. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-26 17:55:07 -04:00
Christian König	bd755d0870	drm/amdgpu: remove extra parameters from scheduler callbacks Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-26 17:54:10 -04:00
Christian König	88079006dc	drm/amdgpu: wake up scheduler only when neccessary Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-26 17:53:23 -04:00
Christian König	062c7fb3eb	drm/amdgpu: remove entity idle timeout v2 Removing the entity from scheduling can deadlock the whole system. Wait forever till the remaining IBs are scheduled. v2: fix comment as well Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)	2015-08-26 17:52:18 -04:00
Chunming Zhou	f38fdfddfa	drm/amdgpu: add priv data to sched Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-25 10:52:18 -04:00
Chunming Zhou	84f76ea6b0	drm/amdgpu: add owner for sched fence Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-25 10:51:32 -04:00
Christian König	c14692f0a7	drm/amdgpu: remove entity reference from sched fence Entity don't live as long as scheduler fences. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:50:42 -04:00
Christian König	6c859274f3	drm/amdgpu: fix and cleanup amd_sched_entity_push_job Calling schedule() is probably the worse things we can do. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:49:57 -04:00
Christian König	69f7dd652c	drm/amdgpu: remove unused parameters to amd_sched_create Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:47:41 -04:00
Christian König	1fca766b24	drm/amdgpu: remove sched_lock It isn't protecting anything. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:46:46 -04:00
Christian König	b034b572f2	drm/amdgpu: remove prepare_job callback Not used any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:46:02 -04:00
Christian König	d54fdb94b2	drm/amdgpu: cleanup a scheduler function name Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:44:57 -04:00
Christian König	e688b72822	drm/amdgpu: reorder scheduler functions Keep run queue, entity and scheduler handling together. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:44:23 -04:00
Christian König	f495659821	drm/amdgpu: fix scheduler thread creation error checking Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:43:46 -04:00
Christian König	aef4852eed	drm/amdgpu: fix entity wakeup race condition That actually didn't worked at all. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:42:30 -04:00
Christian König	f85a6dd9eb	drm/amdgpu: cleanup entity picking Cleanup function name, stop checking scheduler ready twice, but check if kernel thread should stop instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:41:52 -04:00
Christian König	9788ec4032	drm/amdgpu: remove some more unused entity members v2 None of them are used any more. v2: fix type in error message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-25 10:40:55 -04:00
Christian König	c746ba2223	drm/amdgpu: rework scheduler submission handling. Remove active_hw_rq and it's protecting queue_lock, they are unused. User 32bit atomic for hw_rq_count, 64bits for counting to three is a bit overkill. Cleanup the function name and remove incorrect comments. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>	2015-08-25 10:39:31 -04:00
Christian König	ce882e6dc2	drm/amdgpu: remove v_seq handling from the scheduler v2 Simply not used any more. Only keep 32bit atomic for fence sequence numbering. v2: trivial rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> (v1) Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)	2015-08-25 10:39:16 -04:00
Christian König	2b184d8dbc	drm/amdgpu: use a spinlock instead of a mutex for the rq More appropriate and fixes some nasty lockdep warnings. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-20 17:03:47 -04:00
Chunming Zhou	bb977d3711	drm/amdgpu: abstract amdgpu_job for scheduler Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-20 17:00:35 -04:00
Christian König	432a4ff8b7	drm/amdgpu: cleanup sheduler rq handling v2 Rework run queue implementation, especially remove the odd list handling. v2: cleanup the code only, no algorithem change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>	2015-08-17 16:51:22 -04:00
Chunming Zhou	1c8f805af9	drm/amdgpu: fix unnecessary wake up decrease CPU extra overhead. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-17 16:51:20 -04:00
Chunming Zhou	281b422301	drm/amdgpu: add reference for fence fix fence is released when pass to fence sometimes. add reference for it. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>	2015-08-17 16:51:17 -04:00

1 2

72 Commits