linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 10:25:26 +07:00

Author	SHA1	Message	Date
Mika Kuoppala	2e19af9438	drm/i915/tgl: Wa_1409600907 To avoid possible hang, we need to add depth stall if we flush the depth cache. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-8-mika.kuoppala@linux.intel.com	2019-10-15 18:23:10 +01:00
Mika Kuoppala	36a6b5d964	drm/i915/tgl: Add extra hdc flush workaround In order to ensure constant caches are invalidated properly with a0, we need extra hdc flush after invalidation. v2: use IS_TGL_REVID (Chris) References: HSDES#1604544889 Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-4-mika.kuoppala@linux.intel.com	2019-10-15 18:16:51 +01:00
Mika Kuoppala	4aa0b5d457	drm/i915/tgl: Add HDC Pipeline Flush Add hdc pipeline flush to ensure memory state is coherent in L3 when we are done. v2: Flush also in breadcrumbs (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-3-mika.kuoppala@linux.intel.com	2019-10-15 18:15:59 +01:00
Mika Kuoppala	62037ffff2	drm/i915/tgl: Include ro parts of l3 to invalidate Aim for completeness and invalidate also the ro parts in l3 cache. This might allow to get rid of the preparser disable/enable workaround on invalidation path. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-2-mika.kuoppala@linux.intel.com	2019-10-15 18:13:50 +01:00
Chris Wilson	8b390c1581	drm/i915/execlists: Clear semaphore immediately upon ELSP promotion There is no significance to our delay before clearing the semaphore the engine is waiting on, so release it as soon as we acknowledge the CS update following our preemption request. This should allow the GPU to resume work earlier, if it was stuck on the semaphore at the end of a request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191015093204.25693-1-chris@chris-wilson.co.uk	2019-10-15 11:51:13 +01:00
Chris Wilson	3c00660db1	drm/i915/execlists: Assert tasklet is locked for process_csb() We rely on only the tasklet being allowed to call into process_csb(), so assert that is locked when we do. As the tasklet uses a simple bitlock, there is no strong lockdep checking so we must make do with a plain assertion that the tasklet is running and assume that we are the tasklet! v2: Fixup intel_gt_sanitize() to prepare each engine for the reset so that the locks are marked as held during the reset v3: Check for existent function pointers for very early sanitisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014121336.30137-1-chris@chris-wilson.co.uk	2019-10-14 21:10:59 +01:00
Chris Wilson	89b6d1831d	drm/i915/execlists: Tweak virtual unsubmission Since commit `e2144503bf` ("drm/i915: Prevent bonded requests from overtaking each other on preemption") we have restricted requests to run on their chosen engine across preemption events. We can take this restriction into account to know that we will want to resubmit those requests onto the same physical engine, and so can shortcircuit the virtual engine selection process and keep the request on the same engine during unwind. References: `e2144503bf` ("drm/i915: Prevent bonded requests from overtaking each other on preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Ramlingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191013203012.25208-1-chris@chris-wilson.co.uk	2019-10-14 12:51:17 +01:00
Chris Wilson	c3eb54aad9	drm/i915: Mark up "sentinel" requests Sometimes we want to emit a terminator request, a request that flushes the pipeline and allows no request to come after it. This can be used for a "preempt-to-idle" to ensure that upon processing the context-switch to that request, all other active contexts have been flushed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191012070136.32058-1-chris@chris-wilson.co.uk	2019-10-12 08:51:17 +01:00
Chris Wilson	d8ad5f5261	drm/i915/execlists: Prevent merging requests with conflicting flags We set out-of-bound parameters inside the i915_requests.flags field, such as disabling preemption or marking the end-of-context. We should not coalesce consecutive requests if they have differing instructions as we only inspect the last active request in a context. Thus if we allow a later request to be merged into the same execution context, it will mask any of the earlier flags. References: `2a98f4e65b` ("drm/i915: add infrastructure to hold off preemption on a request") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191011190325.10979-9-chris@chris-wilson.co.uk	2019-10-12 07:54:52 +01:00
Chris Wilson	cbbf278778	drm/i915/execlists: Only mark incomplete requests as -EIO on cancelling Only the requests that have not completed do we want to change the status of to signal the -EIO when cancelling the inflight set of requests upon wedging. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191011103345.26013-1-chris@chris-wilson.co.uk	2019-10-11 13:07:24 +01:00
Chris Wilson	c97fb526ca	drm/i915/execlists: Leave tell-tales as to why pending[] is bad Before we BUG out with bad pending state, leave a telltale as to which test failed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010071434.31195-2-chris@chris-wilson.co.uk	2019-10-11 09:43:06 +01:00
Chris Wilson	bd9bec5b6a	drm/i915/execlists: Mark up expected state during reset Move the BUG_ON around slightly and add some explanations for each to try and capture the expected state more carefully. We want to compare the expected active state of our bookkeeping as compared to the tracked HW state. References: https://bugs.freedesktop.org/show_bug.cgi?id=111937 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010083242.1387-1-chris@chris-wilson.co.uk	2019-10-10 13:52:34 +01:00
Daniele Ceraolo Spurio	9d41318c4e	drm/i915/tgl: simplify the lrc register list for !RCS There are small differences between the blitter and the video engines in the xcs context image (e.g. registers 0x200 and 0x204 only exist on the blitter). Since we never explicitly set a value for those register and given that we don't need to update the offsets in the lrc image when we change engine within the class for virtual engine because the HW can handle that, instead of having a separate define for the BCS we can just restrict the programming to the part we're interested in, which is common across the engines. Bspec: 45584 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191009230424.6507-2-daniele.ceraolospurio@intel.com	2019-10-10 10:14:42 +01:00
Daniele Ceraolo Spurio	ba2c74da52	drm/i915/tgl: the BCS engine supports relative MMIO The specs don't mention any specific HW limitation on the blitter and manual inspection shows that the HW does set the relative MMIO bit in the LRI of the blitter context image, so we can remove our limitations. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191009230424.6507-1-daniele.ceraolospurio@intel.com	2019-10-10 10:12:18 +01:00
Chris Wilson	c949ae4314	drm/i915/execlists: Protect peeking at execlists->active Now that we dropped the engine->active.lock serialisation from around process_csb(), direct submission can run concurrently to the interrupt handler. As such execlists->active may be advanced as we dequeue, dropping the reference to the request. We need to employ our RCU request protection to ensure that the request is not freed too early. Fixes: `df40306902` ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191009100955.21477-1-chris@chris-wilson.co.uk	2019-10-09 19:46:40 +01:00
Chris Wilson	20af04f3dd	drm/i915/execlists: Assign virtual_engine->uncore from first sibling Copy across the engine->uncore shortcut to the virtual_engine from its first physical engine, similar to the handling of the engine->gt backpointer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191008070342.4045-1-chris@chris-wilson.co.uk	2019-10-08 10:14:29 +01:00
Chris Wilson	08ad9a3846	drm/i915/execlists: Fix annotation for decoupling virtual request As we may signal a request and take the engine->active.lock within the signaler, the engine submission paths have to use a nested annotation on their requests -- but we guarantee that we can never submit on the same engine as the signaling fence. <4>[ 723.763281] WARNING: possible circular locking dependency detected <4>[ 723.763285] 5.3.0-g80fa0e042cdb-drmtip_379+ #1 Tainted: G U <4>[ 723.763288] ------------------------------------------------------ <4>[ 723.763291] gem_exec_await/1388 is trying to acquire lock: <4>[ 723.763294] ffff93a7b53221d8 (&engine->active.lock){..-.}, at: execlists_submit_request+0x2b/0x1e0 [i915] <4>[ 723.763378] but task is already holding lock: <4>[ 723.763381] ffff93a7c25f6d20 (&i915_request_get(rq)->submit/1){-.-.}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 723.763420] which lock already depends on the new lock. <4>[ 723.763423] the existing dependency chain (in reverse order) is: <4>[ 723.763427] -> #2 (&i915_request_get(rq)->submit/1){-.-.}: <4>[ 723.763434] _raw_spin_lock_irqsave_nested+0x39/0x50 <4>[ 723.763478] __i915_sw_fence_complete+0x1b2/0x250 [i915] <4>[ 723.763513] intel_engine_breadcrumbs_irq+0x3aa/0x5e0 [i915] <4>[ 723.763600] cs_irq_handler+0x49/0x50 [i915] <4>[ 723.763659] gen11_gt_irq_handler+0x17b/0x280 [i915] <4>[ 723.763690] gen11_irq_handler+0x54/0xf0 [i915] <4>[ 723.763695] __handle_irq_event_percpu+0x41/0x2d0 <4>[ 723.763699] handle_irq_event_percpu+0x2b/0x70 <4>[ 723.763702] handle_irq_event+0x2f/0x50 <4>[ 723.763706] handle_edge_irq+0xee/0x1a0 <4>[ 723.763709] do_IRQ+0x7e/0x160 <4>[ 723.763712] ret_from_intr+0x0/0x1d <4>[ 723.763717] __slab_alloc.isra.28.constprop.33+0x4f/0x70 <4>[ 723.763720] kmem_cache_alloc+0x28d/0x2f0 <4>[ 723.763724] vm_area_dup+0x15/0x40 <4>[ 723.763727] dup_mm+0x2dd/0x550 <4>[ 723.763730] copy_process+0xf21/0x1ef0 <4>[ 723.763734] _do_fork+0x71/0x670 <4>[ 723.763737] __se_sys_clone+0x6e/0xa0 <4>[ 723.763741] do_syscall_64+0x4f/0x210 <4>[ 723.763744] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 723.763747] -> #1 (&(&rq->lock)->rlock#2){-.-.}: <4>[ 723.763752] _raw_spin_lock+0x2a/0x40 <4>[ 723.763789] __unwind_incomplete_requests+0x3eb/0x450 [i915] <4>[ 723.763825] __execlists_submission_tasklet+0x9ec/0x1d60 [i915] <4>[ 723.763864] execlists_submission_tasklet+0x34/0x50 [i915] <4>[ 723.763874] tasklet_action_common.isra.5+0x47/0xb0 <4>[ 723.763878] __do_softirq+0xd8/0x4ae <4>[ 723.763881] irq_exit+0xa9/0xc0 <4>[ 723.763883] smp_apic_timer_interrupt+0xb7/0x280 <4>[ 723.763887] apic_timer_interrupt+0xf/0x20 <4>[ 723.763892] cpuidle_enter_state+0xae/0x450 <4>[ 723.763895] cpuidle_enter+0x24/0x40 <4>[ 723.763899] do_idle+0x1e7/0x250 <4>[ 723.763902] cpu_startup_entry+0x14/0x20 <4>[ 723.763905] start_secondary+0x15f/0x1b0 <4>[ 723.763908] secondary_startup_64+0xa4/0xb0 <4>[ 723.763911] -> #0 (&engine->active.lock){..-.}: <4>[ 723.763916] __lock_acquire+0x15d8/0x1ea0 <4>[ 723.763919] lock_acquire+0xa6/0x1c0 <4>[ 723.763922] _raw_spin_lock_irqsave+0x33/0x50 <4>[ 723.763956] execlists_submit_request+0x2b/0x1e0 [i915] <4>[ 723.764002] submit_notify+0xa8/0x13c [i915] <4>[ 723.764035] __i915_sw_fence_complete+0x81/0x250 [i915] <4>[ 723.764054] i915_sw_fence_wake+0x51/0x64 [i915] <4>[ 723.764054] __i915_sw_fence_complete+0x1ee/0x250 [i915] <4>[ 723.764054] dma_i915_sw_fence_wake_timer+0x14/0x20 [i915] <4>[ 723.764054] dma_fence_signal_locked+0x9e/0x1c0 <4>[ 723.764054] dma_fence_signal+0x1f/0x40 <4>[ 723.764054] vgem_fence_signal_ioctl+0x67/0xc0 [vgem] <4>[ 723.764054] drm_ioctl_kernel+0x83/0xf0 <4>[ 723.764054] drm_ioctl+0x2f3/0x3b0 <4>[ 723.764054] do_vfs_ioctl+0xa0/0x6f0 <4>[ 723.764054] ksys_ioctl+0x35/0x60 <4>[ 723.764054] __x64_sys_ioctl+0x11/0x20 <4>[ 723.764054] do_syscall_64+0x4f/0x210 <4>[ 723.764054] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 723.764054] other info that might help us debug this: <4>[ 723.764054] Chain exists of: &engine->active.lock --> &(&rq->lock)->rlock#2 --> &i915_request_get(rq)->submit/1 <4>[ 723.764054] Possible unsafe locking scenario: <4>[ 723.764054] CPU0 CPU1 <4>[ 723.764054] ---- ---- <4>[ 723.764054] lock(&i915_request_get(rq)->submit/1); <4>[ 723.764054] lock(&(&rq->lock)->rlock#2); <4>[ 723.764054] lock(&i915_request_get(rq)->submit/1); <4>[ 723.764054] lock(&engine->active.lock); <4>[ 723.764054] * DEADLOCK * Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111862 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004194758.19679-1-chris@chris-wilson.co.uk	2019-10-07 21:44:02 +01:00
Chris Wilson	2935ed5339	drm/i915: Remove logical HW ID With the introduction of ctx->engines[] we allow multiple logical contexts to be used on the same engine (e.g. with virtual engines). According to bspec, aach logical context requires a unique tag in order for context-switching to occur correctly between them. [Simple experiments show that it is not so easy to trick the HW into performing a lite-restore with matching logical IDs, though my memory from early Broadwell experiments do suggest that it should be generating lite-restores.] We only need to keep a unique tag for the active lifetime of the context, and for as long as we need to identify that context. The HW uses the tag to determine if it should use a lite-restore (why not the LRCA?) and passes the tag back for various status identifies. The only status we need to track is for OA, so when using perf, we assign the specific context a unique tag. v2: Calculate required number of tags to fill ELSP. Fixes: `976b55f0e1` ("drm/i915: Allow a context to define its set of engines") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk	2019-10-04 15:39:30 +01:00
Chris Wilson	44d0a9c05b	drm/i915/execlists: Skip redundant resubmission If we unwind the active requests, and on resubmission discover that we intend to preempt the active contexts with themselves, simply skip the ELSP submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191003210100.22250-1-chris@chris-wilson.co.uk	2019-10-04 12:52:24 +01:00
Chris Wilson	fcde8c7eea	drm/i915/selftests: Exercise potential false lite-restore If execlists's lite-restore is based on the common GEM context tag rather than the per-intel_context LRCA, then a context switch between two intel_contexts on the same engine derived from the same GEM context will perform a lite-restore instead of a full context switch. We can exploit this by poisoning the ringbuffer of the first context and trying to trick a simple RING_TAIL update (i.e. lite-restore) v2: Also check what happens if preempt ce[0] with ce[1] (both instances on the same engine from the same parent context) [Tvrtko] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191002183459.26614-1-chris@chris-wilson.co.uk	2019-10-03 00:26:02 +01:00
Chris Wilson	f8db4d051b	drm/i915: Initialise breadcrumb lists on the virtual engine With deferring the breadcrumb signalling to the virtual engine (thanks preempt-to-busy) we need to make sure the lists and irq-worker are ready to send a signal. [41958.710544] BUG: kernel NULL pointer dereference, address: 0000000000000000 [41958.710553] #PF: supervisor write access in kernel mode [41958.710556] #PF: error_code(0x0002) - not-present page [41958.710558] PGD 0 P4D 0 [41958.710562] Oops: 0002 [#1] SMP [41958.710565] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G U 5.3.0+ #207 [41958.710568] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [41958.710602] RIP: 0010:i915_request_enable_breadcrumb+0xe1/0x130 [i915] [41958.710605] Code: 8b 44 24 30 48 89 41 08 48 89 08 48 8b 85 98 01 00 00 48 8d 8d 90 01 00 00 48 89 95 98 01 00 00 49 89 4c 24 28 49 89 44 24 30 <48> 89 10 f0 80 4b 30 10 c6 85 88 01 00 00 00 e9 1a ff ff ff 48 83 [41958.710609] RSP: 0018:ffffc90000003de0 EFLAGS: 00010046 [41958.710612] RAX: 0000000000000000 RBX: ffff888735424480 RCX: ffff8887cddb2190 [41958.710614] RDX: ffff8887cddb3570 RSI: ffff888850362190 RDI: ffff8887cddb2188 [41958.710617] RBP: ffff8887cddb2000 R08: ffff8888503624a8 R09: 0000000000000100 [41958.710619] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8887cddb3548 [41958.710622] R13: 0000000000000000 R14: 0000000000000046 R15: ffff888850362070 [41958.710625] FS: 0000000000000000(0000) GS:ffff88885ea00000(0000) knlGS:0000000000000000 [41958.710628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [41958.710630] CR2: 0000000000000000 CR3: 0000000002c09002 CR4: 00000000001606f0 [41958.710633] Call Trace: [41958.710636] <IRQ> [41958.710668] __i915_request_submit+0x12b/0x160 [i915] [41958.710693] virtual_submit_request+0x67/0x120 [i915] [41958.710720] __unwind_incomplete_requests+0x131/0x170 [i915] [41958.710744] execlists_dequeue+0xb40/0xe00 [i915] [41958.710771] execlists_submission_tasklet+0x10f/0x150 [i915] [41958.710776] tasklet_action_common.isra.17+0x41/0xa0 [41958.710781] __do_softirq+0xc8/0x221 [41958.710785] irq_exit+0xa6/0xb0 [41958.710788] smp_apic_timer_interrupt+0x4d/0x80 [41958.710791] apic_timer_interrupt+0xf/0x20 [41958.710794] </IRQ> Fixes: `cb2377a919` ("drm/i915: Fixup preempt-to-busy vs reset of a virtual request") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191001103518.9113-1-chris@chris-wilson.co.uk	2019-10-01 11:46:52 +01:00
Michał Winiarski	e123752374	drm/i915/execlists: Use per-process HWSP as scratch Some of our commands (MI_FLUSH_DW / PIPE_CONTROL) require a post-sync write operation to be performed. Currently we're using dedicated VMA for PIPE_CONTROL and global HWSP for MI_FLUSH_DW. On execlists platforms, each of our contexts has an area that can be used as scratch space. Let's use that instead. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190926133142.2838-2-chris@chris-wilson.co.uk	2019-09-26 18:44:35 +01:00
Chris Wilson	f9d4eae25d	drm/i915/execlists: Simplify gen12_csb_parse Having decided that we only care about the promotion predicate, we can simplify gen12_csb_parse to simply check whether we need to jump to a new queue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190925130845.17952-1-chris@chris-wilson.co.uk	2019-09-25 19:26:47 +01:00
Chris Wilson	7dc56af526	drm/i915/selftests: Verify the LRC register layout between init and HW Before we submit the first context to HW, we need to construct a valid image of the register state. This layout is defined by the HW and should match the layout generated by HW when it saves the context image. Asserting that this should be equivalent should help avoid any undefined behaviour and verify that we haven't missed anything important! Of course, having insisted that the initial register state within the LRC should match that returned by HW, we need to ensure that it does. v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for constructing the lrc image. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190924145950.3011-1-chris@chris-wilson.co.uk	2019-09-24 17:27:19 +01:00
Chris Wilson	e2144503bf	drm/i915: Prevent bonded requests from overtaking each other on preemption Force bonded requests to run on distinct engines so that they cannot be shuffled onto the same engine where timeslicing will reverse the order. A bonded request will often wait on a semaphore signaled by its master, creating an implicit dependency -- if we ignore that implicit dependency and allow the bonded request to run on the same engine and before its master, we will cause a GPU hang. [Whether it will hang the GPU is debatable, we should keep on timeslicing and each timeslice should be "accidentally" counted as forward progress, in which case it should run but at one-half to one-third speed.] We can prevent this inversion by restricting which engines we allow ourselves to jump to upon preemption, i.e. baking in the arrangement established at first execution. (We should also consider capturing the implicit dependency using i915_sched_add_dependency(), but first we need to think about the constraints that requires on the execution/retirement ordering.) Fixes: `8ee36e048c` ("drm/i915/execlists: Minimalistic timeslicing") References: `ee1136908e` ("drm/i915/execlists: Virtual engine bonding") Testcase: igt/gem_exec_balancer/bonded-slice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-3-chris@chris-wilson.co.uk	2019-09-23 20:44:14 +01:00
Chris Wilson	cb2377a919	drm/i915: Fixup preempt-to-busy vs reset of a virtual request Due to the nature of preempt-to-busy the execlists active tracking and the schedule queue may become temporarily desync'ed (between resubmission to HW and its ack from HW). This means that we may have unwound a request and passed it back to the virtual engine, but it is still inflight on the HW and may even result in a GPU hang. If we detect that GPU hang and try to reset, the hanging request->engine will no longer match the current engine, which means that the request is not on the execlists active list and we should not try to find an older incomplete request. Given that we have deduced this must be a request on a virtual engine, it is the single active request in the context and so must be guilty (as the context is still inflight, it is prevented from being executed on another engine as we process the reset). Fixes: `22b7a426bb` ("drm/i915/execlists: Preempt-to-busy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-2-chris@chris-wilson.co.uk	2019-09-23 20:44:14 +01:00
Chris Wilson	b647c7df01	drm/i915: Fixup preempt-to-busy vs resubmission of a virtual request As preempt-to-busy leaves the request on the HW as the resubmission is processed, that request may complete in the background and even cause a second virtual request to enter queue. This second virtual request breaks our "single request in the virtual pipeline" assumptions. Furthermore, as the virtual request may be completed and retired, we lose the reference the virtual engine assumes is held. Normally, just removing the request from the scheduler queue removes it from the engine, but the virtual engine keeps track of its singleton request via its ve->request. This pointer needs protecting with a reference. v2: Drop unnecessary motion of rq->engine = owner Fixes: `22b7a426bb` ("drm/i915/execlists: Preempt-to-busy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-1-chris@chris-wilson.co.uk	2019-09-23 20:43:59 +01:00
Chris Wilson	0d7cf7bc15	drm/i915/execlists: Refactor -EIO markup of hung requests Pull setting -EIO on the hung requests into its own utility function. Having allowed ourselves to short-circuit submission of completed requests, we can now do the mark_eio() prior to submission and avoid some redundant operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-4-chris@chris-wilson.co.uk	2019-09-23 16:21:38 +01:00
Chris Wilson	c0bb487dc1	drm/i915: Only enqueue already completed requests If we are asked to submit a completed request, just move it onto the active-list without modifying it's payload. If we try to emit the modified payload of a completed request, we risk racing with the ring->head update during retirement which may advance the head past our breadcrumb and so we generate a warning for the emission being behind the RING_HEAD. v2: Commentary for the sneaky, shared responsibility between functions. v3: Spelling mistakes and bonus assertion Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-3-chris@chris-wilson.co.uk	2019-09-23 16:21:37 +01:00
Chris Wilson	3231f8c011	drm/i915/execlists: Drop redundant list_del_init(&rq->sched.link) Since amalgamating the queued and active lists in commit `422d7df4f0` ("drm/i915: Replace engine->timeline with a plain list"), performing a i915_request_submit() will remove the request from the execlists priority queue. References: `422d7df4f0` ("drm/i915: Replace engine->timeline with a plain list") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-2-chris@chris-wilson.co.uk	2019-09-23 16:21:36 +01:00
Chris Wilson	ae911b23d2	drm/i915/execlists: Relax assertion for a pinned context image on reset A gpu hang can occur at any time, given a sufficiently angry gpu. An example is when it forgets to perform a context-switch at the end of a request, leaving us with a hanging GPU on a completed request. Here, we may retire the request, only leaving its context alive via the active barrier. When we reset the GPU on a completed request, we do not modify its context image (just updating the ring state) and can safely defer the assertion that we have the image pinned and ready to modify. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111639 Fixes: `dffa8feb30` ("drm/i915/perf: Assert locking for i915_init_oa_perf_state()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-1-chris@chris-wilson.co.uk	2019-09-23 16:21:36 +01:00
Chris Wilson	d19d71fc2b	drm/i915: Mark i915_request.timeline as a volatile, rcu pointer The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk	2019-09-20 10:24:09 +01:00
Chris Wilson	c45e788d95	drm/i915/tgl: Suspend pre-parser across GTT invalidations Before we execute a batch, we must first issue any and all TLB invalidations so that batch picks up the new page table entries. Tigerlake's preparser is weakening our post-sync CS_STALL inside the invalidate pipe-control and allowing the loading of the batch buffer before we have setup its page table (and so it loads the wrong page and executes indefinitely). The igt_cs_tlb indicates that this issue can only be observed on rcs, even though the preparser is common to all engines. Alternatively, we could do TLB shootdown via mmio on updating the GTT. By inserting the pre-parser disable inside EMIT_INVALIDATE, we will also accidentally fixup execution that writes into subsequent batches, such as gem_exec_whisper and even relocations performed on the GPU. We should be careful not to allow this disable to become baked into the uABI! The issue is that if userspace relies on our disabling of the HW optimisation, when we are ready to enable that optimisation, userspace will then be broken... Testcase: igt/i915_selftests/live_gtt/igt_cs_tlb Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111753 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919151811.9526-1-chris@chris-wilson.co.uk	2019-09-20 09:47:52 +01:00
Chris Wilson	c210e85b8f	drm/i915/tgl: Extend MI_SEMAPHORE_WAIT On Tigerlake, MI_SEMAPHORE_WAIT grew an extra dword, so be sure to update the length field and emit that extra parameter and any padding noop as required. v2: Define the token shift while we are adding the updated MI_SEMAPHORE_WAIT v3: Use int instead of bool in the addition so that readers are not left wondering about the intricacies of the C spec. Now they just have to worry what the integer value of a boolean operation is... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190917123055.28965-1-chris@chris-wilson.co.uk	2019-09-17 15:33:21 +01:00
Chris Wilson	ee73e2795b	drm/i915/tgl: Disable preemption while being debugged We see failures where the context continues executing past a preemption event, eventually leading to situations where a request has executed before we have event submitted it to HW! It seems like tgl is ignoring our RING_TAIL updates, but more likely is that there is a missing update required for our semaphore waits around preemption. v2: And disable internal semaphore usage Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912132313.12751-1-chris@chris-wilson.co.uk	2019-09-12 20:45:23 +01:00
Chris Wilson	a17592effd	drm/i915/execlists: Ensure the context is reloaded after a GPU reset After we manipulate the context to allow replay after a GPU reset, force that context to be reloaded. This should be a layer of paranoia, for if the GPU was reset, the context will no longer be resident! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912092933.4729-2-chris@chris-wilson.co.uk	2019-09-12 12:59:46 +01:00
Chris Wilson	582a6f90aa	drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset After a GPU reset, we need to drain all the CS events so that we have an accurate picture of the execlists state at the time of the reset. Be paranoid and force a read of the CSB write pointer from memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912092933.4729-1-chris@chris-wilson.co.uk	2019-09-12 12:59:45 +01:00
Chris Wilson	198d253366	drm/i915/execlists: Ignore lost completion events Icelake hit an issue where it missed reporting a completion event and instead jumped straight to a idle->active event (skipping over the active->idle and not even hitting the lite-restore preemption). 661497511us : process_csb: rcs0 cs-irq head=11, tail=0 661497512us : process_csb: rcs0 csb[0]: status=0x10008002:0x00000020 [lite-restore] 661497512us : trace_ports: rcs0: preempted { 28cc8:11052, 0:0 } 661497513us : trace_ports: rcs0: promote { 28cc8:11054, 0:0 } 661497514us : __i915_request_submit: rcs0 fence 28cc8:11056, current 11052 661497514us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes 661497515us : trace_ports: rcs0: submit { 28cc8:11056, 0:0 } 661497530us : process_csb: rcs0 cs-irq head=0, tail=1 661497530us : process_csb: rcs0 csb[1]: status=0x10008002:0x00000020 [lite-restore] 661497531us : trace_ports: rcs0: preempted { 28cc8:11054!, 0:0 } 661497535us : trace_ports: rcs0: promote { 28cc8:11056, 0:0 } 661497540us : __i915_request_submit: rcs0 fence 28cc8:11058, current 11054 661497544us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes 661497545us : trace_ports: rcs0: submit { 28cc8:11058, 0:0 } 661497553us : process_csb: rcs0 cs-irq head=1, tail=2 661497553us : process_csb: rcs0 csb[2]: status=0x10000001:0x00000000 [idle->active] 661497574us : process_csb: process_csb:1538 GEM_BUG_ON(*execlists->active) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190907084334.28952-1-chris@chris-wilson.co.uk	2019-09-10 11:39:59 +01:00
Chris Wilson	fa9a09f150	drm/i915/execlists: Clear STOP_RING bit on reset During reset, we try to ensure no forward progress of the CS prior to the reset by setting the STOP_RING bit in RING_MI_MODE. Since gen9, this register is context saved and do we end up in the odd situation where we save the STOP_RING bit and so try to stop the engine again immediately upon resume. This is quite unexpected and causes us to complain about an early CS completion event! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111514 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190910080208.4223-1-chris@chris-wilson.co.uk	2019-09-10 11:04:17 +01:00
Chris Wilson	d810583fc2	drm/i915/execlists: Remove incorrect BUG_ON for schedule-out As we may unwind incomplete requests (for preemption) prior to processing the CSB and the schedule-out events, we may update rq->engine (resetting it to point back to the parent virtual engine) prior to calling execlists_schedule_out(), invalidating the assertion that the request still points to the inflight engine. (The likelihood of this is increased if the CSB interrupt processing is pushed to the ksoftirqd for being too slow and direct submission overtakes it.) Tvrtko summarised it as: "So unwind from direct submission resets rq->engine and races with process_csb from the tasklet which notices request has actually completed." Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Fixes: `df40306902` ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190907105046.19934-1-chris@chris-wilson.co.uk	2019-09-09 11:28:06 +01:00
Michel Thierry	5bf05dc58d	drm/i915/tgl: Register state context definition for Gen12 Gen12 has subtle changes in the reg state context offsets (some fields are gone, some are in a different location), compared to previous Gens. The simplest approach seems to be keeping Gen12 (and future platform) changes apart from the previous gens, while keeping the registers that are contiguous in functions we can reuse. v2: alias, virtual engine, rpcs, prune unused regs v3: use engine base (Daniele), take ctx_bb for all Bspec: 46255 Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Tested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> [ickle: Tweaked the GEM_WARN_ON after settling on a compromise with Daniele] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190906122314.2146-2-mika.kuoppala@linux.intel.com	2019-09-06 18:12:58 +01:00
Mika Kuoppala	cdb736fa8b	drm/i915: Use engine relative LRIs on context setup Daniele pointed out that relative mmio works differently in on context restore. Instead of adding the engine mmio base to offset, it masks out the base and adds bits [12:2] to current engine base. This should allow us to construct context register state to be applicable to all instances, including virtual. And avoid the trouble of updating the registers on virtual instances when submitting work. v2: only enable for gen12 for now (Mika) v3: make enabling readable (Chris) Bspec: 20206 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190906134957.25909-1-mika.kuoppala@linux.intel.com	2019-09-06 18:12:25 +01:00
Chris Wilson	dffa8feb30	drm/i915/perf: Assert locking for i915_init_oa_perf_state() We use the context->pin_mutex to serialise updates to the OA config and the registers values written into each new context. Document this relationship and assert we do hold the context->pin_mutex as used by gen8_configure_all_contexts() to serialise updates to the OA config itself. v2: Add a white-lie for when we call intel_gt_resume() from init. v3: Lie while we have the context pinned inside atomic reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20190830181929.18663-1-chris@chris-wilson.co.uk	2019-08-31 16:08:28 +01:00
Chris Wilson	0b718ba1e8	drm/i915/gtt: Downgrade Cherryview back to aliasing-ppgtt With the upcoming change in timing (dramatically reducing the latency between manipulating the ppGTT and execution), no amount of tweaking could save Cherryview, it would always fail to invalidate its TLB. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-2-chris@chris-wilson.co.uk	2019-08-30 20:49:56 +01:00
Chris Wilson	11988e3938	drm/i915/execlists: Try rearranging breadcrumb flush The addition of the DC_FLUSH failed to ensure sanctity of the post-sync write as CI immediately got a completion CS-event before the breadcrumb was coherent. So let's try the other idea of moving the post-sync write into the CS_STALL. References: https://bugs.freedesktop.org/show_bug.cgi?id=111514 References: `e8f6b4952e` ("drm/i915/execlists: Flush the post-sync breadcrumb write harder") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190829081150.10271-2-chris@chris-wilson.co.uk	2019-08-29 23:47:36 +01:00
Chris Wilson	e8f6b4952e	drm/i915/execlists: Flush the post-sync breadcrumb write harder Quite rarely we see that the CS completion event fires before the breadcrumb is coherent, which presumably is a result of the CS_STALL not waiting for the post-sync operation. Try throwing in a DC_FLUSH into the following pipecontrol to see if that makes any difference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190827120615.31390-1-chris@chris-wilson.co.uk	2019-08-28 14:05:31 +01:00
Daniele Ceraolo Spurio	8a9a982767	drm/i915: use a separate context for gpu relocs The CS pre-parser can pre-fetch commands across memory sync points and starting from gen12 it is able to pre-fetch across BB_START and BB_END boundaries as well, so when we emit gpu relocs the pre-parser might fetch the target location of the reloc before the memory write lands. The parser can't pre-fetch across the ctx switch, so we use a separate context to guarantee that the memory is synchronized before the parser can get to it. Note that there is no risk of the CS doing a lite restore from the reloc context to the user context, even if the two have the same hw_id, because since gen11 the CS also checks the LRCA when deciding if it can lite-restore. v2: limit new context to gen12+, release in eb_destroy, add a comment in emit_fini_breadcrumb (Chris). Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190827185805.21799-1-daniele.ceraolospurio@intel.com	2019-08-27 21:14:43 +01:00
Chris Wilson	cccdce1dd0	drm/i915: Make engine's batch pool safe for use with virtual engines A virtual engine itself does not have a batch pool, but we can gleefully use any of its siblings instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190827135935.3831-1-chris@chris-wilson.co.uk	2019-08-27 16:42:12 +01:00
Chris Wilson	a20ab592d1	drm/i915/execlists: Set priority hint prior to submission Since we now run process_csb() outside of the engine->active.lock, we can process a CS-event immediately upon our ELSP write. As we currently inspect the pending queue after the ELSP write, there is an opportunity for a CS-event to update the pending queue before we can read it, making ourselves chases an invalid pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111427 Fixes: `df40306902` ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190821142336.21609-1-chris@chris-wilson.co.uk	2019-08-21 17:32:27 +01:00
Lucas De Marchi	13e53c5c53	drm/i915/tgl: Introduce initial Tiger Lake workarounds Add empty workaround hooks for Tiger Lake. The workarounds will be added on separate patches. We were already applying WaRsForcewakeAddDelayForAck, which is indeed still valid, so also update the comment. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-21-lucas.demarchi@intel.com	2019-08-20 15:23:33 +01:00

1 2 3

127 Commits