linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 15:15:02 +07:00

Author	SHA1	Message	Date
Chris Wilson	97c1635397	drm/i915/execlists: Ensure the tasklet is decoupled upon shutdown As we only cancel the timers asynchronously, they may still be running on another CPU as we shutdown, raising one last softirq. So be safe and make sure the tasklet is flushed before destroying the engine's memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129172542.1222810-1-chris@chris-wilson.co.uk	2019-11-29 20:09:14 +00:00
Chris Wilson	7ce596a803	drm/i915/gem: Take timeline->mutex to walk list-of-requests Though the context is closed and so no more requests can be added to the timeline, retirement can still be removing requests. It can even be removing the very request we are inspecting and so cause us to wander into dead links. Serialise with the retirement by taking the timeline->mutex used for guarding the timeline->requests list. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404 Fixes: `4a31741521` ("drm/i915/gem: Refine occupancy test in kill_context()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk	2019-11-29 20:09:14 +00:00
Ville Syrjälä	8d9875b47a	drm/i915: Don't set undefined bits in dirty_pipes skl_commit_modeset_enables() straight up compares dirty_pipes with a bitmask of already committed pipes. If we set bits in dirty_pipes for non-existent pipes that comparison will never work right. So let's limit ourselves to bits that exist. And we'll do the same for the active_pipes_changed bitmask. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191011200949.7839-5-ville.syrjala@linux.intel.com Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>	2019-11-29 21:49:16 +02:00
Chris Wilson	d92f77deef	Revert "drm/i915: use a separate context for gpu relocs" Since commit `c45e788d95` ("drm/i915/tgl: Suspend pre-parser across GTT invalidations"), we now disable the advanced preparser on Tigerlake for the invalidation phase at the start of the batch, we no longer need to emit the GPU relocations from a second context as they are now flushed inlined. References: `8a9a982767` ("drm/i915: use a separate context for gpu relocs") References: `c45e788d95` ("drm/i915/tgl: Suspend pre-parser across GTT invalidations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129124846.949100-1-chris@chris-wilson.co.uk	2019-11-29 14:23:53 +00:00
Chris Wilson	0cb7da1062	drm/i915/selftests: Wait only on the expected barrier Wait on only the last request on the kernel_context after emitting a barrier so that we do not wait for everything in general and by doing so cause an accidental emission of the barrier! Bugzilla; https://bugs.freedesktop.org/show_bug.cgi?id=112405 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129103455.744389-1-chris@chris-wilson.co.uk	2019-11-29 14:23:53 +00:00
Chris Wilson	b006869c6e	drm/i915/selftests: Always lock the drm_mm around insert/remove Be paranoid and make sure the drm_mm is locked whenever we insert/remove our own nodes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129095659.665381-1-chris@chris-wilson.co.uk	2019-11-29 14:23:53 +00:00
Chris Wilson	6930573279	drm/i915/selftests: Use sgt_iter for huge_pages_free Use the normal sgt_iter to walk the pages scatterlist on free so that we handle the error path correctly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112225 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128232946.546831-1-chris@chris-wilson.co.uk	2019-11-29 14:23:53 +00:00
Michel Thierry	ff690b2111	drm/i915/tgl: Implement Wa_1604555607 Implement Wa_1604555607 (set the DS pairing timer to 128 cycles). FF_MODE2 is part of the register state context, that's why it is implemented here. At TGL A0 stepping, FF_MODE2 register read back is broken, hence disabling the WA verification. v2: Rebased on top of the WA refactoring (Oscar) v3: Correctly add to ctx_workarounds_init (Michel) v4: uncore read is used [Tvrtko] Macros as used for MASK definition [Chris] v5: Skip the Wa_1604555607 verification [Ram] i915 ptr retrieved from engine. [Tvrtko] v6: Added wa_add as a wrapper for __wa_add [Chris] wa_add is directly called instead of new wrapper [tvrtko] BSpec: 19363 HSDES: 1604555607 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Ramalingam C <ramlingam.c@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v5] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128021005.3350-1-ramalingam.c@intel.com	2019-11-29 11:48:20 +00:00
Chris Wilson	952d1a6b0f	drm/i915/selftests: Drop local vm reference! After obtaining a local reference to the vm from the context, remember to drop it before it goes out of scope! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128185402.110678-1-chris@chris-wilson.co.uk	2019-11-28 20:11:13 +00:00
Chris Wilson	212d9994d0	drm/i915/selftests: Count the number of engines used Don't rely on the RUNTIME_INFO() when we loop over a particular context and only run on a filtered set of engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127223252.3777141-1-chris@chris-wilson.co.uk	2019-11-28 11:59:13 +00:00
Chris Wilson	7983990ca9	drm/i915/selftests: Try to show where the pulse went We have a case of a mysteriously absent pulse, so dump the engine details to see if we can find out what happened to it. References: https://bugs.freedesktop.org/show_bug.cgi?id=112405 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128102546.3857140-1-chris@chris-wilson.co.uk	2019-11-28 11:39:50 +00:00
Chris Wilson	cd30a50317	drm/i915/gem: Excise the per-batch whitelist from the context One does not lightly add a new hidden struct_mutex dependency deep within the execbuf bowels! The immediate suspicion in seeing the whitelist cached on the context, is that it is intended to be preserved between batches, as the kernel is quite adept at caching small allocations itself. But no, it's sole purpose is to serialise command submission in order to save a kmalloc on a slow, slow path! By removing the whitelist dependency from the context, our freedom to chop the big struct_mutex is greatly augmented. v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk	2019-11-28 11:39:50 +00:00
Chris Wilson	e3f3a0f269	drm/i915/gt: Defer breadcrumb processing to after the irq handler The design of our interrupt handlers is that we ack the receipt of the interrupt first, inside the critical section where the master interrupt control is off and other cpus cannot start processing the next interrupt; and then process the interrupt events afterwards. However, Icelake introduced a whole new set of banked GT_IIR that are inherently serialised and slow to retrieve the IIR and must be processed within the critical section. We can still push our breadcrumbs out of this critical section by using our irq_worker. On bdw+, this should not make too much of a difference as we only slightly defer the breadcrumbs, but on icl+ this should make a big difference to our throughput of interrupts from concurrently executing engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127115813.3345823-1-chris@chris-wilson.co.uk	2019-11-27 17:02:14 +00:00
Chris Wilson	df9f85d858	drm/i915: Serialise i915_active_fence_set() with itself The expected downside to commit `58b4c1a07a` ("drm/i915: Reduce nested prepare_remote_context() to a trylock") was that it would need to return -EAGAIN to userspace in order to resolve potential mutex inversion. Such an unsightly round trip is unnecessary if we could atomically insert a barrier into the i915_active_fence, so make it happen. Currently, we use the timeline->mutex (or some other named outer lock) to order insertion into the i915_active_fence (and so individual nodes of i915_active). Inside __i915_active_fence_set, we only need then serialise with the interrupt handler in order to claim the timeline for ourselves. However, if we remove the outer lock, we need to ensure the order is intact between not only multiple threads trying to insert themselves into the timeline, but also with the interrupt handler completing the previous occupant. We use xchg() on insert so that we have an ordered sequence of insertions (and each caller knows the previous fence on which to wait, preserving the chain of all fences in the timeline), but we then have to cmpxchg() in the interrupt handler to avoid overwriting the new occupant. The only nasty side-effect is having to temporarily strip off the RCU-annotations to apply the atomic operations, otherwise the rules are much more conventional! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402 Fixes: `58b4c1a07a` ("drm/i915: Reduce nested prepare_remote_context() to a trylock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk	2019-11-27 17:02:14 +00:00
Chris Wilson	730eaeb524	drm/i915/gt: Manual rc6 entry upon parking Now that we rapidly park the GT when the GPU idles, we often find ourselves idling faster than the RC6 promotion timer. Thus if we tell the GPU to enter RC6 manually as we park, we can do so quicker (by around 50ms, half an EI on average) and marginally increase our powersaving across all execlists platforms. v2: Now with a selftest to check we can enter RC6 manually Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Acked-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127095657.3209854-1-chris@chris-wilson.co.uk	2019-11-27 12:53:27 +00:00
Clint Taylor	7e7129dcbd	drm/i915: Disable display interrupts during display IRQ handler During the Display Interrupt Service routine the Display Interrupt Enable bit must be disabled, The interrupts handled, then the Display Interrupt Enable bit must be set to prevent possible missed interrupts. Bspec: 49212 V2: Change Title to remove SDE reference. V3: Fix TAB spacing. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121201455.2558-1-clinton.a.taylor@intel.com	2019-11-26 16:56:57 -08:00
Kai Vehmanen	071309814d	drm/i915/dp: fix DP audio for PORT_A on gen12+ Starting with gen12, PORT_A can be connected to a transcoder with audio support. Modify the existing logic that disabled audio on PORT_A unconditionally. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125125313.17584-1-kai.vehmanen@linux.intel.com	2019-11-26 16:12:44 -08:00
Stanislav Lisovskiy	9b93daa93e	drm/i915: Support more QGV points According to BSpec 53998, there is a mask of max 8 SAGV/QGV points we need to support. Bumping this up to keep the CI happy(currently preventing tests to run), until all SAGV changes land. v2: Fix second plane where QGV points were hardcoded as well. v3: Change the naming of I915_NUM_SAGV_POINTS to be I915_NUM_QGV_POINTS, as more meaningful (Ville Syrjälä) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112189 Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125160800.14740-1-stanislav.lisovskiy@intel.com [vsyrjala: Add missing braces around else (checkpatch), fix Bugzilla tag] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2019-11-26 18:27:37 +02:00
Chris Wilson	58b4c1a07a	drm/i915: Reduce nested prepare_remote_context() to a trylock On context retiring, we may invoke the kernel_context to unpin this context. Elsewhere, we may use the kernel_context to modify this context. This currently leads to an AB-BA lock inversion, so we need to back-off from the contended lock, and repeat. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `a9877da2d6` ("drm/i915/oa: Reconfigure contexts on the fly") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk	2019-11-26 12:45:45 +00:00
Chris Wilson	5766a5ffc6	drm/i915: Default to a more lenient forced preemption timeout Based on a sampling of a number of benchmarks across platforms, by default opt for a much more lenient timeout so that we should not adversely affect existing "good" clients. 640ms ought to be enough for anyone. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169 Fixes: `3a7a92aba8` ("drm/i915/execlists: Force preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk	2019-11-26 09:38:44 +00:00
Chris Wilson	34f5fe1243	drm/i915/selftests: Move mock_vma to the heap to reduce stack_frame An i915_vma struct on the stack may push the frame over the limit, if set conservatively, so move it to the heap. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125124856.1761176-1-chris@chris-wilson.co.uk	2019-11-25 15:09:14 +00:00
Chris Wilson	4f88f8747f	drm/i915/gt: Schedule request retirement when timeline idles The major drawback of commit `7e34f4e4aa` ("drm/i915/gen8+: Add RC6 CTX corruption WA") is that it disables RC6 while Skylake (and friends) is active, and we do not consider the GPU idle until all outstanding requests have been retired and the engine switched over to the kernel context. If userspace is idle, this task falls onto our background idle worker, which only runs roughly once a second, meaning that userspace has to have been idle for a couple of seconds before we enable RC6 again. Naturally, this causes us to consume considerably more energy than before as powersaving is effectively disabled while a display server (here's looking at you Xorg) is running. As execlists will get a completion event as each context is completed, we can use this interrupt to queue a retire worker bound to this engine to cleanup idle timelines. We will then immediately notice the idle engine (without userspace intervention or the aid of the background retire worker) and start parking the GPU. Thus during light workloads, we will do much more work to idle the GPU faster... Hopefully with commensurate power saving! v2: Watch context completions and only look at those local to the engine when retiring to reduce the amount of excess work we perform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315 References: `7e34f4e4aa` ("drm/i915/gen8+: Add RC6 CTX corruption WA") References: `2248a28384` ("drm/i915/gen8+: Add RC6 CTX corruption WA") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk	2019-11-25 13:17:18 +00:00
Chris Wilson	88a4655e75	drm/i915/gt: Adapt engine_park synchronisation rules for engine_retire In the next patch, we will introduce a new asynchronous retirement worker, fed by execlists CS events. Here we may queue a retirement as soon as a request is submitted to HW (and completes instantly), and we also want to process that retirement as early as possible and cannot afford to postpone (as there may not be another opportunity to retire it for a few seconds). To allow the new async retirer to run in parallel with our submission, pull the __i915_request_queue (that passes the request to HW) inside the timelines spinlock so that the retirement cannot release the timeline before we have completed the submission. v2: Actually to play nicely with engine_retire, we have to raise the timeline.active_lock before releasing the HW. intel_gt_retire_requsts() is still serialised by the outer lock so they cannot see this intermediate state, and engine_retire is serialised by HW submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk	2019-11-25 13:17:18 +00:00
Chris Wilson	de5825beae	drm/i915: Serialise with engine-pm around requests on the kernel_context As the engine->kernel_context is used within the engine-pm barrier, we have to be careful when emitting requests outside of the barrier, as the strict timeline locking rules do not apply. Instead, we must ensure the engine_park() cannot be entered as we build the request, which is simplest by taking an explicit engine-pm wakeref around the request construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk	2019-11-25 13:17:18 +00:00
Chris Wilson	da0ef77e1e	drm/i915/execlists: Fixup cancel_port_requests() I rushed a last minute correction to cancel_port_requests() to prevent the snooping of *execlists->active as the inflight array was being updated, without noticing we iterated the inflight array starting from active! Oops. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112387 Fixes: `331bf90591` ("drm/i915/gt: Mark the execlists->active as the primary volatile access") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125112520.1760492-1-chris@chris-wilson.co.uk	2019-11-25 13:17:18 +00:00
Chris Wilson	331bf90591	drm/i915/gt: Mark the execlists->active as the primary volatile access Since we want to do a lockless read of the current active request, and that request is written to by process_csb also without serialisation, we need to instruct gcc to take care in reading the pointer itself. Otherwise, we have observed execlists_active() to report 0x40. [ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4 [ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000 [ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 } [ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940 which is impossible! The answer is that as we keep the existing execlists->active pointing into the array as we copy over that array, the unserialised read may see a partial pointer value. Fixes: `df40306902` ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk	2019-11-25 09:45:37 +00:00
Chris Wilson	bae21dacd7	drm/i915: Switch kunmap() to take the page not vaddr On converting from kunmap_atomic() to kunamp() one must remember the latter takes the struct page, the former the vaddr. Fixes: `48715f7001` ("drm/i915: Avoid atomic context for error capture") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125091409.1630385-1-chris@chris-wilson.co.uk	2019-11-25 09:24:31 +00:00
Chris Wilson	3b054a1c03	drm/i915/selftests: Include the subsubtest name for live_parallel_engines Include the name of the failing subsubtest, should it fails. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191123191547.925360-1-chris@chris-wilson.co.uk	2019-11-23 21:49:27 +00:00
Tvrtko Ursulin	9acc99d8f2	drm/i915/query: Align flavour of engine data lookup Commit `750e76b4f9` ("drm/i915/gt: Move the [class][inst] lookup for engines onto the GT") changed the engine query to iterate over uabi engines but left the buffer size calculation look at the physical engine count. Difference has no practical consequence but it is nicer to align both queries. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `750e76b4f9` ("drm/i915/gt: Move the [class][inst] lookup for engines onto the GT") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191122104115.29610-1-tvrtko.ursulin@linux.intel.com	2019-11-23 19:33:12 +00:00
Juston Li	6025ba1204	drm/i915: coffeelake supports hdcp2.2 This includes other platforms that utilize the same gen graphics as CFL: AML, WHL and CML. Signed-off-by: Juston Li <juston.li@intel.com> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191011181918.29618-1-juston.li@intel.com	2019-11-22 14:11:58 -05:00
Chris Wilson	e8e61f105a	drm/i915/selftests: Flush the active callbacks Before checking the current i915_active state for the asynchronous work we submitted, flush any ongoing callback. This ensures that our sampling is robust and does not sporadically fail due to bad timing as the work is running on another cpu. v2: Drop the fence callback sync, retiring under the lock should be good enough to synchronize with engine_retire() and the intel_gt_retire_requests() background worker. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191122132404.690440-1-chris@chris-wilson.co.uk	2019-11-22 17:54:12 +00:00
Chris Wilson	cfd821b243	drm/i915/selftests: Force bonded submission to overlap Bonded request submission is designed to allow requests to execute in parallel as laid out by the user. If the master request is already finished before its bonded pair is submitted, the pair were not destined to run in parallel and we lose the information about the master engine to dictate selection of the secondary. If the second request was required to be run on a particular engine in a virtual set, that should have been specified, rather than left to the whims of a random unconnected requests! In the selftest, I made the mistake of not ensuring the master would overlap with its bonded pairs, meaning that it could indeed complete before we submitted the bonds. Those bonds were then free to select any available engine in their virtual set, and not the one expected by the test. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191122112152.660743-1-chris@chris-wilson.co.uk	2019-11-22 13:06:36 +00:00
Chris Wilson	67a3acaab7	drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request As we start peeking into requests for longer and longer, e.g. incorporating use of spinlocks when only protected by an rcu_read_lock(), we need to be careful in how we reset the request when recycling and need to preserve any barriers that may still be in use as the request is reset for reuse. Quoting Linus Torvalds: > If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU? .. because the object can be accessed (by RCU) after the refcount has gone down to zero, and the thing has been released. That's the whole and only point of SLAB_TYPESAFE_BY_RCU. That flag basically says: "I may end up accessing this object after it has been free'd, because there may be RCU lookups in flight" This has nothing to do with constructors. It's ok if the object gets reused as an object of the same type and does not get re-initialized, because we're perfectly fine seeing old stale data. What it guarantees is that the slab isn't shared with any other kind of object, _and_ that the underlying pages are free'd after an RCU quiescent period (so the pages aren't shared with another kind of object either during an RCU walk). And it doesn't necessarily have to have a constructor, because the thing that a RCU walk will care about is (a) guaranteed to be an object that has been on some RCU list (so it's not a "new" object) (b) the RCU walk needs to have logic to verify that it's still the same object and hasn't been re-used as something else. In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used immediately, but because it gets reused as the same kind of object, the RCU walker can "know" what parts have meaning for re-use, in a way it couidn't if the re-use was random. That said, it is subtle, and people should be careful. > So the re-use might initialize the fields lazily, not necessarily using a ctor. If you have a well-defined refcount, and use "atomic_inc_not_zero()" to guard the speculative RCU access section, and use "atomic_dec_and_test()" in the freeing section, then you should be safe wrt new allocations. If you have a completely new allocation that has "random stale content", you know that it cannot be on the RCU list, so there is no speculative access that can ever see that random content. So the only case you need to worry about is a re-use allocation, and you know that the refcount will start out as zero even if you don't have a constructor. So you can think of the refcount itself as always having a zero constructor, BUT you need to be careful with ordering. In particular, whoever does the allocation needs to then set the refcount to a non-zero value after it has initialized all the other fields. And in particular, it needs to make sure that it uses the proper memory ordering to do so. NOTE! One thing to be very worried about is that re-initializing whatever RCU lists means that now the RCU walker may be walking on the wrong list so the walker may do the right thing for this particular entry, but it may miss walking other entries. So then you can get spurious lookup failures, because the RCU walker never walked all the way to the end of the right list. That ends up being a much more subtle bug. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191122094924.629690-1-chris@chris-wilson.co.uk	2019-11-22 10:47:38 +00:00
Chris Wilson	f05bfce334	drm/i915/selftests: Shorten infinite wait for sseu Use our more regular igt_flush_test() to bind the wait-for-idle and error out instead of waiting around forever on critical failure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121233021.507400-1-chris@chris-wilson.co.uk	2019-11-22 08:44:57 +00:00
Chris Wilson	1ff2f9e26c	drm/i915/selftests: Always hold a reference on a waited upon request Whenever we wait on a request, make sure we actually hold a reference to it and that it cannot be retired/freed on another CPU! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-4-chris@chris-wilson.co.uk	2019-11-21 16:14:57 +00:00
Chris Wilson	93b0e8fe47	drm/i915: Mark intel_wakeref_get() as a sleeper Assume that intel_wakeref_get() may take the mutex, and perform other sleeping actions in the course of its callbacks and so use might_sleep() to ensure that all callers abide. Anything that cannot sleep has to use e.g. intel_wakeref_get_if_active() to guarantee its avoidance of the non-atomic paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121130528.309474-1-chris@chris-wilson.co.uk	2019-11-21 13:22:04 +00:00
Chris Wilson	c95d31c3df	drm/i915/execlists: Lock the request while validating it during promotion Since the request is already on the HW as we perform its validation, it and even its subsequent barrier may be concurrently retired before we process the assertions. If it is retired already and so off the HW, our assertions become void and we need to ignore them. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112363 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121103546.146487-1-chris@chris-wilson.co.uk	2019-11-21 12:14:45 +00:00
Chris Wilson	090a82e916	drm/i915/gt: Hold request reference while waiting for w/a verification As we wait upon a request, we must be holding a reference to it, and be wary that i915_request_add() consumes the passed in reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121093326.134774-1-chris@chris-wilson.co.uk	2019-11-21 11:54:46 +00:00
Chris Wilson	2d0fb25136	drm/i915: Serialise with remote retirement Since retirement may be running in a worker on another CPU, it may be skipped in the local intel_gt_wait_for_idle(). To ensure the state is consistent for our sanity checks upon load, serialise with the remote retirer by waiting on the timeline->mutex. Outside of this use case, e.g. on suspend or module unload, we expect the slack to be picked up by intel_gt_pm_wait_for_idle() and so prefer to put the special case serialisation with retirement in its single user, for now at least. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-2-chris@chris-wilson.co.uk	2019-11-21 11:53:55 +00:00
Chris Wilson	689122dcc3	Revert "drm/i915/gt: Wait for new requests in intel_gt_retire_requests()" From inside an active timeline in the execbuf ioctl, we may try to reclaim some space in the GGTT. We need GGTT space for all objects on !full-ppgtt platforms, and for context images everywhere. However, to free up space in the GGTT we may need to remove some pinned objects (e.g. context images) that require flushing the idle barriers to remove. For this we use the big hammer of intel_gt_wait_for_idle() However, commit `7936a22dd4` ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()") will continue spinning on the wait if a timeline is active but lacks requests, as is the case during execbuf reservation. Spinning forever is quite time consuming, so revert that commit and start again. In practice, the effect commit `7936a22dd4` was trying to achieve is accomplished by commit `1683d24c14` ("drm/i915/gt: Move new timelines to the end of active_list"), so there is no immediate rush to replace the looping. Testcase: igt/gem_exec_reloc/basic-range Fixes: `7936a22dd4` ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()") References: `1683d24c14` ("drm/i915/gt: Move new timelines to the end of active_list") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-1-chris@chris-wilson.co.uk	2019-11-21 07:59:30 +00:00
Stuart Summers	e18417b48b	drm/i915: Use intel_gt_pm_put_async in GuC submission path GuC submission path can be called from an interrupt context and so should use a worker to avoid holding a mutex. References: `07779a76ee` ("drm/i915: Mark up the calling context for intel_wakeref_put()") Signed-off-by: Stuart Summers <stuart.summers@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191120211321.88021-1-stuart.summers@intel.com	2019-11-20 21:23:10 +00:00
Chris Wilson	e435c608e8	drm/i915/gt: Fixup config ifdeffery for pm_suspend_target_state pm_suspend_target_state is declared under CONFIG_PM_SLEEP but only defined under CONFIG_SUSPEND. Play safe and only use the symbol if it is both declared and defined. Reported-by: kbuild-all@lists.01.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `a70a9e998e` ("drm/i915: Defer rc6 shutdown to suspend_late") Cc: Andi Shyti <andi.shyti@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120182209.3967833-1-chris@chris-wilson.co.uk	2019-11-20 20:34:44 +00:00
Chris Wilson	88cec4973d	drm/i915/gt: Declare timeline.lock to be irq-free Now that we never allow the intel_wakeref callbacks to be invoked from interrupt context, we do not need the irqsafe spinlock for the timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120170858.3965380-1-chris@chris-wilson.co.uk	2019-11-20 17:12:11 +00:00
Chris Wilson	5cba288466	drm/i915/gt: Unlock engine-pm after queuing the kernel context switch In commit `a79ca656b6` ("drm/i915: Push the wakeref->count deferral to the backend"), I erroneously concluded that we last modify the engine inside __i915_request_commit() meaning that we could enable concurrent submission for userspace as we enqueued this request. However, this falls into a trap with other users of the engine->kernel_context waking up and submitting their request before the idle-switch is queued, with the result that the kernel_context is executed out-of-sequence most likely upsetting the GPU and certainly ourselves when we try to retire the out-of-sequence requests. As such we need to hold onto the effective engine->kernel_context mutex lock (via the engine pm mutex proxy) until we have finish queuing the request to the engine. v2: Serialise against concurrent intel_gt_retire_requests() v3: Describe the hairy locking scheme with intel_gt_retire_requests() for future reference. v4: Combine timeline->lock and engine pm release; it's hairy. Fixes: `a79ca656b6` ("drm/i915: Push the wakeref->count deferral to the backend") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk	2019-11-20 16:58:26 +00:00
Chris Wilson	a6edbca74b	drm/i915/gt: Close race between engine_park and intel_gt_retire_requests The general concept was that intel_timeline.active_count was locked by the intel_timeline.mutex. The exception was for power management, where the engine->kernel_context->timeline could be manipulated under the global wakeref.mutex. This was quite solid, as we always manipulated the timeline only while we held an engine wakeref. And then we started retiring requests outside of struct_mutex, only using the timelines.active_list and the timeline->mutex. There we started manipulating intel_timeline.active_count outside of an engine wakeref, and so introduced a race between __engine_park() and intel_gt_retire_requests(), a race that could result in the engine->kernel_context not being added to the active timelines and so losing requests, which caused us to keep the system permanently powered up [and unloadable]. The race would be easy to close if we could take the engine wakeref for the timeline before we retire -- except timelines are not bound to any engine and so we would need to keep all active engines awake. The alternative is to guard intel_timeline_enter/intel_timeline_exit for use outside of the timeline->mutex. Fixes: `e5dadff4b0` ("drm/i915: Protect request retirement with timeline->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk	2019-11-20 16:57:33 +00:00
Chris Wilson	07779a76ee	drm/i915: Mark up the calling context for intel_wakeref_put() Previously, we assumed we could use mutex_trylock() within an atomic context, falling back to a worker if contended. However, such trickery is illegal inside interrupt context, and so we need to always use a worker under such circumstances. As we normally are in process context, we can typically use a plain mutex, and only defer to a work when we know we are being called from an interrupt path. Fixes: `51fbd8de87` ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") References: `a0855d24fc` ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") References: https://bugs.freedesktop.org/show_bug.cgi?id=111626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk	2019-11-20 15:59:23 +00:00
Stuart Summers	8a126392b7	drm/i915: Do not initialize display BW when display not available When display is not available, finding the memory bandwidth available for display is not useful. Skip this sequence here. References: HSDES 1209978255 Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120011016.18049-1-stuart.summers@intel.com	2019-11-20 17:43:47 +02:00
Stuart Summers	e7862f476e	Skip MCHBAR queries when display is not available Platforms without display do not map the MCHBAR MMIO into the GFX device BAR. Skip this sequence when display is not available. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120004505.149516-1-stuart.summers@intel.com	2019-11-20 17:43:47 +02:00
Ville Syrjälä	7451a074bf	drm/i915: Change .crtc_enable/disable() calling convention Just pass the atomic state+crtc to the .crtc_enable() .crtc_disable(). Life is easier when you don't have to think whether to pass the old or the new crtc state. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191118164430.27265-11-ville.syrjala@linux.intel.com	2019-11-20 17:43:47 +02:00
Ville Syrjälä	502d871459	drm/i915: s/pipe_config/new_crtc_state/ in .crtc_enable() Rename pipe_config to new_crtc_state in the .crtc_enable() hooks. The 'pipe_config' name is a zombie that we need to finally put down. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191118164430.27265-10-ville.syrjala@linux.intel.com	2019-11-20 17:43:47 +02:00

1 2 3 4 5 ...

875186 Commits