linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Mika Kuoppala	0c7c0c8e6f	drm/i915/gen12: Flush L3 Flush TDL,L3 and EUs Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-3-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	32d7171ee2	drm/i915/gen12: Fix HDC pipeline flush HDC pipeline flush is bit on the first dword of the PIPE_CONTROL, not the second. Make it so. v2: function naming (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	f02ac414ba	Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" This reverts commit `62037ffff2`. L3 ro cache invalidation is part of the dword0 of pipe control. Also it is not relevant to this gen. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-1-mika.kuoppala@linux.intel.com	2020-05-07 07:44:40 +01:00
Chris Wilson	24fe5f2ab2	drm/i915: Propagate error from completed fences We need to preserve fatal errors from fences that are being terminated as we hook them up. Fixes: `ef46884975` ("drm/i915: Propagate fence errors") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200506162136.3325-1-chris@chris-wilson.co.uk	2020-05-06 18:23:14 +01:00
Matt Roper	9b2383a7ac	drm/i915/icp: Add Wa_14010685332 We need to toggle a SDE chicken bit on and then off as the final step when disabling interrupts in preparation for runtime suspend. Bspec: 33450 Bspec: 8402 Cc: Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501213701.371443-1-matthew.d.roper@intel.com Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>	2020-05-05 14:26:46 -07:00
Chris Wilson	977253df64	drm/i915/gt: Stop holding onto the pinned_default_state As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk	2020-05-05 21:12:33 +01:00
Chris Wilson	b68be5c623	drm/i915/execlists: Record the active CCID from before reset If we cannot trust the reset will flush out the CS event queue such that process_csb() reports an accurate view of HW, we will need to search the active and pending contexts to determine which was actually running at the time we issued the reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk	2020-05-05 12:05:40 +01:00
Stanislav Lisovskiy	f136c58a0d	drm/i915: Added required new PCode commands We need a new PCode request commands and reply codes to be added as a prepartion patch for QGV points restricting for new SAGV support. v2: - Extracted those changes into separate patch (Ville Syrjälä) v3: - Moved new PCode masks to another place from PCode commands(Ville) v4: - Moved new PCode masks to correspondent PCode command, with identation(Ville) - Changed naming to ICL_ instead of GEN11_ to fit more nicely into existing definition style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505102247.32452-5-stanislav.lisovskiy@intel.com	2020-05-05 13:59:55 +03:00
Imre Deak	054318c7e3	drm/i915/tgl+: Fix interrupt handling for DP AUX transactions Unmask/enable AUX interrupts on all ports on TGL+. So far the interrupts worked only on port A, which meant each transaction on other ports took 10ms. Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504075828.20348-1-imre.deak@intel.com	2020-05-05 11:59:48 +03:00
Chris Wilson	25fd6de315	drm/i915/gt: Small tidy of gen8+ breadcrumb emission Use a local to shrink a line under 80 columns, and refactor the common emit_xcs_breadcrumb() wrapper of ggtt-write. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180507.6017-1-chris@chris-wilson.co.uk	2020-05-05 09:16:59 +01:00
Chris Wilson	8757797ff9	drm/i915/selftests: Repeat the rps clock frequency measurement Repeat the measurement of the clock frequency a few times and use the median to try and reduce the systematic measurement error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-6-chris@chris-wilson.co.uk	2020-05-04 18:21:28 +01:00
Chris Wilson	0065e5f5cc	drm/i915/display: Warn if the FBC is still writing to stolen on removal If the FBC is still writing into stolen, it will overwrite any future users of that stolen region. Check before release, just to ease any concerns -- we can remove it again later if it is barking up the wrong tree. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1635 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200503180034.20010-1-chris@chris-wilson.co.uk	2020-05-04 17:11:51 +01:00
Sultan Alsawaf	690d22dafa	drm/i915: Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled In commit `5a7d202b15`, a logical AND was erroneously changed to an OR, causing WaIncreaseLatencyIPCEnabled to be enabled unconditionally for kabylake and coffeelake, even when IPC is disabled. Fix the logic so that WaIncreaseLatencyIPCEnabled is only used when IPC is enabled. Fixes: `5a7d202b15` ("drm/i915: Drop WaIncreaseLatencyIPCEnabled/1140 for cnl") Cc: stable@vger.kernel.org # 5.3.x+ Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430214654.51314-1-sultan@kerneltoast.com	2020-05-04 18:55:41 +03:00
Ville Syrjälä	2dd43144e8	drm/i915: Streamline the artihmetic All these ROUNDING_FACTORs and whatnot are making this thing hard to read. Get rid of them. And let's massage some of the fractions to give us less questionable intermediate results and perhaps less divisions. Also looks like a good helping of 64bit math stuff is needed to avoid some of overflows present in the current code. There might still be a few overflows, namely when calculating link_clks_available/samples_room (would require a huge hblank though), and potentially when calculating hblank_rise (not sure how large link_clks_active can get). It looks like we're still not calculating exactly what the spec says since we truncate tu_data and tu_line early. But I'm too lazy to figure out if we could avoid that. v2: Fix typo in commit msg (Uma) Remove ROUNDING_FACTOR define (Uma) s/5link_clk+5cdclk/5*(link_clk+cdclk)/ (Chris) Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-3-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:53 +03:00
Ville Syrjälä	41ee86d6ee	drm/i915: Rename variables to be consistent with bspec Since the code seems insistent on using the variable names from the bspec formulat, let's be consistent and use those names for all the things. For some reason 'link_clk' and 'lanes' were left out in the code until now. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-2-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:53 +03:00
Ville Syrjälä	d19b29be65	drm/i915: Nuke mode.vrefresh usage mode.vrefresh is rounded to the nearest integer. You don't want to use it anywhere that requires precision. Also I want to nuke it. vtotalvrefresh == 1000clock/htotal, so let's use the latter. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Uma Shankar <uma.shankar@intel.com>	2020-05-04 18:44:52 +03:00
Ville Syrjälä	dab3aff7b1	drm/i915: Remove cnl pre-prod workarounds Remove all the stepping dependent cnl workarounds. Bspec lists more steppings than this so presumably these are classed as pre-production. And this is cnl after all so no one should really care anyway. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2020-05-04 18:44:52 +03:00
Ville Syrjälä	25444ca6cb	drm/i915/fbc: Require linear fb stride to be multiple of 512 bytes on gen9/glk Display WA #1105 says that FBC requires PLANE_STRIDE to be a multiple of 512 bytes on gen9 and glk. This is definitely true for glk as certain tests (such as igt/kms_big_fb/linear-16bpp-rotate-0) are now failing when the display resolution results in a plane stride which is not a multiple of 512 bytes. Curiously I was not able to reproduce this on a KBL. First I suspected that our use of the FBC override stride explain this, but after trying to use the override stride on glk the test still failed. I did try both the old CHICKEN_MISC_4 way and the new FBC_STRIDE way, neither had any effect on the result. Anyways, we need this at least on glk. But let's trust the spec and apply the w/a for all gen9 as well, despite being unable to reproduce the problem. v2: s/FBC_CHICKEN/FBC_STRIDE/ in commit msg Cc: José Roberto de Souza <jose.souza@intel.com> Fixes: `691f7ba58d` ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-2-ville.syrjala@linux.intel.com Reviewed-by: Matt Roper <matthew.d.roper@intel.com>	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	9ff79708c5	drm/i915: Rename bw_state to new_bw_state That is a preparation patch before next one where we introduce old_bw_state and a bunch of other changes as well. In a review comment it was suggested to split out at least that renaming into a separate patch, what is done here. v2: Removed spurious space Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423075902.21892-8-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	ecab0f3d05	drm/i915: Track active_pipes in bw_state We need to calculate SAGV mask also in a non-modeset commit, however currently active_pipes are only calculated for modesets in global atomic state, thus now we will be tracking those also in bw_state in order to be able to properly access global data. v2: - Removed pre/post plane SAGV updates from modeset(Ville) - Now tracking active pipes in intel_can_enable_sagv(Ville) v3: - lock global state if active_pipes change as well(Ville) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430195634.7666-1-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Stanislav Lisovskiy	9728889f42	drm/i915: Use bw state for per crtc SAGV evaluation Future platforms require per-crtc SAGV evaluation and serializing global state when those are changed from different commits. v2: - Add has_sagv check to intel_crtc_can_enable_sagv so that it sets bit in reject mask. - Use bw_state in intel_pre/post_plane_enable_sagv instead of atomic state v3: - Fixed rebase conflict, now using intel_atomic_crtc_state_for_each_plane_state in order to call it from atomic check v4: - Use fb modifier from plane state v5: - Make intel_has_sagv static again(Ville) - Removed unnecessary NULL assignments(Ville) - Removed unnecessary SAGV debug(Ville) - Call intel_compute_sagv_mask only for modesets(Ville) - Serialize global state only if sagv results change, but not mask itself(Ville) v6: - use lock global state instead of serialize(Ville) v7: - use both global state lock and serialize depending on if we need to change only global state or access hw (Ville) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430191757.18206-1-stanislav.lisovskiy@intel.com	2020-05-04 18:44:52 +03:00
Chris Wilson	e3d291301f	drm/i915/gem: Implement legacy MI_STORE_DATA_IMM The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk	2020-05-04 15:15:04 +01:00
Chris Wilson	f5b62bdbb6	drm/i915/gem: Specify address type for chained reloc batches It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: `964a9b0f61` ("drm/i915/gem: Use chained reloc batches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk	2020-05-04 14:28:48 +01:00
Chris Wilson	378974f7f9	drm/i915: Allow some leniency in PCU reads Extend the timeout for pcode reads to 20ms as they should not be performed along critical paths, and succeeding after a short delay is better than failing entirely. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1800 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-1-chris@chris-wilson.co.uk	2020-05-04 11:12:37 +01:00
Chris Wilson	6983dafa31	drm/i915/gem: Lazily acquire the device wakeref for freeing objects We only need the device wakeref on freeing the objects if we have to unbind the object from the global GTT, or otherwise update device information. If the objects are clean, we never need the wakeref, so avoid taking until required. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk	2020-05-04 11:12:37 +01:00
Chris Wilson	389b7f00c7	drm/i915/gt: Sanitize RPS interrupts upon resume Currently we clear and disable the RPS pm interrupts on module load, and presume that they remain disabled forevermore. However, the mask is cleared on suspend and so after resume they may start showing up again unexepectedly. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811 Fixes: `8e99299a04` ("drm/i915/gt: Track use of RPS interrupts in flags") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk	2020-05-03 08:24:36 +01:00
Chris Wilson	6f576d6277	drm/i915/gem: Try an alternate engine for relocations If at first we don't succeed, try try again. Not all engines may support the MI ops we need to perform asynchronous relocation patching, and so we end up falling back to a synchronous operation that has a liability of blocking. However, Tvrtko pointed out we don't need to use the same engine to perform the relocations as we are planning to execute the execbuf on, and so if we switch over to a working engine, we can perform the relocation asynchronously. The user execbuf will be queued after the relocations by virtue of fencing. This patch creates a new context per execbuf requiring asynchronous relocations on an unusable engines. This is perhaps a bit excessive and can be ameliorated by a small context cache, but for the moment we only need it for working around a little used engine on Sandybridge, and only if relocations are actually required to an active batch buffer. Now we just need to teach the relocation code to handle physical addressing for gen2/3, and we should then have universal support! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_reloc/basic-spin # snb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk	2020-05-01 22:56:16 +01:00
Chris Wilson	0e97fbb080	drm/i915/gem: Use a single chained reloc batches for a single execbuf As we can now keep chaining together a relocation batch to process any number of relocations, we can keep building that relocation batch for all of the target vma. This avoiding emitting a new request into the ring for each target, consuming precious ring space and a potential stall. v2: Propagate the failure from submitting the relocation batch. Testcase: igt/gem_exec_reloc/basic-wide-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Chris Wilson	964a9b0f61	drm/i915/gem: Use chained reloc batches The ring is a precious resource: we anticipate to only use a few hundred bytes for a request, and only try to reserve that before we start. If we go beyond our guess in building the request, then instead of waiting at the start of execbuf before we hold any locks or other resources, we may trigger a wait inside a critical region. One example is in using gpu relocations, where currently we emit a new MI_BB_START from the ring every time we overflow a page of relocation entries. However, instead of insert the command into the precious ring, we can chain the next page of relocation entries as MI_BB_START from the end of the previous. v2: Delay the emit_bb_start until after all the chained vma synchronisation is complete. Since the buffer pool batches are idle, this _should_ be a no-op, but one day we may some fancy async GPU bindings for new vma! v3: Use pool/batch consitently, once we start thinking in terms of the batch vma, use batch->obj. v4: Explain the magic number 4. Tvrtko spotted that we lose propagation of the error for failing to submit the relocation request; that's easier to fix up in the next patch. Testcase: igt/gem_exec_reloc/basic-many-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Chris Wilson	9f909e215f	drm/i915: Implement vm_ops->access for gdb access into mmaps gdb uses ptrace() to peek and poke bytes of the target's address space. The driver must implement an vm_ops->access() handler or else gdb will be unable to inspect the pointer and report it as out-of-bounds. Worse than useless as it causes immediate suspicion of the valid GTT pointer, distracting the poor programmer trying to find his bug. v2: Write-protect readonly objects (Matthew). Testcase: igt/gem_mmap_gtt/ptrace Testcase: igt/gem_mmap_offset/ptrace Suggested-by: Kristian H. Kristensen <hoegsberg@google.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk	2020-05-01 17:30:47 +01:00
Chris Wilson	a211da9c77	drm/i915/gt: Make timeslicing an explicit engine property In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `8ee36e048c` ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk	2020-05-01 15:17:33 +01:00
Chris Wilson	3b55cdeb8f	drm/i915/pmu: Keep a reference to module while active While a perf event is open, keep a reference to the module so we don't remove the driver internals mid-sampling. Testcase: igt/perf_pmu/module-unload Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk	2020-05-01 09:24:34 +01:00
Chris Wilson	16e8745967	drm/i915/gt: Move the batch buffer pool from the engine to the gt Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk	2020-04-30 19:12:02 +01:00
Joonas Lahtinen	230982d8d8	drm/i915: Update DRIVER_DATE to 20200430 Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-04-30 11:13:21 +03:00
Joonas Lahtinen	8b46ed57f3	Merge tag 'gvt-next-2020-04-22' of https://github.com/intel/gvt-linux into drm-intel-next-queued gvt-next-2020-04-22 - remove non-upstream xen support bits (Christoph) - guest context shadow copy optimization (Yan) - guest context tracking for shadow skip optimization (Yan) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> From: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422051230.GH11247@zhen-hp.sh.intel.com	2020-04-30 10:53:21 +03:00
Zbigniew Kempczyński	79eb8c7f01	drm/i915/selftests: Add tiled blits selftest Extend coverage of the blitter client by exercising conversion to and from tiled sources. In the process we perform spot checks to verify that the tiling/detiling is being applied correctly, along with position invariance of the tiling parameters. Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk	2020-04-30 08:31:12 +01:00
Chris Wilson	de3b4d9361	drm/i915/gt: Restore aggressive post-boost downclocking We reduced the clocks slowly after a boost event based on the observation that the smoothness of animations suffered. However, since reducing the evalution intervals, we should be able to respond to the rapidly fluctuating workload of a simple desktop animation and so restore the more aggressive downclocking. References: `2a8862d2f3` ("drm/i915: Reduce the RPS shock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk	2020-04-30 00:57:38 +01:00
Chris Wilson	3f88dde6ee	drm/i915/gt: Apply the aggressive downclocking to parking We treat parking as a manual RPS timeout event, and downclock the GPU for the next unpark and batch execution. However, having restored the aggressive downclocking and observed that we have very light workloads whose only interaction is through the manual parking events, carry over the aggressive downclocking to the fake RPS events. References: `21abf0bf16` ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk	2020-04-30 00:57:37 +01:00
Chris Wilson	36d516be86	drm/i915/gt: Switch to manual evaluation of RPS As with the realisation for soft-rc6, we respond to idling the engines within microseconds, far faster than the response times for HW RC6 and RPS. Furthermore, our fast parking upon idle, prevents HW RPS from running for many desktop workloads, as the RPS evaluation intervals are on the order of tens of milliseconds, but the typical workload is just a couple of milliseconds, but yet we still need to determine the best frequency for user latency versus power. Recognising that the HW evaluation intervals are a poor fit, and that they were deprecated [in bspec at least] from gen10, start to wean ourselves off them and replace the EI with a timer and our accurate busy-stats. The principle benefit of manually evaluating RPS intervals is that we can be more responsive for better performance and powersaving for both spiky workloads and steady-state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698 Fixes: `98479ada42` ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk	2020-04-30 00:57:37 +01:00
Chris Wilson	8e99299a04	drm/i915/gt: Track use of RPS interrupts in flags Use the new intel_rps.flags field to store whether or not interrupts are being used with RPS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-3-chris@chris-wilson.co.uk	2020-04-30 00:57:36 +01:00
Chris Wilson	9bad2adbdd	drm/i915/gt: Move rps.enabled/active to flags Pull the boolean intel_rps.enabled and intel_rps.active into a single flags field, in preparation for more. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-2-chris@chris-wilson.co.uk	2020-04-30 00:57:35 +01:00
Chris Wilson	426d0073fb	drm/i915/gt: Always enable busy-stats for execlists In the near future, we will utilize the busy-stats on each engine to approximate the C0 cycles of each, and use that as an input to a manual RPS mechanism. That entails having busy-stats always enabled and so we can remove the enable/disable routines and simplify the pmu setup. As a consequence of always having the stats enabled, we can also show the current active time via sysfs/engine/xcs/active_time_ns. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk	2020-04-30 00:57:34 +01:00
Chris Wilson	be1cb55a07	drm/i915/gt: Keep a no-frills swappable copy of the default context state We need to keep the default context state around to instantiate new contexts (aka golden rendercontext), and we also keep it pinned while the engine is active so that we can quickly reset a hanging context. However, the default contexts are large enough to merit keeping in swappable memory as opposed to kernel memory, so we store them inside shmemfs. Currently, we use the normal GEM objects to create the default context image, but we can throw away all but the shmemfs file. This greatly simplifies the tricky power management code which wants to run underneath the normal GT locking, and we definitely do not want to use any high level objects that may appear to recurse back into the GT. Though perhaps the primary advantage of the complex GEM object is that we aggressively cache the mapping, but here we are recreating the vm_area everytime time we unpark. At the worst, we add a lightweight cache, but first find a microbenchmark that is impacted. Having started to create some utility functions to make working with shmemfs objects easier, we can start putting them to wider use, where GEM objects are overkill, such as storing persistent error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk	2020-04-29 19:02:37 +01:00
Dan Carpenter	8c35a19576	drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb() If intel_context_create() fails then it leads to an error pointer dereference. I shuffled things around to make error handling easier. Fixes: `1dd47b54ba` ("drm/i915: Add live selftests for indirect ctx batchbuffers") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda	2020-04-29 15:16:35 +01:00
Chris Wilson	24aac336ff	drm/i915: Avoid dereferencing a dead context Once the intel_context is closed, the GEM context may be freed and so the link from intel_context.gem_context is invalid. <3>[ 219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <3>[ 219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12 <4>[ 219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G U 5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1 <4>[ 219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017 <4>[ 219.783105] Workqueue: events heartbeat [i915] <4>[ 219.783109] Call Trace: <4>[ 219.783113] <IRQ> <4>[ 219.783119] dump_stack+0x96/0xdb <4>[ 219.783177] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783182] print_address_description.constprop.6+0x16/0x310 <4>[ 219.783239] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783295] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783300] __kasan_report+0x137/0x190 <4>[ 219.783359] ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783366] kasan_report+0x32/0x50 <4>[ 219.783426] intel_engine_coredump_alloc+0x1bc3/0x2250 [i915] <4>[ 219.783481] execlists_reset+0x39c/0x13d0 [i915] <4>[ 219.783494] ? mark_held_locks+0x9e/0xe0 <4>[ 219.783546] ? execlists_hold+0xfc0/0xfc0 [i915] <4>[ 219.783551] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 219.783557] ? _raw_spin_unlock_irqrestore+0x34/0x60 <4>[ 219.783606] ? execlists_submission_tasklet+0x118/0x3a0 [i915] <4>[ 219.783615] tasklet_action_common.isra.14+0x13b/0x410 <4>[ 219.783623] ? __do_softirq+0x1e4/0x9a7 <4>[ 219.783630] __do_softirq+0x226/0x9a7 <4>[ 219.783643] do_softirq_own_stack+0x2a/0x40 <4>[ 219.783647] </IRQ> <4>[ 219.783692] ? heartbeat+0x3e2/0x10f0 [i915] <4>[ 219.783696] do_softirq.part.13+0x49/0x50 <4>[ 219.783700] __local_bh_enable_ip+0x1a2/0x1e0 <4>[ 219.783748] heartbeat+0x409/0x10f0 [i915] <4>[ 219.783801] ? __live_idle_pulse+0x9f0/0x9f0 [i915] <4>[ 219.783806] ? lock_acquire+0x1ac/0x8a0 <4>[ 219.783811] ? process_one_work+0x811/0x1870 <4>[ 219.783827] ? rcu_read_lock_sched_held+0x9c/0xd0 <4>[ 219.783832] ? rcu_read_lock_bh_held+0xb0/0xb0 <4>[ 219.783836] ? _raw_spin_unlock_irq+0x1f/0x40 <4>[ 219.783845] process_one_work+0x8ca/0x1870 <4>[ 219.783848] ? lock_acquire+0x1ac/0x8a0 <4>[ 219.783852] ? worker_thread+0x1d0/0xb80 <4>[ 219.783864] ? pwq_dec_nr_in_flight+0x2c0/0x2c0 <4>[ 219.783870] ? do_raw_spin_lock+0x129/0x290 <4>[ 219.783886] worker_thread+0x82/0xb80 <4>[ 219.783895] ? __kthread_parkme+0xaf/0x1b0 <4>[ 219.783902] ? process_one_work+0x1870/0x1870 <4>[ 219.783906] kthread+0x34e/0x420 <4>[ 219.783911] ? kthread_create_on_node+0xc0/0xc0 <4>[ 219.783918] ret_from_fork+0x3a/0x50 <3>[ 219.783950] Allocated by task 1264: <4>[ 219.783975] save_stack+0x19/0x40 <4>[ 219.783978] __kasan_kmalloc.constprop.3+0xa0/0xd0 <4>[ 219.784029] i915_gem_create_context+0xa2/0xab8 [i915] <4>[ 219.784081] i915_gem_context_create_ioctl+0x1fa/0x450 [i915] <4>[ 219.784085] drm_ioctl_kernel+0x1d8/0x270 <4>[ 219.784088] drm_ioctl+0x676/0x930 <4>[ 219.784092] ksys_ioctl+0xb7/0xe0 <4>[ 219.784096] __x64_sys_ioctl+0x6a/0xb0 <4>[ 219.784100] do_syscall_64+0x94/0x530 <4>[ 219.784103] entry_SYSCALL_64_after_hwframe+0x49/0xb3 <3>[ 219.784120] Freed by task 12: <4>[ 219.784141] save_stack+0x19/0x40 <4>[ 219.784145] __kasan_slab_free+0x130/0x180 <4>[ 219.784148] kmem_cache_free_bulk+0x1bd/0x500 <4>[ 219.784152] kfree_rcu_work+0x1d8/0x890 <4>[ 219.784155] process_one_work+0x8ca/0x1870 <4>[ 219.784158] worker_thread+0x82/0xb80 <4>[ 219.784162] kthread+0x34e/0x420 <4>[ 219.784165] ret_from_fork+0x3a/0x50 Fixes: `2e46a2a0b0` ("drm/i915: Use explicit flag to mark unreachable intel_context") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428090255.10035-1-chris@chris-wilson.co.uk	2020-04-29 15:16:00 +01:00
Nathan Chancellor	2ea4a7ba9b	drm/i915/gt: Avoid uninitialized use of rpcurupei in frequency_show When building with clang + -Wuninitialized: drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable 'rpcurupei' is uninitialized when used here [-Wuninitialized] rpcurupei, ^~~~~~~~~ drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the variable 'rpcurupei' to silence this warning u32 rpcurupei, rpcurup, rpprevup; ^ = 0 1 warning generated. rpupei is assigned twice; based on the second argument to intel_uncore_read, it seems this one should have been assigned to rpcurupei. Fixes: `9c878557b1` ("drm/i915/gt: Use the RPM config register to determine clk frequencies") Link: https://github.com/ClangBuiltLinux/linux/issues/1016 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com	2020-04-29 07:46:21 +01:00
Chris Wilson	f6a7c21c99	drm/i915/execlists: Verify we don't submit two identical CCIDs Check that we do not submit two contexts into ELSP with the same CCID [upper portion of the descriptor]. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-3-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Chris Wilson	5c4a53e3b1	drm/i915/execlists: Track inflight CCID The presumption is that by using a circular counter that is twice as large as the maximum ELSP submission, we would never reuse the same CCID for two inflight contexts. However, if we continually preempt an active context such that it always remains inflight, it can be resubmitted with an arbitrary number of paired contexts. As each of its paired contexts will use a new CCID, eventually it will wrap and submit two ELSP with the same CCID. Rather than use a simple circular counter, switch over to a small bitmap of inflight ids so we can avoid reusing one that is still potentially active. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: `2935ed5339` ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Chris Wilson	2632f174a2	drm/i915/execlists: Avoid reusing the same logical CCID The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: `2935ed5339` ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Matt Atwood	f9d77427c3	drm/i915/tgl: Wa_14011059788 Reflect recent Bspec changes v2: fix whitespace, typo Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Radhakrishna Sripada <Radhakrishna.sripada@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200415193535.14597-1-matthew.s.atwood@intel.com	2020-04-28 11:14:34 -07:00
Chris Wilson	96a4faf524	drm/i915/selftests: Tweak the tolerance for clock ticks to 12.5% Give a small bump for our tolerance on comparing the expected vs measured clock ticks/time from 10% to 12.5% to accommodate a bad result on Sandybridge that was off by 10.3%. Hopefully, that is the worst we will see. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1802 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428114307.5153-1-chris@chris-wilson.co.uk	2020-04-28 14:25:21 +01:00
Colin Ian King	d631461d5c	drm/i915/gt: fix spelling mistake "evalution" -> "evaluation" There is a spelling mistaking in a pr_notice message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200428084920.1035125-1-colin.king@canonical.com	2020-04-28 09:53:59 +01:00
Matt Roper	869129ee0c	drm/i915: Use proper fault mask in interrupt postinstall too The IRQ postinstall handling had open-coded pipe fault mask selection that never got updated for gen11. Switch it to use gen8_de_pipe_fault_mask() to ensure we don't miss updates for new platforms. Cc: José Roberto de Souza <jose.souza@intel.com> Fixes: `d506a65d56` ("drm/i915: Catch GTT fault errors for gen11+ planes") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424231423.4065231-1-matthew.d.roper@intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2020-04-27 11:36:41 -07:00
Chris Wilson	6dc0d028f5	drm/i915/gt: Fix up clock frequency The bspec lists both the clock frequency and the effective interval. The interval corresponds to observed behaviour, so adjust the frequency to match. v2: Mika rightfully asked if we could measure the clock frequency from a selftest. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk	2020-04-27 17:34:33 +01:00
Chris Wilson	4243cd5388	drm/i915/gt: Sanitize GT first We see that if the HW doesn't actually sleep, the HW may eat the poison we set in its write-only HWSP during sanitize: intel_gt_resume.part.8: 0000:00:02.0 __gt_unpark: 0000:00:02.0 gt_sanitize: 0000:00:02.0 force:yes process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90 process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a assert_pending_valid: Nothing pending for promotion! The CS TAIL pointer should have been reset by reset_csb_pointers(), so in this case it is likely that we have read back from the CPU cache and so we must clflush our control over that page. In doing so, push the sanitisation to the start of the GT sequence so that our poisoning is assuredly before we start talking to the HW. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk	2020-04-27 11:39:23 +01:00
Chris Wilson	2759e39535	drm/i915/gt: Check cacheline is valid before acquiring The hwsp_cacheline pointer from i915_request is very, very flimsy. The i915_request.timeline (and the hwsp_cacheline) are lost upon retiring (after an RCU grace). Therefore we need to confirm that once we have the right pointer for the cacheline, it is not in the process of being retired and disposed of before we attempt to acquire a reference to the cacheline. <3>[ 547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915] <3>[ 547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536 <4>[ 547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G U 5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1 <4>[ 547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016 <4>[ 547.208564] Call Trace: <4>[ 547.208579] dump_stack+0x96/0xdb <4>[ 547.208707] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208719] print_address_description.constprop.6+0x16/0x310 <4>[ 547.208841] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208963] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208975] __kasan_report+0x137/0x190 <4>[ 547.209106] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209127] kasan_report+0x32/0x50 <4>[ 547.209257] ? i915_gemfs_fini+0x40/0x40 [i915] <4>[ 547.209376] active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209389] debug_print_object+0xa7/0x220 <4>[ 547.209405] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.209426] debug_object_assert_init+0x297/0x430 <4>[ 547.209449] ? debug_object_free+0x360/0x360 <4>[ 547.209472] ? lock_acquire+0x1ac/0x8a0 <4>[ 547.209592] ? intel_timeline_read_hwsp+0x4f/0x840 [i915] <4>[ 547.209737] ? i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209861] i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209990] ? __live_alloc.isra.15+0xc0/0xc0 [i915] <4>[ 547.210005] ? rcu_read_lock_sched_held+0xd0/0xd0 <4>[ 547.210017] ? print_usage_bug+0x580/0x580 <4>[ 547.210153] intel_timeline_read_hwsp+0xbc/0x840 [i915] <4>[ 547.210284] __emit_semaphore_wait+0xd5/0x480 [i915] <4>[ 547.210415] ? i915_fence_get_timeline_name+0x110/0x110 [i915] <4>[ 547.210428] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.210442] ? _raw_spin_unlock_irq+0x2a/0x40 <4>[ 547.210567] ? __await_execution.constprop.51+0x2e0/0x570 [i915] <4>[ 547.210706] i915_request_await_dma_fence+0x8f7/0xc70 [i915] Fixes: `85bedbf191` ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk	2020-04-27 11:39:23 +01:00
Chris Wilson	68ace460c5	drm/i915/execlists: Check preempt-timeout target before submit_ports We evaluate active, which is a pointer into execlists->inflight[] during dequeue to decide how long a preempt-timeout we need to apply. However, as soon as we do the submit_ports, the HW may send its ACK interrupt causing us to promote execlists->pending[] tp execlists->inflight[], overwriting the value of active. We know *active is only stable until we submit (as we only submit when there is no pending promotion). [ 16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915] [ 16.102356] [ 16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1: [ 16.102780] execlists_dequeue+0x1449/0x1600 [i915] [ 16.103160] __execlists_submission_tasklet+0x48/0x60 [i915] [ 16.103540] execlists_submit_request+0x38e/0x3c0 [i915] [ 16.103940] submit_notify+0x8f/0xc0 [i915] [ 16.104308] __i915_sw_fence_complete+0x61/0x420 [i915] [ 16.104683] i915_sw_fence_complete+0x58/0x80 [i915] [ 16.105054] i915_sw_fence_commit+0x16/0x20 [i915] [ 16.105457] __i915_request_queue+0x60/0x70 [i915] [ 16.105843] i915_gem_do_execbuffer+0x2d6b/0x4230 [i915] [ 16.106227] i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915] [ 16.106257] drm_ioctl_kernel+0xe9/0x130 [ 16.106279] drm_ioctl+0x27d/0x45e [ 16.106311] ksys_ioctl+0x89/0xb0 [ 16.106336] __x64_sys_ioctl+0x42/0x60 [ 16.106370] do_syscall_64+0x6e/0x2c0 [ 16.106397] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk	2020-04-27 11:36:59 +01:00
Nick Desaulniers	9f4069b055	drm/i915: re-disable -Wframe-address The top level Makefile disables this warning. When building an i386_defconfig with Clang, this warning is triggered a whole bunch via includes of headers from perf. Link: https://github.com/ClangBuiltLinux/continuous-integration/pull/182 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200426214215.139435-1-ndesaulniers@google.com	2020-04-27 09:58:27 +01:00
Mika Kuoppala	b8a1181122	drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) v3: end to a cacheline (Chris) v4: add to lrc fixed (Chris) v5: value in offset+1 Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com	2020-04-25 19:08:56 +01:00
Mika Kuoppala	1dd47b54ba	drm/i915: Add live selftests for indirect ctx batchbuffers Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at cacheline granularity, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. v2: request wait fix, naming (Chris) v3: test order (Chris) v4: rebase Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com	2020-04-25 19:08:18 +01:00
Mika Kuoppala	685d21096f	drm/i915: Add per ctx batchbuffer wa for timestamp Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. We can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone as probability of race drops. v2: tidying (Chris) v3: lrr for all engines v4: grp v5: reg bit v6: wa_bb_offset, virtual engines (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston <joseph.koston@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com	2020-04-25 18:39:32 +01:00
Mika Kuoppala	168c6d231b	drm/i915: Add engine scratch register to live_lrc_fixed General purpose registers are per engine and in a fixed location. Add to live_lrc_fixed. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com	2020-04-25 17:58:33 +01:00
Chris Wilson	9669a50799	drm/i915: Drop rq->ring->vma peeking from error capture We only hold the active spinlock while dumping the error state, and this does not prevent another thread from retiring the request -- as it is quite possible that despite us capturing the current state, the GPU has completed the request. As such, it is dangerous to dereference state below the request as it may already be freed, and the simplest way to avoid the danger is not include it in the error state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424191410.27570-1-chris@chris-wilson.co.uk	2020-04-24 22:14:35 +01:00
Chris Wilson	9c878557b1	drm/i915/gt: Use the RPM config register to determine clk frequencies For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk	2020-04-24 19:10:17 +01:00
Chris Wilson	555a322429	drm/i915/gt: Trace RPS events Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk	2020-04-24 18:38:46 +01:00
Chris Wilson	1ebf7aaf3a	drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk	2020-04-24 17:20:58 +01:00
Ville Syrjälä	7db8736db0	drm/i915: Split some long lines Split some overly long lines. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-4-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2020-04-24 17:59:59 +03:00
Ville Syrjälä	8fdda38549	drm/i915: Introduce .set_idle_link_train() vfunc Relocate a bunch of DDI specific code from intel_dp.c to intel_ddi.c by introducing a .set_idle_link_train() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-3-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2020-04-24 17:57:15 +03:00
Ville Syrjälä	fb83f72c48	drm/i915: Introduce .set_signal_levels() vfunc Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_signal_levels() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-2-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2020-04-24 17:53:26 +03:00
Ville Syrjälä	eee3f91195	drm/i915: Introduce .set_link_train() vfunc Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_link_train() vfunc. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420200610.31798-1-ville.syrjala@linux.intel.com Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2020-04-24 17:45:44 +03:00
Ville Syrjälä	d7ff281c6d	drm/i915: Have pfit calculations return an error code Change intel_{gmch,pch}_panel_fitting() to return a normal error vs. success int. We'll need this later to validate that the margin properties aren't misconfigured. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-6-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>	2020-04-24 17:37:22 +03:00
Ville Syrjälä	4cecc7c0cc	drm/i915: Pass connector state to pfit calculations Pass the entire connector state to intel_{gmch,pch}_panel_fitting(). For now we just need to get at .scaling_mode but in the future we'll want access to the margin properties as well. v2: Deal with intel_dp_ycbcr420_config() Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-5-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>	2020-04-24 17:33:35 +03:00
Ville Syrjälä	f650af72e5	drm/i915: s/pipe_config/crtc_state/ in pfit functions Follow the new naming convention and call the crtc state "crtc_state", and while at it drop the redundant crtc argument. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-4-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>	2020-04-24 17:30:25 +03:00
Ville Syrjälä	35dd95b4ee	drm/i915: Use drm_rect to store the pfit window pos/size Make things a bit more abstract by replacing the pch_pfit.pos/size raw register values with a drm_rect. Makes it slighly more convenient to eg. compute the scaling factors. v2: Use drm_rect_init() Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-3-ville.syrjala@linux.intel.com Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>	2020-04-24 17:28:37 +03:00
Ville Syrjälä	eac9c58539	drm/i915: Flatten a bunch of the pfit functions Most of the pfit functions are of the form: func() { if (pfit_enabled) { ... } } Flip the pfit_enabled check around to flatten the functions. And while we're touching all this let's do the usual s/pipe_config/crtc_state/ replacement. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-2-ville.syrjala@linux.intel.com	2020-04-24 17:21:32 +03:00
Ville Syrjälä	c5a01ec757	drm/i915: Fix skl+ non-scaled pfit modes Fix skl_update_scaler_crtc() to deal with different scaling modes correctly. The current implementation assumes DRM_MODE_SCALE_FULLSCREEN. Fortunately we don't expose any border properties currently so the code does actually end up doing the right thing (assigning a scaler for pfit). The code does need to be fixed before any borders are exposed. Also we have redundant calls to skl_update_scaler_crtc() in dp/hdmi .compute_config() which can be nuked. They were anyway called before we had even computed the pfit state so were basically nonsense. The real call we need to keep is in intel_crtc_atomic_check(). v2: Deal witrh skl_update_scaler_crtc() in intel_dp_ycbcr420_config() Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422161917.17389-1-ville.syrjala@linux.intel.com	2020-04-24 17:16:46 +03:00
Chris Wilson	50689771c8	drm/i915: Only close vma we open The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk	2020-04-24 11:24:45 +01:00
Mika Kuoppala	b4892e4404	drm/i915: Make define for lrc state offset More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com	2020-04-24 00:52:14 +01:00
Mika Kuoppala	f1cc6acf22	drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixed Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk	2020-04-24 00:36:13 +01:00
Chris Wilson	cbfd3a0c5a	drm/i915/selftests: Add request throughput measurement to perf Under ideal circumstances, the driver should be able to keep the GPU fully saturated with work. Measure how close to ideal we get under the harshest of conditions with no user payload. v2: Also measure throughput using only one thread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422074203.9799-1-chris@chris-wilson.co.uk	2020-04-23 16:40:30 +01:00
Chris Wilson	b97f77baa8	drm/i915/gt: Check carefully for an idle engine in wait-for-idle intel_gt_wait_for_idle() tries to wait until all the outstanding requests are retired and the GPU is idle. As a side effect of retiring requests, we may submit more work to flush any pm barriers, and so the wait-for-idle tries to flush the background pm work and catch the new requests. However, if the work completed in the background before we were able to flush, it would queue the extra barrier request without us noticing -- and so we would return from wait-for-idle with one request remaining. (This breaks e.g. record_default_state where we need to wait until that barrier is retired, and it may slow suspend down by causing us to wait on the background retirement worker as opposed to immediately retiring the barrier.) However, since we track if there has been a submission since the engine pm barrier, we can very quickly detect if the idle barrier is still outstanding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423085940.28168-1-chris@chris-wilson.co.uk	2020-04-23 16:16:32 +01:00
Chris Wilson	36fe164d8d	drm/i915/gt: Carefully order virtual_submission_tasklet During the virtual engine's submission tasklet, we take the request and insert into the submission queue on each of our siblings. This seems quite simply, and so no problems with ordering. However, the sibling execlists' submission tasklets may run concurrently with the virtual engine's tasklet, submitting the request to HW before the virtual finishes its task of telling all the siblings. If this happens, the sibling tasklet may reorder the ve->sibling[] array that the virtual engine tasklet is processing. This can only reorder within the elements already processed by the virtual engine, nevertheless the race is detected by KCSAN: [ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915] [ 185.580054] [ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2: [ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915] [ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915] [ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915] [ 185.581554] tasklet_action_common.isra.0+0x42/0x90 [ 185.581585] __do_softirq+0xc8/0x206 [ 185.581613] run_ksoftirqd+0x15/0x20 [ 185.581641] smpboot_thread_fn+0x15a/0x270 [ 185.581669] kthread+0x19a/0x1e0 [ 185.581695] ret_from_fork+0x1f/0x30 [ 185.581717] [ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0: [ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915] [ 185.582265] tasklet_action_common.isra.0+0x42/0x90 [ 185.582291] __do_softirq+0xc8/0x206 [ 185.582315] run_ksoftirqd+0x15/0x20 [ 185.582340] smpboot_thread_fn+0x15a/0x270 [ 185.582368] kthread+0x19a/0x1e0 [ 185.582395] ret_from_fork+0x1f/0x30 [ 185.582417] We can prevent this race by checking for the ve->request after looking up the sibling array. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423115315.26825-1-chris@chris-wilson.co.uk	2020-04-23 16:14:27 +01:00
Imre Deak	8372e3227f	drm/i915/icl: Fix timeout handling during TypeC AUX power well enabling Fix the check for when an AUX power well enabling timeout is expected on a legacy TypeC port. Fixes: `89e01caac6` ("drm/i915: Use single set of AUX powerwell ops for gen11+") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422123440.19522-1-imre.deak@intel.com	2020-04-23 14:26:13 +03:00
Chris Wilson	15501287b1	drm/i915/execlists: Drop request-before-CS assertion When we migrated to execlists, one of the conditions we wanted to test for was whether the breadcrumb seqno was being written before the breadcumb interrupt was delivered. This was following on from issues observed on previous generations which were not so strongly ordered. With the removal of the missed interrupt detection, we have not reliable means of detecting the out-of-order seqno/interrupt but instead tried to assert that the relationship between the CS event interrupt and the breadwrite should be strongly ordered. However, Icelake proves it is possible for the HW implementation to forget about minor little details such as write ordering and so the order between processing the CS event and the breadcrumb is unreliable. Remove the unreliable assertion, but leave a debug telltale in case we have reason to suspect. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk	2020-04-22 17:17:50 +01:00
Chris Wilson	cb593e5d2b	drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma() While the ggtt vma are protected by their object lifetime, the list continues until it hits a non-ggtt vma, and that vma is not protected and may be freed as we inspect it. Hence, we require the obj->vma.lock to protect the list as we iterate. An example of forgetting to hold the obj->vma.lock is [1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI [1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1 [1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017 [1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915] [1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10 [1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282 [1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000 [1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578 [1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000 [1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8 [1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00 [1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000 [1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0 [1642834.465038] Call Trace: [1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915] [1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm] [1642834.465156] ? avc_has_perm+0x3b/0x160 [1642834.465178] drm_ioctl+0x206/0x390 [drm] [1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465221] ? selinux_file_ioctl+0x122/0x1c0 [1642834.465226] ? __do_munmap+0x24b/0x4d0 [1642834.465231] ksys_ioctl+0x82/0xc0 [1642834.465235] __x64_sys_ioctl+0x16/0x20 [1642834.465238] do_syscall_64+0x5b/0xf0 [1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1642834.465245] RIP: 0033:0x7f6a3d7b047b [1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48 [1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b [1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e [1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002 [1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080 [1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30 Now to take the spinlock during the list iteration, we need to break it down into two phases. In the first phase under the lock, we cannot sleep and so must defer the actual work to a second list, protected by the ggtt->mutex. We also need to hold the spinlock during creation of a new vma to serialise with updates of the tiling on the object. Reported-by: Dave Airlie <airlied@redhat.com> Fixes: `2850748ef8` ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk	2020-04-22 15:43:56 +01:00
Chris Wilson	c92724de6d	drm/i915/selftests: Try to detect rollback during batchbuffer preemption Since batch buffers dominant execution time, most preemption requests should naturally occur during execution of a batch buffer. We wish to verify that should a preemption occur within a batch buffer, when we come to restart that batch buffer, it occurs at the interrupted instruction and most importantly does not rollback to an earlier point. v2: Do not clear the GPR at the start of the batch, but rely on them being clear for new contexts. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk	2020-04-22 15:42:52 +01:00
Chris Wilson	cbb6f8805a	drm/i915/selftests: Disable heartbeat around RPS interrupt testing For verifying reciving the EI interrupts, we need to hold the GPU in very precise conditions (in terms of C0 cycles during the EI). If we preempt the busy load to handle the heartbeat, this may perturb the busy load causing us to miss the interrupt. The other tests, while not as time sensitive, may also be slightly perturbed, so apply the heartbeat protection across all the measurements. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422083855.26842-1-chris@chris-wilson.co.uk	2020-04-22 10:59:46 +01:00
Chris Wilson	33883310cd	drm/i915/selftests: Unroll the CS frequency loop Having noticed that MI_BB_START is incurring a memory stall (see the correlation with uncore frequency), we have to unroll the loop in order to diminish the impact of the MI_BB_START on the instruction throughput. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421171351.19575-1-chris@chris-wilson.co.uk	2020-04-21 20:48:45 +01:00
Chris Wilson	bd3ec9e758	drm/i915/gt: Poison residual state [HWSP] across resume. Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421092504.7416-1-chris@chris-wilson.co.uk	2020-04-21 16:27:39 +01:00
Chris Wilson	cf9ba27840	drm/i915/selftests: Disable C-states when measuring RPS frequency response Let's isolate the impact of cpu frequency selection on determing the GPU throughput in response to selection of RPS frequencies. For real systems, we do have to be concerned with the impact of integrating c-states, p-states and rp-states, but for the sake of proving whether or not RPS works, one baby step at a time. For the record, as one would hope, it does not seem to impact on the measured performance, but we do it anyway to reduce the number of variables. Later, we can extend the testing to encourage the the cpu/pkg to try and sleep while the GPU is busy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20200421142236.8614-1-chris@chris-wilson.co.uk	2020-04-21 16:24:34 +01:00
Chris Wilson	4ea6b1c456	drm/i915/selftests: Show the full scaling curve on failure If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200421124636.22554-1-chris@chris-wilson.co.uk	2020-04-21 16:24:34 +01:00
Chris Wilson	74f103928d	drm/i915/selftests: Show the pstate limits on any failure to reset min We want to see the pstate limits whenever we fail to set the minimum frequency as that may help for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200420203040.8984-1-chris@chris-wilson.co.uk	2020-04-21 09:39:35 +01:00
Pankaj Bharadiya	007ff34e61	drm/i915/display/vlv_dsi: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-14-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 11:23:17 +03:00
Pankaj Bharadiya	e278f07679	drm/i915/display/overlay: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-11-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 10:54:41 +03:00
Pankaj Bharadiya	8d641574f3	drm/i915/display/global_state: Prefer drm_WARN* over WARN* struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* calls. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-10-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 10:54:28 +03:00
Pankaj Bharadiya	a7f2ad3929	drm/i915/display/frontbuffer: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-9-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 10:54:22 +03:00
Pankaj Bharadiya	4ad53ededf	drm/i915/display/dpll_mgr: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-8-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 10:53:53 +03:00
Pankaj Bharadiya	ce04ecd9cf	drm/i915/display/display: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-5-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 09:50:40 +03:00
Pankaj Bharadiya	8b4f2137cc	drm/i915/display/ddi: Prefer drm_WARN* over WARN* struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* calls. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-4-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 09:49:54 +03:00
Pankaj Bharadiya	1e6850ee4c	drm/i915/display/atomic_plane: Prefer drm_WARN_ON over WARN_ON struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406112800.23762-3-pankaj.laxminarayan.bharadiya@intel.com	2020-04-21 09:49:30 +03:00

1 2 3 4 5 ...

64954 Commits