Right now if sink reported any PSR error or if it fails to
acknowledge the PSR wakeup it sets a flag and do not attempt to
enable PSR anymore. That is the safest approach to avoid repetitive
glitches and allowed us to have PSR enabled by default.
But from time to time even good PSR panels have a PSR error, causing
tests to fail. And for now we are not yet to the point were we could
try to recover from PSR errors, so lets add this information to the
debugfs so IGT can check if PSR is disabled because of sink errors or
not and eliminate this noise from CI runs.
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ap Kamal <kamal.ap@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023214932.94679-1-jose.souza@intel.com
Replace sampling the engine state every so often with a periodic
heartbeat request to measure the health of an engine. This is coupled
with the forced-preemption to allow long running requests to survive so
long as they do not block other users.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
Creating and opening the GuC log relay file enables and starts
the relay potentially before the caller is ready to consume logs.
Change the behavior so that relay starts only on an explicit call
to the write function (with a value of '1'). Other values flush
the log relay as before.
v2: Style changes and fix typos. Add guc_log_relay_stop()
function. (Daniele)
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Robert M. Fosha <robert.m.fosha@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022163754.23870-1-robert.m.fosha@intel.com
Reduce verbosity in code by renaming dsc_params member of crtc state to
simply dsc. There is enough context for this to be clear. No functional
changes.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022133414.8293-1-jani.nikula@intel.com
The HW performs swizzling as part of its fence tiling inside the Global
GTT. We already do the probing of the HW settings from the GGTT setup,
complete the picture by storing the information as part of the GGTT. The
primary benefit is the consistency of our probe routines do not break
the i915_ggtt encapsulation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-2-chris@chris-wilson.co.uk
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).
v2: Don't forget to save/restore registers used for the wait (Chris)
v3: Name used CS_GPR registers (Chris)
Fix compile issue due to rebase (Lionel)
v4: Fix save/restore helpers (Umesh)
v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)
v6: Add missing struct declarations in i915_perf.h
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012072308.30312-2-chris@chris-wilson.co.uk
Sometimes a test has to wait for RCU to complete a grace period and
perform its callbacks, for example waiting for a close(fd) to actually
perform the fput(filp) and so trigger all the callbacks such as closing
GEM contexts. There is no trivial means of triggering an RCU barrier
from userspace, so add one for our convenience in
debugfs/i915_drop_caches
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011173823.20432-1-chris@chris-wilson.co.uk
Adding DC3CO counter in i915_dmc_info debugfs will be
useful for DC3CO validation.
DMC firmware uses DMC_DEBUG3 register as DC3CO counter
register on TGL, as per B.Specs DMC_DEBUG3 is general
purpose register.
v1: comment modification for DMC_DBUG3.
using GEN >= 12 check instead of IS_TIGERLAKE()
to print DMC_DEBUG3 counter value.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191003081738.22101-7-anshuman.gupta@intel.com
Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.
v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk
With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]
We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.
v2: Calculate required number of tags to fill ELSP.
Fixes: 976b55f0e1 ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
wait_for_timelines is essentially the same loop as retiring requests
(with an extra timeout), so merge the two into one routine.
v2: i915_retire_requests_timeout and keep VT'd w/a as !interruptible
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-10-chris@chris-wilson.co.uk
We don't need to hold struct_mutex now for retiring requests, so drop it
from i915_retire_requests() and i915_gem_wait_for_idle(), finally
removing I915_WAIT_LOCKED for good.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-8-chris@chris-wilson.co.uk
Gen12 has dual-subslices (DSS), which compared to gen11 subslices have
some duplicated resources/paths. Although DSS behave similarly to 2
subslices, instead of splitting this and presenting userspace with bits
not directly representative of hardware resources, present userspace
with a subslice_mask made up of DSS bits instead.
v2: GEM_BUG_ON on mask size (Lionel)
Bspec: 29547
Bspec: 12247
Cc: Kelvin Gardiner <kelvin.gardiner@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com> #v1
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913075137.18476-2-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Without it we get:
Unclaimed read from register 0x1e1110
WARNING: CPU: 2 PID: 1029 at drivers/gpu/drm/i915/intel_uncore.c:1101 __unclaimed_reg_debug+0x40/0x50 [i915]
Call Trace:
fwtable_read32+0x233/0x300 [i915]
i915_interrupt_info+0xa73/0xd60 [i915]
seq_read+0xdb/0x3c0
full_proxy_read+0x51/0x80
vfs_read+0x9e/0x160
ksys_read+0x8f/0xe0
do_syscall_64+0x55/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109824
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912125418.23115-2-arkadiusz.hiler@intel.com
As we track when we put the GT device to sleep upon idling, we can use
that callback to sample the current rc6 counters and record the
timestamp for estimating samples after that point while asleep.
v2: Stick to using ktime_t
v3: Track user_wakerefs that interfere with the new
intel_gt_pm_wait_for_idle
v4: No need for parked/unparked estimation if !CONFIG_PM
v5: Keep timer park/unpark logic as was
v6: Refactor duplicated estimate/update rc6 logic
v7: Pull intel_get_pm_get_if_awake() out from the pmu->lock.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912124813.19225-1-chris@chris-wilson.co.uk
There's no easy way of checking whether iommu is enabled for the GPU
(you can grep dmesg if you know the device, or you can grep
i915_gpu_info if that's available). We do have a central
i915_capabilities with the intent of listing such pertinent information,
so add the iommu status.
Suggested-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911114655.9254-1-chris@chris-wilson.co.uk
If we make sure we grab a strong reference to each object as we dump it,
we can reduce the locks outside of our iterators to an rcu_read_lock.
This should prevent errors like:
[ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
[ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
[ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
[ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
[ 2138.371959] Call Trace:
[ 2138.371974] dump_stack+0x7c/0xbb
[ 2138.372099] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372108] print_address_description+0x73/0x3a0
[ 2138.372231] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372352] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372362] __kasan_report+0x14e/0x192
[ 2138.372489] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372502] kasan_report+0xe/0x20
[ 2138.372625] per_file_stats+0x43/0x380 [i915]
[ 2138.372751] ? i915_panel_show+0x110/0x110 [i915]
[ 2138.372761] idr_for_each+0xa7/0x160
[ 2138.372773] ? idr_get_next_ul+0x110/0x110
[ 2138.372782] ? do_raw_spin_lock+0x10a/0x1d0
[ 2138.372923] print_context_stats+0x264/0x510 [i915]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903062133.27360-1-chris@chris-wilson.co.uk
obj->pin_global was originally used as a means to keep the shrinker off
the active scanout, but we use the vma->pin_count itself for that and
the obj->frontbuffer to delay shrinking active framebuffers. The other
role that obj->pin_global gained was for spotting display objects inside
GEM and working harder to keep those coherent; for which we can again
simply inspect obj->frontbuffer directly.
Coming up next, we will want to manipulate the pin_global counter
outside of the principle locks, so would need to make pin_global atomic.
However, since obj->frontbuffer is already managed atomically, it makes
sense to use that the primary key for display objects instead of having
pin_global.
Ville pointed out the principle difference is that obj->frontbuffer is
set for as long as an intel_framebuffer is attached to an object, but
obj->pin_global was only raised for as long as the object was active. In
practice, this means that we consider the object as being on the scanout
for longer than is strictly required, causing us to be more proactive in
flushing -- though it should be true that we would have flushed
eventually when the back became the front, except that on the flip path
that flush is async but when hit from another ioctl it will be
synchronous.
v2: i915_gem_object_is_framebuffer()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-5-chris@chris-wilson.co.uk
enum port is a mess now because it no longer matches the spec
at all. Let's start to dig ourselves out of this hole by
reducing our reliance on port_name(). This should at least make
a bunch of debug messages a bit more sensible while we think how
to fill the the hole properly.
Based on the following cocci script with a lot of manual cleanup
(all the format strings etc.):
@@
expression E;
@@
(
- port_name(E->port)
+ E->base.base.id, E->base.name
|
- port_name(E.port)
+ E.base.base.id, E.base.name
)
@@
enum port P;
expression E;
@@
P = E->port
<...
- port_name(P)
+ E->base.base.id, E->base.name
...>
@@
enum port P;
expression E;
@@
P = E.port
<...
- port_name(P)
+ E.base.base.id, E.base.name
...>
@@
expression E;
@@
{
- enum port P = E;
... when != P
}
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830182719.32608-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
There is a difference in BSpec's and the driver's designation of DDI
ports. BSpec uses the following names:
- before GEN11:
BSpec/driver:
port A/B/C/D etc
- GEN11:
BSpec/driver:
port A-F
- GEN12:
BSpec:
port A/B/C for combo PHY ports
port TC1-6 for Type C PHY ports
driver:
port A-I.
The driver's port D name matches BSpec's TC1 port name.
So far power domains were named according to the BSpec designation, to
make it easier to match the code against the specification. That however
can be confusing when a power domain needs to be matched to a port on
GEN12+. To resolve that use the driver's port A-I designation for power
domain names too and rename the corresponding power wells so that they
reflect the mapping from the driver's to BSpec's port name.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823100711.27833-1-imre.deak@intel.com
Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
slice * subslice stride + subslice index / 8
v2: Fix 32-bit build
v3: Use new helper function in SSEU workaround warning message
v4: Use GEM_BUG_ON to force developers to use valid SSEU configurations
per platform (Chris)
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-12-stuart.summers@intel.com
Add a new function to copy subslices for a specified slice
between intel_sseu structures for the purpose of determining
power-gate status. Note that currently ss_stride has a max
of 1.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-11-stuart.summers@intel.com
Add a new function to allow each platform to set maximum
slice, subslice, and EU information to reduce code duplication.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-3-stuart.summers@intel.com
Use a local variable to find SSEU runtime information
in various debugfs functions.
v2: Remove extra line breaks per feedback from Chris
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-2-stuart.summers@intel.com
PSR registers are a mess, some have the full address while others just
have the additional offset from psr_mmio_base.
For BDW+ psr_mmio_base is nothing more than TRANSCODER_EDP_OFFSET +
0x800 and using it makes more difficult for people with an PSR
register address or PSR register name from from BSpec as i915 also
don't match the BSpec names.
For HSW psr_mmio_base is _DDI_BUF_CTL_A + 0x800 and PSR registers are
only available in DDIA.
Other reason to make relative to transcoder is that since BDW every
transcoder have PSR registers, so in theory it should be possible to
have PSR enabled in a non-eDP transcoder.
So for BDW+ we can use _TRANS2() to get the register offset of any
PSR register in any transcoder while for HSW we have _HSW_PSR_ADJ
that will calculate the register offset for the single PSR instance,
noting that we are already guarded about trying to enable PSR in other
port than DDIA on HSW by the 'if (dig_port->base.port != PORT_A)' in
intel_psr_compute_config(), this check should only be valid for HSW
and will be changed in future.
PSR2 registers and PSR_EVENT was added after Haswell so that is why
_PSR_ADJ() is not used in some macros.
The only registers that can not be relative to transcoder are
PSR_IMR and PSR_IIR that are not relative to anything, so keeping it
hardcoded. That changed for TGL but it will be handled in another
patch.
Also removing BDW_EDP_PSR_BASE from GVT because it is not used as it
is the only PSR register that GVT have.
v5:
- Macros changed to be more explicit about HSW (Dhinakaran)
- Squashed with the patch that added the tran parameter to the
macros (Dhinakaran)
v6:
- Checking for interruption errors after module reload in the
transcoder that will be used (Dhinakaran)
- Using lowercase to the registers offsets
v7:
- Removing IS_HASWELL() from registers macros(Jani)
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820223325.27490-1-jose.souza@intel.com
Since we want to revoke the ggtt vma from only under the ggtt->mutex, we
need to move protection of the userfault tracking from the struct_mutex
to the ggtt->mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-2-chris@chris-wilson.co.uk
We can reduce the locking for fence registers from the dev->struct_mutex
to a local mutex. We could introduce a mutex for the sole purpose of
tracking the fence acquisition, except there is a little bit of overlap
with the fault tracking, so use the i915_ggtt.mutex as it covers both.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-1-chris@chris-wilson.co.uk
As we plan to continue driver load after GuC initialization
failure, we can't assume that GuC log data will be available
just because GuC was initially enabled. We must check that
GuC is still running instead.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190818095204.31568-2-michal.wajdeczko@intel.com
Move the active tracking for the frontbuffer operations out of the
i915_gem_object and into its own first class (refcounted) object. In the
process of detangling, we switch from low level request tracking to the
easier i915_active -- with the plan that this avoids any potential
atomic callbacks as the frontbuffer tracking wishes to sleep as it
flushes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816074635.26062-1-chris@chris-wilson.co.uk
The engine->guc_id is GuC FW defined and it is not guaranteed to be
below I915_NUM_ENGINES, so we shouldn't use it with the i915-defined
client->submissions, as we might overflow.
Instead of fixing it, just get rid of client->submissions, because the
information we get from it is not interesting anymore now that we only
have 1 client.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190814002145.29056-1-daniele.ceraolospurio@intel.com
We were using the last_fence to track the last request that used this
vma that might be interpreted by a fence register and forced ourselves
to wait for this request before modifying any fence register that
overlapped our vma. Due to requirement that we need to track any XY_BLT
command, linear or tiled, this in effect meant that we have to track the
vma for its active lifespan anyway, so we can forgo the explicit
last_fence tracking and just use the whole vma->active.
Another solution would be to pipeline the register updates, and would
help resolve some long running stalls for gen3 (but only gen 2 and 3!)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk
Before we start upon our great GT interrupt refactor, throw out the
cruft! In this case, it is an unloved debugfs showing the current ips
status, a fairly meaningless bunch of numbers that we are not checking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190810090329.6966-1-chris@chris-wilson.co.uk
Multiple uncore structures will share the debug infrastructure, so
move it to a common place and add extra locking around it.
Also, since we now have a separate object, it is cleaner to have
dedicated functions working on the object to stop and restart the
mmio debug. Apart from the cosmetic changes, this patch introduces
2 functional updates:
- All calls to check_for_unclaimed_mmio will now return false when
the debug is suspended, not just the ones that are active only when
i915_modparams.mmio_debug is set. If we don't trust the result of the
check while a user is doing mmio access then we shouldn't attempt the
check anywhere.
- i915_modparams.mmio_debug is not save/restored anymore around user
access. The value is now never touched by the kernel while debug is
disabled so no need for save/restore.
v2: squash mmio_debug patches, restrict mmio_debug lock usage (Chris)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809063116.7527-1-chris@chris-wilson.co.uk
Currently we walk the entire list of obj->vma for each obj within a file
to find the matching vma of this context. Since we know we are searching
for a particular vma bound to a user context, we can use the rbtree to
search for it rather than repeatedly walk everything.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808162407.28121-1-chris@chris-wilson.co.uk
As we need to acquire a mutex to serialise the final
intel_wakeref_put, we need to ensure that we are in process context at
that time. However, we want to allow operation on the intel_wakeref from
inside timer and other hardirq context, which means that need to defer
that final put to a workqueue.
Inside the final wakeref puts, we are safe to operate in any context, as
we are simply marking up the HW and state tracking for the potential
sleep. It's only the serialisation with the potential sleeping getting
that requires careful wait avoidance. This allows us to retain the
immediate processing as before (we only need to sleep over the same
races as the current mutex_lock).
v2: Add a selftest to ensure we exercise the code while lockdep watches.
v3: That test was extremely loud and complained about many things!
v4: Not a whale!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295
References: https://bugs.freedesktop.org/show_bug.cgi?id=111245
References: https://bugs.freedesktop.org/show_bug.cgi?id=111256
Fixes: 18398904ca ("drm/i915: Only recover active engines")
Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk
Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.
There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.
v2: fix the include guard name in the renamed file
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com
To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).
v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
Switch to tracking activity via i915_active on individual nodes, only
keeping a list of retired objects in the cache, and reaping the cache
when the engine itself idles.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk
The shrinker cannot touch objects used by the contexts (logical state
and ring). Currently we mark those as "pin_global" to let the shrinker
skip over them, however, if we remove them from the shrinker lists
entirely, we don't event have to include them in our shrink accounting.
By keeping the unshrinkable objects in our shrinker tracking, we report
a large number of objects available to be shrunk, and leave the shrinker
deeply unsatisfied when we fail to reclaim those. The shrinker will
persist in trying to reclaim the unavailable objects, forcing the system
into a livelock (not even hitting the dread oomkiller).
v2: Extend unshrinkable protection for perma-pinned scratch and guc
allocations (Tvrtko)
v3: Notice that we should be pinned when marking unshrinkable and so the
link cannot be empty; merge duplicate paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk