Commit Graph

19786 Commits

Author SHA1 Message Date
Jani Nikula
aab30b85c9 drm/i915: ensure more headers remain self-contained
Add more headers to the header test list:

* i915_drv.h
* i915_params.h
* i915_reg.h
* intel_drv.h
* intel_uncore.h

Happily they already are self-contained, but keep them that way.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f660e7e1258b81d50475fa73f610eb3312c83424.1556540889.git.jani.nikula@intel.com
2019-04-30 14:29:11 +03:00
Lucas De Marchi
da17223e85 drm/i915: do not mix workaround with normal flow
Separate the two comments: one is a workaround and the other is a sanity
check. We could just compare != 1, but let's treat them differently due
to having different meaning.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404230426.15837-4-lucas.demarchi@intel.com
2019-04-30 02:25:37 -07:00
Lucas De Marchi
323b0a82ef drm/i915: reorder if chain to have last gen first
Reorder if/else so we check for gen >= 11 first, similar to most of
other checks in the driver.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404230426.15837-3-lucas.demarchi@intel.com
2019-04-30 02:25:37 -07:00
Lucas De Marchi
fcfec1fc98 drm/i915/icl: fix step numbers in icl_display_core_init()
At some point the spec was changed and we never updated the numbers to
match it. Let's try once more to keep them in sync.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404230426.15837-2-lucas.demarchi@intel.com
2019-04-30 02:25:37 -07:00
Tvrtko Ursulin
0fc2273b9a drm/i915/icl: Whitelist GEN9_SLICE_COMMON_ECO_CHICKEN1
WaEnableStateCacheRedirectToCS context workaround configures the L3 cache
to benefit 3d workloads but media has different requirements.

Remove the workaround and whitelist the register to allow any userspace
configure the behaviour to their liking.

v2:
 * Remove the workaround apart from adding the whitelist.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: kevin.ma@intel.com
Cc: xiaogang.li@intel.com
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418100634.984-1-tvrtko.ursulin@linux.intel.com
Fixes: f63c7b4880 ("drm/i915/icl: WaEnableStateCacheRedirectToCS")
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[tursulin: Anuj reported no GPU hangs or performance regressions with old
 Mesa on patched kernel.]
2019-04-30 07:50:58 +01:00
Chris Wilson
62c8e42345 drm/i915: Skip unused contexts for context_barrier_task()
If the context has not been used yet, it needs no barrier, and in the
process fix up the selftest in mock_contexts.

Testcase: igt/gem_ctx_clone/vm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190429090735.326-1-chris@chris-wilson.co.uk
2019-04-29 11:12:37 +01:00
Chris Wilson
46472b3efb drm/i915: Move i915_request_alloc into selftests/
Having transitioned GEM over to using intel_context as its primary means
of tracking the GEM context and engine combined and using
i915_request_create(), we can move the older i915_request_alloc()
helper function into selftests/ where the remaining users are confined.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-9-chris@chris-wilson.co.uk
2019-04-26 18:32:20 +01:00
Chris Wilson
0268444607 drm/i915: Remove intel_context.active_link
We no longer need to track the active intel_contexts within each engine,
allowing us to drop a tricky mutex_lock from inside unpin (which may
occur inside fs_reclaim).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-8-chris@chris-wilson.co.uk
2019-04-26 18:32:17 +01:00
Chris Wilson
5e2a0419ef drm/i915: Switch back to an array of logical per-engine HW contexts
We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
2019-04-26 18:32:11 +01:00
Chris Wilson
11334c6aad drm/i915: Split engine setup/init into two phases
In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk
2019-04-26 18:32:07 +01:00
Chris Wilson
6b736de574 drm/i915: Pass intel_context to intel_context_pin_lock()
Move the intel_context_instance() to the caller so that we can decouple
ourselves from one context instance per engine.

v2: Rename pin_lock() to lock_pinned(), hopefully that is clearer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-5-chris@chris-wilson.co.uk
2019-04-26 18:32:05 +01:00
Chris Wilson
1b1ae40721 drm/i915/selftests: Pass around intel_context for sseu
Combine the (i915_gem_context, intel_engine) into a single parameter,
the intel_context for convenience and later simplification.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-4-chris@chris-wilson.co.uk
2019-04-26 18:32:04 +01:00
Chris Wilson
f7f28de7e5 drm/i915/selftests: Use the real kernel context for sseu isolation tests
Simply the setup slightly for the sseu selftests to use the actual
kernel_context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-3-chris@chris-wilson.co.uk
2019-04-26 18:32:03 +01:00
Chris Wilson
fa9f668141 drm/i915: Export intel_context_instance()
We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk
2019-04-26 18:32:00 +01:00
Chris Wilson
251d46b087 drm/i915/gvt: Pin the per-engine GVT shadow contexts
Our eventual goal is to rid request construction of struct_mutex, with
the short term step of lifting the struct_mutex requirements into the
higher levels (i.e. the caller must ensure that the context is already
pinned into the GTT). In this patch, we pin GVT's shadow context upon
allocation and so keep them pinned into the GGTT for as long as the
virtual machine is alive, and so we can use the simpler request
construction path safe in the knowledge that the hard work is already
done.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-1-chris@chris-wilson.co.uk
2019-04-26 18:31:57 +01:00
Jani Nikula
b226c3491b Merge drm/drm-next into drm-intel-next-queued
Get gvt-fixes back to dinq.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-04-26 18:22:59 +03:00
Ville Syrjälä
f61a8f36c4 drm/i915: Clean up cherryview_load_luts()
I like my functions simple, so split up the low level bits from
cherryview_load_luts() into separate functions. Also rename the
whole thing to chv_load_luts() to match the new world order.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190408121815.30142-1-ville.syrjala@linux.intel.com
Reviewed-by: Swati Sharma <swati2.sharma@intel.com>
2019-04-26 18:14:52 +03:00
Ville Syrjälä
d428ca17ea drm/i915: Fix ICL output CSC programming
When I refactored the code into its own function I accidentally
misplaced the <<16 shifts for some of the registers causing us
to lose the blue channel entirely.

We should really find a way to test this...

Cc: Uma Shankar <uma.shankar@intel.com>
Fixes: d2c19b06d6 ("drm/i915: Clean up ilk/icl pipe/output CSC programming")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190425192419.24931-1-ville.syrjala@linux.intel.com
Reviewed-by: Swati Sharma <swati2.sharma@intel.com>
2019-04-26 18:14:52 +03:00
Chris Wilson
9ce9bdb00d drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

v3: ECOSKPD to the rescue

Ville found the magic bit in the ECOSKPD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.

v4: Fix spelling of ECOSKPD and discover it already exists

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419172720.5462-1-chris@chris-wilson.co.uk
2019-04-26 11:39:17 +01:00
Chris Wilson
1215d28e72 drm/i915: Enable render context support for Ironlake (gen5)
Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.

v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419111749.3910-2-chris@chris-wilson.co.uk
2019-04-26 11:39:17 +01:00
Chris Wilson
928f8f4231 drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
Despite what I think the prm recommends, commit f2253bd985
("drm/i915/ringbuffer: EMIT_INVALIDATE after switch context") turned out
to be a huge mistake when enabling Ironlake contexts as the GPU would
hang on either a MI_FLUSH or PIPE_CONTROL immediately following the
MI_SET_CONTEXT of an active mesa context (more vanilla contexts, e.g.
simple rendercopies with igt, do not suffer).

Ville found the following clue,

  "[DevCTG+]: For the invalidate operation of the pipe control, the
   following pointers are affected. The
   invalidate operation affects the restore of these packets. If the pipe
   control invalidate operation is completed
   before the context save, the indirect pointers will not be restored from
   memory.
   1. Pipeline State Pointer
   2. Media State Pointer
   3. Constant Buffer Packet"

which suggests by us emitting the INVALIDATE prior to the MI_SET_CONTEXT,
we prevent the context-restore from chasing the dangling pointers within
the image, and explains why this likely prevents the GPU hang.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419111749.3910-1-chris@chris-wilson.co.uk
2019-04-26 11:37:58 +01:00
Chris Wilson
e0516e8364 drm/i915: Move sandybride pcode access to intel_sideband.c
sandybride_pcode is another sideband, so move it to their new home.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-8-chris@chris-wilson.co.uk
2019-04-26 10:20:47 +01:00
Chris Wilson
063203c013 drm/i915: Merge sandybridge_pcode_(read|write)
These routines are identical except in the nature of the value parameter.
For writes it is a pure in-param, but for a read, we need an out-param.
Since they differ in a single line, merge the two routines into one.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-7-chris@chris-wilson.co.uk
2019-04-26 10:20:46 +01:00
Chris Wilson
7531942861 drm/i915: Merge sbi read/write into a single accessor
Since intel_sideband_read and intel_sideband_write differ by only a
couple of lines (depending on whether we feed the value in or out),
merge the two into a single common accessor.

v2: Restore vlv_flisdsi_read() lost during rebasing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-6-chris@chris-wilson.co.uk
2019-04-26 10:20:45 +01:00
Chris Wilson
56c5098ffc drm/i915: Separate sideband declarations to intel_sideband.h
Split the sideback declarations out of the ginormous i915_drv.h

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-5-chris@chris-wilson.co.uk
2019-04-26 10:20:39 +01:00
Chris Wilson
ebb5eb7d73 drm/i915: Replace pcu_lock with sb_lock
We now have two locks for sideband access. The general one covering
sideband access across all generation, sb_lock, and a specific one
covering sideband access via the punit on vlv/chv. After lifting the
sb_lock around the punit into the callers, the pcu_lock is now redudant
and can be separated from its other use to regulate RPS (essentially
giving RPS a lock all of its own).

v2: Extract a couple of minor bug fixes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-4-chris@chris-wilson.co.uk
2019-04-26 10:20:35 +01:00
Chris Wilson
337fa6e04d drm/i915: Lift sideband locking for vlv_punit_(read|write)
Lift the sideband acquisition for vlv_punit_read and vlv_punit_write
into their callers, so that we can lock the sideband once for a sequence
of operations, rather than perform the heavyweight acquisition on each
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-3-chris@chris-wilson.co.uk
2019-04-26 10:20:32 +01:00
Chris Wilson
221c78623e drm/i915: Lift acquiring the vlv punit magic to a common sb-get
As we now employ a very heavy pm_qos around the punit access, we want to
minimise the number of synchronous requests by performing one for the
whole punit sequence rather than around individual accesses. The
sideband lock is used for this, so push the pm_qos into the sideband
lock acquisition and release, moving it from the lowlevel punit rw
routine to the callers. In the first step, we move the punit magic into
the common sideband lock so that we can acquire a bunch of ports
simultaneously, and if need be extend the workaround protection later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-2-chris@chris-wilson.co.uk
2019-04-26 10:20:28 +01:00
Chris Wilson
a75d035fed drm/i915: Disable preemption and sleeping while using the punit sideband
While we talk to the punit over its sideband, we need to prevent the cpu
from sleeping in order to prevent a potential machine hang.

Note that by itself, it appears that pm_qos_update_request (via
intel_idle) doesn't provide a sufficient barrier to ensure that all core
are indeed awake (out of Cstate) and that the package is awake. To do so,
we need to supplement the pm_qos with a manual ping on_each_cpu.

v2: Restrict the heavy-weight wakeup to just the ISOF_PORT_PUNIT, there
is insufficient evidence to implicate a wider problem atm. Similarly,
restrict the w/a to Valleyview, as Cherryview doesn't have an angry cadre
of users.

The working theory, courtesy of Ville and Hans, is the issue lies within
the power delivery and so is likely to be unit and board specific and
occurs when both the unit/fw require extra power at the same time as the
cpu package is changing its own power state.

References: https://bugzilla.kernel.org/show_bug.cgi?id=109051
References: https://bugs.freedesktop.org/show_bug.cgi?id=102657
References: https://bugzilla.kernel.org/show_bug.cgi?id=195255
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426081725.31217-1-chris@chris-wilson.co.uk
2019-04-26 10:20:26 +01:00
Chris Wilson
1f2b4a7edb drm/i915: Allow multiple user handles to the same VM
It was noted that we made the same mistake for VM_ID as for object
handles, whereby we ensured that we only allocated a single handle for
one ppgtt. This has the unfortunate consequence for userspace that they
need to reference count the handles to avoid destroying an active ID. If
we allow multiple handles to the same ppgtt, userspace can freely
unreference any handle they own without fear of destroying the same
handle in use elsewhere.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190425054333.27299-1-chris@chris-wilson.co.uk
2019-04-25 07:33:57 +01:00
Chris Wilson
8f2a1057d6 drm/i915: Explicitly pin the logical context for execbuf
In order to separate the reservation phase of building a request from
its emission phase, we need to pull some of the request alloc activities
from deep inside i915_request to the surface, GEM_EXECBUFFER.

v2: Be frivolous, use a local drm_i915_private.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190425050143.811-1-chris@chris-wilson.co.uk
2019-04-25 06:35:35 +01:00
Chris Wilson
79ffac8599 drm/i915: Invert the GEM wakeref hierarchy
In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
2019-04-24 22:26:49 +01:00
Chris Wilson
2ccdf6a1c3 drm/i915: Pass intel_context to i915_request_create()
Start acquiring the logical intel_context and using that as our primary
means for request allocation. This is the initial step to allow us to
avoid requiring struct_mutex for request allocation along the
perma-pinned kernel context, but it also provides a foundation for
breaking up the complex request allocation to handle different scenarios
inside execbuf.

For the purpose of emitting a request from inside retirement (see the
next patch for engine power management), we also need to lift control
over the timeline mutex to the caller.

v2: Note that the request carries the active reference upon construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
2019-04-24 22:25:35 +01:00
Chris Wilson
6eee33e87f drm/i915: Introduce context->enter() and context->exit()
We wish to start segregating the power management into different control
domains, both with respect to the hardware and the user interface. The
first step is that at the lowest level flow of requests, we want to
process a context event (and not a global GEM operation). In this patch,
we introduce the context callbacks that in future patches will be
redirected to per-engine interfaces leading to global operations as
required.

The intent is that this will be guarded by the timeline->mutex, except
that retiring has not quite finished transitioning over from being
guarded by struct_mutex. So at the moment it is protected by
struct_mutex with a reminded to switch.

v2: Rename default handlers to intel_context_enter_engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-3-chris@chris-wilson.co.uk
2019-04-24 22:25:32 +01:00
Chris Wilson
23c3c3d04f drm/i915: Pull the GEM powermangement coupling into its own file
Split out the powermanagement portion (GT wakeref, suspend/resume) of
GEM from i915_gem.c into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-2-chris@chris-wilson.co.uk
2019-04-24 22:25:28 +01:00
Chris Wilson
d91e657876 drm/i915: Introduce struct intel_wakeref
For controlling runtime pm of the GT and engines, we would like to have
a callback to do extra work the first time we wake up and the last time
we drop the wakeref. This first/last access needs serialisation and so
we encompass a mutex with the regular intel_wakeref_t tracker.

v2: Drop the _once naming and report the errors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc; Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-1-chris@chris-wilson.co.uk
2019-04-24 22:25:26 +01:00
Chris Wilson
112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00
Chris Wilson
86554f48e5 drm/i915/selftests: Verify whitelist of context registers
The RING_NONPRIV allows us to add registers to a whitelist that allows
userspace to modify them. Ideally such registers should be safe and
saved within the context such that they do not impact system behaviour
for other users. This selftest verifies that those registers we do add
are (a) then writable by userspace and (b) only affect a single client.

Opens:
- Is GEN9_SLICE_COMMON_ECO_CHICKEN1 really write-only?

v2: Remove the blatant copy-paste.
v3: Emulate userspace register writes via the batch again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424110941.9869-1-chris@chris-wilson.co.uk
2019-04-24 18:37:36 +01:00
Chris Wilson
09407579ab drm/i915: Store the default sseu setup on the engine
As we push for better compartmentalisation, it is more convenient to
copy the default sseu configuration from the engine into the derived
logical context, than it is to dig it out from i915->runtime_info.

v2: Use intel_sseu_from_device_info() to describe the converter

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424095134.30249-1-chris@chris-wilson.co.uk
2019-04-24 16:37:20 +01:00
Imre Deak
447811a686 drm/i915/icl: Fix MG_DP_MODE() register programming
Fix the order of lane, port parameters passed to the register macro.

Note that this was already partly fixed by commit
37fc7845df ("drm/i915: Call MG_DP_MODE() macro with the right parameters order")

While at it simplify things by using the macro directly instead of an
unnecessary redirection via an array.

v2:
- Add a note the commit message about simplifying things. (José)

Fixes: 58106b7d81 ("drm/i915: Make MG PHY macros semantically consistent")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419071026.32370-1-imre.deak@intel.com
(cherry picked from commit 9c11b12184)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-04-24 09:39:11 +03:00
Chris Wilson
929eec99f5 drm/i915: Avoid use-after-free in reporting create.size
We have to avoid chasing after a userspace race!

<3>[  473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
<3>[  473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541

<4>[  473.114464] CPU: 1 PID: 1541 Comm: gem_flink_race Tainted: G     U            5.1.0-rc4-g7d07e025e786-kasan_88+ #1
<4>[  473.114469] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
<4>[  473.114474] Call Trace:
<4>[  473.114488]  dump_stack+0x7c/0xbb
<4>[  473.114612]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114621]  print_address_description+0x65/0x270
<4>[  473.114728]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114839]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.114848]  kasan_report+0x149/0x18d
<4>[  473.114962]  ? i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115069]  i915_gem_create+0x1d2/0x1f0 [i915]
<4>[  473.115176]  ? i915_gem_object_create.part.28+0x4b0/0x4b0 [i915]
<4>[  473.115289]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115297]  drm_ioctl_kernel+0x192/0x260
<4>[  473.115306]  ? drm_ioctl_permit+0x280/0x280
<4>[  473.115326]  drm_ioctl+0x67c/0x960
<4>[  473.115438]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
<4>[  473.115448]  ? drm_getstats+0x20/0x20
<4>[  473.115459]  ? __lock_acquire+0xa66/0x3fe0
<4>[  473.115474]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4>[  473.115485]  ? debug_object_active_state+0x2ea/0x4e0
<4>[  473.115496]  ? debug_show_all_locks+0x2d0/0x2d0
<4>[  473.115513]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.115522]  ? check_flags.part.27+0x440/0x440
<4>[  473.115532]  ? ioctl_preallocate+0x1a0/0x1a0
<4>[  473.115547]  ? __fget+0x2ac/0x410
<4>[  473.115561]  ? __ia32_sys_dup3+0xb0/0xb0
<4>[  473.115569]  ? rwlock_bug.part.0+0x90/0x90
<4>[  473.115590]  ksys_ioctl+0x35/0x70
<4>[  473.115597]  ? lockdep_hardirqs_off+0x1cb/0x2b0
<4>[  473.115608]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.115614]  ? lockdep_hardirqs_on+0x342/0x590
<4>[  473.115623]  do_syscall_64+0x97/0x400
<4>[  473.115633]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  473.115641] RIP: 0033:0x7fce590d55d7
<4>[  473.115649] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4>[  473.115655] RSP: 002b:00007fce4d525ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  473.115662] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fce590d55d7
<4>[  473.115667] RDX: 00007fce4d525c10 RSI: 00000000c010645b RDI: 0000000000000007
<4>[  473.115672] RBP: 00007fce4d525c10 R08: 00007fce4d526700 R09: 00007fce4d526700
<4>[  473.115677] R10: 0000000000000054 R11: 0000000000000246 R12: 00000000c010645b
<4>[  473.115682] R13: 0000000000000007 R14: 0000000000000000 R15: 00007ffe0e4a7450

<3>[  473.115731] Allocated by task 1541:
<4>[  473.115766]  kmem_cache_alloc+0xce/0x290
<4>[  473.115895]  i915_gem_object_create.part.28+0x1c/0x4b0 [i915]
<4>[  473.116000]  i915_gem_create+0xe3/0x1f0 [i915]
<4>[  473.116008]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116013]  drm_ioctl+0x67c/0x960
<4>[  473.116020]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116026]  ksys_ioctl+0x35/0x70
<4>[  473.116032]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116038]  do_syscall_64+0x97/0x400
<4>[  473.116044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

<3>[  473.116071] Freed by task 1542:
<4>[  473.116101]  kmem_cache_free+0xb7/0x2f0
<4>[  473.116205]  __i915_gem_free_objects+0x7d4/0xe10 [i915]
<4>[  473.116311]  i915_gem_create_ioctl+0xaa/0xd0 [i915]
<4>[  473.116318]  drm_ioctl_kernel+0x192/0x260
<4>[  473.116323]  drm_ioctl+0x67c/0x960
<4>[  473.116330]  do_vfs_ioctl+0x18d/0xfa0
<4>[  473.116335]  ksys_ioctl+0x35/0x70
<4>[  473.116341]  __x64_sys_ioctl+0x6a/0xb0
<4>[  473.116347]  do_syscall_64+0x97/0x400
<4>[  473.116354]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Testcase: igt/gem_flink_race/flink_close
Fixes: e163484afa ("drm/i915: Update size upon return from GEM_CREATE")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417132507.27133-1-chris@chris-wilson.co.uk
(cherry picked from commit 9953402349)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-04-24 09:39:07 +03:00
Dave Airlie
8d8f6f7044 drm-misc-next for v5.2:
UAPI Changes:
 - Document which feature flags belong to which command in virtio_gpu.h
 - Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy.
 
 Cross-subsystem Changes:
 - Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE
 - Add parameters to the device tree bindings for tfp410
 - iommu/io-pgtable: Add ARM Mali midgard MMU page table format
 - dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound.
 - Use the 64-bits compare for dma-fence-chains
 
 Core Changes:
 - Make the fb conversion functions use __iomem dst.
 - Rename drm_client_add to drm_client_register
 - Move intel_fb_initial_config to core.
 - Add a drm_gem_objects_lookup helper
 - Add drm_gem_fence_array helpers, and use it in lima.
 - Add drm_format_helper.c to kerneldoc.
 
 Driver Changes:
 - Add panfrost driver for mali midgard/bitfrost.
 - Converts bochs to use the simple display type.
 - Small fixes to sun4i, tinydrm, ti-fp410.
 - Fid aspeed's Kconfig options.
 - Make some symbols/functions static in lima, sun4i and meson.
 - Add a driver for the lg,acx467akm-7 panel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAly4N1oACgkQ/lWMcqZw
 E8OJNQ//fk8+8TkupJTiBYjsAIbo4pRrWa29zQFhiqEhG2kpDESr1YjkeN3uQ+wg
 R6laOUt1jMNiV/BRBUlsv95cbZ0eda3mDPyFOqni8qBoUf0o4NMgLhuLK6qPeMyN
 76u/VZCwYl1AZGiyi33xMABWZ0WmRkwaGxG8gDW/pGU2klRX6T7XgZPDnXrX/jJC
 xq9QUWsOKoBVXX6OtYQPiL6WslbHh97sPzzs5dljJuLOWIug6xFWarUYK6vPnzU1
 HVNcu2/oS6VMCRAW+Ocf4dlcfIN7PvKL984AOcZH3SLB5qabhbjB14e10RivJGZx
 O2yhqNsdF42HcvA08EnwzNvtNS9Rj/GNuw95KHEU+pKZGZ6dQo/fFivm2DoeOqub
 piQlTjVqrHhpNhKg+h8Bd5jUQjx97TPy+PjFtjvCZznZpp8SK8T12yrN+MK4W1ml
 vzMYSaMWiUYNbdixSQH0L90i555uMQgOXh53mKNEovPkh1SKlcsMbONuJIEphpne
 jOT89O9AhtwGu4179cTHRPWpsDcWw/Uoji5wcWZkjeBWdwjwXDGXiIHePN1KAcw9
 qBERJs+yVujgmAvjnJwbe78QlYB1+wgtPdvWAGR0gmu31J1WVL1ADMLFe0YQ5RIz
 huXetCYJzYzv8PxWAPUkky/vUFq4EZQkEQFzGQPtrZje5sH2mko=
 =7Hyr
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-04-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.2:

UAPI Changes:
- Document which feature flags belong to which command in virtio_gpu.h
- Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy.

Cross-subsystem Changes:
- Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE
- Add parameters to the device tree bindings for tfp410
- iommu/io-pgtable: Add ARM Mali midgard MMU page table format
- dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound.
- Use the 64-bits compare for dma-fence-chains

Core Changes:
- Make the fb conversion functions use __iomem dst.
- Rename drm_client_add to drm_client_register
- Move intel_fb_initial_config to core.
- Add a drm_gem_objects_lookup helper
- Add drm_gem_fence_array helpers, and use it in lima.
- Add drm_format_helper.c to kerneldoc.

Driver Changes:
- Add panfrost driver for mali midgard/bitfrost.
- Converts bochs to use the simple display type.
- Small fixes to sun4i, tinydrm, ti-fp410.
- Fid aspeed's Kconfig options.
- Make some symbols/functions static in lima, sun4i and meson.
- Add a driver for the lg,acx467akm-7 panel.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/737ad994-213d-45b5-207a-b99d795acd21@linux.intel.com
2019-04-24 10:12:50 +10:00
Dave Airlie
b1c4f7fead Merge tag 'drm-intel-next-2019-04-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- uAPI "Fixes:" patch for the upcoming kernel 5.1, included here too

  We have an Ack from the media folks (only current user) for this
  late tweak

Cross-subsystem Changes:

- ALSA: hda: Fix racy display power access (Takashi, Chris)

Driver Changes:

- DDI and MIPI-DSI clocks fixes for Icelake (Vandita)
- Fix Icelake frequency change/locking (RPS) (Mika)
- Temporarily disable ppGTT read-only bit on Icelake (Mika)
- Add missing Icelake W/As (Mika)
- Enable 12 deep CSB status FIFO on Icelake (Mika)
- Inherit more Icelake code for Elkhartlake (Bob, Jani)

- Handle catastrophic error on engine reset (Mika)
- Shortcut readiness to reset check (Mika)
- Regression fix for GEM_BUSY causing us to report a mixed uabi-class request as not busy (Chris)
- Revert back to max link rate and lane count on eDP (Jani)
- Fix pipe BPP readout for BXT/GLK DSI (Ville)
- Set DP min_bpp to 8*3 for non-RGB output formats (Ville)
- Enable coarse preemption boundaries for Gen8 (Chris)
- Do not enable FEC without DSC (Ville)
- Restore correct BXT DDI latency optim setting calculation (Ville)
- Always reset context's RING registers to avoid running workload twice during reset (Chris)
- Set GPU wedged on driver unload (Janusz)
- Consolidate two similar barries from timeline into one (Chris)
- Only reset the pinned kernel contexts on resume (Chris)
- Wakeref tracking improvements (Chris, Imre)
- Lockdep fixes for shrinker interactions (Chris)
- Bump ready tasks ahead of busywaits in prep of semaphore use (Chris)

- Huge step in splitting display code into fine grained files (Jani)
- Refactor the IRQ init/reset macros for code saving (Paulo)
- Convert IRQ initialization code to uncore MMIO access (Paulo)
- Convert workarounds code to use uncore MMIO access (Chris)
- Nuke drm_crtc_state and use intel_atomic_state instead (Manasi)
- Update SKL clock-gating WA (Radhakrishna, Ville)
- Isolate GuC reset code flow (Chris)
- Expose force_dsc_enable through debugfs (Manasi)
- Header standalone compile testing framework (Jani)
- Code cleanups to reduce driver footprint (Chris)
- PSR code fixes and cleanups (Jose)
- Sparse and kerneldoc updates (Chris)
- Suppress spurious combo PHY B warning (Vile)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418080426.GA6409@jlahtine-desk.ger.corp.intel.com
2019-04-24 10:02:20 +10:00
Radhakrishna Sripada
51eb1a1de7 drm/i915/icl: Fix clockgating issue when using scalers
Fixes the clock-gating issue when pipe scaling is enabled.
(Lineage #2006604312)

V2: Fix typo in headline(Chris)
    Handle the non double buffered nature of the register(Ville)
V3: Fix checkpatch warning. BAT failure for V2 on gen3 looks unrelated.
V4: Split the icl and skl wa's(Ville)
V5: Split the checks for icl and skl(Ville)
V6: Correct the flipped checks in intel_pre_plane_update(Ville)
V7: Use enum for pipe and extend the WA for plane scalers(Ville)
V8: Eliminate the redundant use of pch_pfit(Ville)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417185901.14833-1-radhakrishna.sripada@intel.com
2019-04-23 21:30:24 +03:00
Ville Syrjälä
372b9ffb57 drm/i915: Fix skl+ max plane width
The spec has changed since skl_max_plane_width() was written.
Now the SKL limits are lower than what they were initially, and
GLK and ICL have different limits. Update the code to match the
spec.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418195907.23912-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2019-04-23 19:34:51 +03:00
Imre Deak
9c11b12184 drm/i915/icl: Fix MG_DP_MODE() register programming
Fix the order of lane, port parameters passed to the register macro.

Note that this was already partly fixed by commit
37fc7845df ("drm/i915: Call MG_DP_MODE() macro with the right parameters order")

While at it simplify things by using the macro directly instead of an
unnecessary redirection via an array.

v2:
- Add a note the commit message about simplifying things. (José)

Fixes: 58106b7d81 ("drm/i915: Make MG PHY macros semantically consistent")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419071026.32370-1-imre.deak@intel.com
2019-04-23 11:05:07 +03:00
Chris Wilson
2d6692e642 drm/i915: Start writeback from the shrinker
When we are called to relieve mempressue via the shrinker, the only way
we can make progress is either by discarding unwanted pages (those
objects that userspace has marked MADV_DONTNEED) or by reclaiming the
dirty objects via swap. As we know that is the only way to make further
progress, we can initiate the writeback as we invalidate the objects.
This means the objects we put onto the inactive anon lru list are
already marked for reclaim+writeback and so will trigger a wait upon the
writeback inside direct reclaim, greatly improving the success rate of
direct reclaim on i915 objects.

The corollary is that we may start a slow swap on opportunistic
mempressure from the likes of the compaction + migration kthreads. This
is limited by those threads only being allowed to shrink idle pages, but
also that if we reactivate the page before it is swapped out by gpu
activity, we only page the cost of repinning the page. The cost is most
felt when an object is reused after mempressure, which hopefully
excludes the latency sensitive tasks (as we are just extending the
impact of swap thrashing to them).

Apparently this is not the first time we've had this idea. Back in
commit 5537252b6b ("drm/i915: Invalidate our pages under memory
pressure") we wanted to start writeback but settled on invalidate after
Hugh Dickins warned us about a possibility of a deadlock within shmemfs
if we started writeback from shrink_slab. Looking at the callchain,
using writeback from i915_gem_shrink should be equivalent to the pageout
also employed by shrink_slab, i.e. it should not be any riskier afaict.

v2: Leave mmapings intact. At this point, the only mmapings of our
objects will be via CPU mmaps on the shmemfs filp, which are
out-of-scope for our LRU tracking. Instead leave those pages to the
inactive anon LRU page list for aging and pageout as normal.

v3: Be selective on which paths trigger writeback, in particular
excluding paths shrinking just to reclaim vm space (e.g. mmap, vmap
reapers) and avoid starting writeback on the entire process space from
within the pm freezer.

References: https://bugs.freedesktop.org/show_bug.cgi?id=108686
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Hocko <mhocko@suse.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190420115539.29081-1-chris@chris-wilson.co.uk
2019-04-20 15:06:31 +01:00
Fernando Pacheco
f3c2b76ef2 drm/i915/selftests: Check that gpu reset is usable from atomic context
GPU reset is now available with GuC enabled, so re-enable our check that
this reset is usable from atomic context.

Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-6-fernando.pacheco@intel.com
2019-04-20 08:20:08 +01:00
Fernando Pacheco
40d211ef62 Revert "drm/i915/guc: Disable global reset"
We have now prepared the guc reset paths to avoid taking struct_mutex, or
any other lock, and so it is now safe to re-enable.

References: fe62365f9f ("drm/i915/guc: Disable global reset")
Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-5-fernando.pacheco@intel.com
2019-04-20 08:20:06 +01:00
Fernando Pacheco
fc488b5903 drm/i915/uc: Place uC firmware in upper range of GGTT
Currently we pin the GuC or HuC firmware image just before uploading.
Perma-pin during uC initialization instead and use the range reserved at
the top of the address space.

Moving the firmware resulted in needing to:
- use an additional pinning for the rsa signature which will be used
  during HuC auth as addresses above GUC_GGTT_TOP do not map through GTT.

v2: Remove call to set to gtt domain
    Do not restore fw gtt mapping unconditionally
    Separate out pin/unpin functions and drop usage of pin/unpin
    Use uc_fw init/fini functions to bind/unbind fw object

v3: Bind is only needed during xfer (Chris)
    Remove attempts to bind outside of xfer (Chris)
    Mark fw bind/unbind static

Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-4-fernando.pacheco@intel.com
2019-04-20 08:19:32 +01:00