Commit Graph

77 Commits

Author SHA1 Message Date
Ben Widawsky
7e0d96bc03 drm/i915: Use multiple VMs -- the point of no return
As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).

For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.

One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).

v2: Disable BDW on rebase

NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:24:52 +01:00
Daniel Vetter
3d7f0f9dcc Merge commit drm-intel-fixes into topic/ppgtt
I need the tricky do_switch fix before I can merge the final piece of
the ppgtt enabling puzzle. Otherwise the conflict will be a real pain
to resolve since the do_switch hunk from -fixes must be placed at the
exact right place within a hunk in the next patch.

Conflicts:
	drivers/gpu/drm/i915/i915_gem_context.c
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:23:37 +01:00
Ben Widawsky
41bde5535a drm/i915: Get context early in execbuf
We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.

This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.

Important tricky bit:
Because slow relocations during execbuffer drop struct_mutex

Perhaps it would be best to acquire the reference when we get the
context, but I'll save that for another day (note I have written the
patch before, and I found the changes required to be uglier than this).

Note that since we currently access everything via context id, and not
the data structure this is fine, though not desirable. The next change
attempts to get the context only once via the context ID idr lookup, and
as such, the following can happen:

CTX-A is created, refcount = 1
CTX-A execbuf, mutex dropped
close IOCTL called on CTX-A, refcount = 0
CTX-A resumes in execbuf.

v2: Rebased on top of
commit b6359918b8
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Oct 30 15:44:16 2013 +0200

    drm/i915: add i915_get_reset_stats_ioctl

v3: Rebased on top of
commit 25b3dfc87b
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date:   Tue Nov 12 11:57:30 2013 +0200

Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Tue Nov 26 16:14:33 2013 +0200

    drm/i915: check context reset stats before relocations

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:52:42 +01:00
Ben Widawsky
c482972a08 drm/i915: Piggy back hangstats off of contexts
To simplify the codepaths somewhat, we can simply always create a
context. Contexts already keep hangstat information. This prevents us
from having to differentiate at other parts in the code.

There is allocation overhead, but it should not be measurable.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:51:58 +01:00
Ben Widawsky
0eea67eb26 drm/i915: Create a per file_priv default context
Every file will get it's own context, and we use this context instead of
the default context. The default context still exists for future
shrinker usage as well as reset handling.

v2: Updated to address Mika's recent context guilty changes
Some more changes around this come up in later patches as well.

v3: Use a fake context to avoid allocation for the !HAS_HW_CONTEXT case.
I've tried the alternatives. This looks the best to me.
Removed hangstat stuff from v2 - for a separate patch
Demote failed PPGTT set to DRM_DEBUG_DRIVER since it can now be invoked
easily from userspace.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:44:29 +01:00
Ben Widawsky
bdf4fd7ea0 drm/i915: Do aliasing PPGTT init with contexts
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.

The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:32:14 +01:00
Ben Widawsky
c7c48dfdff drm/i915: Add VM to context
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.

To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.

Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:20 +01:00
Ben Widawsky
a45d0f6a7f drm/i915: Generalize default context setup
The plan to to make every file descriptor have a default context. To
accommodate this, generalize out default context setup function so it
can be used at file open time.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:56 +01:00
Ben Widawsky
2fa48d8d4a drm/i915: Split context enabling from init
We **need** to do this for exactly 1 reason, because we want to embed a
PPGTT into the context, but we don't want to special case the default
context.

To achieve that, we must be able to initialize contexts after the GTT is
setup (so we can allocate and pin the default context's BO), but before
the PPGTT and rings are initialized. This is because, currently, context
initialization requires ring usage. We don't have rings until after the
GTT is setup. If we split the enabling part of context initialization,
the part requiring the ringbuffer, we can untangle this, and then later
embed the PPGTT

Incidentally this allows us to also adhere to the original design of
context init/fini in future patches: they were only ever meant to be
called at driver load and unload.

v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris)

v3: BUG_ON after checking for disabled contexts. Or else it blows up pre
gen6 (Ben)

v4: Forward port
Modified enable for each ring, since that patch is earlier in the series
Dropped ring arg from create_default_context so it can be used by others

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:55 +01:00
Ben Widawsky
acce9ffa48 drm/i915: Better reset handling for contexts
This patch adds to changes for contexts on reset:
Sets last context to default - this will prevent the context switch
happening after a reset. That switch is not possible because the
rings are hung during reset and context switch requires reset. This
behavior will need to be reworked in the future, but this is what we
want for now.

In the future, we'll also want to reset the guilty context to
uninitialized. We should wait for ARB_Robustness related code to land
for that.

This is somewhat for paranoia.  Because we really don't know what the
GPU was doing when it hung, or the state it was in (mid context write,
for example), later restoring the context is a bad idea. By setting the
flag to not initialized, the next load of that context will not restore
the state, and thus on the subsequent switch away from the context will
overwrite the old data.

NOTE: This code needs a fixup when we actually have multiple VMs. The
issue that can occur is inactive objects in a VM will need to be
destroyed before the last context unref. This can now happen via the
fake switch introduced in this patch (and it other ways in the future)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:54 +01:00
Ben Widawsky
0009e46cd5 drm/i915: Track which ring a context ran on
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).

This is very important when we switch address spaces, which unlike
context objects, do change per ring.

As an example, if we have:

        RCS   BCS
ctx            A
ctx      A
ctx      B
ctx            B

Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.

As a result, we no longer need to track which ring a context "belongs"
to, as it never really made much sense anyway.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:54 +01:00
Ben Widawsky
67e3d2979b drm/i915: Permit contexts on all rings
If we want to use contexts in more abstract terms (specifically with
PPGTT in mind), we need to allow them to be specified for any ring.

Since the upcoming patches will bring about the use of multiple address
spaces, and each ring needs to have an address space programmed (which
we intend to do at context switch time), we can no longer only use RCS.

With multiple rings having a last context, we must now unreference these
contexts.

NOTE: This commit requires an update to intel-gpu-tools to make it not
fail.

v2: Rebased with some logical conflicts.
Squashed in the context fini refcount patch

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:53 +01:00
Ben Widawsky
b731d33d05 drm/i915: relax context alignment
With the introduction of contexts per fd in the future, one can easily
envision more contexts being used. We do not have an easy remedy to
reduce the space requirements of the contexts, we can make things
slightly better by using less stringent alignments on later hardware.

Ville: Since I can almost predict you'll point this out. I can no longer
find the docs which specify the 64k requirement on certain gen6 SKUs. If
you'd like to change that too, be my guest.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:52 +01:00
Ben Widawsky
e422b888eb drm/i915: Add a context open function
We'll be doing a bit more stuff with each file, so having our own open
function should make things clean.

This also allows us to easily add conditionals for stuff we don't want
to do when we don't have HW contexts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:51 +01:00
Ben Widawsky
6f65e29aca drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.

What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.

drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage

drm/i915: Add bind/unbind object functions to VMA

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use the new vm [un]bind functions

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

At this point the original mailing list thread diverges. ie.

v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: reduce vm->insert_entries() usage

FKA: drm/i915: eliminate vm->insert_entries()

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:50 +01:00
Ben Widawsky
d7f46fc4e7 drm/i915: Make pin count per VMA
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:49 +01:00
Daniel Vetter
acc240d41e drm/i915: Fix use-after-free in do_switch
So apparently under ridiculous amounts of memory pressure we can get
into trouble in do_switch when we try to move the old hw context
backing storage object onto the active lists.

With list debugging enabled that usually results in us chasing a
poisoned pointer - which means we've hit upon a vma that has been
removed from all lrus with list_del (and then deallocated, so it's a
real use-after free).

Ian Lister has done some great callchain chasing and noticed that we
can reenter do_switch:

i915_gem_do_execbuffer()

i915_switch_context()

do_switch()
   from = ring->last_context;
   i915_gem_object_pin()

      i915_gem_object_bind_to_gtt()
         ret = drm_mm_insert_node_in_range_generic();
         // If the above call fails then it will try i915_gem_evict_something()
         // If that fails it will call i915_gem_evict_everything() ...
	 i915_gem_evict_everything()
	    i915_gpu_idle()
	       i915_switch_context(DEFAULT_CONTEXT)

Like with everything else where the shrinker or eviction code can
invalidate pointers we need to reload relevant state.

Note that there's no need to recheck whether a context switch is still
required because:

- Doing a switch to the same context is harmless (besides wasting a
  bit of energy).

- This can only happen with the default context. But since that one's
  pinned we'll never call down into evict_everything under normal
  circumstances. Note that there's a little driver bringup fun
  involved namely that we could recourse into do_switch for the
  initial switch. Atm we're fine since we assign the context pointer
  only after the call to do_switch at driver load or resume time. And
  in the gpu reset case we skip the entire setup sequence (which might
  be a bug on its own, but definitely not this one here).

Cc'ing stable since apparently ChromeOS guys are seeing this in the
wild (and not just on artificial stress tests), see the reference.

Note that in upstream code doesn't calle evict_everything directly
from evict_something, that's an extension in this product branch. But
we can still hit upon this bug (and apparently we do, see the linked
backtraces). I've noticed this while trying to construct a testcase
for this bug and utterly failed to provoke it. It looks like we need
to driver the system squarly into the lowmem wall and provoke the
shrinker to evict the context object by doing the last-ditch
evict_everything call.

Aside: There's currently no means to get a badly-fragmenting hw
context object away from a bad spot in the upstream code. We should
fix this by at least adding some code to evict_something to handle hw
contexts.

References: https://code.google.com/p/chromium/issues/detail?id=248191
Reported-by: Ian Lister <ian.lister@intel.com>
Cc: Ian Lister <ian.lister@intel.com>
Cc: stable@vger.kernel.org
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Bloomfield, Jon <jon.bloomfield@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Reviewed-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-06 13:09:11 +01:00
Chris Wilson
0d1430a3f4 drm/i915: Hold mutex across i915_gem_release
Inorder to serialise the closing of the file descriptor and its
subsequent release of client requests with i915_gem_free_request(), we
need to hold the struct_mutex in i915_gem_release(). Failing to do so
has the potential to trigger an OOPS, later with a use-after-free.

Testcase: igt/gem_close_race
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70874
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71029
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-04 16:57:02 +01:00
Ben Widawsky
2f88542607 drm/i915: Remove defunct ctx switch comments
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:12:16 +01:00
Daniel Vetter
c09cd6e969 Merge branch 'backlight-rework' into drm-intel-next-queued
Pull in Jani's backlight rework branch. This was merged through a
separate branch to be able to sort out the Broadwell conflicts
properly before pulling it into the main development branch.

Conflicts:
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-15 10:02:39 +01:00
Ben Widawsky
8897644a6d drm/i915/bdw: HW context support
BDW context sizes varies a bit.

v2: Squash in fixup for the hw context size from Ben.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:37 +01:00
Ben Widawsky
8245be3139 drm/i915: Require HW contexts (when possible)
v2: Fixed the botched locking on init_hw failure in i915_reset (Ville)
Call cleanup_ringbuffer on failed context create in init_hw (Ville)

v3: Add dev argument ti clean_ringbuffer

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-07 09:35:44 +01:00
Ben Widawsky
71b76d004f drm/i915: cleanup context fini
I had this lying around from he original PPGTT series, and thought we
might try to get it in by itself.

With the introduction of context refcounting we never explicitly
ref/unref the backing object. As such, the previous fix was a bit wonky.

Aside from fixing the above, this patch also puts us in good shape for
an upcoming patch which allows a failure to occur in between
context_init and the first do_switch.

CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-16 11:08:30 +02:00
Ben Widawsky
e2d05a8b1e drm/i915: Convert active API to VMA
Even though we track object activity and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in
the APIs.

NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.

v2: Remove leftover hunk from the previous patch which didn't keep
i915_gem_object_move_to_active. That patch had to rely on the ring to
get the dev instead of the obj. (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-01 07:45:21 +02:00
Ben Widawsky
3ccfd19dea drm/i915: Do remaps for all contexts
On both Ivybridge and Haswell, row remapping information is saved and
restored with context. This means, we never actually properly supported
the l3 remapping because our sysfs interface is asynchronous (and not
tied to any context), and the known faulty HW would be reused by the
next context to run.

Not that due to the asynchronous nature of the sysfs entry, there is no
point modifying the registers for the existing context. Instead we set a
flag for all contexts to load the correct remapping information on the
next run. Interested clients can use debugfs to determine whether or not
the row has been remapped.

One could propose at this point that we just do the remapping in the
kernel. I guess since we have to maintain the sysfs interface anyway,
I'm not sure how useful it is, and I do like keeping the policy in
userspace; (it wasn't my original decision to make the
interface the way it is, so I'm not attached).

v2: Force a context switch when we have a remap on the next switch.
(Ville)
Don't let userspace use the interface with disabled contexts.

v3: Don't force a context switch, just let it nop
Improper context slice remap initialization, 1<<1 instead of 1<<i, but I
rewrote it to avoid a second round of confusion.
Error print moved to error path (All Ville)
Added a comment on why the slice remap initialization happens.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-19 20:39:56 +02:00
Ben Widawsky
a33afea5ff drm/i915: Keep a list of all contexts
I have implemented this patch before without creating a separate list
(I'm having trouble finding the links, but the messages ids are:
<1364942743-6041-2-git-send-email-ben@bwidawsk.net>
<1365118914-15753-9-git-send-email-ben@bwidawsk.net>)

However, the code is much simpler to just use a list and it makes the
code from the next patch a lot more pretty.

As you'll see in the next patch, the reason for this is to be able to
specify when a context needs to get L3 remapping. More details there.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-19 20:39:43 +02:00
Damien Lespiau
508842a036 drm/i915: It's its!
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-04 17:34:54 +02:00
Chris Wilson
c0321e2c5a drm/i915: Do not add an interrupt for a context switch
We use the request to ensure we hold a reference to the context for the
duration that it remains in use by the ring. Each request only holds a
reference to the current context, hence we emit a request after
switching contexts with the final reference to the old context. However,
the extra interrupt caused by that request is not useful (no timing
critical function will wait for the context object), instead the overhead
of servicing the IRQ shows up in some (lightweight) benchmarks. In order
to keep the useful property of using the request to manage the context
lifetime, we want to add a dummy request that is associated with the
interrupt from the subsequent real request following the batch.

The extra interrupt was added as a side-effect of using
i915_add_request() in

commit 112522f678
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu May 2 16:48:07 2013 +0300

    drm/i915: put context upon switching

v2: Daniel convinced me that the request here was solely for context
lifetime tracking and that we have the active ref to keep the object
alive whilst the MI_SET_CONTEXT. So the only concern then is which
context should get the blame for MI_SET_CONTEXT failing. The old scheme
added a request for the old context so that any hang upto and including
the switch away would mark the old context as guilty. Now any hang here
implicates the new context. However since we have already gone through a
complete flush with the last context in its last request, and all that
lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we
should be safe in not unduly placing blame on the new context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-04 17:34:53 +02:00
Ben Widawsky
ca191b1313 drm/i915: mm_list is per VMA
formerly: "drm/i915: Create VMAs (part 5) - move mm_list"

The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VMx .

Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.

Shamelessly manipulated out of Daniel:
"active/inactive stuff is used by eviction when we run out of address
space, so needs to be per-vma and per-address space. Bound/unbound otoh
is used by the shrinker which only cares about the amount of memory used
and not one bit about in which address space this memory is all used in.
Of course to actual kick out an object we need to unbind it from every
address space, but for that we have the per-object list of vmas."

v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)

v3: Moved earlier in the series

v4: Add dropped message from v3

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Frob patch to apply and use vma->node.size directly as
discused with Ben. Also drop a needles BUG_ON before move_to_inactive,
the function itself has the same check.]
[danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA
in destroy", specifically unlink the vma from the mm_list in
vma_unbind (to keep it symmetric with bind_to_vm) instead of
vma_destroy.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:06:58 +02:00
Chris Wilson
350ec881d9 drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge
MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-06 16:35:30 +02:00
Ben Widawsky
c37e220461 drm/i915: Add VM to pin
To verbalize it, one can say, "pin an object into the given address
space." The semantics of pinning remain the same otherwise.

Certain objects will always have to be bound into the global GTT.
Therefore, global GTT is a special case, and keep a special interface
around for it (i915_gem_obj_ggtt_pin).

v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:09 +02:00
Chris Wilson
11fa338404 drm/i915: Fix retrieval of hangcheck stats
The default context is always supported (as it contains the global
hangcheck stats) and the contexts for hangcheck are not limited
to any ring.

References: https://bugs.freedesktop.org/show_bug.cgi?id=65845
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 10:40:25 +02:00
Ben Widawsky
f343c5f647 drm/i915: Getter/setter for object attributes
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:34 +02:00
Ben Widawsky
a0de80a0e0 drm/i915: Fix context sizes on HSW
With updates to the spec, we can actually see the context layout, and
how many dwords are allocated. That table suggests we need 70720 bytes
per HW context. Rounded up, this is 18 pages. Looking at what lives
after the current 4 pages we use, I can't see too much important (mostly
it's d3d related), but there are a couple of things which look scary. I
am hopeful this can explain some of our odd HSW failures.

v2: Make the context only 17 pages. The power context space isn't used
ever, and execlists aren't used in our driver, making the actual total
66944 bytes.

v3: Add a comment to the code. (Jesse & Paulo)

Reported-by: "Azad, Vinit" <vinit.azad@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:14:54 +02:00
Mika Kuoppala
0025c0772d drm/i915: change i915_add_request to macro
Only execbuffer needed all the parameters on i915_add_request().
By putting __i915_add_request behind macro, all current callsites
become cleaner. Following patch will introduce a new parameter
for __i915_add_request. With this patch, only the relevant callsite
will reflect the change making commit smaller and easier to understand.

v2: _i915_add_request as function name (Chris Wilson)

v3: change name __i915_add_request and fix ordering of params (Ben Widawsky)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-13 17:42:15 +02:00
Mika Kuoppala
c0bb617a70 drm/i915: add i915_gem_context_get_hang_stats()
To get context hang statistics for specified context,
add i915_gem_context_get_hang_stats().

For arb-robustness, every context needs to have its own
hang statistics tracking. Added function will return
the user specified context statistics or in case of
default context, statistics from drm_i915_file_private.

v2: handle default context inside get_reset_state

v3: return struct pointer instead of passing it in as param
    (Chris Wilson)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-13 17:42:15 +02:00
Ben Widawsky
bb0364130f drm/i915: context debug messages
Add some debug messages to help figure out what goes wrong on context
initialization.

Later in the PPGTT series, I ended up having a lot of failures after
reset. In many cases it was extra difficult to debug because I hadn't
even realized that contexts failed to reinitialize after reset (again an
artifact of some later patches).

This fairly benign patch does help debug some potential issues which
arise later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-31 20:53:58 +02:00
Damien Lespiau
8693a82487 drm/i915: Add references to some workaround we implement
We did not mention the workaround name when implementing those. This
should help us track what we already implement.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-10 21:56:34 +02:00
Ben Widawsky
186507e9e8 drm/i915: Assert mutex_is_locked on context lookup
Because our context refcounting doesn't grab a ref at lookup time, it is
unsafe to do so without the lock.

NOTE: We don't have an easy way to put the assertion in the lookup
function which is where this really belongs. Context switching is good
enough because it actually asserts even more correctness by protecting
the default_context.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:30:29 +02:00
Chris Wilson
112522f678 drm/i915: put context upon switching
In order to be notified of when the context and all of its associated
objects is idle (for if the context maps to a ppgtt) we need a callback
from the retire handler. We can arrange this by using the kref_get/put
of the context for request tracking and by inserting a request to
demarque the switch away from the old context.

[Ben: fixed minor error to patch compile, AND s/last_context/from/]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:20:48 +02:00
Mika Kuoppala
168f836602 drm/i915: unreference default context on module unload
Before module unload is called, gpu_idle() will switch
to default context. This will increment ref count of base
object as the default context is 'running' on module unload
time. Unreference the drm object so that when context
is freed, base object is freed as well.

v2: added comment to explain the refcounts (Ben Widawsky)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-03 18:19:56 +02:00
Mika Kuoppala
dce3271b1e drm/i915: reference count for i915_hw_contexts
Enabling PPGTT and also the need to track which context was guilty of
gpu hang (arb robustness enabling) have put pressure for struct i915_hw_context
to be more than just a placeholder for hw context state.

In order to track object lifetime properly in a multi peer usage, add reference
counting for i915_hw_context.

v2: track i915_hw_context pointers instead of using ctx_ids
(from Chris Wilson)

v3 (Ben): Get rid of do_release() and handle refcounting more compactly.
(recommended by Chis)

v4: kref_* put inside static inlines (Daniel Vetter)
remove code duplication on freeing context (Chris Wilson)

v5: idr_remove and ctx->file_priv = NULL in destroy ioctl (Chris)
This actually will cause a problem if one destroys a context and later
refers to the idea of the context (multiple contexts may have the same
id, but only 1 will exist in the idr).

v6: Strip out the request related stuff. Reworded commit message.
Got rid of do_destroy and introduced i915_gem_context_release_handle,
suggested by Chris Wilson.

v7: idr_remove can't be called inside idr_for_each (Chris Wilson)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v5)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v7)
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Squash sob lines, the patch ping-ponged between Ben and Mika
a bit ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-30 23:40:13 +02:00
Chris Wilson
4615d4c9e2 drm/i915: Use MLC (l3$) for context objects
Enabling context support increases SwapBuffers latency by about 20%
(measured on an i7-3720qm). We can offset that loss slightly by enabling
faster caching for the contexts. As they are not backed by any
particular cache (such as the sampler or render caches) our only option
is to select the generic mid-level cache. This reduces the latency of
the swap by about 5%.

Oddly this effect can be observed running smokin-guns on IVB at
1280x1024:
Using BLT copies for swaps: 151.67 fps
Using Render copies for swaps (unpatched):  141.70 fps
With contexts disabled: 150.23 fps
With contexts in L3$: 150.77 fps

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:11 +02:00
Tejun Heo
c8c470afe3 drm/i915: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:16 -08:00
Ben Widawsky
f73f760725 drm/i915/ctx: Remove bad invariant
It's not that the assertion is incorrect, but rather that we can call
do_destroy early in loading, and we will falsely BUG().

Since contexts have been in for a while now, and in the internal APIs
are pretty stable, it should be fairly safe to remove this.

v2: Remove unused dev_priv, and dev

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-02-15 10:30:40 +01:00
Ben Widawsky
07ea0d85ac drm/i915: Clarify HW context size logic
This was a rebase error from when the patches originally landed. Since
the context size is unsigned, there is also no use in checking if it's
less than 0.

The existing code is not really wrong, but it's not simple as it should
be.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-02-15 10:30:36 +01:00
Chris Wilson
9d7730914f drm/i915: Preallocate next seqno before touching the ring
Based on the work by Mika Kuoppala, we realised that we need to handle
seqno wraparound prior to committing our changes to the ring. The most
obvious point then is to grab the seqno inside intel_ring_begin(), and
then to reuse that seqno for all ring operations until the next request.
As intel_ring_begin() can fail, the callers must already be prepared to
handle such failure and so we can safely add further checks.

This patch looks like it should be split up into the interface
changes and the tweaks to move seqno wrapping from the execbuffer into
the core seqno increment. However, I found no easy way to break it into
incremental steps without introducing further broken behaviour.

v2: Mika found a silly mistake and a subtle error in the existing code;
inside i915_gem_retire_requests() we were resetting the sync_seqno of
the target ring based on the seqno from this ring - which are only
related by the order of their allocation, not retirement. Hence we were
applying the optimisation that the rings were synchronised too early,
fortunately the only real casualty there is the handling of seqno
wrapping.

v3: Do not forget to reset the sync_seqno upon module reinitialisation,
ala resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-29 11:43:52 +01:00
Ben Widawsky
f94982b002 drm/i915: Allocate the proper size for contexts.
Whoops. This was fixed previously, but not sure how it got lost. It's
not needed for -fixes or stable because at the moment
drm_i915_file_private is way bigger than i915_hw_context (by 120 bytes
on my 64b build).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:47 +01:00
Dave Airlie
1f31c69dac Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:

Bigger -fixes pile, mostly because I've included Ajax' DP dongle stuff,
as discussed on irc. Otherwise just small things:
- regression fix to finally make 6bpc auto-dither on dp work (Jani)
- reinstate an snb ctx w/a that accidentally got lost in a rework (Chris)
- fixup the DP train sequence, logic-goof-up uncovered by Coverty (Chris)
- fix set_caching locking (Ben)
- fix spurious segfault on con-current gtt mmap faulting (Dimitry and Mika)
- some pageflip correctness fixes (still hunting down some issues, but
  these are the worst offenders of confused code that we've tracked down
  thus far) from Chris and me
- fixup swizzling settings on vlv (Jesse)
- gt_mode w/a from Ben added, fixes snb gt1 rc6+hw ctx hangs.

* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: Fix GT_MODE default value
  drm/i915: don't frob the vblank ts in finish_page_flip
  drm/i915: call drm_handle_vblank before finish_page_flip
  drm/i915: print warning if vmi915_gem_fault error is not handled
  drm/i915: EBUSY status handling added to i915_gem_fault().
  drm/i915: Try harder to complete DP training pattern 1
  drm/i915: set swizzling to none on VLV
  drm/dp: Make sink count DP 1.2 aware
  drm/dp: Document DP spec versions for various DPCD registers
  drm/i915/dp: Be smarter about connection sense for branch devices
  drm/i915/dp: Fetch downstream port info if needed during DPCD fetch
  drm/dp: Update DPCD defines
  drm: Export drm_probe_ddc()
  drm/i915: Flush the pending flips on the CRTC before modification
  drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a
  drm/i915: Fix set_caching locking
  drm/i915: use adjusted_mode instead of mode for checking the 6bpc force flag
2012-10-07 21:13:54 +10:00
Linus Torvalds
612a9aab56 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm merge (part 1) from Dave Airlie:
 "So first of all my tree and uapi stuff has a conflict mess, its my
  fault as the nouveau stuff didn't hit -next as were trying to rebase
  regressions out of it before we merged.

  Highlights:
   - SH mobile modesetting driver and associated helpers
   - some DRM core documentation
   - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
     combined pte writing, ilk rc6 support,
   - nouveau: major driver rework into a hw core driver, makes features
     like SLI a lot saner to implement,
   - psb: add eDP/DP support for Cedarview
   - radeon: 2 layer page tables, async VM pte updates, better PLL
     selection for > 2 screens, better ACPI interactions

  The rest is general grab bag of fixes.

  So why part 1? well I have the exynos pull req which came in a bit
  late but was waiting for me to do something they shouldn't have and it
  looks fairly safe, and David Howells has some more header cleanups
  he'd like me to pull, that seem like a good idea, but I'd like to get
  this merge out of the way so -next dosen't get blocked."

Tons of conflicts mostly due to silly include line changes, but mostly
mindless.  A few other small semantic conflicts too, noted from Dave's
pre-merged branch.

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
  drm/nv98/crypt: fix fuc build with latest envyas
  drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
  drm/nv41/vm: fix and enable use of "real" pciegart
  drm/nv44/vm: fix and enable use of "real" pciegart
  drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
  drm/nouveau: store supported dma mask in vmmgr
  drm/nvc0/ibus: initial implementation of subdev
  drm/nouveau/therm: add support for fan-control modes
  drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
  drm/nouveau/therm: calculate the pwm divisor on nv50+
  drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
  drm/nouveau/therm: move thermal-related functions to the therm subdev
  drm/nouveau/bios: parse the pwm divisor from the perf table
  drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
  drm/nouveau/therm: rework thermal table parsing
  drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
  drm/nouveau: fix pm initialization order
  drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
  drm/nouveau: log channel debug/error messages from client object rather than drm client
  drm/nouveau: have drm debugging macros build on top of core macros
  ...
2012-10-03 23:29:23 -07:00