To get the fun stuff out of the way, the legacy hws is allocated by
userspace when the gpu needs a gfx hws. And there's no reference-counting
going on, so userspace can simply screw everyone over.
At least it's not as horrible as i810, where the ringbuffer is allocated
by userspace ...
We can't fix this disaster, but we can at least tidy up the code a
bit to make things clearer:
- Drop the drm ioremap indirection.
- Add a new new read_legacy_status_page to paper over the differences
between the legacy gfx hws and the physical hws shared with the
new ringbuffer code.
- Add a pointer in dev_priv->dri1 for the cpu addresses - that one is
an iomem remapping as opposed to all other hw status pages. This is
just prep work to make sparse happy.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We kzalloc dev_priv, and we never use hws_map in intel_ringbuffer.c.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
They're now in intel_pm.c, so group them a bit better.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We now have a nice home for power management code, so let's use it!
v2: Resolve conflict agains "Only enable IPS polling for gen5"
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Unfortunately there has been dri1 userspace that used gem to manage
the gtt and hence also needed cliprects in the execbuf ioctl. So
we can't ever remove that code without breaking the ioctl abi.
But at least we can disable it on gen5+, because these horrible
versions of mesa have not supported these chips.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Wohoo!
Now we only need to move all the gem/kms stuff that accidentally
landed in i915_dma.c out of it, and this will be our legacy dri1
grave-yard.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... and hide it in i915_dma.c.
This way all the legacy stuff dealing with READ_BREADCRUMB and
LP_RING and friends is in i915_dma.c.
v2: Rebase on top of Chris Wilson's rework irq handling code.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's just get this out of the way.
v2: Rebase against ENODEV changes.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We never supported dri1 on gen5+.
VLV never had that code, so no need to remove it.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is a pretty racy way to close these races, and we have
much better means to cope with these races meanwhile: For
non-broken userspace we correctly wait for any outstanding
rendering, for broken userspace the hangcheck will save the
day.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The LP refers to 'low priority' as opposed to the high priority
ring on gen2/3. So lets constrain its use to the code of that era.
Unfortunately we can't yet completely remove the associated
macros from common headers and shove them into i915_dma.c to
the other dri1 legacy support code, a few cleanups are still
missing for that.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Assigned in setparam, used never.
I didn't bother to dig through the archives to figure out what
this was supposed to do.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Even the horrible gen3 XvMC code has learned to do this
right by the time xf86-video-intel releases learned to do
kernel modesetting. So we can just disallow this.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... and shove allow_batchbuffer in there. More dragons will
follow suit.
There's the curious case that we allow this for KMS ...
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
i915_dma.c contains most of the old dri1 horror-show, so move
the remaining bits there, too. The code has been removed and
the only thing left are some stubs to ensure that userspace
doesn't try to use this stuff. vblank_pipe_set only returns 0
without any side-effects, so we can even stub it out with
the canonical drm_noop.
v2: Rebase against ENODEV changes.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
vblank_pipe was intended to be used for tracking DRI1 state. However,
the vblank_pipe reported to DRI1 is fixed to umask both pipes, and the
dev_priv->vblank_pipe unused and superfluous.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On SandyBridge IPS was entirely implemented in hardware and not reliant
on the driver monitoring power consumption and feeding back desired run
states, so the hardware is able to adapt quicker and more flexibly. Which
is a huge relief for us as we no longer have to carry empirically
derived magic algorithms.
Yet despite the advance in technology, the driver was still doing its
IPS polling on all machines. Restrict it to the only supported hardware,
Clarkdale/Arrandale.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It turns out throttle had an almost identical bit of code to do the
wait. Now we can call the new helper directly. This is just a bonus,
and not needed for the overall series.
v2: remove irq_get/put which is now in __wait_seqno (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's about to go away anyway. Just here to help bisection.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
i915_wait_request is actually a fairly large function encapsulating
quite a few different operations. Because being able to wait on seqnos
in various conditions is useful, extracting that bit of code to a helper
function seems useful
v2: pull the irq_get/put as well (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The only time irq_get should fail is during unload or suspend. Both of
these points should try to quiesce the GPU before disabling interrupts
and so the atomic polling should never occur.
This was recommended by Chris Wilson as a way of reducing added
complexity to the polled wait which I introduced in an RFC patch.
09:57 < ickle_> it's only there as a fudge for waiting after irqs
after uninstalled during s&r, we aren't actually meant to hit it
09:57 < ickle_> so maybe we should just kill the code there and fix the breakage
v2: return -ENODEV instead of -EBUSY when irq_get fails
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The waiting_seqno is not terribly useful, and as such we can remove it
so that we'll be able to extract lockless code.
v2: Keep the information for error_state (Chris)
Check if ring is initialized in hangcheck (Chris)
Capture the waiting ring (Chris)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: add some bikeshed to clarify a comment.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This extra bit of interrupt enabling code doesn't belong in the wait
seqno function. If anything we should pull it out to a helper so the
throttle code can also use it. The history is a bit vague, but I am
going to attempt to just dump it, unless someone can argue otherwise.
Removing this allows for a shared lock free wait seqno function. To keep
tabs on this issue though, the IER value is stored on error capture
(recommended by Chris Wilson)
v2: fixed typo EIR->IER (Ben)
Fix some white space (Ben)
Move IER capture to globally instead of per ring (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: ier is a 16 bit reg on gen2!]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This originates from a hack by me to quickly fix a bug in an earlier
patch where we needed control over whether or not waiting on a seqno
actually did any retire list processing. Since the two operations aren't
clearly related, we should pull the parameter out of the wait function,
and make the caller responsible for retiring if the action is desired.
The only function call site which did not get an explicit retire_request call
(on purpose) is i915_gem_inactive_shrink(). That code was already calling
retire_request a second time.
v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I've missed this one.
v2: Chris Wilson noticed another register.
v3: Color choice improvements.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since there is only one remaining user of I915_INTERRUPT_ENABLE_FIX,
expand it at the callsite. Quoting Jesse Barnes:
"I'd really like to get rid of these defines at the top of i915_irq.c.
Some are unused and the others just make you check for the right bits
everytime your read the code."
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Add bikeshed suggested by Jesse.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We appear to allow too many pending pageflips as evidenced by an
apparent pin-leak. So borrow the pageflip completion logic from i8xx for
handling PendingFlip in a robust manner.
v2: Address Jesse's reminders about the nuances of gen3 IRQ handling.
References: https://bugzilla.kernel.org/show_bug.cgi?id=41882
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bring the for-each-pipe loops together so that the code is easier on the
eyes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A couple of miscellaneous cleanups as well to move per-loop condition
variables within the scope of the loop and the update of the DRI1
breadcrumb to the tail of the function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And a couple of miscellaneous cleanups to the main body of the IRQ loop;
move per-loop condition variables within the scope of the loop and move
the old DRI1 breadcrumb to the tail of the function and so only execute
it once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On later gen3, you are able to select the meaning of the FlipPending
status bit in IIR and change it to FlipDone. This was sometimes done by
the BIOS leading to confusion on just how pageflipping worked on gen3.
Simplify the implementation by using the legacy meaning for all gen3
machines.
Note: this makes all gen3 machines equally broken...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In preparation for rewriting the gen3 irq handler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And remove the cargo-culted copy from the valleyview irq handler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The waitqueues are already initialised during ring initialisation so
kill the redundant and duplicated code to do so in each generations IRQ
installer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rather than duplicate similar code across the IRQ installers, perform
the initialisation of the workers upfront. This will lead to simpler
teardown and quiescent code as we can assume that the workers have
been initialised.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Calling these when gem assumes full control of the hw won't end
in anything else than tears. So be a bit more paranoid here.
Just serves as documentation.
v2: Bail out with ENODEV as suggested by Chris Wilson.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Always true these days. It has been added originally to work
around some issues with the agp layer in 2.6.29:
commit ac5c4e7618
Author: Dave Airlie <airlied@redhat.com>
Date: Fri Dec 19 15:38:34 2008 +1000
drm/i915: GEM on PAE has problems - disable it for now.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We always set it so there's no point in checking. We could
instead add a bit that tells us whether gem is actually
initialized (i.e. either kms or gem_init_ioctl called), but
that's imho not worth it.
So just rip it out.
There's a little change in the wait_ring timeout, but we've never
run with anything else than the 60 second timeout, even on dri1
userspace.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This ioctl used in a kms driver is only useful to create massive
havoc.
v2: Bail out with -ENODEV as suggested by Chris Wilson.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Also ditch the cargo-culted dev_priv checks - either we have a
giant hole in our setup code or this is useless. Plainly bogus
to check for it in either case.
v2: Chris Wilson noticed that I've missed one bogus dev_priv check.
v3: The check in the overlay code is redundant (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The specs recommend that this bit be set on PineView. No reason is
given, but it sounds like a powersaving bit that we should expect the
BIOS to be setting...
v2: Rebase on top of _MASKED_ENABLE_BIT
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We slightly modify the initialisation sequence to move the
initialisation of the memory managers earlier and in particular before
probing outputs and detecting any existing output configuration. This is
essential if we wish to track preallocated objects and preserve them
whilst initialising GEM.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The use of the mm_list by deferred-free breaks the following patches to
extend the range of objects tracked. We can simplify things if we just
make the unbind during free uninterrutible.
Note that unbinding should never fail, because we hold an additional
reference on every active object. Only the ilk vt-d workaround breaks
this, but already takes care of not failing by waiting for the gpu to
quiescent non-interruptible. But the existence of the deferred free
list casted some doubts on this theory, hence WARN if the unbind fails
and only then retry non-interruptible.
We can kill this additional code after a release in case the theory is
indeed right and no one has hit that WARN.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Simplify object tracking by removing the inactive but pinned list. The
only place where this was used is for counting the available memory,
which is just as easy performed by checking all objects on the rare
occasions it is required (application startup). For ease of debugging,
we keep the reporting of pinned objects through the error-state and
debugfs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was only used by one external caller who would just be as happy
with evict-everything, so perform the replacement and make the function
private.
In the process we note that unbinding the inactive list should not fail,
and make it a warning instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, we only bump the inactive LRU of an object when we bind
into the GTT for a page-fault. As the object may be used many times
before its mapping is zapped, we do not mark it as active as
frequently as we should. Userspace should be calling set-to-GTT-domain
before each pointer deference (for synchronous access) and so is a
good place to mark the buffer as active.
Marking the buffer as recently used places it at the end of the
inactive eviction queue, though still before anything with outstanding
rendering. This reduces the likelihood of evicting a buffer that is
going to be used again by the CPU in the near future. This way we can
hopefully avoid to kick out upload buffers right before we use them on
the gpu.
Note that we need to check that the object is not active or pinned,
for otherwise we create havoc on the active/pinned lists, which also
use obj->mm_list.
The active lists are sorted by and evicted in last GPU rendering
order, access by the CPU to a still active buffer therefore does not
affect its eviction ordering. Pinned objects are currently excluded
from eviction, therefore the only list that we need to bump for GTT
access by the CPU is the inactive list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added further explanations to the commit message as discussed
on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>