Commit Graph

119 Commits

Author SHA1 Message Date
Sean Paul
f01db988ef drm/i915: Add wait_for in init_ring_common
I have seen a number of "blt ring initialization failed" messages
where the ctl or start registers are not the correct value. Upon further
inspection, if the code just waited a little bit, it would read the
correct value. Adding the wait_for to these reads should eliminate the
issue.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-18 19:10:06 +01:00
Chris Wilson
a71d8d9452 drm/i915: Record the tail at each request and use it to estimate the head
By recording the location of every request in the ringbuffer, we know
that in order to retire the request the GPU must have finished reading
it and so the GPU head is now beyond the tail of the request. We can
therefore provide a conservative estimate of where the GPU is reading
from in order to avoid having to read back the ring buffer registers
when polling for space upon starting a new write into the ringbuffer.

A secondary effect is that this allows us to convert
intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
upon the single function to handle the complicated task of waiting upon
the GPU. A necessary precaution is that we need to make that wait
uninterruptible to match the existing conditions as all the callers of
intel_ring_begin() have not been audited to handle ERESTARTSYS
correctly.

By using a conservative estimate for the head, and always processing all
outstanding requests first, we prevent a race condition between using
the estimate and direct reads of I915_RING_HEAD which could result in
the value of the head going backwards, and the tail overflowing once
again. We are also careful to mark any request that we skip over in
order to free space in ring as consumed which provides a
self-consistency check.

Given sufficient abuse, such as a set of unthrottled GPU bound
cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
Sandy Bridge (i5-2520m):
  firefox-paintball  18927ms -> 15646ms: 1.21x speedup
  firefox-fishtank   12563ms -> 11278ms: 1.11x speedup
which is a mild consolation for the performance those traces achieved from
exploiting the buggy autoreported head.

v2: Add a few more comments and make request->tail a conservative
estimate as suggested by Daniel Vetter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: resolve conflicts with retirement defering and the lack of
the autoreport head removal (that will go in through -fixes).]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-15 14:26:03 +01:00
Daniel Vetter
99ffa1629d drm/i915: enable forcewake voodoo also for gen6
We still have reports of missed irqs even on Sandybridge with the
HWSTAM workaround in place. Testing by the bug reporter gets rid of
them with the forcewake voodoo and no HWSTAM writes.

Because I've slightly botched the rebasing I've left out the ACTHD
readback which is also required to get IVB working. Seems to still
work on the tester's machine, so I think we should go with the more
minmal approach on SNB. Especially since I've only found weak evidence
for holding forcewake while waiting for an interrupt to arrive, but
none for the ACTHD readback.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13 10:57:07 +01:00
Daniel Vetter
53d227f282 drm/i915: fixup seqno allocation logic for lazy_request
Currently we reserve seqnos only when we emit the request to the ring
(by bumping dev_priv->next_seqno), but start using it much earlier for
ring->oustanding_lazy_request. When 2 threads compete for the gpu and
run on two different rings (e.g. ddx on blitter vs. compositor)
hilarity ensued, especially when we get constantly interrupted while
reserving buffers.

Breakage seems to have been introduced in

commit 6f392d5486
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 7 11:01:22 2010 +0100

    drm/i915: Use a common seqno for all rings.

This patch fixes up the seqno reservation logic by moving it into
i915_gem_next_request_seqno. The ring->add_request functions now
superflously still return the new seqno through a pointer, that will
be refactored in the next patch.

Note that with this change we now unconditionally allocate a seqno,
even when ->add_request might fail because the rings are full and the
gpu died. But this does not open up a new can of worms because we can
already leave behind an outstanding_request_seqno if e.g. the caller
gets interrupted with a signal while stalling for the gpu in the
eviciton paths. And with the bugfix we only ever have one seqno
allocated per ring (and only that ring), so there are no ordering
issues with multiple outstanding seqnos on the same ring.

v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
clear that we only have one seqno counter for all rings. Suggested by
Chris Wilson.

v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
instead of ring->oustanding_lazy_request to make the follow-up
refactoring more clearly correct. Also improve the commit message
with issues discussed on irc.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13 10:55:57 +01:00
Daniel Vetter
9edd576d89 Merge remote-tracking branch 'airlied/drm-fixes' into drm-intel-next-queued
Back-merge from drm-fixes into drm-intel-next to sort out two things:

- interlaced support: -fixes contains a bugfix to correctly clear
  interlaced configuration bits in case the bios sets up an interlaced
  mode and we want to set up the progressive mode (current kernels
  don't support interlaced). The actual feature work to support
  interlaced depends upon (and conflicts with) this bugfix.

- forcewake voodoo to workaround missed IRQ issues: -fixes only enabled
  this for ivybridge, but some recent bug reports indicate that we
  need this on Sandybridge, too. But in a slightly different flavour
  and with other fixes and reworks on top. Additionally there are some
  forcewake cleanup patches heading to -next that would conflict with
  currrent -fixes.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-10 17:14:49 +01:00
Daniel Vetter
6a233c7887 drm/i915/ringbuffer: kill snb blt workaround
This was just to facilitate product enablement with pre-production hw.
Allows us to kill quite a bit of cruft.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29 17:50:40 +01:00
Daniel Vetter
96154f2fab drm/i915: switch ring->id to be a real id
... and add a helpr function for the places where we want a flag.

This way we can use ring->id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29 17:32:58 +01:00
Eric Anholt
8d79c3490a drm/i915: Remove the MI_FLUSH_ENABLE setting.
We have always been using the wrong bit -- it's bit 12.  However, the
bit also doesn't do anything -- hardware has always accepted the
MI_FLUSH command even when it was specced not to.

Given that there is only one MI_FLUSH emitted in all of the driver
stack on gen6+ (in i965_video.c of the 2d driver, and it should be
using other code to do its flush instead), just remove the MI_FLUSH
enable instead of trying to fix it.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-25 09:32:21 +01:00
Keith Packard
8f0fc977f5 Revert "drm/i915: Work around gen7 BLT ring synchronization issues."
This reverts commit 42ff6572e5.

New forcewake voodoo makes this no longer necessary.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-20 10:20:44 -08:00
Daniel Vetter
4cd53c0c8b drm/i915: paper over missed irq issues with force wake voodoo
Two things seem to do the trick on my ivb machine here:
- prevent the gt from powering down while waiting for seqno
  notification interrupts by grabbing the force_wake in get_irq (and
  dropping it in put_irq again).
- ordering writes from the ring's CS by reading a CS register, ACTHD
  seems to work.

Only the blt&bsd ring on ivb seem to be massively affected by this,
but for paranoia do this dance also on the render ring and on snb
(i.e. all gpus with forcewake).

Tested with Eric's glCopyPixels loop which without this patch scores a
missed irq every few seconds.

This patch needs my forcewake rework to use a spinlock instead of
dev->struct_mutex.

After crawling through docs a lot I've found the following nugget:

Internal doc "SNB GT PM Programming Guide", Section 4.3.1:

"GT does not generate interrupts while in RC6 (by design)"

So it looks like rc6 and irq generation are indeed related.

v2: Improve the comment per Eugeni Dodonov's suggestion.

v3: Add the documentation snipped. Also restrict the w/a to ivb only
for -fixes, as suggested by Keith Packard.

Cc: stable@kernel.org
Cc: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Eugeni Dodonov <eugeni.dodonov@intel.com>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-19 12:28:57 -08:00
Daniel Vetter
e6bfaf8542 drm/i915: don't bail out of intel_wait_ring_buffer too early
In the pre-gem days with non-existing hangcheck and gpu reset code,
this timeout of 3 seconds was pretty important to avoid stuck
processes.

But now we have the hangcheck code in gem that goes to great length
to ensure that the gpu is really dead before declaring it wedged.

So there's no need for this timeout anymore. Actually it's even harmful
because we can bail out too early (e.g. with xscreensaver slip)
when running giant batchbuffers. And our code isn't robust enough
to properly unroll any state-changes, we pretty much rely on the gpu
reset code cleaning up the mess (like cache tracking, fencing state,
active list/request tracking, ...).

With this change intel_begin_ring can only fail when the gpu is
wedged, and it will return -EAGAIN (like wait_request in case the
gpu reset is still outstanding).

v2: Chris Wilson noted that on resume timers aren't running and hence
we won't ever get kicked out of this loop by the hangcheck code. Use
an insanely large timeout instead for the HAS_GEM case to prevent
resume bugs from totally hanging the machine.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 10:26:31 -08:00
Eric Anholt
42ff6572e5 drm/i915: Work around gen7 BLT ring synchronization issues.
Previous to this commit, testing easily reproduced a failure where the
seqno would apparently arrive after the IRQ associated with it, with test programs as simple as:

for (;;) {
    glCopyPixels(0, 0, 1, 1);
    glFinish();
}

Various workarounds we've seen for previous generations didn't work to
fix this issue, so until new information comes in, replace the IRQ
waits on the BLT ring with polling.

Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 09:31:15 -08:00
Ben Widawsky
84f9f938be drm/i915: Force sync command ordering (Gen6+)
The docs say this is required for Gen7, and since the bit was added for
Gen6, we are also setting it there pit pf paranoia. Particularly as
Chris points out, if PIPE_CONTROL counts as a 3d state packet.

This was found through doc inspection by Ken and applies to Gen6+;

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 09:09:44 -08:00
Jesse Barnes
8d31528703 drm/i915: Use PIPE_CONTROL for flushing on gen6+.
v2 by danvet: Use a new flag to flush the render target cache on gen6+
(hw reuses the old write flush bit), as suggested by Ben Widawsdy.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: this seems to fix cairo-perf-trace hangs on my snb]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:41 -07:00
Kenneth Graunke
9d971b3753 drm/i915: Rename PIPE_CONTROL bit defines to be less terse.
"STALL_AT_SCOREBOARD" is much clearer than "STALL_EN" now that there are
several different kinds of stalls.  Also, "INSTRUCTION_CACHE_INVALIDATE"
is a lot easier to understand at a glance than the terse "IS_FLUSH."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: use INVALIDATE for ro cache flags for more consistency]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:40 -07:00
Kenneth Graunke
fcbc34e4dc drm/i915: Remove implied length of 2 from GFX_OP_PIPE_CONTROL #define.
Not all PIPE_CONTROLs have a length of 2, so remove it from the #define
and make each invocation specify the desired length.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: implement style suggestion from Ben Widawsdy]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:40 -07:00
Ben Widawsky
c8c99b0f0d drm/i915: Dumb down the semaphore logic
While I think the previous code is correct, it was hard to follow and
hard to debug. Since we already have a ring abstraction, might as well
use it to handle the semaphore updates and compares.

I don't expect this code to make semaphores better or worse, but you
never know...

v2:
Remove magic per Keith's suggestions.
Ran Daniel's gem_ring_sync_loop test on this.

v3:
Ignored one of Keith's suggestions.

v4:
Removed some bloat per Daniel's recommendation.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-09-21 14:52:41 -07:00
Akshay Joshi
0206e353a0 Drivers: i915: Fix all space related issues.
Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.

Signed-off-by: Akshay Joshi <me@akshayjoshi.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-09-19 18:01:47 -07:00
Jesse Barnes
b095cd0a0c drm/i915: set GFX_MODE to pre-Ivybridge default value even on Ivybridge
Prior to Ivybridge, the GFX_MODE would default to 0x800, meaning that
MI_FLUSH would flush the TLBs in addition to the rest of the caches
indicated in the MI_FLUSH command.  However starting with Ivybridge, the
register defaults to 0x2800 out of reset, meaning that to invalidate the
TLB we need to use PIPE_CONTROL.  Since we're not doing that yet, go
back to the old default so things work.

v2: don't forget to actually *clear* the new bit

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-19 11:57:12 -07:00
Keith Packard
df7976797f Merge branch 'drm-intel-fixes' into drm-intel-next 2011-07-22 13:40:42 -07:00
Keith Packard
f3234706a7 drm/i915: Initialize RCS ring status page address in intel_render_ring_init_dri
Physically-addressed hardware status pages are initialized early in
the driver load process by i915_init_phys_hws. For UMS environments,
the ring structure is not initialized until the X server starts. At
that point, the entire ring structure is re-initialized with all new
values. Any values set in the ring structure (including
ring->status_page.page_addr) will be lost when the ring is
re-initialized.

This patch moves the initialization of the status_page.page_addr value
to intel_render_ring_init_dri.

Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: stable@kernel.org
2011-07-22 13:36:52 -07:00
Chris Wilson
e4ffd173a1 drm/i915: Add an interface to dynamically change the cache level
[anholt v2: Don't forget that when going from cached to uncached, we
haven't been tracking the write domain from the CPU perspective, since
we haven't needed it for GPU coherency.]

[ickle v3: We also need to make sure we relinquish any fences on older
chipsets and clear the GTT for sane domain tracking.]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2011-06-09 21:51:16 -07:00
Feng, Boqun
8547920fc6 drm/i915: clean up unused ring_get_irq/ring_put_irq functions
This patch depends on patch "drm/i915: fix user irq miss in BSD ring on
g4x".
Once the previous patch apply, ring_get_irq/ring_put_irq become unused.
So simply remove them.

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-16 12:54:16 -07:00
Feng, Boqun
5bfa1063a7 drm/i915: fix user irq miss in BSD ring on g4x
On g4x, user interrupt in BSD ring is missed.
This is because though g4x and ironlake share the same bsd_ring,
their interrupt control interfaces have _two_ differences.

1.different irq enable/disable functions:
On g4x are i915_enable_irq and i915_disable_irq.
On ironlake are ironlake_enable_irq and ironlake_disable_irq.
2.different irq flag:
On g4x user interrupt flag in BSD ring on is I915_BSD_USER_INTERRUPT.
On ironlake is GT_BSD_USER_INTERRUPT

Old bsd_ring_get/put_irq call ring_get_irq and ring_get_irq.
ring_get_irq and ring_put_irq only call ironlake_enable/disable_irq.
So comes the irq miss on g4x.

To fix this, as other rings' code do, conditionally call different
functions(i915_enable/disable_irq and ironlake_enable/disable_irq)
and use different interrupt flags in bsd_ring_get/put_irq.

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com>
Cc: stable@kernel.org
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-16 12:54:05 -07:00
Eric Anholt
4593010b68 drm/i915: Update the location of the ringbuffers' HWS_PGA registers for IVB.
They have been moved from the ringbuffer groups to their own group it
looks like.  Fixes GPU hangs on gnome startup.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-13 18:12:52 -07:00
Jesse Barnes
65d3eb1e06 drm/i915: ring support for Ivy Bridge
Use Sandy Bridge paths in a few places.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-13 17:10:33 -07:00
Chris Wilson
93dfb40cd8 drm/i915: Rename agp_type to cache_level
... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-10 13:56:43 -07:00
Ben Widawsky
96f298aa9c drm/1915: ringbuffer wait for idle function
Added a new function which waits for the ringbuffer space to be equal to
(total - 8). This is the empty condition of the ringbuffer, and
equivalent to head==tail.

Also modified two users of this functionality elsewhere in the code.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-05-10 13:56:40 -07:00
Chris Wilson
b259f6730c drm/i915: Move the irq wait queue initialisation into the ring init
Required so that we don't obliterate the queue if initialising the
rings after the global IRQ handler is installed.

[Jesse, you recently looked at refactoring the IRQ installation
routines, does moving the initialisation of ring buffer data structures away
from that routine make sense in your grand scheme?]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-05-10 13:19:12 -07:00
Chris Wilson
36d527dead drm/i915: Restore missing command flush before interrupt on BLT ring
We always skipped flushing the BLT ring if the request flush did not
include the RENDER domain. However, this neglects that we try to flush
the COMMAND domain after every batch and before the breadcrumb interrupt
(to make sure the batch is indeed completed prior to the interrupt
firing and so insuring CPU coherency). As a result of the missing flush,
incoherency did indeed creep in, most notable when using lots of command
buffers and so potentially rewritting an active command buffer (i.e.
the GPU was still executing from it even though the following interrupt
had already fired and the request/buffer retired).

As all ring->flush routines now have the same preconditions, de-duplicate
and move those checks up into i915_gem_flush_ring().

Fixes gem_linear_blit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: mengmeng.meng@intel.com
2011-03-23 09:17:01 +00:00
Chris Wilson
9035a97a32 Merge branch 'drm-intel-fixes' into drm-intel-next
Grab the latest stabilisation bits from -fixes and some suspend and
resume fixes from linus.

Conflicts:
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_irq.c
2011-02-16 09:44:30 +00:00
Chris Wilson
db53a30261 drm/i915: Refine tracepoints
A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-07 14:59:18 +00:00
Chris Wilson
71a77e07d0 drm/i915: Invalidate TLB caches on SNB BLT/BSD rings
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
2011-02-02 15:52:38 +00:00
Chris Wilson
21dd373486 drm/i915: Defer reporting EIO until we try to use the GPU
Instead of reporting EIO upfront in the entrance of an ioctl that may or
may not attempt to use the GPU, defer the actual detection of an invalid
ioctl to when we issue a GPU instruction. This allows us to continue to
use bo in video memory (via pread/pwrite and mmap) after the GPU has hung.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-27 11:06:07 +00:00
Chris Wilson
29ee399131 drm/i915: Silence a few -Wunused-but-set-variable
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-25 10:33:11 +00:00
Chris Wilson
bdd92c9ad2 Merge branch 'drm-intel-fixes' into drm-intel-next
Merge important suspend and resume regression fixes and resolve the
small conflict.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
2011-01-24 23:45:32 +00:00
Chris Wilson
c7dca47bd6 drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for space
During suspend, Linus found that his machine would hang for 3 seconds,
and identified that intel_ring_buffer_wait() was the culprit:

"Because from looking at the code, I get the notion that
"intel_read_status_page()" may not be exact. But what happens if that
inexact value matches our cached ring->actual_head, so we never even
try to read the exact case? Does it _stay_ inexact for arbitrarily
long times? If so, we might wait for the ring to empty forever (well,
until the timeout - the behavior I see), even though the ring really
_is_ empty."

As the reported HEAD position is only updated every time it crosses a
64k boundary, whilst draining the ring it is indeed likely to remain one
value. If that value matches the last known HEAD position, we never read
the true value from the register and so trigger a timeout.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-20 17:26:57 +00:00
Chris Wilson
c0c06bd244 drm/i915/ringbuffer: Kill an annoyingly frequent debug message
This is better handled through the tracepoints and just clutters the
debug logs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-20 11:21:40 +00:00
Chris Wilson
e8616b6ced drm/i915: Initialise ring vfuncs for old DRI paths
We weren't setting up the vfunc table when initialising the old DRI
ringbuffer, leading to such OOPSes as:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD 10c441067 PUD 1185e5067 PMD 0
Oops: 0010 [#1] PREEMPT SMP
last sysfs file: /sys/class/dmi/id/chassis_asset_tag
CPU 3
Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit
cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev
usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e
usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm
snd_seq snd_timer snd_seq_device processor parport_pc thermal snd
thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon
rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801
rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse
intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod
ata_piix libata scsi_mod unix
Jan 18 15:49:29 lithui kernel:
Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product
Name
RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task
ffff88011b016e40)
Stack:
ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
<0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
<0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
Call Trace:
[<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
[<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
[<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
[<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
[<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
[<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
[<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
[<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
[<ffffffff810edb19>] sys_ioctl+0x99/0xa0
[<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
Code:  Bad RIP value.
RIP  [<(null)>] (null)
RSP <ffff8801150d1d40>
CR2: 0000000000000000

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Herbert Xu <herbert@gondor.apana.org.au>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
2011-01-20 11:20:53 +00:00
Chris Wilson
0dc79fb2a3 drm/i915: Make the ring IMR handling private
As the IMR for the USER interrupts are not modified elsewhere, we can
separate the spinlock used for these from that of hpd and pipestats.
Those two IMR are manipulated under an IRQ and so need heavier locking.

Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:43:58 +00:00
Chris Wilson
01a03331e5 drm/i915/ringbuffer: Simplify the ring irq refcounting
... and move it under the spinlock to gain the appropriate memory
barriers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32752
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:43:57 +00:00
Chris Wilson
0f46832fab drm/i915: Mask USER interrupts on gen6 (until required)
Otherwise we may consume 20% of the CPU just handling IRQs whilst
rendering. Ouch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:43:56 +00:00
Chris Wilson
b72f3acb71 drm/i915: Handle ringbuffer stalls when flushing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:43:55 +00:00
Chris Wilson
55249baaa5 drm/i915: Workaround erratum on i830 for TAIL pointer within last 2 cachelines
On i830 if the tail pointer is set to within 2 cachelines of the end of
the buffer, the chip may hang. So instead if the tail were to land in
that location, we pad the end of the buffer with NOPs, and start again
at the beginning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11 20:35:41 +00:00
Chris Wilson
c6df541c00 Revert "drm/i915: Avoid using PIPE_CONTROL on Ironlake"
Restore PIPE_CONTROL once again just for Ironlake, as it appears that
MI_USER_INTERRUPT does not have the same coherency guarantees, that is
on Ironlake the interrupt following a GPU write is not guaranteed to
arrive after the write is coherent from the CPU, as it does on the
other generations.

Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reported-by: Shuang He <shuang.he@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32402
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-15 10:15:25 +00:00
Chris Wilson
b13c2b96bf drm/i915/ringbuffer: Make IRQ refcnting atomic
In order to enforce the correct memory barriers for irq get/put, we need
to perform the actual counting using atomic operations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-14 11:34:46 +00:00
Chris Wilson
8d5203ca62 Merge branch 'drm-intel-fixes' into drm-intel-next 2010-12-09 20:22:04 +00:00
Chris Wilson
8c0a6bfef1 drm/i915/ringbuffer: Handle wrapping of the autoreported HEAD
If the tail advances beyond the autoreport HEAD value, then we need to
fallback to an uncached read of the HEAD register in order to ascertain
the correct amount of remaining space in the ringbuffer.

Reported-by: Fang, Xun <xunx.fang@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32259
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-09 12:53:19 +00:00
Chris Wilson
1a1c69762a Merge branch 'drm-intel-fixes' into drm-intel-next
Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/intel_dp.c
2010-12-07 23:02:08 +00:00
Chris Wilson
88f23b8fa3 drm/i915: Avoid using PIPE_CONTROL on Ironlake
The workaround is hideous and we are using the STORE_DWORD on all other
generations on all other rings, so use for the gen5 render ring as
well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05 23:18:14 +00:00