A lot of 945GMs have had stability issues for a long time, this manifested as X hangs, blitter engine hangs, and lots of crashes.
one such report is at:
https://bugs.freedesktop.org/show_bug.cgi?id=20560
along with numerous distro bugzillas.
This only took a week of digging and hair ripping to figure out.
Tracked down and tested on a 945GM Lenovo T60,
previously running
x11perf -copypixwin500
or
x11perf -copywinpix500
repeatedly would cause the GPU to wedge within 4 or 5 tries, with random busy bits set.
After this patch no hangs were observed.
cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
The i915 memory arbiter has a register full of configuration
bits which are currently not defined in the driver header file.
Signed-off-by: Keith Packard <keithp@keithp.com>
cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: fix page flip finish vs. prepare on plane B
drm/i915: change default panel fitting mode to preserve aspect ratio
drm/i915: fix uninitialized variable warning in i915_setup_compression()
drm/i915: take struct_mutex in i915_dma_cleanup()
drm/i915: Fix CRT hotplug regression in 2.6.35-rc1
i915: fix ironlake edp panel setup (v4)
drm/i915: don't access FW_BLC_SELF on 965G
drm/i915: Account for space on the ring buffer consumed whilst wrapping.
drm/i915: gen3 page flipping fixes
drm/i915: don't queue flips during a flip pending event
drm/i915: Fix incorrect intel_ring_begin size in BSD ringbuffer.
drm/i915: Turn on 945 self-refresh only if single CRTC is active
drm/i915/gen4: Fix interrupt setup ordering
drm/i915: Use RSEN instead of HTPLG for tfp410 monitor detection.
drm/i915: Move non-phys cursors into the GTT
Revert "drm/i915: Don't enable pipe/plane/VCO early (wait for DPMS on)."
(Included the "fix page flip finish vs. prepare on plane B" patch from
Jesse on top of the pull request from Eric. -- Linus)
Since commit 4bdadb9785 ("drm/i915:
Selectively enable self-reclaim"), we've been passing GFP_MOVABLE to the
i915 page allocator where we weren't before due to some over-eager
removal of the page mapping gfp_flags games the code used to play.
This caused hibernate on Intel hardware to result in a lot of memory
corruptions on resume. See for example
http://bugzilla.kernel.org/show_bug.cgi?id=13811
Reported-by: Evengi Golov (in bugzilla)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: M. Vefa Bicakci <bicave@superonline.com>
Cc: stable@kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We did this a long time ago in the DDX driver, but now this fix belongs
in the kernel.
Preserving the aspect ratio is a nicer default.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=18033.
Tested-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes:
drivers/gpu/drm/i915/i915_dma.c: In function ‘i915_setup_compression’:
drivers/gpu/drm/i915/i915_dma.c:1311: error: ‘compressed_llb’ may be used uninitialized in this function
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
intel_cleanup_ring_buffer() calls drm_gem_object_unreference() (as
opposed to drm_gem_object_unreference_unlocked()) so it needs to be
called with "struct_mutex" held. If we don't hold the lock, it triggers
a BUG_ON(!mutex_is_locked(&dev->struct_mutex));
I also audited the other places that call intel_cleanup_ring_buffer()
and they all hold the lock so they're OK.
This was introduced in: 8187a2b70e "drm/i915: introduce
intel_ring_buffer structure (V2)" and it's a regression from v2.6.34.
Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=16247
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reported-by: Benny Halevy <bhalevy@panasas.com>
Tested-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Commit 7a772c492f has two bugs which
made the hotplug problems on my laptop worse instead of better.
First, it did not, in fact, disable the CRT plug interrupt -- it
disabled all the other hotplug interrupts. It seems rather doubtful
that that bit of the patch fixed anything, so let's just remove it.
(If you want to add it back, you probably meant ~CRT_HOTPLUG_INT_EN.)
Second, on at least my GM45, setting CRT_HOTPLUG_ACTIVATION_PERIOD_64
and CRT_HOTPLUG_VOLTAGE_COMPARE_50 (when they were previously unset)
causes a hotplug interrupt about three seconds later. The old code
never restored PORT_HOTPLUG_EN so this could only happen once, but
they new code restores those registers. So just set those bits when
we set up the interrupt in the first place.
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Eric Anholt <eric@anholt.net>
The eDP spec claims a 20% overhead for the 8:10 encoding scheme used
on the wire. Take this into account when picking the lane/clock speed
for the panel.
v3: some panels are out of spec, try our best to deal with them, don't
refuse modes on eDP panels, and try the largest allowed settings if
all else fails on eDP.
v4: fix stupid typo, forgot to git add before amending.
Fixes several reports in bugzilla:
https://bugs.freedesktop.org/show_bug.cgi?id=28070
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The register offset for FW_BLC_SELF is a totally different set of bits
on Broadwater (it's actually MI_RDRET_STATE), so don't treat it like
FW_BLC_SELF on 965G chips.
Fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=26874.
Cc: stable@kernel.org
Tested-by: Norman Yarvin <yarvin@yarchive.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
If we fill the tail of the physical ring buffer with NOOP when wrapping,
we need to account for the reduction in available space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Gen3 chips have slightly different flip commands, and also contain a bit
that indicates whether a "flip pending" interrupt means the flip has
been queued or has been completed.
So implement support for the gen3 flip command, and make sure we use the
flip pending interrupt correctly depending on the value of ECOSKPD bit
0.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Hardware will set the flip pending ISR bit as soon as it receives the
flip instruction, and (supposedly) clear it once the flip completes
(e.g. at the next vblank). If we try to send down a flip instruction
while the ISR bit is set, the hardware can become very confused, and we
may never receive the corresponding flip pending interrupt, effectively
hanging the chip.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
The ring_begin API was taking a number of bytes, while all of our
other begin/end macros take number of dwords. Change the API over to
dwords to prevent future bugs.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Enable self-refresh on 945 when just one CRTC is activated.
Otherwise user would get display flicker with dual display.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=27667
Signed-off-by: Li Peng <peng.li@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
This reverts commit cfecde435d, since it
seems to cause some systems to not come up with any video output at all
(or video that only comes on when X starts up).
Fixes bugzilla:
http://bugzilla.kernel.org/show_bug.cgi?id=16163
Reported-and-tested-by: David John <davidjon@xenontk.org>
Tested-by: Nick Bowler <nbowler@elliptictech.com>
Acked-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The previous commit fixes the problem, these commits make sure we actually
fail properly if it happens again.
I've squashed the commits from Chris since they are all fixing one issue.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(regression fix since fbdev/kms rework).
My fb rework didn't remember about the 84/65s.
Reported-by: Ondrej Zary <linux@rainbow-software.org>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cursors need to be in the GTT domain when being accessed by the GPU.
Previously this was a fortuitous byproduct of userspace using pwrite()
to upload the image data into the cursor. The redundant clflush was
removed in commit 9b8c4a and so the image was no longer being flushed
out of the caches into main memory. One could also devise a scenario
where the cursor was rendered by the GPU, prior to being attached as the
cursor, resulting in similar corruption due to the missing MI_FLUSH.
Fixes:
Bug 28335 - Cursor corruption caused by commit 9b8c4a0b21https://bugs.freedesktop.org/show_bug.cgi?id=28335
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unmask, then enable interrupts, then enable interrupt sources; matches
PCH ordering. The old way (sources, enable, unmask) gives a window
during which interrupt conditions would appear in ISR but would never
reach IIR and thus never raise an IRQ. Since interrupts only trigger
on rising edges in ISR, this would lead to conditions where (for
example) output hotplugging would never fire an interrupt because it
was already stuck on in ISR.
Also, since we know IIR and PIPExSTAT have been cleared during
irq_preinstall, don't clear them again during irq_postinstall, nothing
good can come of that.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (41 commits)
drm/radeon/kms: make sure display hw is disabled when suspending
drm/vmwgfx: Allow userspace to change default layout. Bump minor.
drm/vmwgfx: Fix framebuffer modesetting
drm/vmwgfx: Fix vga save / restore with display topology.
vgaarb: use MIT license
vgaarb: convert pr_devel() to pr_debug()
drm: fix typos in Linux DRM Developer's Guide
drm/radeon/kms/pm: voltage fixes
drm/radeon/kms/pm: radeon_set_power_state fixes
drm/radeon/kms/pm: patch default power state with default clocks/voltages on r6xx+
drm/radeon/kms/pm: enable SetVoltage on r7xx/evergreen
drm/radeon/kms/pm: add support for SetVoltage cmd table (V2)
drm/radeon/kms/evergreen: add initial CS parser
drm/kms: disable/enable poll around switcheroo on/off
drm/nouveau: fixup confusion over which handle the DSM is hanging off.
drm/nouveau: attempt to get bios from ACPI v3
drm/nv50: cast IGP memory location to u64 before shifting
drm/ttm: Fix ttm_page_alloc.c
drm/ttm: Fix cached TTM page allocation.
drm/vmwgfx: Remove some leftover debug messages.
...
Cursors need to be in the GTT domain when being accessed by the GPU.
Previously this was a fortuitous byproduct of userspace using pwrite()
to upload the image data into the cursor. The redundant clflush was
removed in commit 9b8c4a and so the image was no longer being flushed
out of the caches into main memory. One could also devise a scenario
where the cursor was rendered by the GPU, prior to being attached as the
cursor, resulting in similar corruption due to the missing MI_FLUSH.
Fixes:
Bug 28335 - Cursor corruption caused by commit 9b8c4a0b21https://bugs.freedesktop.org/show_bug.cgi?id=28335
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl>
Signed-off-by: Eric Anholt <eric@anholt.net>
This reverts commit cfecde435d.
The commit was first created as an attempt to fix LVDS initialiazation
on Ironlake. Testing revealed that it didn't fix that, but it was
assumed to still be correct anyway.
Subsequent testing has revealed that this commit has caused other
regressions:
* Change in VBlank interrupt frequency causing 60% 3D performance regression
http://bugs.freedesktop.org/show_bug.cgi?id=27698
* Black screen on G45
http://bugs.freedesktop.org/show_bug.cgi?id=27733
So revert this buggy code for now to revisit later when we can fix
actual bugs without causing these regressions.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
This will let userland only try to use the new media decode
functionality when the appropriate kernel is present.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
I'm actually kind of shocked that it works at all otherwise.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
With splitted engines on Sandybridge, each engine has its own
interrupt control as well. This unmasks the interrupt to properly
enable pipe control notify event for render engine.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Sandybridge(Gen6) has new format for PIPE_CONTROL command,
the flush and post-op control are in dword 1 now. This
changes command length field for difference between Ironlake
and Sandybridge.
I tried to test this with noop request and issue PIPE_CONTROL
command for each sequence and track notify interrupts, which
seems work fine. Hopefully we don't need workaround like on
Ironlake for Sandybridge.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Since we now get_user_pages() outside of the mutex prior to performing
the copy, we kmap() the page inside the copy routine and so need to
perform an ordinary memcpy() and not copy_from_user().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
As we do not have a requirement to be atomic and avoid sleeping whilst
performing the slow copy for shmem based pread and pwrite, we can use
kmap instead, thus simplifying the code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
We can avoid an early clflush when pwriting if we use the current CPU
write domain rather than moving the object to the GTT domain for the
purposes of the pwrite. This has the advantage of not flushing the
presumably hot data that we want to upload into the bo, and of ascribing
the clflush to the execution when profiling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
The callers expect us to cleanup any partially initialised structures
before reporting the error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
If the object is bigger than the entire aperture, reject it early
before evicting everything in a vain attempt to find space.
v2: Use E2BIG as suggested by Owain G. Ainsworth.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: Eric Anholt <eric@anholt.net>
This particular warning is harmless as we emit during the normal
pinning process where the batch buffer requires more fences than is
available without eviction. Only if we fail to evict enough fences does
this become a problem, so include the requested number of fences in the
ultimate *error* message.
v2: Remember to compile test even trial patches to remove warnings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Add the pitch that we about to write into the control register along
with the base, offset and coordinates that go into the other control
registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
If the FBC is already disabled, then we do not even attempt to disable
FBC and so there is no point emitting a debug statement at that point,
having already emitted one saying why we are disabling FBC.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Nesting domain changes will cause confusion when trying to interpret the
tracepoints describing the sequence of changes for the object, as well
as obscuring the order of operations for the reader of the code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Delay taking the mutex until we need to and ensure that we hold the
spinlock when resetting unpin_work on the error path. Also defer the
debugging print messages until after we have released the spinlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Only report an error if the GPU has actually detected one, otherwise we
are just hung.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Pineview with DDR3 memory has different latencies to enable CxSR.
This patch updates CxSR latency table to add Pineview DDR3 latency
configuration. It also adds one flag "is_ddr3" for checking DDR3
setting in MCHBAR.
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Peng <peng.li@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The "encoder" variable can never be null because it is used as loop
cursor in a list_for_each_entry() loop.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The "connector" variable is used as the cursor in a
list_for_each_entry() and it's always non-null so we don't need to check
it.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
For real HDMI sink, CPT HDMI port has to set 'HDMI' mode flag
in order to make HDMI audio work correctly.
This is required patch for drm/i915 to enable HDMI audio on CPT PCH,
ALSA patch is at http://mailman.alsa-project.org/pipermail/alsa-devel/2010-May/027601.html
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
This saves a whooping 7 dwords. Zero functional changes. Because
some of the refcounts are rather tightly calculated, I've put
BUG_ONs in the code to check for overflows.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>