Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.
Signed-off-by: Akshay Joshi <me@akshayjoshi.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
IVB uses the same interrupt reg layout as SNB, so add an IS_GEN7 to the
interrupt debugfs file.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
Expose the SNB+ cache sharing policy register in debugfs. The new file,
i915_cache_sharing, has 4 values, 0-3, with 0 being "max uncore
resources" and 3 being the minimum. Exposing this control should make
benchmarking easier and help us choose a good default.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
Mainly for use in debugging and benchmarking, this file allows the user
to control the max frequency used by the GPU. Frequency may still vary
based on workload (if the frequency is set to higher than the minimum)
but won't go over the newly set value.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
They use the same register interfaces, so we can simply enable the
existing code on IVB.
v2:
- resolve conflict with ring freq scaling, we can enable it too
v3:
- resolve conflict again, this time on drm-intel-next
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
Reported-by: Pavel Roskin <proski@gnu.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38777
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
The ring frequency scaling table tells the PCU to treat certain GPU
frequencies as if they were a given CPU frequency for purposes of
scaling the ring frequency. Normally the PCU will scale the ring
frequency based on the CPU P-state, but with the table present, it will
also take the GPU frequency into account.
The main downside of keeping the ring frequency high while the CPU is
at a low frequency (or asleep altogether) is increased power
consumption. But then if you're keeping your GPU busy, you probably
want the extra performance.
v2:
- add units to debug table header (from Eric)
- use tsc_khz as a fallback if the cpufreq driver doesn't give us a freq
(from Chris)
v3:
- fix comments & debug output
- remove unneeded force wake get/put
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
FBC has too many corner cases that we don't currently deal with, so
disable it by default so we can enable more important features like RC6,
which conflicts in some configurations.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31742
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Forcewake needs to register itself with drm to use the remove function.
The file also should be read only.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
forcewake is controlled by the open and close of the debugfs file. This
assures that buggy applications cannot cause the GT to stay on forever.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
Found by the new strict checking for the mutex being held whilst
manipulating the forcewake status.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
Provide a reference count to track the forcewake state of the GPU and
give a safe mechanism for userspace to wake the GT. This also potentially
saves a UC read if the GT is known to be awake already.
The reference count is atomic, but the register access and hardware wake
sequence is protected by struct_mutex.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
... to clarify just how we use it inside the driver and remove the
confusion of the poorly matching agp_type names. We still need to
translate through agp_type for interface into the fake AGP driver.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
Currently this is only useful for the rc6 stuff. But this would also be
useful when I finally get around to the logical context + ppgtt stuff.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
Fix up the debug file to report the right frequencies. On SNB, we program
the PCU with a frequency ratio, which is multiplied by 100MHz on the CPU
side. But GFX only runs at half that, so report it as such to avoid
confusion.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Whilst the GT is powered down (rc6), writes to MMADDR are placed in a
FIFO by the System Agent. This is a limited resource, only 64 entries, of
which 20 are reserved for Display and PCH writes, and so we must take
care not to queue up too many writes. To avoid this, there is counter
which we can poll to ensure there are sufficient free entries in the
fifo.
"Issuing a write to a full FIFO is not supported; at worst it could
result in corruption or a system hang."
Reported-and-Tested-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We had some conversions over to the _PIPE macros, but didn't get
everything. So hide the per-pipe regs with an _ (still used in a few
places for legacy) and add a few _PIPE based macros, then make sure
everyone uses them.
[update: remove usage of non-existent no-op macro]
[update 2: keep modesetting suspend/resume code, update to new reg names]
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[ickle: stylistic cleanups for checkpatch and taste]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
These make us increase our frequency much more readily, and decrease
them only after significant idle time, resulting in a 20% performance
increase for nexuiz.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Move code around and invoke iomem annotation in a few more places in
order to silence sparse. Still a few more iomem annotations to go...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Re-enable rc6 support on Ironlake for power savings. Adds a debugfs
file to check current RC state, adds a missing workaround for Ironlake
MI_SET_CONTEXT instructions, and renames MCHBAR_RENDER_STANDBY to
RSTDBYCTL to match the docs.
Keep RC6 and the power context disabled on pre-ILK. It only seems to
hang and doesn't save any power.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Add an interrupt handler for switching graphics frequencies and handling
PM interrupts. This should allow for increased performance when busy
and lower power consumption when idle.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Add the support of memory self-refresh on Sandybridge, which is now
support 3 levels of watermarks and the source of the latency values
for watermarks has changed.
On Sandybridge, the LP0 WM value is not hardcoded any more. All the
latency value is now should be extracted from MCHBAR SSKPD register.
And the MCHBAR base address is changed, too.
For the WM values, if any calculated watermark values is larger than
the maximum value that can be programmed into the associated watermark
register, that watermark must be disabled.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
[ickle: remove duplicate compute routines and fixup for checkpatch]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The bulk of the change is to convert the growing list of rings into an
array so that the relationship between the rings and the semaphore sync
registers can be easily computed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Simply remove our accounting of objects inside the aperture, keeping
only track of what is in the aperture and its current usage. This
removes the over-complication of BUGs that were attempting to keep the
accounting correct and also removes the overhead of the accounting on
the hot-paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Having seen the effects of erroneous fencing on the batchbuffer, a
useful sanity check is to record the fence registers at the time of an
error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When trying to diagnose mysterious errors on resume, capture the
display register contents as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The pinned buffers are useful for diagnosing errors in setting up state
for the chipset, which may not necessarily be 'active' at the time of
the error, e.g. the cursor buffer object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
So long as we adhere to the fence registers rules for alignment and no
overlaps (including with unfenced accesses to linear memory) and account
for the tiled access in our size allocation, we do not have to allocate
the full fenced region for the object. This allows us to fight the bloat
tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
the GTT we still suffer the additional alignment constraints, so it doesn't
magic allow us to render larger scenes without stalls -- we need the
expanded GTT and fence pipelining to overcome those...]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This holds error state from the main graphics arbiter mainly involving
the DMA engine and address translation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
More precisely: For those that _need_ to be mappable. Also add two
BUG_ONs in fault and pin to check the consistency of the mappable
flag.
Changes in v2:
- Add tracking of gtt mappable space (to notice mappable/unmappable
balancing issues).
- Improve the mappable working set tracking by tracking fault and pin
separately.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>