Backmerge Linus tree after rc5 + drm-fixes went in.
There were a few amdkfd conflicts I wanted to avoid,
and Ben requested this for nouveau also.
Conflicts:
drivers/gpu/drm/amd/amdkfd/Makefile
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
drivers/gpu/drm/amd/amdkfd/kfd_priv.h
drivers/gpu/drm/amd/include/kgd_kfd_interface.h
drivers/gpu/drm/i915/intel_runtime_pm.c
drivers/gpu/drm/radeon/radeon_kfd.c
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
if the mutex debugging is enabled which introduces a race in our
mutex_is_locked_by() - i.e. we may inspect the old owner value before it
is acquired by the new task.
This is the root cause of this error:
diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index 5cf6731..3ef3736 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
DEBUG_LOCKS_WARN_ON(lock->owner != current);
DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
- mutex_clear_owner(lock);
}
/*
* __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
* mutexes so that we can do it here after we've verified state.
*/
+ mutex_clear_owner(lock);
atomic_set(&lock->count, 1);
}
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Like Ivybridge, we have reports that we get random hangs when flipping
with multiple pipes. Extend
commit 2a92d5bca1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Jul 8 10:40:29 2014 +0100
drm/i915: Disable RCS flips on Ivybridge
to also apply to Haswell.
Reported-and-tested-by: Scott Tsai <scottt.tw@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87759
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org # 2a92d5bca1 drm/i915: Disable RCS flips on Ivybridge
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We apply the RPS interrupt workaround on VLV everywhere except when
writing the mask directly during idling the GPU. For consistency do this
also there.
While at it also extend the code comment about affected platforms.
I couldn't reproduce the issue on VLV fixed by this workaround, by
removing the workaround from everywhere, while it's 100% reproducible on
SNB using igt/gem_reset_stats/ban-ctx-render. So also add a note that
it hasn't been verified if the workaround really applies to VLV/CHV.
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In
commit dbea3cea69
Author: Imre Deak <imre.deak@intel.com>
Date: Mon Dec 15 18:59:28 2014 +0200
drm/i915: sanitize RPS resetting during GPU reset
we disable RPS interrupts during GPU resetting, but don't apply the
necessary GEN6 HW workaround. This leads to a HW lockup during a
subsequent "looping batchbuffer" workload. This is triggered by the
testcase that submits exactly this kind of workload after a simulated
GPU reset. I'm not sure how likely the bug would have triggered
otherwise, since we would have applied the workaround anyway shortly
after the GPU reset, when enabling GT powersaving from the deferred
work.
This may also fix unrelated issues, since during driver loading /
suspending we also disable RPS interrupts and so we also had a short
window during the rest of the loading / resuming where a similar
workload could run without the workaround applied.
v2:
- separate the fix to route RPS interrupts to the CPU on GEN9 too
to a separate patch (Daniel)
Bisected-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Testcase: igt/gem_reset_stats/ban-ctx-render
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87429
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
GEN8+ HW has the option to route PM interrupts to either the CPU or to
GT. For GEN8 this was already set correctly to routing to CPU, but not
for GEN9, so fix this. Note that when disabling RPS interrupts this was
set already correctly, though in that case it didn't matter much except
for the possibility of spurious interrupts.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
- plane handling refactoring from Matt Roper and Gustavo Padovan in prep for
atomic updates
- fixes and more patches for the seqno to request transformation from John
- docbook for fbc from Rodrigo
- prep work for dual-link dsi from Gaurav Signh
- crc fixes from Ville
- special ggtt views infrastructure from Tvrtko Ursulin
- shadow patch copying for the cmd parser from Brad Volkin
- execlist and full ppgtt by default on gen8, for testing for now
* tag 'drm-intel-next-2014-12-19' of git://anongit.freedesktop.org/drm-intel: (131 commits)
drm/i915: Update DRIVER_DATE to 20141219
drm/i915: Hold runtime PM during plane commit
drm/i915: Organize bind_vma funcs
drm/i915: Organize INSTDONE report for future.
drm/i915: Organize PDP regs report for future.
drm/i915: Organize PPGTT init
drm/i915: Organize Fence registers for future enablement.
drm/i915: tame the chattermouth (v2)
drm/i915: Warn about missing context state workarounds only once
drm/i915: Use true PPGTT in Gen8+ when execlists are enabled
drm/i915: Skip gunit save/restore for cherryview
drm/i915/chv: Use timeout mode for RC6 on chv
drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist
drm/i915: Tidy up execbuffer command parsing code
drm/i915: Mark shadow batch buffers as purgeable
drm/i915: Use batch length instead of object size in command parser
drm/i915: Use batch pools with the command parser
drm/i915: Implement a framework for batch buffer pools
drm/i915: fix use after free during eDP encoder destroying
drm/i915/skl: Skylake also supports DP MST
...
I've had these since before -rc1, but they missed my last pull
request. Real bug fixes and mostly cc: stable material.
* tag 'drm-intel-next-fixes-2014-12-30' of git://anongit.freedesktop.org/drm-intel:
drm/i915: add missing rpm ref to i915_gem_pwrite_ioctl
Revert "drm/i915: Preserve VGACNTR bits from the BIOS"
drm/i915: Don't call intel_prepare_page_flip() multiple times on gen2-4
drm/i915: Kill check_power_well() calls
This reverts commit 355a701838.
This had some bad side effects under normal operation, and should
have been dropped earlier.
Signed-off-by: Dave Airlie <airlied@redhat.com>
misc i915 fixes.
* tag 'drm-intel-next-fixes-2014-12-17' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Disable PSMI sleep messages on all rings around context switches
drm/i915: Force the CS stall for invalidate flushes
drm/i915: Invalidate media caches on gen7
drm/i915: sanitize RPS resetting during GPU reset
drm/i915: move RPS PM_IER enabling to gen6_enable_rps_interrupts
drm/i915: vlv: fix IRQ masking when uninstalling interrupts
Without this RPM ref we can hit the device suspended WARN via:
i915_gem_object_pin()->ggtt_bind_vma->gen6_ggtt_insert_entries(). I
noticed this on my BYT while keeping the i915 device in runtime
suspended state for a while. I chose this place to take the ref to
avoid the possible deadlock via the mutex_lock taken both later in this
function and in the runtime suspend handler. This can happen if an RPM
suspend event is queued and need to be flushed before taking the RPM
ref.
Testcase: igt/pm_rpm/gem-evict-pwrite
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87363
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The VGA_2X_MODE bit apparently affects the display even when the VGA
plane is disabled. The bit will set by the BIOS when the panel width
is at least 1280 pixels. So by preserving the bit from the BIOS we
end up with corrupted display on machines with such high res panels.
I only have 1024x768 panels on my gen2 machines so never ran into
this problem.
The original reason for preserving the VGACNTR register was to make
my 830 survive S3 with acpi_sleep=s3_bios option. However after
further 830 fixes that option is no longer needed to make S3 work
and preserving VGACNTR doesn't seem to be necessary without it,
so we can just revert the entire patch.
This reverts
commit 69769f9a42
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri Aug 15 01:22:08 2014 +0300
drm/i915: Preserve VGACNTR bits from the BIOS
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87171
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The flip stall detector kicks in when pending>=INTEL_FLIP_COMPLETE. That
means if we first call intel_prepare_page_flip() but don't call
intel_finish_page_flip(), the next stall check will erroneosly think
the page flip was somehow stuck.
With enough debug spew emitted from the interrupt handler my 830 hangs
when this happens. My theory is that the previous vblank interrupt gets
sufficiently delayed that the handler will see the pending bit set in
IIR, but ISR still has the bit set as well (ie. the flip was processed
by CS but didn't complete yet). In this case the handler will proceed
to call intel_check_page_flip() immediately after
intel_prepare_page_flip(). It then tries to print a backtrace for the
stuck flip WARN, which apparetly results in way too much debug spew
delaying interrupt processing further. That then seems to cause an
endless loop in the interrupt handler, and the machine is dead until
the watchdog kicks in and reboots. At least limiting the number of
iterations of the loop in the interrupt handler also prevented the
hang.
So it seems better to not call intel_prepare_page_flip() without
immediately calling intel_finish_page_flip(). The IIR/ISR trickery
avoids races here so this is a perfectly safe thing to do.
v2: Fix typo in commit message (checkpatch)
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88381
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85888
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
pps_{lock,unlock}() call intel_display_power_{get,put}() outside
pps_mutes to avoid deadlocks with the power_domain mutex. In theory
during aux transfers we should usually have the relevant power domain
references already held by some higher level code, so this should not
result in much overhead (exception being userspace i2c-dev access).
However thanks to the check_power_well() calls in
intel_display_power_{get/put}() we end up doing a few Punit reads for
each aux transfer. Obviously doing this for each byte transferred via
i2c-over-aux is not a good idea.
I can't think of a good way to keep check_power_well() while eliminating
the overhead, so let's just remove check_power_well() entirely.
Fixes a driver init time regression introduced by:
commit 773538e860
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Thu Sep 4 14:54:56 2014 +0300
drm/i915: Reset power sequencer pipe tracking when disp2d is off
Credit goes to Jani for figuring this out.
v2: Add the regression note in the commit message.
Cc: stable@vger.kernel.org (v3.18+)
Cc: Egbert Eich <eich@suse.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86201
Tested-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
During plane operations, we read/write some registers that only operate
properly if we're not runtime suspended. At the moment we're not
holding the runtime PM reference across the whole plane operation, so
there's a potential for problems.
This issue was already partially addressed by commit
commit d6dd6843ff
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Aug 15 15:59:32 2014 -0300
drm/i915: fix plane/cursor handling when runtime suspended
which took care of holding the runtime PM reference during the pin and
fence operations for plane updates. However there are still a few
actual plane registers that we also need to hold the runtime PM
reference for. Recent refactoring patches in preparation for atomic
have rearranged the code and made it increasingly likely that the
hardware will have time to suspend between the pin/fence operation and
the actual register writes. Examples of such registers are the stuff
touched by ivb_get_colorkey.
The solution here grabs the runtime PM reference around the 'commit'
operation for planes, which should cover all the relevant register
reads/writes.
Note that this has only been exposed with
commit 6beb8c23eb
Author: Matt Roper <matthew.d.roper@intel.com>
Date: Mon Dec 1 15:40:14 2014 -0800
drm/i915: Consolidate plane 'prepare' functions (v2)
so doesn't need to be ported to 3.19.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87180
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Testcase: igt/pm-rpm/legacy-planes
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Augment commit message with information Paulo supplied.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's be optimistic that for future platforms this will remain the same
and reorg a bit.
This reorg in if blocks instead of switch make life easier for future
platform support addition.
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's be optimistic that for future platforms this will remain the same
and reorg a bit.
This reorg in if blocks instead of switch make life easier for future
platform support addition.
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's be optimistic that for future platforms this will remain the same
and reorg a bit.
This reorg in if blocks instead of switch make life easier for future
platform support addition.
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's be optimistic that for future platforms memory management doesn't change
that much and reuse gen8 function for PPGTT init.
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Let's be optimistic that for future platforms this will remain the same
and reorg a bit.
This reorg in if blocks instead of switch make life easier for future
platform support addition.
v2: Jani pointed out I was missing reg_830 for some gen3 platforms. So let's make
this platforms subcases of Gen checks.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There exists a current workaround to prevent a hang on context switch
should the ring go to sleep in the middle of the restore,
WaProgramMiArbOnOffAroundMiSetContext (applicable to all gen7+). In
spite of disabling arbitration (which prevents the ring from powering
down during the critical section) we were still hitting hangs that had
the hallmarks of the known erratum. That is we are still seeing hangs
"on the last instruction in the context restore". By comparing -nightly
(broken) with requests (working), we were able to deduce that it was the
semaphore LRI cross-talk that reproduced the original failure. The key
was that requests implemented deferred semaphore signalling, and
disabling that, i.e. emitting the semaphore signal to every other ring
after every batch restored the frequent hang. Explicitly disabling PSMI
sleep on the RCS ring was insufficient, all the rings had to be awake to
prevent the hangs. Fortunately, we can reduce the wakelock to the
MI_SET_CONTEXT operation itself, and so should be able to limit the extra
power implications.
Since the MI_ARB_ON_OFF workaround is listed for all gen7 and above
products, we should apply this extra hammer for all of the same
platforms despite so far that we have only been able to reproduce the
hang on certain ivb and hsw models. The last question is whether we want
to always use the extra hammer or only when we know semaphores are in
operation. At the moment, we only use LRI on non-RCS rings for
semaphores, but that may change in the future with the possibility of
reintroducing this bug under subtle conditions.
v2: Make it explicit that the PSMI LRI are an extension to the original
workaround for the other rings.
v3: Bikeshedding variable names and whitespacing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80660
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83677
Cc: Simon Farnsworth <simon@farnz.org.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Peter Frühberger <fritsch@xbmc.org>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In order to act as a full command barrier by itself, we need to tell the
pipecontrol to actually stall the command streamer while the flush runs.
We require the full command barrier before operations like
MI_SET_CONTEXT, which currently rely on a prior invalidate flush.
References: https://bugs.freedesktop.org/show_bug.cgi?id=83677
Cc: Simon Farnsworth <simon@farnz.org.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In the gen7 pipe control there is an extra bit to flush the media
caches, so let's set it during cache invalidation flushes.
v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec.
Cc: Simon Farnsworth <simon@farnz.org.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Many distro's have mechanism in place to collect and automatically file
bugs for failed WARN()s. And since i915 has a lot of hw state sanity
checks which result in WARN(), it generates quite a lot of noise which
is somewhat disconcerting to the end user.
Separate out the internal hw-is-in-the-state-I-expected checks into
I915_STATE_WARN()s and allow configuration via i915.verbose_checks module
param about whether this will generate a full blown stacktrace or just
DRM_ERROR(). The new moduleparam defaults to true, so by default there
is no change in behavior. And even when disabled, you will still get
an error message logged.
v2: paint the macro names blue, clarify that the default behavior
remains the same as before
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise, new platforms without workarounds will hit this warning for
every new context created.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In Gen8+, full ppgtt needs execlist, otherwise the ctx switch can hang.
Also remove the current restriction, a user should be able to explicitly set
ppgtt=2.
Note, this patch considers that execlist support has been enabled by
default on Gen8.
v2: Remove non-default restriction and clarify commit message (Daniel)
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
[danvet: s/comment/commit message/ in the commit message since that's
what Michel meant as per our irc discussion.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With cherryview onwards, Gunit hardware itself save and restore all the
Gunit registers. Skipping the "vlv_save_gunit_s0ix_state" &
"vlv_restore_gunit_s0ix_state" for cherryview in S3/S0ix sequence.
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Higher RC6 residency is observed using timeout mode
instead of EI mode. It's Recommended to use TO Method for RC6.
v2: Add comment about timeout threshold. (Tom)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will allow us to read the number of dispatched compute threads
for GL_ARB_pipeline_statistics_query.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move it to a separate function since the main do_execbuffer function
already has so much going on.
v2:
- Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7
feedback)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By adding a new exec_entry flag, we cleanly mark the shadow objects
as purgeable after they are on the active list.
v2:
- Move 'shadow_batch_obj->madv = I915_MADV_WILLNEED' inside _get
fnc (danvet, from v4 6/7 feedback)
v3:
- Remove duplicate 'madv = I915_MADV_WILLNEED' (danvet, from v6 4/5)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Previously we couldn't trust the user-supplied batch length because
it came directly from userspace (i.e. untrusted code). It would have
affected what commands software parsed without regard to what hardware
would actually execute, leaving a potential hole.
With the parser now copying the user supplied batch buffer and writing
MI_NOP commands to any space after the copied region, we can safely use
the batch length input. This should be a performance win as the actual
batch length is frequently much smaller than the allocated object size.
v2: Fix handling of non-zero batch_start_offset
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch sets up all of the tracking and copying necessary to
use batch pools with the command parser and dispatches the copied
(shadow) batch to the hardware.
After this patch, the parser is in 'enabling' mode.
Note that performance takes a hit from the copy in some cases
and will likely need some work. At a rough pass, the memcpy
appears to be the bottleneck. Without having done a deeper
analysis, two ideas that come to mind are:
1) Copy sections of the batch at a time, as they are reached
by parsing. Might improve cache locality.
2) Copy only up to the userspace-supplied batch length and
memset the rest of the buffer. Reduces the number of reads.
v2:
- Remove setting the capacity of the pool
- One global pool instead of per-ring pools
- Replace batch_obj with shadow_batch_obj and hook into eb->vmas
- Memset any space in the shadow batch beyond what gets copied
- Rebased on execlist prep refactoring
v3:
- Rebase on chained batch handling
- Squash in setting the secure dispatch flag
- Add a note about the interaction w/secure dispatch pinning
- Check for request->batch_obj == NULL in i915_gem_free_request
v4:
- Fix read domains for shadow_batch_obj
- Remove the set_to_gtt_domain call from i915_parse_cmds
- ggtt_pin/unpin in the parser block to simplify error handling
- Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
- Remove i915_gem_batch_pool_put calls
v5:
- Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
the parser (danvet, from v4 0/7 feedback)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds a small module for managing a pool of batch buffers.
The only current use case is for the command parser, as described
in the kerneldoc in the patch. The code is simple, but separating
it out makes it easier to change the underlying algorithms and to
extend to future use cases should they arise.
The interface is simple: init to create an empty pool, fini to
clean it up, get to obtain a new buffer. Note that all buffers are
expected to be inactive before cleaning up the pool.
Locking is currently based on the caller holding the struct_mutex.
We already do that in the places where we will use the batch pool
for the command parser.
v2:
- s/BUG_ON/WARN_ON/ for locking assertions
- Remove the cap on pool size
- Switch from alloc/free to init/fini
v3:
- Idiomatic looping structure in _fini
- Correct handling of purged objects
- Don't return a buffer that's too much larger than needed
v4:
- Rebased to latest -nightly
v5:
- Remove _put() function and clean up comments to match
v6:
- Move purged check inside the loop (danvet, from v4 1/7 feedback)
v7:
- Use single list instead of two. (Chris W)
- s/active_list/cache_list
- Squashed in debug patches (Chris W)
drm/i915: Add a batch pool debugfs file
It provides some useful information about the buffers in
the global command parser batch pool.
v2: rebase on global pool instead of per-ring pools
v3: rebase
drm/i915: Add batch pool details to i915_gem_objects debugfs
To better account for the potentially large memory consumption
of the batch pool.
v8:
- Keep cache in LRU order (danvet, from v6 1/5 feedback)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After
commit a18c0af171
uthor: Thierry Reding <treding@nvidia.com>
Date: Wed Dec 10 11:38:49 2014 +0100
drm: Zero out DRM object memory upon cleanup
we will use the eDP encoder during destroying it. Fix this by calling
drm_encoder_cleanup() at a point when the encoder is not used any more.
This caused a NULL pointer dereference in pps_lock(), I can't see that
it caused any other problem.
All the other encoders seem to call drm_encoder_cleanup() at a safe
place.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Atm, we don't disable RPS interrupts and related work items before
resetting the GPU. This may interfere with the following GPU
initialization and cause RPS interrupts to show up in PM_IIR too early
before calling gen6_enable_rps_interrupts() (triggering a WARN there).
Solve this by disabling RPS interrupts and flushing any related work
items before resetting the GPU.
v2:
- split out the common parts of the gt suspend and the new gt reset
functions (Paulo)
v3:
- remove the check for UMS, it's a NOP nowadays (Daniel)
Reported-by: He, Shuang <shuang.he@intel.com>
Testcase: igt/gem_reset_stats/ban-render
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86644
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Paulo noticed that we don't enable RPS interrupts via PM_IER in
gen6_enable_rps_interrupts(). This wasn't a problem so far, since the
only place we disabled RPS interrupts was during system/runtime suspend
and after that we reenable all interrupts in the IRQ pre/postinstall
hooks.
In the next patch we'll disable/reenable RPS interrupts during GPU reset
too, but not call IRQ uninstall, pre/postinstall hooks, so there the
above wouldn't work. The logical place for programming PM_IER is
gen6_enable_rps_interrupts() and this also makes the function more
symmetric with gen6_disable_rps_interrupts(), so move the programming
there from the postinstall hooks.
Note that these changes don't affect the ILK RPS interrupt code, which
could be sanitized in a similar way. But that can be done as a
follow-up.
Credits-to: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
I've checked that TRANS_DDI_MODE, DP_TP_CTL MST bits are identical to
HSW/BDW on SKL, as well as the long vs short HPD bits. So we have a good
chance to be working as well as prevous platforms.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2 pieces of code need to read out the DDI clock: the DDI encoder and the
MST encoder .get_config() vfuncs.
Until now the SKL read out code was only in the former, so let's move
the pre and post SKL logic in intel_ddi_clock_get() and this this one
everywhere.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
LFP brighness control from the VBT block 43 indicates which
controller is used for brightness.
LFP1 brightness control method:
Bit 7-4 = This field controller number of the brightnes controller.
0 = Controller 0
1 = Controller 1
2 = Controller 2
3 = Controller 3
Others = Reserved
Bits 3-0 = This field specifies the brightness control pin to be used on the
platform.
0 = PMIC pin is used for brightness control
1 = LPSS PWM is used for brightness control
2 = Display DDI is used for brightness control
3 = CABC method to control brightness
Others = Reserved
Adding the above fields in dev_priv->vbt and corresponding changes in
parse_backlight()
v2: Jani's review comments addressed
- Move PWM definitions to intel_bios.h
- Moving vbt_version to intel_vbt_data
- Rename brightness to bl_ctrl_data
- Logging just control_pin instead of string
- Avoid adding vbt_version in dev_priv
- Since only DDI option is available as of now, let control pin DDI
affect dev_priv->vbt.backlight.present
v3: Jani's review comments addressed
- Drop control_pin
- Use bdb->version
- set controller to 0 instead of using control pin define
- check controller bounds
- remove superfluous changes in intel_parse_bios
Signed-off-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We were incorreectly bypassing the flush everytime which led to fifo
underrun when more than one plane is enabled.
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Satheeshakrishna M<satheeshakrishna.m@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Execlist support in the i915 driver is now considered good enough for the
feature to be enabled by default on Gen8 and later and routinely tested.
Adjusted i915 parameters structure initialization to reflect this and updated
the comment in intel_sanitize_enable_execlists().
There's still work to do before we can let the wider massive onto it,
but there's still time left before the 3.20 cutoff.
v2: Update the MODULE_PARM_DESC too.
Issue: VIZ-2020
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
[danvet: Add note that there's still some work left to do.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The pipe wm parameters is not correctly updated with sprite parameters
because it copies them for each plane from plane_list to the sprite
offset in pipe wm parameters. Since plane_list also contains primary and
cursor planes, we end up updating wrong params for sprites.
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A short section describing background, implementation and intended usage.
v2:
* Align section name between template and DOC comment. (Michel Thierry)
For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.
Added a GGTT view concept and linked it with the VMA to distinguish between
multiple instances per address space.
New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.
This now means that objects can have multiple VMA entries so the code which
assumed there will only be one also had to be modified.
Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
which is DMA mapped on first VMA instantiation and unmapped on the last one
going away.
v2:
* Removed per view special casing in i915_gem_ggtt_prepare /
finish_object in favour of creating and destroying DMA mappings
on first VMA instantiation and last VMA destruction. (Daniel Vetter)
* Simplified i915_vma_unbind which does not need to count the GGTT views.
(Daniel Vetter)
* Also moved obj->map_and_fenceable reset under the same check.
* Checkpatch cleanups.
v3:
* Only retire objects once the last VMA is unbound.
v4:
* Keep scatter-gather table for alternative views persistent for the
lifetime of the VMA.
* Propagate binding errors to callers and handle appropriately.
v5:
* Explicitly look for normal GGTT view in i915_gem_obj_bound to align
usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
* Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
* Removed stray semi-colon in i915_gem_object_set_cache_level.
For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Drop hunk from i915_gem_shrink since it's just prettification
but upsets a __must_check warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
According to updated BSpec, Render/Common/media Wells register range changed.
Updating the same to match the spec and avoid extra forcewake for none
forcewake range.
v2: Update media forcewake range (Ville)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From now on for both DSI Ports A & C, the seq_port value has been
set to 0. seq_port value is parsed from Sequence block#53 of VBT.
So, for packets that needs to be read/write for DSI single link on
Port A and Port C will now be based on the DVO port from VBT block 2,
instead of seq_port.
Signed-off-by: Gaurav K Singh <gaurav.k.singh@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Faster feedback to errors is always better. This is inspired by the
addition to WARN_ONs to mask/enable helpers for registers to make sure
callers have the arguments ordered correctly: Pretty much always the
arguments are static.
We use WARN_ON(1) a lot in default switch statements though where we
should always handle all cases. So add a new macro specifically for
that.
The idea to use __builtin_constant_p is from Chris Wilson.
v2: Use the ({}) gcc-ism to avoid the static inline, suggested by
Dave. My first attempt used __cond as the temp var, which is the same
used by BUILD_BUG_ON, but with inverted sense. Hilarity ensued, so
sprinkle i915 into the name.
Also use a temporary variable to only evaluate the condition once,
suggested by Damien.
v3: It's crazy but apparently 32bit gcc can't compile out the
BUILD_BUG_ON in a lot of cases and just falls over. I have no idea
why, but until clue grows just disable this nifty idea on 32bit
builds. Reported by 0-day builder.
v4: Got it all wrong, apparently its the gcc version. We need 4.9+.
Now reported by Imre.
v5: Chris suggested to add the case to MISSING_CASE for speedier
debug.
v6: Even some gcc 4.9 versions don't see through the maze, so give up
for now. Keep the skeleton and MISSING_CASE stuff though.
Cc: Imre Deak <imre.deak@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
We consistently use the _irq_handler postfix for functions called in
hardirq context. Especially when it's a non-static function hardirq is
a crazy enough calling context to warrant this level of ocd. So rename
it.
Cc: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>