Commit Graph

31323 Commits

Author SHA1 Message Date
Chris Wilson
85e17f5974 drm/i915: Move the global sync optimisation to the timeline
Currently we try to reduce the number of synchronisations (now the
number of requests we need to wait upon) by noting that if we have
earlier waited upon a request, all subsequent requests in the timeline
will be after the wait. This only applies to requests in this timeline,
as other timelines will not be ordered by that waiter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
2016-10-28 20:53:54 +01:00
Chris Wilson
caddfe7192 drm/i915: Defer breadcrumb emission
Move the actual emission of the breadcrumb for closing the request from
i915_add_request() to the submit callback. (It can be moved later when
required.) This allows us to defer the allocation of the global_seqno
from request construction to actual submission, allowing us to emit the
requests out of order (wrt to the order of their construction, they
still will only be executed one all of their dependencies are resolved
including that all earlier requests on their timeline have been
submitted.) We have to specialise how we then emit the request in order
to write into the preallocated space, rather than at the tail of the
ringbuffer (which will have been advanced by the addition of new
requests).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-29-chris@chris-wilson.co.uk
2016-10-28 20:53:54 +01:00
Chris Wilson
98f29e8d90 drm/i915: Record space required for breadcrumb emission
In the next patch, we will use deferred breadcrumb emission. That requires
reserving sufficient space in the ringbuffer to emit the breadcrumb, which
first requires us to know how large the breadcrumb is.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-28-chris@chris-wilson.co.uk
2016-10-28 20:53:53 +01:00
Chris Wilson
9b81d556b1 drm/i915: Rename ->emit_request to ->emit_breadcrumb
Now that the emission of the request tail and its submission to hardware
are two separate steps, engine->emit_request() is confusing.
engine->emit_request() is called to emit the breadcrumb commands for the
request into the ring, name it such (engine->emit_breadcrumb).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-27-chris@chris-wilson.co.uk
2016-10-28 20:53:53 +01:00
Chris Wilson
65e4760e39 drm/i915: Introduce a global_seqno for each request
Though we will have multiple timelines, we still have a single timeline
of execution. This we can use to provide an execution and retirement order
of requests. This keeps tracking execution of requests simple, and vital
for preserving a single waiter (i.e. so that we can order the waiters so
that only the earliest to wakeup need be woken). To accomplish this we
distinguish the seqno used to order requests per-context (external) and
that used internally for execution.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
2016-10-28 20:53:53 +01:00
Chris Wilson
4680816be3 drm/i915: Wait first for submission, before waiting for request completion
In future patches, we will no longer be able to wait on a static global
seqno and instead have to break our wait up into phases. First we wait
for the global seqno assignment (upon submission to hardware), and once
submitted we wait for the hardware to complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-25-chris@chris-wilson.co.uk
2016-10-28 20:53:52 +01:00
Chris Wilson
3033acab07 drm/i915: Queue the idling context switch after all other timelines
Before suspend, we wait for the switch to the kernel context. In order
for all the other context images to be complete upon suspend, that
switch must be the last operation by the GPU (i.e. this idling request
must not overtake any pending requests). To make this request execute last,
we make it depend on every other inflight request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-24-chris@chris-wilson.co.uk
2016-10-28 20:53:52 +01:00
Chris Wilson
73cb97010d drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.

Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
2016-10-28 20:53:51 +01:00
Chris Wilson
c004a90b72 drm/i915: Restore nonblocking awaits for modesetting
After combining the dma-buf reservation object and the GEM reservation
object, we lost the ability to do a nonblocking wait on the i915 request
(as we blocked upon the reservation object during prepare_fb). We can
instead convert the reservation object into a fence upon which we can
asynchronously wait (including a forced timeout in case the DMA fence is
never signaled).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-22-chris@chris-wilson.co.uk
2016-10-28 20:53:51 +01:00
Chris Wilson
d07f0e59b2 drm/i915: Move GEM activity tracking into a common struct reservation_object
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

 * busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

 * non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

 * dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

 * loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

 * minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
2016-10-28 20:53:50 +01:00
Chris Wilson
f0cd518206 drm/i915: Use lockless object free
Having moved the locked phase of freeing an object to a separate worker,
we can now declare to the core that we only need the unlocked variant of
driver->gem_free_object, and can use the simple unreference internally.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-20-chris@chris-wilson.co.uk
2016-10-28 20:53:50 +01:00
Chris Wilson
fbbd37b36f drm/i915: Move object release to a freelist + worker
We want to hide the latency of releasing objects and their backing
storage from the submission, so we move the actual free to a worker.
This allows us to switch to struct_mutex freeing of the object in the
next patch.

Furthermore, if we know that the object we are dereferencing remains valid
for the duration of our access, we can forgo the usual synchronisation
barriers and atomic reference counting. To ensure this we defer freeing
an object til after an RCU grace period, such that any lookup of the
object within an RCU read critical section will remain valid until
after we exit that critical section. We also employ this delay for
rate-limiting the serialisation on reallocation - we have to slow down
object creation in order to prevent resource starvation (in particular,
files).

v2: Return early in i915_gem_tiling() ioctl to skip over superfluous
work on error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-19-chris@chris-wilson.co.uk
2016-10-28 20:53:49 +01:00
Chris Wilson
40e62d5d6b drm/i915: Acquire the backing storage outside of struct_mutex in set-domain
As we can locklessly (well struct_mutex-lessly) acquire the backing
storage, do so in set-domain-ioctl to reduce the contention on the
struct_mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-18-chris@chris-wilson.co.uk
2016-10-28 20:53:49 +01:00
Chris Wilson
fe115628d5 drm/i915: Implement pwrite without struct-mutex
We only need struct_mutex within pwrite for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-17-chris@chris-wilson.co.uk
2016-10-28 20:53:48 +01:00
Chris Wilson
bb6dc8d96b drm/i915: Implement pread without struct-mutex
We only need struct_mutex within pread for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-16-chris@chris-wilson.co.uk
2016-10-28 20:53:48 +01:00
Chris Wilson
7dd737f377 drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex
Use the per-object mm.lock to allocate the backing storage (and hold a
reference to it across the dmabuf access) without resorting to
struct_mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-15-chris@chris-wilson.co.uk
2016-10-28 20:53:48 +01:00
Chris Wilson
1233e2db19 drm/i915: Move object backing storage manipulation to its own locking
Break the allocation of the backing storage away from struct_mutex into
a per-object lock. This allows parallel page allocation, provided we can
do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT
fault), i.e. before execbuf! The increased cost of the atomic counters
are hidden behind i915_vma_pin() for the typical case of execbuf, i.e.
as the object is typically bound between execbufs, the page_pin_count is
static. The cost will be felt around set-domain and pwrite, but offset
by the improvement from reduced struct_mutex contention.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-14-chris@chris-wilson.co.uk
2016-10-28 20:53:47 +01:00
Chris Wilson
03ac84f183 drm/i915: Pass around sg_table to get_pages/put_pages backend
The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
2016-10-28 20:53:47 +01:00
Chris Wilson
a4f5ea64f0 drm/i915: Refactor object page API
The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
2016-10-28 20:53:46 +01:00
Chris Wilson
d2a84a76a3 drm/i915: Use radixtree to jump start intel_partial_pages()
We can use the radixtree index of the obj->pages to find the start
position of the desired partial range.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk
2016-10-28 20:53:46 +01:00
Chris Wilson
96d7763452 drm/i915: Use a radixtree for random access to the object's backing storage
A while ago we switched from a contiguous array of pages into an sglist,
for that was both more convenient for mapping to hardware and avoided
the requirement for a vmalloc array of pages on every object. However,
certain GEM API calls (like pwrite, pread as well as performing
relocations) do desire access to individual struct pages. A quick hack
was to introduce a cache of the last access such that finding the
following page was quick - this works so long as the caller desired
sequential access. Walking backwards, or multiple callers, still hits a
slow linear search for each page. One solution is to store each
successful lookup in a radix tree.

v2: Rewrite building the radixtree for clarity, hopefully.

v3: Rearrange execbuf to avoid calling i915_gem_object_get_sg() from
within an atomic section and so relax the allocation context to a simple
GFP_KERNEL and mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-10-chris@chris-wilson.co.uk
2016-10-28 20:53:45 +01:00
Chris Wilson
4c7d62c6b8 drm/i915: Markup GEM API with lockdep asserts
Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk
2016-10-28 20:53:45 +01:00
Chris Wilson
4e50f082ac drm/i915: Reuse the active golden render state batch
The golden render state is constant, but we recreate the batch setting
it up for every new context. If we keep that batch in a volatile cache
we can safely reuse it whenever we need to initialise a new context. We
mark the pages as purgeable and use the shrinker to recover pages from
the batch whenever we face memory pressues, recreating that batch afresh
on the next new context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-8-chris@chris-wilson.co.uk
2016-10-28 20:53:44 +01:00
Chris Wilson
920cf41949 drm/i915: Introduce an internal allocator for disposable private objects
Quite a few of our objects used for internal hardware programming do not
benefit from being swappable or from being zero initialised. As such
they do not benefit from using a shmemfs backing storage and since they
are internal and never directly exposed to the user, we do not need to
worry about providing a filp. For these we can use an
drm_i915_gem_object wrapper around a sg_table of plain struct page. They
are not swap backed and not automatically pinned. If they are reaped
by the shrinker, the pages are released and the contents discarded. For
the internal use case, this is fine as for example, ringbuffers are
pinned from being written by a request to be read by the hardware. Once
they are idle, they can be discarded entirely. As such they are a good
match for execlist ringbuffers and a small variety of other internal
objects.

In the first iteration, this is limited to the scratch batch buffers we
use (for command parsing and state initialisation).

v2: Allocate physically contiguous pages, where possible.
v3: Reduce maximum order on subsequent requests following an allocation
failure.
v4: Fix up mismatch between swiotlb segment size and page count (it
counts in 2k units, not 4k pages)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-7-chris@chris-wilson.co.uk
2016-10-28 20:53:44 +01:00
Chris Wilson
e95433c73a drm/i915: Rearrange i915_wait_request() accounting with callers
Our low-level wait routine has evolved from our generic wait interface
that handled unlocked, RPS boosting, waits with time tracking. If we
push our GEM fence tracking to use reservation_objects (required for
handling multiple timelines), we lose the ability to pass the required
information down to i915_wait_request(). However, if we push the extra
functionality from i915_wait_request() to the individual callsites
(i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use
of those extras, we can both simplify our low level wait and prepare for
extending the GEM interface for use of reservation_objects.

v2: Rewrite i915_wait_request() kerneldocs

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
2016-10-28 20:53:43 +01:00
Chris Wilson
f8a7fde456 drm/i915: Defer active reference until required
We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk
2016-10-28 20:53:43 +01:00
Chris Wilson
2e36991a8a drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked()
Since we only use the more generic unlocked variant, just rename it as
the normal i915_gem_active_wait(). The temporary cost is that we need to
always acquire the reference in a RCU safe manner, but the benefit is
that we will combine the common paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-5-chris@chris-wilson.co.uk
2016-10-28 20:53:43 +01:00
Chris Wilson
c92ac094a9 drm/i915: Remove superfluous wait_for_error() from throttle-ioctl
The throttle-ioctl never touches the struct_mutex. It does, however, as
part of its ABI report whether the hardware is terminally wedged. For
that purposes, it only has to report the current state and not incur the
cost of checking/waiting every invocation, as we do not have to wait for
a reset before waiting on a request to ensure completion (that is baked
into the wait request implementation).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-3-chris@chris-wilson.co.uk
2016-10-28 20:53:42 +01:00
Chris Wilson
7e941861c9 drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate
In forthcoming patches, we want to be able to dynamically allocate the
wait_queue_t used whilst awaiting. This is more convenient if we extend
the i915_sw_fence_await_sw_fence() to perform the allocation for us if
we pass in a gfp mask as an alternative than a preallocated struct.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-2-chris@chris-wilson.co.uk
2016-10-28 20:53:42 +01:00
Chris Wilson
b52992c06c drm/i915: Support asynchronous waits on struct fence from i915_gem_request
We will need to wait on DMA completion (as signaled via struct fence)
before executing our i915_gem_request. Therefore we want to expose a
method for adding the await on the fence itself to the request.

v2: Add a comment detailing a failure to handle a signal-on-any
fence-array.
v3: Pretend that magic numbers don't exist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-1-chris@chris-wilson.co.uk
2016-10-28 20:53:41 +01:00
Chris Wilson
fc0990903c drm/i915: Remove insert-page shortcut from execbuf relocate_iomap()
We are not allowed to touch the GTT entries underneath an atomic section,
as they take a rpm wakelock (which is illegal from atomic context) and
in the near future acquiring the DMA address for a page within an object
may sleep for an allocation. This makes the current shortcircuit in
relocation_iomap() for performing a second relocation on an adjacent page
illegal, and we need to release the atomic iomapping, lookup the DMA,
insert it into the GTT before reentering the atomic iomap section.

As it happens, this is precisely what we do on if we are using an
iomapping over the full object and not just a single page and by
removing the shortcut, we do the right thing.

Fixes: 9c870d0367 ("drm/i915: Use RPM as the barrier for controlling...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028142756.3850-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-28 20:53:31 +01:00
Matt Roper
2c4b49a0f7 drm/i915: Use macro in place of open-coded for_each_universal_plane loop
This was the only use of (misleadingly-named) intel_num_planes()
function, so we can remove it as well.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-3-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2016-10-28 11:27:19 -07:00
Matt Roper
8b364b41ce drm/i915: Rename for_each_plane -> for_each_universal_plane
This macro's name is a bit misleading; it doesn't actually iterate over
all planes since it omits the cursor plane.  Its only uses are in gen9
code which is using it to iterate over the universal planes (which we
treat as primary+sprites); in these cases the legacy cursor registers
are programmed independently if necessary.  The macro's iterator value
(0 for primary plane, spritenum+1 for each secondary plane) also isn't
meaningful outside the gen9 context where the hardware considers them to
all be "universal" planes that follow this numbering.

This is just a renaming/clarification patch with no functional change.
However it will make the subsequent patches more clear.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-2-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2016-10-28 11:26:36 -07:00
Navare, Manasi D
40dba34112 drm/i915: Change the placement of some static functions in intel_dp.c
These static helper functions are required to be used during
fallback link rate implemnetation so they need to be placed at the top
of the file.

v3:
* Add cleanup to other patch (Mika Kahola)
v2:
* Dont move around functions declared in intel_drv.h (Rodrigo Vivi)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477524358-16563-4-git-send-email-manasi.d.navare@intel.com
2016-10-28 15:06:31 +03:00
Ander Conselvan de Oliveira
ed37892e6d drm/i915: Address broxton phy registers based on phy and channel number
The port registers related to the phys in broxton map to different
channels and specific phys. Make that mapping explicit.

v2: Pass enum dpio_phy to macros instead of mmio base. (Imre)

v3: Fix typo in macros. (Imre)

v4: Also change variables from u32 to enum dpio_phy. (Imre)
    Remove leftovers from previous version. (Imre)

v5: Actually git add the changes.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476863940-6019-1-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-10-28 12:25:24 +03:00
Ander Conselvan de Oliveira
e7583f7b10 drm/i915: Add location of the Rcomp resistor to bxt_ddi_phy_info
Use struct bxt_ddi_phy_info to hold information of where the Rcomp
resistor is located, instead of hard coding it in the init sequence.

Note that this moves the enabling of the phy with the Rcomp resistor out
of the power well enable code. That should be safe since
bxt_ddi_phy_init() is called while the power domains lock is held, and
that is the only way that function gets called, so there is no
possibility of a concurrent phy enable caused by a power domain get
call.

v2: Replace comment about lock with lockdep_assert_held()  (Imre)

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/62d209950ad48484564f3e793cf247cf62572a39.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:25:03 +03:00
Ander Conselvan de Oliveira
842d416654 drm/i915: Create a struct to hold information about the broxton phys
Information about which phy is dual channel is hardcoded in the phy init
sequence. Split that to a separate struct so the init sequence is more
generic.

v2: Restore mangled part that ended up in following patch. (Imre)

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/9102f4c984044126057e4fdd1b91a615ff25fae6.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:24:51 +03:00
Ander Conselvan de Oliveira
b6e08203cc drm/i915: Move broxton vswing sequence to intel_dpio_phy.c
The vswing sequence is related to the DPIO phy, so move it closer to the
rest of DPIO phy related code.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/59aa5c85a115c5cbed81e793f20cd7b9f8de694b.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:24:45 +03:00
Ander Conselvan de Oliveira
f38861b814 drm/i915: Move DPIO phy documentation section to intel_dpio_phy.c
Move the DPIO phy documentation section to intel_dpio_phy.c, since that
is a more suitable place now that there is a source file dedicated for
those phys.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/55a2d38c15c06a8c5bce498b28decc03948f0224.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:24:37 +03:00
Ander Conselvan de Oliveira
47a6bc61b8 drm/i915: Move broxton phy code to intel_dpio_phy.c
The phy in broxton is also a dpio phy, similar to cherryview but with
programming through MMIO. So move the code together with the other
similar phys.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/d611de6d256593cf904172db7ff27f164480c228.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:24:01 +03:00
Ander Conselvan de Oliveira
b284eedaf7 drm/i915: Pass lane count to bxt_ddi_phy_calc_lane_optmin_mask()
Pass lane count to bxt_ddi_phy_calc_lane_optmin_mask() instead of having
it extract that number from a pipe_config to decouple the phy code from
intel_crtc_state.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/a4977e0207e594953c4f9d1b5f2ef972a8679e74.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:23:53 +03:00
Ander Conselvan de Oliveira
362624c9ba drm/i915: Explicitly map broxton DPIO power wells to phys
The mapping from the BXT_DPIO_CMN_* power wells to their respective phys
required a detour implemented in the bxt_power_well_to_phy() function.
Instead, embed that information directly into the power_well struct, by
resurrecting the data field.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/7fe97582fa08c7340ce6a3b6b0ea3e72a73182d7.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:23:45 +03:00
Ander Conselvan de Oliveira
01c3faa70b drm/i915: Rename struct i915_power_well field data to id
Calling it data seems to imply arbitrary data can be associated with the
power well. However, that field is used for look ups and expected to be
unique, so rename it.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/f3916c3c5bfa793b0fc870fd44007a3ff425194d.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
2016-10-28 12:23:30 +03:00
Daniel Vetter
96583ddbec Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge latest drm-next to pull in the s/fence/dma_fence/ rework,
needed before we merge more i915 fencing patches.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-10-28 09:14:08 +02:00
Dave Airlie
fb422950c6 Merge branch 'linux-4.9' of git://github.com/skeggsb/linux into drm-next
Karol's work which greatly improves volt/clock changes on a
heap of boards, nothing too exciting beyond a random collection of fixes.

* 'linux-4.9' of git://github.com/skeggsb/linux: (33 commits)
  drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook
  drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook
  drm/nouveau/pci: set streaming DMA mask early
  drm/nouveau/kms: add Maxwell to backlight initialization
  drm/nouveau/bar/nv50: fix bar2 vm size
  drm/nouveau/disp: remove unused function in sorg94.c
  drm/nouveau/volt: use kernel's 64-bit signed division function
  drm/nouveau/core: add missing header dependencies
  drm/nouveau/gr/nv3x: add 0x0597 kelvin 3d class support
  drm/nouveau/drm/nouveau: add a LED driver for the NVIDIA logo
  drm/nouveau/fb/ram: Use Kepler implementation on Maxwell
  drm/nouveau/volt: Make use of cvb coefficients
  drm/nouveau/volt/gf100-: Add speedo
  drm/nouveau/volt: Add implementation for gf100
  drm/nouveau/bios/vmap: unk0 field is the mode
  drm/nouveau/volt: Don't require perfect fit
  drm/nouveau/clk: Allow boosting only when NvBoost is set
  drm/nouveau/bios: Add parsing of VPSTATE table
  drm/nouveau/clk: Respect voltage limits in nvkm_cstate_prog
  drm/nouveau/clk: Fixup cstate selection
  ...
2016-10-28 14:24:56 +10:00
Dave Airlie
220196b384 Merge tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Pull request already again to get the s/fence/dma_fence/ stuff in and
allow everyone to resync. Otherwise really just misc stuff all over, and a
new bridge driver.

* tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel:
  drm/bridge: fix platform_no_drv_owner.cocci warnings
  drm/bridge: fix semicolon.cocci warnings
  drm: Print some debug/error info during DP dual mode detect
  drm: mark drm_of_component_match_add dummy inline
  drm/bridge: add Silicon Image SiI8620 driver
  dt-bindings: add Silicon Image SiI8620 bridge bindings
  video: add header file for Mobile High-Definition Link (MHL) interface
  drm: convert DT component matching to component_match_add_release()
  dma-buf: Rename struct fence to dma_fence
  dma-buf/fence: add an lockdep_assert_held()
  drm/dp: Factor out helper to distinguish between branch and sink devices
  drm/edid: Only print the bad edid when aborting
  drm/msm: add missing header dependencies
  drm/msm/adreno: move function declarations to header file
  drm/i2c/tda998x: mark symbol static where possible
  doc: add missing docbook parameter for fence-array
  drm: RIP mode_config->rotation_property
  drm/msm/mdp5: Advertize 180 degree rotation
  drm/msm/mdp5: Use per-plane rotation property
2016-10-28 11:33:52 +10:00
Rex Zhu
3495a10357 drm/amdgpu: turn on/off uvd clock when dpm enable/disable on CI
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:58 -04:00
Rex Zhu
415282b15e drm/amdgpu: disable dpm before turn off clock when vce idle.
v2: move return value check as well

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:48 -04:00
Rex Zhu
4be5097ccb drm/amdgpu: enable uvd bypass mode for CI/VI.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:38 -04:00
Rex Zhu
3f767e3d07 drm/amdgpu: just not load smc firmware if smu is already running
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:29 -04:00
Rex Zhu
86f8c599b0 drm/amdgpu: when suspend, set boot state instand of disable dpm.
fix pm-hibernate bug, when suspend/resume, dpm start failed.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:19 -04:00
Huang Rui
8ed8147abc drm/amdgpu: use failed label to handle context init failure
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-27 15:18:09 -04:00
Tvrtko Ursulin
1353ec3833 drm/i915: Correct pipe fault reporting string
Newline somehow ended up in the middle of the line.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1477572512-4030-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-27 15:08:43 +01:00
Daniel Vetter
f6c499eca0 Merge tag 'gvt-next-2016-10-27' of https://github.com/01org/gvt-linux into drm-intel-next-queued
gvt-next-2016-10-27

- Resolve current left build issue with ACPI=n and 32bit kernel
- TLB workaround from Arkadiusz
- vGPU reset fix from Ping
- workload scheduler nesting sleep fix from Changbin
- more misc fixes for sparse warnings and cleanups

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-10-27 10:33:17 +02:00
kbuild test robot
56df51d003 drm/bridge: fix platform_no_drv_owner.cocci warnings
drivers/gpu/drm/bridge/sil-sii8620.c:1556:3-8: No need to set .owner here. The core will do it.

 Remove .owner field if calls are used which set it automatically

Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

CC: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161026165836.GA98766@lkp-sb04.lkp.intel.com
2016-10-27 11:35:23 +05:30
kbuild test robot
3a81e96094 drm/bridge: fix semicolon.cocci warnings
drivers/gpu/drm/bridge/sil-sii8620.c:988:2-3: Unneeded semicolon

 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161026165836.GA98907@lkp-sb04.lkp.intel.com
2016-10-27 11:34:21 +05:30
Du, Changbin
e45d7b7f47 drm/i915/gvt: fix nested sleeping issue
We cannot use blocking method mutex_lock inside a wait loop.
Here we invoke pick_next_workload() which needs acquire a
mutex in our "condition" experssion. Then we go into a another
of the going-to-sleep sequence and changing the task state.
This is a dangerous. Let's rewrite the wait sequence to avoid
nested sleeping.

v2: fix do...while loop exit condition (zhenyu)
v3: rebase to gvt-staging branch

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-27 11:20:42 +08:00
Bing Niu
6fb5082a8c drm/i915/gvt: throw error basing on execlist submit result
throw error message in elsp emulation handler basing on execlist
submit result. guest will trigger tdr process for recovering, gvt
just follow guest's desire.

v2: populate error to top of mmio emulation logic, comments from
zhenyu

Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-27 11:20:42 +08:00
Ping Gao
23736d1b1b drm/i915/gvt: add full vGPU reset support
Full vGPU reset need to release all the shadow PPGGT pages to avoid
unnecessary write-protect and also should re-initialize pvinfo after
resetting vregs to keep pvinfo correct.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-27 11:20:19 +08:00
Anusha Srivatsa
1c00164d4c drm/i915/DMC/KBL: Load DMC on KBL using the no_stepping_info array
Currently, for display there is only one DMC image for KBL.
Remove the stepping_info table for KBL and use the no_stepping_info
array for loading the firmware.

v2: Removed the block of code as pointed out by Rodrigo to make the
loads as generic as possible.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477355301-7035-1-git-send-email-anusha.srivatsa@intel.com
2016-10-26 14:20:53 -07:00
Imre Deak
9ff7a1b0ba drm: Print some debug/error info during DP dual mode detect
There's at least one LSPCON device that occasionally returns an unexpected
adaptor ID which leads to a failed detect. Print some debug info to help
debugging this and future cases. Also print an error for an unexpected
adaptor ID, so users can report it.

v2:
- s/adapter/adaptor/ and add code comment about incorrect type 1 adaptor
  IDs. (Ville)

Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1477499359-12001-1-git-send-email-imre.deak@intel.com
2016-10-26 15:57:11 -04:00
Ville Syrjälä
da064b47c0 drm/i915: Fix SKL+ 90/270 degree rotated plane coordinate computation
Pass the framebuffer size in .16 fixed point coordinates to
drm_rect_rotate() since that's what the source coordinates are as well
at this stage. We used to do this part of the computation in integer
coordinates, but that got changed when moving the computation to
happen in the check phase of the operation. Unfortunately I forgot
to shift up the fb width and height appropriately.

With the bogus size we ended up with some negative fb offset, which when
added to the vma offset caused out scanout to start at an offset earlier
than we inteded. Eg. when testing on my SKL I saw a row of incorrect
tiles at the top of my screen.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: b63a16f6cd ("drm/i915: Compute display surface offset in the plane check hook for SKL+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477325584-23679-1-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-10-26 19:32:26 +03:00
Arkadiusz Hiler
aafee2eb8c drm/i915: fix comment on I915_{READ, WRITE}_FW
Comment mentioned use of intel_uncore_forcewake_irq{unlock, lock}
functions which are nonexistent (and never were).

The description was also incomplete and could cause confusion. Updated
comment is more elaborate on usage and caveats.

v2: mention __locked variant of intel_uncore_forcewake_{get,put} instead
    of plain ones

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilsono.c.uk>
[Mika: removed two superfluous lines on comment noted by Chris]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477399682-3133-1-git-send-email-arkadiusz.hiler@intel.com
2016-10-26 14:45:33 +03:00
Imre Deak
489375c866 drm/i915/lspcon: Add workaround for resuming in PCON mode
On my APL the LSPCON firmware resumes in PCON mode as opposed to the
expected LS mode. It also appears to be in a state where AUX DPCD reads
will succeed but return garbage recovering only after a few hundreds of
milliseconds. After the recovery time DPCD reads will result in the
correct values and things will continue to work. If I2C over AUX is
attempted during this recovery time (implying an AUX write transaction)
the firmware won't recover and will stay in this broken state.

As a workaround check if the firmware is in PCON state after resume and
if so wait until the correct DPCD values are returned. For this we
compare the branch descriptor with the one we cached during init time.
If the firmware was in the LS state, we skip the w/a and continue as
before.

v2:
- Use the DP descriptor value cached in intel_dp. (Jani)
- Get to intel_dp using container_of(), instead of a cached ptr.
  (Shashank)
- Use usleep_range() instead of msleep().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98353
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-9-git-send-email-imre.deak@intel.com
2016-10-26 12:41:01 +03:00
Imre Deak
a5d94b83ec drm/i915/lspcon: Get DDC adapter via container_of() instead of cached ptr
We can use the container_of() magic to get to the DDC adapter, so no
need for caching a pointer to it. We'll also need to get at the intel_dp
ptr in the following patch, so add a helper that can be used for both
purposes.

Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-8-git-send-email-imre.deak@intel.com
2016-10-26 12:41:01 +03:00
Imre Deak
12a47a4228 drm/i915/dp: Read DP descriptor for eDP and LSPCON too
As for external DP sink and branch devices read and print the DP
descriptor for eDP and LSPCON devices as well to aid debugging.

v2:
- Split out this change to a separate patch. (Jani)

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-7-git-send-email-imre.deak@intel.com
2016-10-26 12:41:01 +03:00
Imre Deak
24e807e79f drm/i915/lspcon: Fail LSPCON probe if the start of DPCD can't be read
All types of DP devices (eDP, DP sink, DP branch) will fail their probe
if the start of DPCD can't be read. The LSPCON PCON functionality also
depends on accessing this area, so fail the probe if the read fails.

Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-6-git-send-email-imre.deak@intel.com
2016-10-26 12:41:00 +03:00
Imre Deak
7b3fc170d6 drm/i915/dp: Print full branch/sink descriptor
Extend the branch/sink descriptor info with the missing device ID
field. While at it also read out all the descriptor registers in one
transfer and make the debug print more compact.

v2: (Jani)
- Cache the descriptor in intel_dp.
- Split out this change into a separate patch.
v3: (Jani)
- Fix return value check of __intel_dp_read_desc().
- Use %pE instead of %s to print the device ID.

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477401159-15098-1-git-send-email-imre.deak@intel.com
2016-10-26 12:41:00 +03:00
Imre Deak
5e89667742 drm/i915/dp: Print only sink or branch specific OUI based on dev type
There are two separate sets of DPCD registers for the DP OUI - as well as
for the device ID and HW/SW revision - based on whether the given DP
device is a branch or a sink. Currently we print both branch and sink
OUIs, for consistency print only the one that corresponds to the
probed device.

v2:
- Split out this change into a separate patch. (Jani)

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-4-git-send-email-imre.deak@intel.com
2016-10-26 12:40:59 +03:00
Imre Deak
6f172a43a6 drm/i915/dp: Remove debug dependency of DPCD SW/HW revision read
Performing DPCD AUX reads based on debug settings may introduce obscure
bugs in other places that depend on the read being done (or being not
done). To reduce the uncertainty perform the reads unconditionally.

Cc: Mika Kahola <mika.kahola@intel.com>
Suggested-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-3-git-send-email-imre.deak@intel.com
2016-10-26 12:40:59 +03:00
Imre Deak
c726ad01d2 drm/dp: Factor out helper to distinguish between branch and sink devices
This check is open-coded in a few places, so it makes sense to simplify
things by having a helper for it similar to the rest of DPCD feature
helpers.

v2: (Jani)
- Move the helper to drm_dp_helper.h.
- Split out this change to a separate patch.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-2-git-send-email-imre.deak@intel.com
2016-10-26 12:40:59 +03:00
Libin Yang
6014ac122e drm/i915/audio: set proper N/M in modeset
When modeset occurs and the LS_CLK is set to some special values in DP
mode, the N/M need to be set manually if audio is playing. Otherwise the
first several seconds may be silent in audio playback.

The relationship of Maud and Naud is expressed in the following
equation:

Maud/Naud = 512 * fs / f_LS_Clk

Please refer VESA DisplayPort Standard spec for details.

v2 by Jani:
- organize Maud/Naud table according to DP 1.4 spec
- add 64k and 128k audio rates
- update HSW_AUD_M_CTS_ENABLE register when Maud not found
- remove extra checks for port clock
- simplify Maud/Naud lookup
- reset patch author back to Libin

Cc: "Zhang, Keqiao" <keqiao.zhang@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: "Lin, Mengdong" <mengdong.lin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477407258-30599-3-git-send-email-jani.nikula@intel.com
2016-10-26 12:36:30 +03:00
Jani Nikula
9ca89c443d drm/i915/audio: drop extra crtc clock check from HDMI audio N lookup
The array contains the crtc clock, rely on that. While at it, debug log
the HDMI N value or automatic mode.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: "Lin, Mengdong" <mengdong.lin@intel.com>
Cc: Libin Yang <libin.yang@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477407258-30599-2-git-send-email-jani.nikula@intel.com
2016-10-26 12:24:26 +03:00
Ville Syrjälä
1aab956c7b drm/i915: Refresh that status of MST capable connectors in ->detect()
Once we've determined that the sink is MST capable we never end up
running through the full detect cycle again, despite getting HPDs.
Fix tht by ripping out the incorrect piece of code responsible.

This got broken when I moved the long HPD handling to the ->detect()
hook, but failed to remove the leftover code.

Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Cc: Rui Tiago Matos <tiagomatos@gmail.com>
Tested-by: Rui Tiago Matos <tiagomatos@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98323
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
References: https://bugs.freedesktop.org/show_bug.cgi?id=98306
Fixes: 27d4efc559 ("drm/i915: Move long hpd handling into the hotplug work")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477057478-29328-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-10-26 11:16:34 +03:00
Tvrtko Ursulin
3299e7e434 drm/i915: Remove two invalid warns
Objects can have multiple VMAs used for display in which
case assertion that objects must not be pinned for display
more times than the current VMA is incorrect.

v2: Commit message update. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 058d88c433 ("drm/i915: Track pinned VMA")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1477413635-3876-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-26 09:04:56 +01:00
Tvrtko Ursulin
07ee2bce6a drm/i915: Rotated view does not need a fence
We do not need to set up a fence for the rotated view.

Display does not need it and no one can access it.

v2: Move code to __i915_vma_set_map_and_fenceable. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 05a20d098d ("drm/i915: Move map-and-fenceable tracking to the VMA")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-10-26 09:04:55 +01:00
Andrzej Hajda
ce6e153f41 drm/bridge: add Silicon Image SiI8620 driver
SiI8620 transmitter converts eTMDS/HDMI signal to MHL 3.0.
It is controlled via I2C bus. Its interaction with other
devices in video pipeline is performed mainly on HW level.
The only interaction it does on device driver level is
filtering-out unsupported video modes, it exposes drm_bridge
interface to perform this operation.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1476085157-5266-1-git-send-email-a.hajda@samsung.com
2016-10-26 11:19:12 +05:30
Ping Gao
0a8b66e3ad drm/i915/gvt: correct the reset logic
The current_vgpu will set to NULL after stopping the scheduler when
the reset is triggered by current vgpu, so here need change the
judgement condition for current vgpu detection.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-26 13:36:41 +08:00
Ping Gao
40d2428b3a drm/i915/gvt: add vreg write for GDRST handler
The emulation handler for MMIO GDRST miss vreg write in it, as result
the vreg cannot update correspondingly.

Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-26 13:33:05 +08:00
Xiaoguang Chen
b0122f3114 drm/i915/gvt: fix detect_host calling logic
Like other routines, intel_gvt_hypervisor_detect_host returns 0
for success.

Signed-off-by: Xiaoguang Chen <xiaoguang.chen@intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-26 13:23:04 +08:00
Min He
64fafcf5a2 drm/i915/gvt: fix an typo in skl_decode_mi_display_flip
Fix type to set correct pipe number.

Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-26 13:12:03 +08:00
Akash Goel
ed4596ea99 drm/i915/guc: WA to address the Ringbuffer coherency issue
Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
pinned in mappable aperture portion of GGTT and for ringbuffer pages
allocated from Stolen memory, access can only be done through GMADR BAR.
In case of GuC based submission, updates done in ringbuffer via GMADR
may not get committed to memory by the time the Command streamer starts
reading them, resulting in fetching of stale data.

For Host based submission, such problem is not there as the write to Ring
Tail or ELSP register happens from the Host side prior to submission.
Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
enforces the ordering between outstanding GMADR writes & new GTTMADR access.
MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
registers within GT is contained within GT, so ordering is not enforced
resulting in a race, which can manifest in form of a hang.

To ensure the flush of in-flight GMADR writes, a POSTING READ is done to
GuC register prior to doorbell ring.
There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
which takes care of GMADR writes from User space to GEM buffers, but not the
ringbuffer writes from KMD.
This WA is needed on all recent HW.

v2:
- Use POSTING_READ_FW instead of POSTING_READ as GuC register do not lie
  in any forcewake domain range and so the overhead of spinlock & search
  in the forcewake table is avoidable. (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1477413323-1880-1-git-send-email-akash.goel@intel.com
2016-10-25 21:00:29 +01:00
Christian König
9982ca681e drm/amdgpu: add amdgpu_ttm_bo_eviction_valuable callback
This way we can correctly check split VRAM buffers as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:44:04 -04:00
Christian König
a2ab19fed9 drm/ttm: make eviction decision a driver callback v2
This way the driver can decide if it is valuable to evict a BO or not.

The current implementation is added as default to all existing drivers.

v2: fix some typos found during internal testing

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:44:04 -04:00
Lucas Stach
1c331f75aa drm/radeon/pm: autoswitch power state when in balanced mode
The current default of always using the performance power state leads
to increased power consumption of mobile devices, which have a dedicated
battery power state. Switch between the performance and battery power
state automatically, dpending on the current AC power status, when the
user asked for the balanced power state.

The user can still override this logic by asking for the performance
or battery power state explicitly.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:44:03 -04:00
Colin Ian King
fad742f89d drm/amd/powerplay: fix spelling mistake and add KERN_WARNING to printks
Fix trivial spelling mistake cant't -> can't and add KERN_WARNING to
printk messages.  Remove redundant spaces before \n too (thanks to
Joe Perches for spotting those).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:44:02 -04:00
Monk Liu
aafcafa0fa drm/amdgpu:new ids flag for preempt
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:59 -04:00
Baoyou Xie
d1936cc2fc drm/amdgpu: mark symbols static where possible
We get 2 warnings when building kernel with W=1:
drivers/gpu/drm/amd/amdgpu/si.c:908:5: warning: no previous prototype for 'si_pciep_rreg' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/si.c:921:6: warning: no previous prototype for 'si_pciep_wreg' [-Wmissing-prototypes]

In fact, both functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:58 -04:00
Baoyou Xie
356aee305a drm/amdgpu: change function declarations and add missing header dependencies
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/amd/amdgpu/atombios_crtc.c:38:6: warning: no previous prototype for 'amdgpu_atombios_crtc_overscan_setup' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/dce_v8_0.c:661:6: warning: no previous prototype for 'dce_v8_0_disable_dce' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:40:5: warning: no previous prototype for 'amdgpu_gfx_scratch_get' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:62:6: warning: no previous prototype for 'amdgpu_gfx_scratch_free' [-Wmissing-prototypes]
....

In fact, these functions are declared in
drivers/gpu/drm/amd/amdgpu/atombios_crtc.h
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
drivers/gpu/drm/amd/amdgpu/dce_v8_0.h
drivers/gpu/drm/amd/amdgpu/dce_v10_0.h
drivers/gpu/drm/amd/amdgpu/dce_v11_0.h
drivers/gpu/drm/amd/powerplay/inc/pp_acpi.h.
So this patch adds missing header dependencies.

By the way, this patch changes declaration of amdgpu_gfx_parse_disable_cu()
to subject to its implement, and clean three function declarations
in pp_acpi.h up.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:58 -04:00
Alex Deucher
f93932bcdc drm/amdgpu: s/amdgpuCrtc/amdgpu_crtc/ in pageflip code
Fix random CamelCase that has annoyed me for a while.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:57 -04:00
Alex Deucher
f1e68a7cf5 drm/amdgpu/atom: remove a bunch of unused functions
Leftovers from the radeon.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:56 -04:00
Alex Deucher
72a57438d1 drm/amdgpu: consolidate atom scratch reg handling for hangs
Move from asic specific code to common atom code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:56 -04:00
Alex Deucher
a76ed485c5 drm/amdgpu: use amdgpu_bo_[create|free]_kernel for wb
Rather than open coding it.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:55 -04:00
Christian König
9861470181 drm/amdgpu: add VCE VM session tracking
Fix the problems with killing VCE sessions in VM mode.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:55 -04:00
Christian König
45088efc85 drm/amdgpu: improve parse_cs handling a bit
This way we can use parse_cs and still keep VM mode enabled.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:54 -04:00
Rex Zhu
5e876c62d8 drm/amdgpu: refine set power state logic for dpm.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:53 -04:00
Rex Zhu
8c8e2c30d2 drm/amdgpu: update current ps/requeset ps in adev with real ps.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:53 -04:00
Alex Deucher
6061789a45 drm/amdgpu: add an implement for check_power_state equal for KV
KV/KB/ML was missed these was implemented for other asics.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:52 -04:00
Rex Zhu
3411717501 drm/amdgpu: add an implement for check_power_state equal for Si.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:52 -04:00
Rex Zhu
73909a746a drm/amdgpu: add an implement for check_power_state equal for Cz.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:51 -04:00
Rex Zhu
1d516c41d9 drm/amdgpu: add an implement for check_power_state equal for CI
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:50 -04:00
Rex Zhu
fbebf2c6bc drm/amdgpu: add new callback to check power state info
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:50 -04:00
Rex Zhu
db82b67c57 drm/amdgpu: check min clock set by DAL before set ps.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:49 -04:00
Tom St Denis
74f3ce31e9 drm/amd/amdgpu: Put in rest of wave fields
Add the rest of the basic SQ WAVE fields to
finish off the implementation.  Eventually,
a separate interface will be needed for GPRs.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:49 -04:00
Tom St Denis
472259f026 drm/amd/amdgpu: re-factor debugfs wave reader
Move IP version specific code into a callback.

Also add support for gfx7 devices.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:48 -04:00
Tom St Denis
394fdde256 drm/amd/amdgpu: Make debugfs write compliment read
Add PG lock support as well as bank selection to
the MMIO write function.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:47 -04:00
Tom St Denis
32977f93b4 drm/amd/amdgpu: Allow broadcast on debugfs read (v2)
Allow any of the se/sh/instance fields to be
specified as a broadcast by submitting 0x3FF.

(v2) Fix broadcast range checking

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:47 -04:00
Tom St Denis
5ecfb3b8fc drm/amd/amdgpu: Fix debugfs wave reader
On non VI/CZ platforms it would not free
the grbm index lock.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:46 -04:00
Tom St Denis
273d7aa13c drm/amd/amdgpu: Add wave reader to debugfs
Currently supports CZ/VI.  Allows nearly atomic read
of wave data from GPU.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:46 -04:00
Alex Deucher
a125510721 drm/amdgpu: rework IP block registration (v2)
This makes it easier to replace specific IP blocks on
asics for handling virtual_dce, DAL, etc. and for building
IP lists for hw or tables.  This also stored the status
information in the same structure.

v2: split out spelling fix into a separate patch
    add a function to add IPs to the list

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:45 -04:00
Alex Deucher
cf35c7ca3d drm/amdgpu/powerplay: fix spelling in amdgpu_powerplay.h
and update a comment as well.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:44 -04:00
Alex Deucher
623fea1868 drm/amdgpu/virtual_dce: move define into source file
It's not used outside the file.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:44 -04:00
Alex Deucher
2120df475d drm/amdgpu: enable virtual dce on SI
Add the proper IP module when requested.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:43 -04:00
Alex Deucher
07fecde5d3 drm/amdgpu: fill in vce clock info ioctl query (v2)
Returns the vce clock table for the user mode driver.
The user mode driver can fill this data into vce clock
data packet for optimal VCE DPM.

v2: update to the new API

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:43 -04:00
Alex Deucher
597be302f1 drm/amdgpu/powerplay: add an implementation for get_vce_clock_state (v3)
Used by the powerplay dpm code.

v2: update to the new API
v3: drop old include

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:42 -04:00
Alex Deucher
825cc9974d drm/amdgpu/dpm: add an implementation for get_vce_clock_state (v2)
Used by the non-powerplay dpm code.

v2: update to the new API

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:41 -04:00
Alex Deucher
230cf1ba72 drm/amdgpu/dpm: add new callback to fetch vce clock state (v2)
Will be used by the new info ioctl query.

v2: fetch a single state per request

Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:41 -04:00
Rex Zhu
66ba1afd85 drm/amdgpu: save number of vce states in dpm struct.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:40 -04:00
Rex Zhu
0d8de7ca0b drm/amdgpu: use same vce state definition in dpm and powerplay
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:39 -04:00
Alex Deucher
cf0978819c drm/amdgpu: move dpm related definitions to amdgpu_dpm.h
No intended functional change.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:38 -04:00
Christian König
7988714237 drm/amdgpu: move align_mask and nop into ring funcs as well (v2)
They are constant as well.

v2: update uvd and vce phys ring structures as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:38 -04:00
Christian König
21cd942e5c drm/amdgpu: move the ring type into the funcs structure (v2)
It's constant, so it doesn't make to much sense to keep it
with the variable data.

v2: update vce and uvd phys mode ring structures as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:37 -04:00
Christian König
e12f3d7a23 drm/amdgpu: move IB and frame size directly into the engine description
I should have suggested that on the initial patchset. This saves us a
few CPU cycles during CS and a bunch of loc.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:36 -04:00
Christian König
7bc6be825a drm/amdgpu: remove explicit NULL init for parse_cs
sed -i "/\.parse_cs = NULL,/d" drivers/gpu/drm/amd/amdgpu/*.c

That's just a leftover from radeon.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:36 -04:00
Christian König
e08c90a774 drm/amdgpu: remove 128 NOP hack from vm_flush v2
With the padding raised to 256 DW that shouldn't be needed any more.

v2: reduce estimation as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:35 -04:00
Christian König
c81b07e6bc drm/amdgpu: remove ring type check for conditional execution
If a ring doesn't support that it shouldn't implement the function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:35 -04:00
Christian König
66f3b2d527 drm/amdgpu: pad gfx and compute rings to 256 dw
The same as on windows to avoid further problems with CE/DE
command submission overlaps.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:34 -04:00
Alex Deucher
ec9aaaff66 drm/radeon: clarify why we evict vram twice on suspend
Update the comment to explain why we do this.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:33 -04:00
Alex Deucher
a0a71e49f5 drm/amdgpu: clarify why we evict vram twice on suspend
Update the comment to explain why we do this.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:33 -04:00
Alex Deucher
db9635cc14 drm/amdgpu: used cached gca values for vi_read_register (v2)
Using the cached values has less latency for bare metal
and SR-IOV, and prevents reading back bogus values if the
engine is powergated.

v2: fix typo in tile idx calculation

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:32 -04:00
Alex Deucher
34817db6c7 drm/amdgpu/gfx8: use cached raster config values in csb setup
Simplify the code and properly set the csb for harvest values.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:32 -04:00
Alex Deucher
392f0c775c drm/amdgpu/gfx8: cache rb config values
Needed when for SR-IOV and when PG is enabled.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:31 -04:00
Alex Deucher
e3fa76306f drm/amdgpu: add additional cached gca config variables
We need to cache some additional values to handle SR-IOV
and PG.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:30 -04:00
Christian König
b88c8796d8 drm/amdgpu: use amdgpu_vm_get_pd_bo in the GEM code
Instead of messing with the PD directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:30 -04:00
Christian König
073440d262 drm/amdgpu: move VM defines into amdgpu_vm.h
Only cleanup, no intended functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:29 -04:00
Christian König
7802301611 drm/amdgpu: move fence and ring defines into amdgpu_ring.h
Only cleanup, no intended functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:29 -04:00
Christian König
5611350499 drm/amdgpu: move sync handling into a separate header
Only cleanup, no intended functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:28 -04:00
Christian König
914b4dce4f drm/amdgpu: stop using a bo list entry for the VM PTs
Saves us a bit of memory.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:28 -04:00
Christian König
f7da30d979 drm/amdgpu: move PT validation back into VM code v2
Saves a bunch of CPU cycles when swapping things back in and
allows us to split the VM headers into a separate file.

v2: rename parameters

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:27 -04:00
Christian König
a7d64de659 drm/amdgpu: remove adev pointer from struct amdgpu_bo v2
It's completely pointless to have two pointers to the
device in the same structure.

v2: rename function to amdgpu_ttm_adev, fix typos

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:26 -04:00
Tom St Denis
f3fd451263 drm/amd/amdgpu: Enable UVD PG on Tonga
Tested by reading tile/clk bits during load/idle.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:26 -04:00
Tom St Denis
97f40ef049 drm/amd/powerplay: Enable UVD powergating for SMU7
This patch enables detecting VCE/UVD PG features and fixes the
UVD powergate function.

Tested on a Tonga (by reading UVD tile/clk bits during playback/idle).

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:25 -04:00
Christian König
f8991bab1a drm/amdgpu: update the shadow PD together with the real one v2
Far less CPU cycles needed for this approach.

v2: fix typo

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:25 -04:00
Frank Min
42e8cb5001 drm/amdgpu:wptr poll address of gfx8 is needed
for GFX8, gfx ring's wptr_addr is needed by SRIOV & CP for polling.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:24 -04:00
Monk Liu
4c2b2453ef drm/amdgpu:properly fix some JumpTable issues
we found some MEC ucode leads to IB test fail or even
ring test fail if Jump Table of it is not start in
FW bo with page aligned address, fixed by always make
JT address page aligned.

we don't need to patch JT2 for MEC2, because for VI,
MEC2 is a copy of MEC1, thus when converting fw_type
for MEC_JT2 we just return MEC1,hw can use the same
JT for both MEC1 & MEC2.

above two change fixed some ring/ib test failure issue
for some version of MEC ucode.

Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:23 -04:00
Monk Liu
bed5712e1a drm/amdgpu:add MEC_STORAGE ucode id for sriov
for sriov, SMC need MEC_STORAGE reserved in fw bo.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Frank Min <frank.min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:23 -04:00
Frank Min
ac00bbf32b drm/amdgpu:add callback in cgs for sriov detect
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:22 -04:00
Frank Min
f501a7e550 drm/amdgpu:fw bo should be in VRAM for SRIOV
for GTT memory SMC can only access it within PF space, which is not
used for SRIOV case, thus for SRIOV case, we let SMC use FB space for
ucode bo.

Signed-off-by: Frank Min <frank.min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:22 -04:00
Frank Min
01ab960d49 drm/amdgpu:keep bo pinned in prefered domain
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:21 -04:00
Monk Liu
4bc10d168a drm/amdgpu:use smc_index_11 for VI
for VI smc, index_0 to index_8 are all not safe,
they may used by BIOS/FW, and index_11 is reserved
only for driver.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:20 -04:00
Frank Min
e1d99217d0 drm/amdgpu:add one more fiji device id
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:20 -04:00
Baoyou Xie
f8a4c11b0a drm/amd/powerplay: mark symbols static where possible
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smumgr.c:162:5: warning: no previous prototype for 'fiji_setup_pwr_virus' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smc.c:2052:5: warning: no previous prototype for 'fiji_program_mem_timing_parameters' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/polaris10_smumgr.c:175:5: warning: no previous prototype for 'polaris10_avfs_event_mgr' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/cz_hwmgr.c:69:10: warning: no previous prototype for 'cz_get_eclk_level' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/smu7_hwmgr.c:92:26: warning: no previous prototype for 'cast_phw_smu7_power_state' [-Wmissing-prototypes]
....

In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:19 -04:00
Baoyou Xie
22e5808eba drm/radeon: mark symbols static where possible
We get 4 warnings when building kernel with W=1:
drivers/gpu/drm/radeon/si.c:7850:5: warning: no previous prototype for 'si_vce_send_vcepll_ctlreq' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/radeon_dp_mst.c:226:21: warning: no previous prototype for 'radeon_mst_best_encoder' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/radeon_dp_mst.c:344:26: warning: no previous prototype for 'radeon_mst_find_connector' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/radeon_dp_mst.c:600:6: warning: no previous prototype for 'radeon_dp_mst_encoder_destroy' [-Wmissing-prototypes]

In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:19 -04:00
Baoyou Xie
297b12862d drm/radeon: add missing header dependencies
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/radeon/radeon_clocks.c:35:10: warning: no previous prototype for 'radeon_legacy_get_engine_clock' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/atombios_encoders.c:75:1: warning: no previous prototype for 'atombios_get_backlight_level' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/r600_cs.c:2268:5: warning: no previous prototype for 'r600_cs_parse' [-Wmissing-prototypes]
drivers/gpu/drm/radeon/evergreen_cs.c:2671:5: warning: no previous prototype for 'evergreen_cs_parse' [-Wmissing-prototypes]
....

In fact, these functions are declared
in drivers/gpu/drm/radeon/radeon_asic.h,
so this patch adds missing header dependencies.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:18 -04:00
Junwei Zhang
ef704318d3 drm/amd/amdgpu: bump version for memory query info
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:18 -04:00
Junwei Zhang
e0adf6c86c drm/amd/amdgpu: unify memory query info interface
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:17 -04:00
Christian König
6a7f76e70f drm/amdgpu: add VRAM manager v2
Split VRAM allocations into 4MB blocks.

v2: fix typo in comment, some suggested cleanups
v3: document how to disable the feature, fix rebase issue

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:16 -04:00
Christian König
8892f153c8 drm/amdgpu: enable amdgpu_move_blit to handle multiple MM nodes v2
This allows us to move scattered buffers around.

v2: fix a couple of typos, handle scattered to scattered moves as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:16 -04:00
Christian König
63e0ba40e5 drm/amdgpu: handle multiple MM nodes in the VMs v2
This allows us to map scattered VRAM BOs to the VMs.

v2: fix offset handling, use pfn instead of offset,
    fix PAGE_SIZE != AMDGPU_GPU_PAGE_SIZE case

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:15 -04:00
Christian König
d2e938701a drm/amdgpu: set at least the node size in the gtt manager
Otherwise the new VM code becomes confused.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:15 -04:00
Christian König
56de55a1a8 drm/amdgpu: use explicit limit for VRAM_CONTIGUOUS
Split VRAM won't have a valid offset, so just set an explicit limit
when the flag is given to trigger reallocation if necessary.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:14 -04:00
Christian König
03f48dd5d2 drm/amdgpu: add AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS flag v3
Add a flag noting that a BO must be created using linear VRAM
and set this flag on all in kernel users where appropriate.

Hopefully I haven't missed anything.

v2: add it in a few more places, fix CPU mapping.
v3: rename to VRAM_CONTIGUOUS, fix typo in CS code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:13 -04:00
Junwei Zhang
cfa32556e5 drm/amd/amdgpu: add info about vram and gtt max allocation size
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:13 -04:00
Junwei Zhang
9f6163e7e3 drm/amd/amdgpu: add info about vram and gtt total size
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:12 -04:00
Alex Deucher
46c9cc11a5 drm/amdgpu/dce6: don't enable HPD Rx interrupts
Not used currently.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:12 -04:00
Alex Deucher
079ea1901b drm/amdgpu/dce6: RMW hpd registers
No need to hard code the entire register to just
set/clear one bit.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:11 -04:00
Alex Deucher
34386043d9 drm/amdgpu/dce6: simplify hpd code
Use an address offset like other dce code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:10 -04:00
Alex Deucher
d2486d25bd drm/amdgpu/dce11: simplify hpd code
use the hpd enum directly as an index

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:10 -04:00
Alex Deucher
03ae23b93b drm/amdgpu/dce8: RMW hpd registers
No need to hard code the entire register to just
set/clear one bit.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:09 -04:00
Alex Deucher
6753ac2bf4 drm/amdgpu/dce10: simplify hpd code
use the hpd enum directly as an index

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:09 -04:00
Alex Deucher
2285b91cd2 drm/amdgpu/dce8: simplify hpd code
Use an address offset like other dce code.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:08 -04:00
Emily Deng
0f66356d24 drm/amd/amdgpu: For virtual display, enable multi crtcs. (v3)
Enable multi crtcs for virtual display, user can set the number of crtcs
by amdgpu module parameter  virtual_display.

v2: make timers per crtc
v3: agd: simplify implementation

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:07 -04:00
Alex Deucher
483ef98588 drm/amdgpu: rename amdgpu_whether_enable_virtual_display
to match the other functions in that file.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:07 -04:00
Alex Deucher
87320cafbc Revert "drm/amdgpu: Add virtual connector and encoder macros."
This reverts commit 16925c92db.

This is no longer necessary.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
2016-10-25 14:38:06 -04:00
Alex Deucher
66264ba804 drm/amdgpu: simplify encoder and connector setup (v2)
No need to emulate all of the stuff for real hw.

v2: warning fix

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:06 -04:00
Alex Deucher
9405e47dba drm/amdgpu/virtual_dce: clean up interrupt handling
We handle the virtual interrupts from a timer so no
need to try an look like we are handling IV ring events.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:05 -04:00
Alex Deucher
bf2335a54e drm/amdgpu/virtual_dce: no need to an irq process callback
Virtual crtcs interrupts do not show up in the IV ring,
so it will never be called.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:04 -04:00
Alex Deucher
82b9f81760 drm/amdgpu/virtual_dce: drop pageflip_irq funcs
Never used.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:04 -04:00
Alex Deucher
425f6d6033 drm/amdgpu/virtual_dce: drop empty function
No need to ack non-existent interrupts.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:03 -04:00
Alex Deucher
a1d37046d4 drm/amdgpu/virtual_dce: add dce6 support
disable the real dce hw if the asic supports dce.

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:03 -04:00
Alex Deucher
1d160f4303 drm/amdgpu/dce6: add dce_v6_0_disable_dce
Needed for virtual dce support

Reviewed-By: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-25 14:38:02 -04:00
Russell King
97ac0e47ae drm: convert DT component matching to component_match_add_release()
Convert DT component matching to use component_match_add_release().

Acked-by: Jyri Sarha <jsarha@ti.com>
Reviewed-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/E1bwo6l-0005Io-Q1@rmk-PC.armlinux.org.uk
2016-10-25 11:52:38 -04:00
Chris Wilson
f54d186700 dma-buf: Rename struct fence to dma_fence
I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
 (
 ...
 )

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-10-25 14:40:39 +02:00
Chris Wilson
de867c20b9 drm/i915: Include the kernel uptime in the error state
As well as knowing when the error occurred, it is more interesting to me
to know how long after booting the error occurred, and for good measure
record the time since last hw initialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025121602.1457-1-chris@chris-wilson.co.uk
2016-10-25 13:22:43 +01:00
Akash Goel
7ef54de7fd drm/i915: Mark the GuC log buffer flush interrupts handling WQ as freezable
The GuC log buffer flush work item has to do a register access to send the
ack to GuC and this work item, if not synced before suspend, can potentially
get executed after the GFX device is suspended. This work item function uses
rpm get/put calls around the Hw access, which covers the rpm suspend case
but for system suspend a sync would be required as kernel can potentially
schedule the work items even after some devices, including GFX, have been
put to suspend. But sync has to be done only for the system suspend case,
as sync along with rpm get/put can cause a deadlock for rpm suspend path.
To have the sync, but like a NOOP, for rpm suspend path also this work
item could have been queued from the irq handler only when the device is
runtime active & kept active while that work item is pending or getting
executed but an interrupt can come even after the device is out of use and
so can potentially lead to missing of this work item.

By marking the workqueue, dedicated for handling GuC log buffer flush
interrupts, as freezable we don't have to bother about flushing of this
work item from the suspend hooks, the pending work item if any will be
either executed before the suspend or scheduled later on resume. This way
the handling of log buffer flush work item can be kept same between system
suspend & rpm suspend.

Suggested-by: Imre Deak <imre.deak@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
1e6b8b0dc8 drm/i915: Early creation of relay channel for capturing boot time logs
As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

v2:
- Remove the couple of FIXMEs, as now the relay channel will be created
  early before enabling the flush interrupts, so no possibility of relay
  channel pointer being modified & read at the same time from 2 different
  execution contexts.
- Rebase.

v3:
- Add a comment to justiy setting 'is_global' before the NULL check on the
  parent directory dentry pointer.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
717065907f drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer
To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used. In order to avoid
any performance drop & have fast reads from the GuC log buffer, used SSE4.1
movntdqa based memcpy function i915_memcpy_from_wc, as copying using
movntqda from WC type memory is almost as fast as reading from WB memory.
This way log buffer sampling time will not get increased and so would be
able to deal with the flush interrupt storm when GuC is generating logs at
a very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

v3: Squash the WC type vmalloc mapping patch with this patch. (Chris)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Sagar Arun Kamble
685534ef4c drm/i915: Debugfs support for GuC logging control
This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
    Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
    set the guc_log_level to -1 when logging is disabled. (Tvrtko)

v4: Minor cleanup & rebase. (Tvrtko)

v5:
- Lock struct_mutex after the NULL check for guc log buffer vma. (Chris)
- Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Sagar Arun Kamble
896a0cb0fe drm/i915: Support for forceful flush of GuC log buffer
GuC firmware sends a flush interrupt to Host when the log buffer is half
full and at that time only it updates the log buffer state.
But in certain cases, as described below, it could be useful to have all
that even when log buffer is only partially full. For that there is a force
log buffer flush Host2GuC action supported by GuC firmware.

For Validation requirements, a forceful flush is needed to collect the
left over logs on disabling logging. The same can be done before proceeding
with GPU/GuC reset as there could be some data in log buffer which is yet
to be captured and those logs would be particularly useful to understand
that why the reset was initiated.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
27b85beae0 drm/i915: Augment i915 error state to include the dump of GuC log buffer
Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

v2:
- For uniformity use existing helper function print_error_obj() to
  dump out contents of GuC log buffer, pretty printing is better left
  to userspace. (Chris)
- Skip the dumping of GuC log buffer when logging is disabled as it
  won't be of any use.
- Rebase.

v3: Rebase.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
72c0bc66ca drm/i915: Increase GuC log buffer size to reduce flush interrupts
In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.
This also helps in reducing the output log file size, apart from
reducing the flush interrupt count. With the original settings,
44 KB was needed for one snapshot. With modified settings, 76 KB is
needed for a snapshot which will be equivalent to 2 snapshots of the
original setting. So 12KB saving, every 88 KB, over the original setting.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
6941f3c9ce drm/i915: Optimization to reduce the sampling time of GuC log buffer
GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

v3: Fix the blooper of doing the copy twice. (Tvrtko)

v4: Add curlies for 'else' case also, matching the 'if'. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
5aa1ee4b12 drm/i915: Add stats for GuC log buffer flush interrupts
GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
5dd7989bbd drm/i915: New lock to serialize the Host2GuC actions
With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
    while waiting for the action's response from GuC. (Tvrtko)

v3: To conform to the general rules, acquire mutex before taking the
    forcewake. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Akash Goel
f824083559 drm/i915: Add a relay backed debugfs interface for capturing GuC logs
Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snapshots could be stored in the relay buffer.
The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
    debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
  the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
  sufficient buffering for boot time logs

v5:
- Fix the alignment, indentation issues and some minor cleanup. (Tvrtko)
- Update the comment to elaborate on why a relay channel has to be
  associated with the debugfs file. (Tvrtko)

v6:
- Move the write to 'is_global' after the NULL check on parent directory
  dentry pointer. (Tvrtko)

v7: Add a BUG_ON to validate relay buffer allocation size. (Chris)

Testcase: igt/tools/intel_guc_logger

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Sagar Arun Kamble
4100b2ab3e drm/i915: Handle log buffer flush interrupt event from GuC
GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

v7:
- Use switch statement instead of 'if else' for retrieving the GuC log
  buffer size. (Tvrtko)
- Refactored the log buffer copying function and shortended the name of
  couple of variables for better readability. (Tvrtko)

v8:
- Make the dedicated wq as a high priority one to further reduce the
  turnaround time of handing log buffer flush event from GuC.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Sagar Arun Kamble
26705e2075 drm/i915: Support for GuC interrupts
There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Akash Goel
f4e9af4f5a drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
    failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Akash Goel
d6b40b4b17 drm/i915: New structure to contain GuC logging related fields
So far there were 2 fields related to GuC logs in 'intel_guc' structure.
For the support of capturing GuC logs & storing them in a local buffer,
multiple new fields would have to be added. This warrants a separate
structure to contain the fields related to GuC logging state.
Added a new structure 'intel_guc_log' and instance of it inside
'intel_guc' structure.

v2: Rebase.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Sagar Arun Kamble
5d34e85a9e drm/i915: Add GuC ukernel logging related fields to fw interface file
The first page of the GuC log buffer contains state info or meta data
which is required to parse the logs contained in the subsequent pages.
The structure representing the state info is added to interface file
as Driver would need to handle log buffer flush interrupts from GuC.
Added an enum for the different message/event types that can be send
by the GuC ukernel to Host.
Also added 2 new Host to GuC action types to inform GuC when Host has
flushed the log buffer and forcefuly cause the GuC to send a new
log buffer flush interrupt.

v2:
- Make documentation of log buffer state structure more elaborate &
  rename LOGBUFFERFLUSH action to LOG_BUFFER_FLUSH for consistency.(Tvrtko)

v3: Add GuC log buffer layout diagram for more clarity.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00