Commit Graph

11 Commits

Author SHA1 Message Date
Boris Brezillon
65101d8c91 drm/vc4: Expose performance counters to userspace
The V3D engine has various hardware counters which might be interesting
to userspace performance analysis tools.

Expose new ioctls to create/destroy a performance monitor object and
query the counter values of this perfmance monitor.

Note that a perfomance monitor is given an ID that is only valid on the
file descriptor it has been allocated from. A performance monitor can be
attached to a CL submission and the driver will enable HW counters for
this request and update the performance monitor values at the end of the
job.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180112090926.12538-1-boris.brezillon@free-electrons.com
2018-02-10 22:23:26 +00:00
Stefan Schake
ce9caf2f79 drm/vc4: Move IRQ enable to PM path
We were calling enable_irq on bind, where it was already enabled previously
by the IRQ helper. Additionally, dev->irq is not set correctly until after
postinstall and so was always zero here, triggering a warning in 4.15.
Fix both by moving the enable to the power management resume path, where we
know there was a previous disable invocation during suspend.

Fixes: 253696ccd6 ("drm/vc4: Account for interrupts in flight")
Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1514563543-32511-1-git-send-email-stschake@gmail.com
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-01-03 15:56:03 -08:00
Stefan Schake
babc811005 drm/vc4: Release fence after signalling
We were never releasing the initial fence reference that is obtained
through dma_fence_init.

Link: https://github.com/anholt/linux/issues/122
Fixes: cdec4d3613 ("drm/vc4: Expose dma-buf fences for V3D rendering.")
Signed-off-by: Stefan Schake <stschake@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1512236444-301-1-git-send-email-stschake@gmail.com
2017-12-08 13:02:22 -08:00
Stefan Schake
253696ccd6 drm/vc4: Account for interrupts in flight
Synchronously disable the IRQ to make the following cancel_work_sync
invocation effective.

An interrupt in flight could enqueue further overflow mem work. As we
free the binner BO immediately following vc4_irq_uninstall this caused
a NULL pointer dereference in the work callback vc4_overflow_mem_work.

Link: https://github.com/anholt/linux/issues/114
Signed-off-by: Stefan Schake <stschake@gmail.com>
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1510275907-993-2-git-send-email-stschake@gmail.com
2017-11-13 16:40:15 -08:00
Eric Anholt
553c942f8b drm/vc4: Allow using more than 256MB of CMA memory.
Until now, we've had to limit Raspberry Pi to 256MB of CMA memory to
keep from triggering the hardware addressing bug between the tile
binner and the tile alloc memory (where the top 4 bits come from the
tile state data array's address).

To work around that and allow more memory to be reserved for graphics,
allocate a single BO to store tile state data arrays and tile
alloc/overflow memory while the GPU is active, and make sure that that
one BO doesn't happen to cross a 256MB boundary.  With that in place,
we can allocate textures and shaders anywhere in system memory (still
contiguous, of course).

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327231025.19391-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-04-18 14:32:20 -07:00
Eric Anholt
cdec4d3613 drm/vc4: Expose dma-buf fences for V3D rendering.
This is needed for proper synchronization with display on another DRM
device (pl111 or tinydrm) with buffers produced by vc4 V3D.  Fixes the
new igt vc4_dmabuf_poll testcase, and rendering of one of the glmark2
desktop tests on pl111+vc4.

This doesn't yet introduce waits on another device's fences before
vc4's rendering/display, because I don't have testcases for them.

v2: Reuse dma_fence_free(), retitle commit message to clarify that
    it's not a full dma-buf fencing implementation yet.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170412191202.22740-6-eric@anholt.net
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-04-13 11:00:28 -07:00
Eric Anholt
72f793f14a drm/vc4: Convert existing documentation to actual kerneldoc.
I'm going to hook vc4 up to the sphinx build, so clean up its comments
to not generate warnings when we do.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227201144.10970-2-eric@anholt.net
2017-02-28 12:51:48 -08:00
Eric Anholt
9326e6f255 drm/vc4: Fix overflow mem unreferencing when the binner runs dry.
Overflow memory handling is tricky: While it's still referenced by the
BPO registers, we want to keep it from being freed.  When we are
putting a new set of overflow memory in the registers, we need to
assign the old one to the last rendering job using it.

We were looking at "what's currently running in the binner", but since
the bin/render submission split, we may end up with the binner
completing and having no new job while the renderer is still
processing.  So, if we don't find a bin job at all, look at the
highest-seqno (last) render job to attach our overflow to.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: ca26d28bba ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
Cc: stable@vger.kernel.org
2016-08-19 19:17:34 -07:00
Varad Gautam
ca26d28bba drm/vc4: improve throughput by pipelining binning and rendering jobs
The hardware provides us with separate threads for binning and
rendering, and the existing model waits for them both to complete
before submitting the next job.

Splitting the binning and rendering submissions reduces idle time and
gives us approx 20-30% speedup with some x11perf tests such as -line10
and -tilerect1.  Improves openarena performance by 1.01897% +/-
0.247857% (n=16).

Thanks to anholt for suggesting this.

v2: Rebase on the spurious resets fix (change by anholt).

Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-03-13 17:05:05 -07:00
Eric Anholt
2c68f1fcfb drm/vc4: Return an ERR_PTR from BO creation instead of NULL.
Fixes igt vc4_create_bo/create-bo-0 by returning -EINVAL from the
ioctl instead of -ENOMEM.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-16 12:20:58 -08:00
Eric Anholt
d5b1a78a77 drm/vc4: Add support for drawing 3D frames.
The user submission is basically a pointer to a command list and a
pointer to uniforms.  We copy those in to the kernel, validate and
relocate them, and store the result in a GPU BO which we queue for
execution.

v2: Drop support for NV shader recs (not necessary for GL), simplify
    vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style
    types.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:10 -08:00