Commit Graph

7 Commits

Author SHA1 Message Date
Eric Anholt
553c942f8b drm/vc4: Allow using more than 256MB of CMA memory.
Until now, we've had to limit Raspberry Pi to 256MB of CMA memory to
keep from triggering the hardware addressing bug between the tile
binner and the tile alloc memory (where the top 4 bits come from the
tile state data array's address).

To work around that and allow more memory to be reserved for graphics,
allocate a single BO to store tile state data arrays and tile
alloc/overflow memory while the GPU is active, and make sure that that
one BO doesn't happen to cross a 256MB boundary.  With that in place,
we can allocate textures and shaders anywhere in system memory (still
contiguous, of course).

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327231025.19391-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-04-18 14:32:20 -07:00
Eric Anholt
cdec4d3613 drm/vc4: Expose dma-buf fences for V3D rendering.
This is needed for proper synchronization with display on another DRM
device (pl111 or tinydrm) with buffers produced by vc4 V3D.  Fixes the
new igt vc4_dmabuf_poll testcase, and rendering of one of the glmark2
desktop tests on pl111+vc4.

This doesn't yet introduce waits on another device's fences before
vc4's rendering/display, because I don't have testcases for them.

v2: Reuse dma_fence_free(), retitle commit message to clarify that
    it's not a full dma-buf fencing implementation yet.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170412191202.22740-6-eric@anholt.net
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-04-13 11:00:28 -07:00
Eric Anholt
72f793f14a drm/vc4: Convert existing documentation to actual kerneldoc.
I'm going to hook vc4 up to the sphinx build, so clean up its comments
to not generate warnings when we do.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227201144.10970-2-eric@anholt.net
2017-02-28 12:51:48 -08:00
Eric Anholt
9326e6f255 drm/vc4: Fix overflow mem unreferencing when the binner runs dry.
Overflow memory handling is tricky: While it's still referenced by the
BPO registers, we want to keep it from being freed.  When we are
putting a new set of overflow memory in the registers, we need to
assign the old one to the last rendering job using it.

We were looking at "what's currently running in the binner", but since
the bin/render submission split, we may end up with the binner
completing and having no new job while the renderer is still
processing.  So, if we don't find a bin job at all, look at the
highest-seqno (last) render job to attach our overflow to.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: ca26d28bba ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
Cc: stable@vger.kernel.org
2016-08-19 19:17:34 -07:00
Varad Gautam
ca26d28bba drm/vc4: improve throughput by pipelining binning and rendering jobs
The hardware provides us with separate threads for binning and
rendering, and the existing model waits for them both to complete
before submitting the next job.

Splitting the binning and rendering submissions reduces idle time and
gives us approx 20-30% speedup with some x11perf tests such as -line10
and -tilerect1.  Improves openarena performance by 1.01897% +/-
0.247857% (n=16).

Thanks to anholt for suggesting this.

v2: Rebase on the spurious resets fix (change by anholt).

Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-03-13 17:05:05 -07:00
Eric Anholt
2c68f1fcfb drm/vc4: Return an ERR_PTR from BO creation instead of NULL.
Fixes igt vc4_create_bo/create-bo-0 by returning -EINVAL from the
ioctl instead of -ENOMEM.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-16 12:20:58 -08:00
Eric Anholt
d5b1a78a77 drm/vc4: Add support for drawing 3D frames.
The user submission is basically a pointer to a command list and a
pointer to uniforms.  We copy those in to the kernel, validate and
relocate them, and store the result in a GPU BO which we queue for
execution.

v2: Drop support for NV shader recs (not necessary for GL), simplify
    vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style
    types.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:10 -08:00