The hardware provides us with separate threads for binning and
rendering, and the existing model waits for them both to complete
before submitting the next job.
Splitting the binning and rendering submissions reduces idle time and
gives us approx 20-30% speedup with some x11perf tests such as -line10
and -tilerect1. Improves openarena performance by 1.01897% +/-
0.247857% (n=16).
Thanks to anholt for suggesting this.
v2: Rebase on the spurious resets fix (change by anholt).
Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
The user submission is basically a pointer to a command list and a
pointer to uniforms. We copy those in to the kernel, validate and
relocate them, and store the result in a GPU BO which we queue for
execution.
v2: Drop support for NV shader recs (not necessary for GL), simplify
vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style
types.
Signed-off-by: Eric Anholt <eric@anholt.net>