Commit Graph

20 Commits

Author SHA1 Message Date
Eric Anholt
98830d91da drm/vc4: Add T-format scanout support.
The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty).  If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).

T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor.  Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.

Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout.  On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled.  However, for X11 this still ends
up being a huge performance win in active usage.

This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer.  However, we fail to do so for untiled buffers already.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-06-15 16:02:45 -07:00
Eric Anholt
bb7d785688 drm/vc4: Add HDMI audio support
The HDMI encoder IP embeds all needed blocks to output audio, with a
custom DAI called MAI moving audio between the two parts of the HDMI
core.  This driver now exposes a sound card to let users stream audio
to their display.

Using the hdmi-codec driver has been considered here, but MAI meant
having to significantly rework hdmi-codec, and it would have left
little shared code with the I2S mode anyway.

The encoder requires that the audio be SPDIF-formatted frames only,
which alsalib will format-convert for us.

This patch is the combined work of Eric Anholt (initial register setup
with a separate dmaengine driver and using simple-audio-card) and
Boris Brezillon (moving it all into HDMI, massive debug to get it
actually working), and which Eric has the permission to release.

v2: Drop "-audio" from sound card name, since that's already implied
    (suggestion by Boris)

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227202803.12855-2-eric@anholt.net
2017-03-16 10:33:30 -07:00
Eric Anholt
a86773d120 drm/vc4: Add support for feeding DSI encoders from the pixel valve.
We have to set a different pixel format, which tells the hardware to
use the pix_width field that's fed in sideband from the DSI encoder to
divide the "pixel" clock.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161214194621.16499-6-eric@anholt.net
2017-02-01 12:51:22 -08:00
Eric Anholt
d17a1bb9b8 drm/vc4: Set up SCALER_DISPCTRL at boot.
We want the HVS on, obviously, and we also want DSP3 (PV1's source) to
be muxed from HVS channel 2 like we expect in vc4_crtc.c.  The
firmware wasn't setting the DSP3 mux up when both the LCD and HDMI
were disabled.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161214194621.16499-5-eric@anholt.net
2017-02-01 12:51:22 -08:00
Boris Brezillon
ab8df60e3a drm/vc4: Fix ->clock_select setting for the VEC encoder
PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-09 15:26:29 -08:00
Eric Anholt
dfccd937de drm/vc4: Add support for double-clocked modes.
Now that we have infoframes to report the pixel repeat flag, we can
start using it.  Fixes locking the 720x480i and 720x576i modes on my
Dell 2408WFP.  Like the 1920x1080i case, they don't fit properly on
the screen, though.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-10-06 11:58:28 -07:00
Eric Anholt
21317b3fba drm/vc4: Set up the AVI and SPD infoframes.
Fixes a purple bar on the left side of the screen with my Dell
2408WFP.  It will also be required for supporting the double-clocked
video modes.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-10-06 11:58:28 -07:00
Eric Anholt
682e62c454 drm/vc4: Fix support for interlaced modes on HDMI.
We really do need to be using the halved V fields.  I had been
confused by the code I was using as a reference because it stored
halved vsync fields but not halved vdisplay, so it looked like I only
needed to divide vdisplay by 2.

This reverts part of Mario's timestamping fixes that prevented
CRTC_HALVE_V from applying, and instead adjusts the timestamping code
to not use the crtc field in that case.

Fixes locking of 1920x1080x60i on my Dell 2408WFP.  There are black
bars on the top and bottom, but I suspect that might be an
under/overscan flags problem as opposed to video timings.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-10-06 11:58:27 -07:00
Eric Anholt
6e1cbbad67 drm/vc4: Enable limited range RGB output on HDMI with CEA modes.
Fixes broken grayscale ramps on many HDMI monitors, where large areas
at the ends of the ramp would all appear as black or white.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-10-06 11:58:26 -07:00
Mario Kleiner
1bf59f1dcb drm/vc4: Implement precise vblank timestamping.
Precise vblank timestamping is implemented via the
usual scanout position based method. On VC4 the
pixelvalves PV do not have a scanout position
register. Only the hardware video scaler HVS has a
similar register which describes which scanline for
the output is currently composited and stored in the
HVS fifo for later consumption by the PV.

This causes a problem in that the HVS runs at a much
faster clock (system clock / audio gate) than the PV
which runs at video mode dot clock, so the unless the
fifo between HVS and PV is full, the HVS will progress
faster in its observable read line position than video
scan rate, so the HVS position reading can't be directly
translated into a scanout position for timestamp correction.

Additionally when the PV is in vblank, it doesn't consume
from the fifo, so the fifo gets full very quickly and then
the HVS stops compositing until the PV enters active scanout
and starts consuming scanlines from the fifo again, making
new space for the HVS to composite.

Therefore a simple translation of HVS read position into
elapsed time since (or to) start of active scanout does
not work, but for the most interesting cases we can still
get useful and sufficiently accurate results:

1. The PV enters active scanout of a new frame with the
   fifo of the HVS completely full, and the HVS can refill
   any fifo line which gets consumed and thereby freed up by
   the PV during active scanout very quickly. Therefore the
   PV and HVS work effectively in lock-step during active
   scanout with the fifo never having more than 1 scanline
   freed up by the PV before it gets refilled. The PV's
   real scanout position is therefore trailing the HVS
   compositing position as scanoutpos = hvspos - fifosize
   and we can get the true scanoutpos as HVS readpos minus
   fifo size, so precise timestamping works while in active
   scanout, except for the last few scanlines of the frame,
   when the HVS reaches end of frame, stops compositing and
   the PV catches up and drains the fifo. This special case
   would only introduce minor errors though.

2. If we are in vblank, then we can only guess something
   reasonable. If called from vblank irq, we assume the irq is
   usually dispatched with minimum delay, so we can take a
   timestamp taken at entry into the vblank irq handler as a
   baseline and then add a full vblank duration until the
   guessed start of active scanout. As irq dispatch is usually
   pretty low latency this works with relatively low jitter and
   good results.

   If we aren't called from vblank then we could be anywhere
   within the vblank interval, so we return a neutral result,
   simply the current system timestamp, and hope for the best.

Measurement shows the generated timestamps to be rather precise,
and at least never off more than 1 vblank duration worst-case.

Limitations: Doesn't work well yet for interlaced video modes,
             therefore disabled in interlaced mode for now.

v2: Use the DISPBASE registers to determine the FIFO size (changes
    by anholt)

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v2)
2016-07-11 17:17:34 -07:00
Mario Kleiner
56d1fe0979 drm/vc4: Make pageflip completion handling more robust.
Protect both the setup of the pageflip event and the
latching of the new requested displaylist head pointer
by the event lock, so we can't get into a situation
where vc4_atomic_flush latches the new display list via
HVS_WRITE, then immediately gets preempted before queueing
the pageflip event, then the page-flip completes in hw and
the vc4_crtc_handle_page_flip() runs and no-ops due to
lack of a pending pageflip event, then vc4_atomic_flush
continues and only then queues the pageflip event - after
the page flip handling already no-oped. This would cause
flip completion handling only at the next vblank - one
frame too late.

In vc4_crtc_handle_page_flip() check the actual DL head
pointer in SCALER_DISPLACTX against the requested pointer
for page flip to make sure that the flip actually really
completed in the current vblank and doesn't get deferred
to the next one because the DL head pointer was written
a bit too late into SCALER_DISPLISTX, after start of
vblank, and missed the boat. This avoids handling a
pageflip completion too early - one frame too early.

According to Eric, DL head pointer updates which were
written into the HVS DISPLISTX reg get committed to hardware
at the last pixel of active scanout. Our vblank interrupt
handler, as triggered by PV_INT_VFP_START irq, gets to run
earliest at the first pixel of HBLANK at the end of the
last scanline of active scanout, ie. vblank irq handling
runs at least 1 pixel duration after a potential pageflip
completion happened in hardware.

This ordering of events in the hardware, together with the
lock protection and SCALER_DISPLACTX sampling of this patch,
guarantees that pageflip completion handling only runs at
exactly the vblank irq of actual pageflip completion in all
cases.

Background info from Eric about the relative timing of
HVS, PV's and trigger points for interrupts, DL updates:

https://lists.freedesktop.org/archives/dri-devel/2016-May/107510.html

Tested on RPi 2B with hardware timing measurement equipment
and shown to no longer complete flips too early or too late.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-06-06 13:00:40 -07:00
Eric Anholt
e582b6c7e7 drm/vc4: Add support for gamma ramps.
We could possibly save a bit of power by not requesting gamma
conversion when the ramp happens to be 1:1, but at least if all the
CRTCs are off the SRAM will be disabled.

This should fix brightness sliders in a lot of fullscreen games.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-05-02 16:17:10 -07:00
Dave Airlie
67d1c0a25c Merge tag 'drm-vc4-fixes-2016-03-03' of github.com:anholt/linux into drm-next
This pull request fixes the major VC4 HDMI modesetting bugs found when
the first wave of users showed up in Raspbian.

* tag 'drm-vc4-fixes-2016-03-03' of github.com:anholt/linux:
  drm/vc4: Initialize scaler DISPBKGND on modeset.
  drm/vc4: Fix setting of vertical timings in the CRTC.
  drm/vc4: Fix the name of the VSYNCD_EVEN register.
  drm/vc4: Add another reg to HDMI debug dumping.
  drm/vc4: Bring HDMI up from power off if necessary.
  drm/vc4: Fix a framebuffer reference leak on async flip interrupt.
2016-03-14 09:48:04 +10:00
Eric Anholt
6a60920986 drm/vc4: Initialize scaler DISPBKGND on modeset.
We weren't updating the interlaced bit, so we'd scan out incorrectly
if the firmware had brought up the TV encoder and we were switching to
HDMI.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-26 17:42:48 -08:00
Eric Anholt
c31806fbdd drm/vc4: Fix the name of the VSYNCD_EVEN register.
It's used for delaying vsync in interlaced mode.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-26 15:51:29 -08:00
Eric Anholt
851479ad59 drm/vc4: Bring HDMI up from power off if necessary.
If the firmware hadn't brought up HDMI for us, we need to do its
power-on reset sequence (reset HD and and clear its STANDBY bits,
reset HDMI, and leave the PHY disabled).

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-26 15:51:10 -08:00
Eric Anholt
fc04023faf drm/vc4: Add support for YUV planes.
This supports 420 and 422 subsampling with 2 or 3 planes, tested with
modetest.  It doesn't set up chroma subsampling position (which it
appears KMS doesn't deal with yet).

The LBM memory is overallocated in many cases, but apparently the docs
aren't quite correct and I'll probably need to look at the hardware
source to really figure it out.

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-16 11:24:08 -08:00
Eric Anholt
21af94cf1a drm/vc4: Add support for scaling of display planes.
This implements a simple policy for choosing scaling modes
(trapezoidal for decimation, PPF for magnification), and a single PPF
filter (Mitchell/Netravali's recommendation).

Signed-off-by: Eric Anholt <eric@anholt.net>
2016-02-16 11:24:08 -08:00
Eric Anholt
1fa81589bb drm/vc4: Fix a typo in a V3D debug register.
Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:09 -08:00
Eric Anholt
c8b75bca92 drm/vc4: Add KMS support for Raspberry Pi.
This is enough for fbcon and bringing up X using
xf86-video-modesetting.  It doesn't support the 3D accelerator or
power management yet.

v2: Drop FB_HELPER select thanks to Archit's patches.  Do manual init
    ordering instead of using the .load hook.  Structure registration
    more like tegra's, but still using the typical "component" code.
    Drop no-op hooks for atomic_begin and mode_fixup() now that
    they're optional.  Drop sentinel in Makefile.  Fix minor style
    nits I noticed on another reread.

v3: Use the new bcm2835 clk driver to manage pixel/HSM clocks instead
    of having a fixed video mode.  Use exynos-style component driver
    matching instead of devicetree nodes to list the component driver
    instances.  Rename compatibility strings to say bcm2835, and
    distinguish pv0/1/2.  Clean up some h/vsync code, and add in
    interlaced mode setup.  Fix up probe/bind error paths.  Use
    bitops.h macros for vc4_regs.h

v4: Include i2c.h, allow building under COMPILE_TEST, drop msleep now
    that other bugs have been fixed, add timeouts to cpu_relax()
    loops, rename hpd-gpio to hpd-gpios.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-10-21 10:33:12 +01:00