linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-21 07:09:15 +07:00

Author	SHA1	Message	Date
Boris Brezillon	65101d8c91	drm/vc4: Expose performance counters to userspace The V3D engine has various hardware counters which might be interesting to userspace performance analysis tools. Expose new ioctls to create/destroy a performance monitor object and query the counter values of this perfmance monitor. Note that a perfomance monitor is given an ID that is only valid on the file descriptor it has been allocated from. A performance monitor can be attached to a CL submission and the driver will enable HW counters for this request and update the performance monitor values at the end of the job. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180112090926.12538-1-boris.brezillon@free-electrons.com	2018-02-10 22:23:26 +00:00
Stefan Schake	ce9caf2f79	drm/vc4: Move IRQ enable to PM path We were calling enable_irq on bind, where it was already enabled previously by the IRQ helper. Additionally, dev->irq is not set correctly until after postinstall and so was always zero here, triggering a warning in 4.15. Fix both by moving the enable to the power management resume path, where we know there was a previous disable invocation during suspend. Fixes: `253696ccd6` ("drm/vc4: Account for interrupts in flight") Signed-off-by: Stefan Schake <stschake@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1514563543-32511-1-git-send-email-stschake@gmail.com Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-03 15:56:03 -08:00
Stefan Schake	babc811005	drm/vc4: Release fence after signalling We were never releasing the initial fence reference that is obtained through dma_fence_init. Link: https://github.com/anholt/linux/issues/122 Fixes: `cdec4d3613` ("drm/vc4: Expose dma-buf fences for V3D rendering.") Signed-off-by: Stefan Schake <stschake@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1512236444-301-1-git-send-email-stschake@gmail.com	2017-12-08 13:02:22 -08:00
Stefan Schake	253696ccd6	drm/vc4: Account for interrupts in flight Synchronously disable the IRQ to make the following cancel_work_sync invocation effective. An interrupt in flight could enqueue further overflow mem work. As we free the binner BO immediately following vc4_irq_uninstall this caused a NULL pointer dereference in the work callback vc4_overflow_mem_work. Link: https://github.com/anholt/linux/issues/114 Signed-off-by: Stefan Schake <stschake@gmail.com> Fixes: `d5b1a78a77` ("drm/vc4: Add support for drawing 3D frames.") Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1510275907-993-2-git-send-email-stschake@gmail.com	2017-11-13 16:40:15 -08:00
Eric Anholt	553c942f8b	drm/vc4: Allow using more than 256MB of CMA memory. Until now, we've had to limit Raspberry Pi to 256MB of CMA memory to keep from triggering the hardware addressing bug between the tile binner and the tile alloc memory (where the top 4 bits come from the tile state data array's address). To work around that and allow more memory to be reserved for graphics, allocate a single BO to store tile state data arrays and tile alloc/overflow memory while the GPU is active, and make sure that that one BO doesn't happen to cross a 256MB boundary. With that in place, we can allocate textures and shaders anywhere in system memory (still contiguous, of course). Signed-off-by: Eric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170327231025.19391-1-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-04-18 14:32:20 -07:00
Eric Anholt	cdec4d3613	drm/vc4: Expose dma-buf fences for V3D rendering. This is needed for proper synchronization with display on another DRM device (pl111 or tinydrm) with buffers produced by vc4 V3D. Fixes the new igt vc4_dmabuf_poll testcase, and rendering of one of the glmark2 desktop tests on pl111+vc4. This doesn't yet introduce waits on another device's fences before vc4's rendering/display, because I don't have testcases for them. v2: Reuse dma_fence_free(), retitle commit message to clarify that it's not a full dma-buf fencing implementation yet. Signed-off-by: Eric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170412191202.22740-6-eric@anholt.net Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-13 11:00:28 -07:00
Eric Anholt	72f793f14a	drm/vc4: Convert existing documentation to actual kerneldoc. I'm going to hook vc4 up to the sphinx build, so clean up its comments to not generate warnings when we do. Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170227201144.10970-2-eric@anholt.net	2017-02-28 12:51:48 -08:00
Eric Anholt	9326e6f255	drm/vc4: Fix overflow mem unreferencing when the binner runs dry. Overflow memory handling is tricky: While it's still referenced by the BPO registers, we want to keep it from being freed. When we are putting a new set of overflow memory in the registers, we need to assign the old one to the last rendering job using it. We were looking at "what's currently running in the binner", but since the bin/render submission split, we may end up with the binner completing and having no new job while the renderer is still processing. So, if we don't find a bin job at all, look at the highest-seqno (last) render job to attach our overflow to. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: `ca26d28bba` ("drm/vc4: improve throughput by pipelining binning and rendering jobs") Cc: stable@vger.kernel.org	2016-08-19 19:17:34 -07:00
Varad Gautam	ca26d28bba	drm/vc4: improve throughput by pipelining binning and rendering jobs The hardware provides us with separate threads for binning and rendering, and the existing model waits for them both to complete before submitting the next job. Splitting the binning and rendering submissions reduces idle time and gives us approx 20-30% speedup with some x11perf tests such as -line10 and -tilerect1. Improves openarena performance by 1.01897% +/- 0.247857% (n=16). Thanks to anholt for suggesting this. v2: Rebase on the spurious resets fix (change by anholt). Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-03-13 17:05:05 -07:00
Eric Anholt	2c68f1fcfb	drm/vc4: Return an ERR_PTR from BO creation instead of NULL. Fixes igt vc4_create_bo/create-bo-0 by returning -EINVAL from the ioctl instead of -ENOMEM. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-16 12:20:58 -08:00
Eric Anholt	d5b1a78a77	drm/vc4: Add support for drawing 3D frames. The user submission is basically a pointer to a command list and a pointer to uniforms. We copy those in to the kernel, validate and relocate them, and store the result in a GPU BO which we queue for execution. v2: Drop support for NV shader recs (not necessary for GL), simplify vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style types. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-12-07 20:05:10 -08:00

11 Commits