Most, but not all, paths where calling the with struct_mutex held. The
fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run
the first time) was masking this issue.
So lets just always hold struct_mutex for hw_init(). And sprinkle some
WARN_ON()'s and might_lock() to avoid this sort of problem in the
future.
Signed-off-by: Rob Clark <robdclark@gmail.com>
memptrs->wptr seems to be unused. Remove it to avoid
confusing the upcoming preemption code.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The amount of information that we need to pass into msm_gpu_init()
is steadily increasing, so add a new struct to stabilize the function
call and make it easier to add new configuration down the line.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
There isn't any generic code that uses ->idle so remove it.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
only render to buffers that are marked as secure and inaccessible
to the kernel and user through a series of hardware protections. In
practice secure mode is used to draw things like a UI on a secure
video frame.
In order to switch out of secure mode the GPU executes a special
shader that clears out the GMEM and other sensitve registers and
then writes a register. Because the kernel can't be trusted the
shader binary is signed and verified and programmed by the
secure world. To do this we need to read the MDT header and the
segments from the firmware location and put them in memory and
present them for approval.
For targets without secure support there is an out: if the
secure world doesn't support secure then there are no hardware
protections and we can freely write the SECVID_TRUST register from
the CPU. We don't have 100% confidence that we can query the
secure capabilities at run time but we have enough calls that
need to go right to give us some confidence that we're at least doing
something useful.
Of course if we guess wrong you trigger a permissions violation
which usually ends up in a system crash but thats a problem
that shows up immediately.
[v2: use child device per Bjorn]
[v3: use generic MDT loader per Bjorn]
[v4: use managed dma functions and ifdefs for the MDT loader]
[v5: Add depends for QCOM_MDT_LOADER]
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[robclark: fix Kconfig to use select instead of depends + #if IS_ENABLED()]
Signed-off-by: Rob Clark <robdclark@gmail.com>
If a OPP table is defined for the GPU device in the device tree use
that in lieu of the downstream style GPU frequency table. If we do
use the downstream table convert it to a OPP table so that we can
take advantage of the OPP lookup facilities later.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some A3XX and A4XX GPU targets required that the GPU clock be
programmed to a non zero value when it was disabled so
27Mhz was chosen as the "invalid" frequency.
Even though newer targets do not have the same clock restrictions
we still write 27Mhz on clock disable and expect the clock subsystem
to round down to zero.
For unknown reasons even though the slow clock speed is always
27Mhz and it isn't actually a functional level the legacy device tree
frequency tables always defined it and then did gymnastics to work
around it.
Instead of playing the same silly games just hard code the "slow" clock
speed in the code as 27MHz and save ourselves a bit of infrastructure.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
User space needs to know where the GMEM whole starts so that they
can set up the addressing correctly.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
There are reasons for a memory object to outlive the file descriptor
that created it and so the address space that a buffer object is
attached to must also outlive the file descriptor. Reference count
the address space so that it can remain viable until all the objects
have released their addresses.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
We should be detaching the MMU before destroying the address
space. To do this cleanly, the detach has to happen in
adreno_gpu_cleanup() because it needs access to structs
in adreno_gpu.c. Plus it is better symmetry to have
the attach and detach at the same code level.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The interrupt status was being cleared before processing the handlers.
a5xx_rbbm_err_irq() was checking the interrupt status again, which would
likely turn out bad because the interrupt status would be 0 (or at least
different). Pass the original status to the function instead.
Also, skip clearing RBBM_AHB_ERROR from the interrupt status. The interrupt
will keep firing until the error source is cleared. Skip the clear to
avoid a storm until the error is cleared in a5xx_rbbm_err_irq().
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Instead of checking for a5xx_gpu->gpmu_iova during destroy we
accidently check a5xx_gpu->gpmu_bo.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
The newly added a5xx support fails to build when debugfs is diabled:
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:849:4: error: 'struct msm_gpu_funcs' has no member named 'show'
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:849:11: error: 'a5xx_show' undeclared here (not in a function); did you mean 'a5xx_irq'?
This adds a missing #ifdef.
Fixes: b5f103ab98 ("drm/msm: gpu: Add A5XX target support")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Each of the per-generation callbacks was doing this. Lets just simplify
and move it into toplevel show() fxn.
Signed-off-by: Rob Clark <robdclark@gmail.com>
This was never documented or used in upstream dtb. It is used by
downstream bindings from android device kernels. But the quirks are
a property of the gpu revision, and as such are redundant to be listed
separately in dt. Instead, move the quirks to the device table.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
The original way we determined the gpu version was based on downstream
bindings from android kernel. A cleaner way is to get the version from
the compatible string.
Note that no upstream dtb uses these bindings. But the code still
supports falling back to the legacy bindings (with a warning), so that
we are still compatible with the gpu dt node from android device
kernels.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Herring <robh@kernel.org>
The plan is to use the OPP bindings. For now, remove the documentation
for qcom,gpu-pwrlevels, and make the driver fall back to a safe low
clock if the node is not present.
Note that no upstream dtb use this node. For now we keep compatibility
with this node to avoid breaking compatibility with downstream android
dt files.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Herring <robh@kernel.org>
Fixes: 9cb07b099fb ("drm/msm: support multiple address spaces")
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Currently the value written to CP_RB_WPTR is calculated on the fly as
(rb->next - rb->start). But as the code is designed rb->next is wrapped
before writing the commands so if a series of commands happened to
fit perfectly in the ringbuffer, rb->next would end up being equal to
rb->size / 4 and thus result in an out of bounds address to CP_RB_WPTR.
The easiest way to fix this is to mask WPTR when writing it to the
hardware; it makes the hardware happy and the rest of the ringbuffer
math appears to work and there isn't any point in upsetting anything.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[squash in is_power_of_2() check]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Most 5XX targets have GPMU (Graphics Power Management Unit) that
handles a lot of the heavy lifting for power management including
thermal and limits management and dynamic power collapse. While
the GPMU itself is optional, it is usually nessesary to hit
aggressive power targets.
The GPMU firmware needs to be loaded into the GPMU at init time via a
shared hardware block of registers. Using the GPU to write the microcode
is more efficient than using the CPU so at first load create an indirect
buffer that can be executed during subsequent initalization sequences.
After loading the GPMU gets initalized through a shared register
interface and then we mostly get out of its way and let it do
its thing.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Disable the interrupt during the init sequence to avoid having
interrupts fired for errors and other things that we are not
ready to handle while initializing.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add helper functions for TYPE4 and TYPE7 ME opcodes that replace
TYPE0 and TYPE3 starting with the A5XX targets.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a new generic function to write a "64" bit value. This isn't
actually a 64 bit operation, it just writes the upper and lower
32 bit of a 64 bit value to a specified LO and HI register. If
a particular target doesn't support one of the registers it can
mark that register as SKIP and writes/reads from that register
will be quietly dropped.
This can be immediately put in place for the ringbuffer base and
the RPTR address. Both writes are converted to use
adreno_gpu_write64() with their respective high and low registers
and the high register appropriately marked as SKIP for both 32 bit
targets (a3xx and a4xx). When a5xx comes it will define valid target
registers for the 'hi' option and everything else will just work.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add some new functions to manipulate GPU registers. gpu_read64 and
gpu_write64 can read/write a 64 bit value to two 32 bit registers.
For 4XX and older these are normally perfcounter registers, but
future targets will use 64 bit addressing so there will be many
more spots where a 64 bit read and write are needed.
gpu_rmw() does a read/modify/write on a 32 bit register given a mask
and bits to OR in.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
When the GPU hardware init function fails (like say, ME_INIT timed
out) return error instead of blindly continuing on. This gives us
a small chance of saving the system before it goes boom.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
There are very few register accesses in the common code. Cut down
the list of common registers to just those that are used. This
saves const space and saves us the effort of maintaining registers
for A3XX and A4XX that don't exist or are unused.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
For a5xx the gpu is 64b so we need to change iova to 64b everywhere. On
the display side, iova is still 32b so it can ignore the upper bits.
(Although all the armv8 devices have an iommu that can map 64b pa to 32b
iova.)
Signed-off-by: Rob Clark <robdclark@gmail.com>
We can have various combinations of 64b and 32b address space, ie. 64b
CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So
best to decouple the device iova's from mmap offset.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We get 2 warnings when building kernel with W=1:
drivers/gpu/drm/msm/adreno/a3xx_gpu.c:535:17: warning: no previous prototype for 'a3xx_gpu_init' [-Wmissing-prototypes]
drivers/gpu/drm/msm/adreno/a4xx_gpu.c:624:17: warning: no previous prototype for 'a4xx_gpu_init' [-Wmissing-prototypes]
In fact, both functions are declared in
drivers/gpu/drm/msm/adreno/adreno_device.c, but should be declared
in a header file. So this patch moves both function declarations to
drivers/gpu/drm/msm/adreno/adreno_gpu.h.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1477127865-9381-1-git-send-email-baoyou.xie@linaro.org
For some optimizations coming on the userspace side, splitting larger
draw or gmem cmds into multiple cmdstream buffers, we need to support
much more than the previous small/arbitrary limit.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Before we can add vmap shrinking, we really need to know which vmap'ings
are currently being used. So switch to get/put interface. Stubbed put
fxns for now.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some, but not all, callers of obj->vmap() would check if return
IS_ERR(). So let's actually return an error if vmap() fails. And fixup
the call-sites that were not handling this properly.
Signed-off-by: Rob Clark <robdclark@gmail.com>
At this point, there is nothing left to fail. And submit already has a
fence assigned and is added to the submit_list. Any problems from here
on out are asynchronous (ie. hangcheck/recovery).
Signed-off-by: Rob Clark <robdclark@gmail.com>
It is no longer true that we discard all in-flight submits on recover
(these days we only discard the first one that hung). After the first
re-submitted batch completes it would overwrite the fence with a correct
value, but there would be a window of time which showed all re-submitted
batches as already complete.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Better encapsulate the per-timeline stuff into fence-context. For now
there is just a single fence-context, but eventually we'll also have one
per-CRTC to enable fully explicit fencing.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Not sure where it came from, but seem unintentional. And also not
needed on a420, so let's just drop it.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We need this for GL_TIMESTAMP queries.
Note: currently only supported on a4xx.. a3xx doesn't have this
always-on counter. I think we could emulate it with the one CP
counter that is available, but for now it is of limited usefulness
on a3xx (since we can't seem to do time-elapsed queries in any sane
way with the existing firmware on a3xx, and if you are trying to do
profiling on a tiler you want time-elapsed). We can add that later
if it becomes useful.
Signed-off-by: Rob Clark <robdclark@gmail.com>
As described in the downstream/kgsl driver:
Sometimes the RPTR shadow memory is unreliable causing timeouts
in adreno_idle(). Read it directly from the register instead.
Signed-off-by: Craig Stout <cstout@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
We need this in userspace for interpreting some of the perf ctrs.
Note possibly not quite sufficient if we had some frequency mgmt
approach other than race-to-idle. Not really sure what the best
thing to do if we did. Although displaying results as a percentage
of max frequence seems sensible(ish) if we did.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Remove CONFIG_OF checks in adreno_device.c. The downstream bus scaling
stuff is included only when CONFIG_OF is not set. So, remove that too.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
As found in apq8016 (used in DragonBoard 410c) and msm8916.
Note that numerically a306 is actually 307 (since a305c already claimed
306). Nice and confusing.
Signed-off-by: Rob Clark <robdclark@gmail.com>
A few spots in the driver have support for downstream android
CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the
driver for various devices which do not have sufficient upstream
kernel support. But the intentionally dead code seems to cause
some confusion. Rename the #define to make this more clear.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Dump a bit more info when the GPU hangs, without having hang_debug
enabled (which dumps a *lot* of registers). Also dump the scratch
registers, as they are useful for determining where in the cmdstream
the GPU hung (and they seem always safe to read when GPU has hung).
Note that the freedreno gallium driver emits increasing counter values
to SCRATCH6 (to identify tile #) and SCRATCH7 (to identify draw #), so
these two in particular can be used to "triangulate" where in the
cmdstream the GPU hung.
Signed-off-by: Rob Clark <robdclark@gmail.com>
In error paths, this was being called without struct_mutex held.
Leading to panics like:
msm 1a00000.qcom,mdss_mdp: No memory protection without IOMMU
Kernel panic - not syncing: BUG!
CPU: 0 PID: 1409 Comm: cat Not tainted 4.0.0-dirty #4
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
[<ffffffc000089c78>] dump_backtrace+0x0/0x118
[<ffffffc000089da0>] show_stack+0x10/0x20
[<ffffffc0006686d4>] dump_stack+0x84/0xc4
[<ffffffc0006678b4>] panic+0xd0/0x210
[<ffffffc0003e1ce4>] drm_gem_object_free+0x5c/0x60
[<ffffffc000402870>] adreno_gpu_cleanup+0x60/0x80
[<ffffffc0004035a0>] a3xx_destroy+0x20/0x70
[<ffffffc0004036f4>] a3xx_gpu_init+0x84/0x108
[<ffffffc0004018b8>] adreno_load_gpu+0x58/0x190
[<ffffffc000419dac>] msm_open+0x74/0x88
[<ffffffc0003e0a48>] drm_open+0x168/0x400
[<ffffffc0003e7210>] drm_stub_open+0xa8/0x118
[<ffffffc0001a0e84>] chrdev_open+0x94/0x198
[<ffffffc000199f88>] do_dentry_open+0x208/0x310
[<ffffffc00019a4c4>] vfs_open+0x44/0x50
[<ffffffc0001aa26c>] do_last.isra.14+0x2c4/0xc10
[<ffffffc0001aac38>] path_openat+0x80/0x5e8
[<ffffffc0001ac354>] do_filp_open+0x2c/0x98
[<ffffffc00019b60c>] do_sys_open+0x13c/0x228
[<ffffffc00019b72c>] SyS_openat+0xc/0x18
CPU1: stopping
But there isn't any particularly good reason to hold struct_mutex for
teardown, so just standardize on calling it without the mutex held and
use the _unlocked() versions for GEM obj unref'ing
Signed-off-by: Rob Clark <robdclark@gmail.com>
Resync from rnndb database, to pull in register defines for:
* eDP
* HDMI/HDCP
* mdp4/mdp5 YUV support
* mdp5 hw cursor support
Signed-off-by: Rob Clark <robdclark@gmail.com>
The release_firmware() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Register offsets have changed between a3xx and a4xx GPUs.
To be able access these registers in common code, we create
a lookup table, and set of read-write APIs to access the
register through the lookup table.
Signed-off-by: Aravind Ganesan <aravindg@codeaurora.org>
[robclark: remove REG_ADRENO_UNDEFINED, just use zero, and minor
tweaks for latest generated headers]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Move anything that can fail after call to base class msm_gpu_init().
This way, if we fail, active_list has already been initialized so we
don't trip 'WARN_ON(!list_empty(&gpu->active_list))' in
msm_gpu_cleanup().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add ptr to list of interesting registers to 'struct adreno_gpu' and use
that to move most of the debugfs show and register dump bits down into
adreno_gpu. This will avoid duplication as support for additional
adreno generations is added.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Push a few bits down into adreno_gpu so they won't have to be duplicated
as support for additional adreno generations is added.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Somewhere along the way, the firmware loader sprouted another lock
dependency, resulting in possible deadlock scenario:
&dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem
which is problematic vs things like gem mmap.
So introduce a separate mutex to synchronize gpu init.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Downstream kernel IOMMU had a non-standard way of dealing with multiple
devices and multiple ports/contexts. We don't need that on upstream
kernel, so rip out the crazy.
Note that we have to move the pinning of the ringbuffer to after the
IOMMU is attached. No idea how that managed to work properly on the
downstream kernel.
For now, I am leaving the IOMMU port name stuff in place, to simplify
things for folks trying to backport latest drm/msm to device kernels.
Once we no longer have to care about pre-DT kernels, we can drop this
and instead backport upstream IOMMU driver.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Now that we (almost) have enough dependencies in place (MMCC, RPM, etc),
add necessary DT support so that we can use drm/msm on upstream kernel.
v2: update for review comments
v3: rebase on component helper changes
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some of the w/a or different behavior of userspace blob driver seem to
be keyed to gpu patch revision, rather than gpu-id. So expose the full
chip-id to userspace so it can DTRT.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Shut down the clks when the gpu has nothing to do. A short inactivity
timer is used to provide a low pass filter for power transitions.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add support for adreno 330. Not too much different, just a few
differences in initial configuration plus setting OCMEM base.
Userspace support is already in upstream mesa.
Note that the existing DT code is simply using the bindings from
downstream android kernel, to simplify porting of this driver to
existing devices. These do not constitute any committed/stable
DT ABI. The addition of proper DT bindings will be a subsequent
patch, at which point (as best as possible) I will try to support
either upstream bindings or what is found in downstream android
kernel, so that existing device DT files can be used.
Signed-off-by: Rob Clark <robdclark@gmail.com>
This adds the necessary configuration for the APQ8060A SoC (dual-core
krait + a320 gpu) as found on the bstem board.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add a VRAM carveout that is used for systems which do not have an IOMMU.
The VRAM carveout uses CMA. The arch code must setup a CMA pool for the
device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not
cool). The user can configure the VRAM pool size using msm.vram module
param.
Technically, the abstraction of IOMMU behind msm_mmu is not strictly
needed, but it simplifies the GEM code a bit, and will be useful later
when I add support for a2xx devices with GPUMMU, so I decided to keep
this part.
It appears to be possible to configure the GPU to restrict access to
addresses within the VRAM pool, but this is not done yet. So for now
the GPU will refuse to load if there is no sort of mmu. Once address
based limits are supported and tested to confirm that we aren't giving
the GPU access to arbitrary memory, this restriction can be lifted
Signed-off-by: Rob Clark <robdclark@gmail.com>
This got a bit broken with original patches when re-arranging things to
move dependencies on mach-msm inside #ifndef OF.
Signed-off-by: Rob Clark <robdclark@gmail.com>
If gpu locks up with the rptr shortly beyond the wrap-around point in
the ringbuffer, because the rptr was not reset (but wptr is, by virtue
of resetting rb->cur), we could end up in a scenario where we think
there is not enough space in the ringbuffer for the next cmds. And
since the CP won't reset rptr until after processing an IB, this leaves
things in a sort of deadlock.
So reset rptr too. And a bit more spiffing up of hangcheck to make
things easier to debug.
Signed-off-by: Rob Clark <robdclark@gmail.com>
A basic, no-frills recovery mechanism in case the gpu gets wedged. We
could try to be a bit more fancy and restart the next submit after the
one that got wedged, but for now keep it simple. This is enough to
recover things if, for example, the gpu hangs mid way through a piglit
run.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add initial support for a3xx 3d core.
So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)
Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu
This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.
Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Generated from rnndb files in:
https://github.com/freedreno/envytools
Keep this split out as a separate commit to make it easier to review the
actual driver.
Signed-off-by: Rob Clark <robdclark@gmail.com>