Instead of taking the first pipe and giving the rest to kfd, take the
first 2 queues of each pipe.
Effectively, amdgpu and amdkfd own the same number of queues. But
because the queues are spread over multiple pipes the hardware will be
able to better handle concurrent compute workloads.
amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4
amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4
v2: fix policy comment
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of picking an arbitrary queue for KIQ, search for one according
to policy. The queue must be unused.
Also report the KIQ as an unavailable resource to KFD.
In testing I ran into KCQ initialization issues when using pipes 2/3 of
MEC2 for the KIQ. Therefore the policy disallows grabbing one of these.
v2: fix (ring.me + 1) to (ring.me -1) in amdgpu_amdkfd_device_init
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The assumption that we are only using the first pipe no longer holds.
Instead, calculate the queue_mask from the queue_bitmap.
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pipes provide better concurrency than queues, therefore we want to make
sure that apps use queues from different pipes whenever possible.
Optimize for the trivial case where an app will consume rings in order,
therefore we don't want adjacent rings to belong to the same pipe.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This information is already available in adev.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the KGD to KFD interface to allow sharing pipes with queue
granularity instead of pipe granularity.
This allows for more interesting pipe/queue splits.
v2: fix overflow check for res.queue_mask
v3: fix shift overflow when setting res.queue_mask
v4: fix comment in is_pipeline_enabled()
v5: clamp res.queue_mask to the first MEC only
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current implementation is hardcoded to enable ME1/PIPE0 interrupts
only.
This patch allows amdgpu to enable interrupts for any pipe of ME1.
v2: added gfx9 support
v3: use soc15_grbm_select for gfx9
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously the queue/pipe split with kfd operated with pipe
granularity. This patch allows amdgpu to take ownership of an arbitrary
set of queues.
It also consolidates the last few magic numbers in the compute
initialization process into mec_init.
v2: support for gfx9
v3: renamed AMDGPU_MAX_QUEUES to AMDGPU_MAX_COMPUTE_QUEUES
v4: fix off-by-one in num_mec checks in *_compute_queue_acquire
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make amdgpu the owner of all per-pipe state of the HQDs.
This change will allow us to split the queues between kfd and amdgpu
with a queue granularity instead of pipe granularity.
This patch fixes kfd allocating an HDP_EOP region for its 3 pipes which
goes unused.
v2: support for gfx9
v3: fix gfx7 HPD intitialization
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename straggler instances of r(adeon)dev to a(mdgpu)dev
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The return value from copy_form_user is 0 for the success case.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the same gfx_*_mqd_commit function for kfd and amdgpu codepaths.
This removes the last duplicates of this programming sequence.
v2: fix cp_hqd_pq_wptr value
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The gfxv7 contains a slightly different version of cik_mqd called
bonaire_mqd. This can introduce subtle bugs if fixes are not applied in
both places.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Handle HQD deactivation timeouts instead of ignoring them.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The MQD programming sequence currently exists in 3 different places.
Refactor it to absorb all the duplicates.
The success path remains mostly identical except for a slightly
different order in the non-kiq case. This shouldn't matter if the HQD
is disabled.
The error handling paths have been updated to deal with the new code
structure.
v2: the non-kiq path for gfxv8 was dropped in the rebase
v3: split MEC_HPD_SIZE rename, dropped doorbell changes
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename MEC_HPD_SIZE to GFXN_MEC_HPD_SIZE to clarify it is specific to a
gfx generation.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit f8fdaa0e7b81698ba2ad8c2d20c7f9a44c75e0c6.
firmware add support for this feature, so still ctrl by vbios.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This isn't beneficial any more since VRAM allocations are now split
so that they fits into a single page table.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Makes it easier to update the PDE with huge pages.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code easier to understand.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move several if statements and a loop statment from
run time to initialization time.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This change is also useful for the upcoming changes where page tables
can be updated by CPU.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If updating the PDs fails we now invalidate all entries to try again later.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename adjust_mc_addr to get_vm_pde and check the address bits in one place.
v2: handle vcn as well, keep setting the valid bit manually,
add a BUG_ON() for GMC v6, v7 and v8 as well.
v3: handle vcn_v1_0_enc_ring_emit_vm_flush as well.
v4: fix the BUG_ON mask for GFX6-8
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Load Balancing Per Watt (LBPW) allows dynamically disable CUs
when they are idle
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
remnants from bring-up.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tools fb address was failed to send to smu when smu
was not running. Changing sequence will fix it.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It is to fix bug of sysfs entry pp_table which had size 0 of output before.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_device_resume() & amdgpu_device_init() have a high
time consuming call of amdgpu_late_init() which sets the
clock_gating state of all IP blocks and is blocking.
This patch defers only this setting of clock gating state
operation to post resume of amdgpu driver but ideally before
the UI comes up or in some cases post ui as well.
With this change the resume time of amdgpu_device comes down
from 1.299s to 0.199s which further helps in reducing the overall
system resume time.
V1: made the optimization applicable during driver load as well.
TEST:(For ChromiumOS on STONEY only)
* UI comes up
* amdgpu_late_init() call gets called consistently and no errors reported.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZK2lrAAoJEHm+PkMAQRiGm3AH/13F1DlIk05aSXHoDr/idIpR
GMHmk3YF+EuFjsL463Sh6s/SSWmz0Lda8euaoB4wCWvQFX2ZjTE+aOd79XlRiZJQ
OTtLkV9I41eXIJUpEOHia7xZiCsbw+usqcHrm1aBoSh5KKV2iQmEOrnJdibqJVOF
eXUMphNK/zFtAd2bKtQSxkaBnOOqsQUgVQSkr2K9rSg25l0KokFC6c5K5IjLn4x9
QgDY4wmMvHrDz0CtpoqlNM4XqbsDJVrFeZGfg6hlMqSRDeXeg4h3Ol0VfIT496RP
QBdrDb6hWO+HKt9B0M+7Q+8a/Fsw+5dtpqv1W/Wlr0i4CS6euU8NChAmrpkrqGo=
=m5ba
-----END PGP SIGNATURE-----
Backmerge tag 'v4.12-rc3' into drm-next
Linux 4.12-rc3
Daniel has requested this for some drm-intel-next work.
The randstruct plugin requires designated initializers for structures
that are entirely function pointers.
Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Huang <JinHuiEric.Huang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
The randstruct plugin requires structures that are entirely function
pointers be initialized using designated initializers.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
KIQ is the Kernel Interface Queue for managing the MEC. Rather than setting
up rings via direct MMIO of ring registers, the rings are configured via
special packets sent to the KIQ. The allows the MEC to better manage shared
resources and certain power events. It also reduces the code paths in the
driver to support and is required for MEC powergating.
v2: drop gfx_v9_0_cp_compute_fini() as well
v3: rebase on latest changes derived from gfx8, add unmap queues on
hw_fini
v4: fix copy/paste typo in error message (Rex)
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to reset the wptr and clear the rings. The UNMAP_QUEUES
packet writes the current MQD state back the MQD on suspend,
so there is no need to reset it as well.
v2: fix from gfx8 (Rex)
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the KCQ setup. This way we only have to wait once for the
entire MEC.
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rather than waiting for each queue.
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
One for KIQ and one for the KCQ. This simplifies the logic and
allows for future optimizations.
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's stored in LE format.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sets floors for various clocks depending on current
requirements.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the new clock type to the gfx arbitor so we can determine
the proper clock floors for it.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
New clock type on RV.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Print the failed msg when it fails.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: bump the DRM version
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If you initiate a read that is out of the VRAM address space return
ENXIO instead of 0.
Reads that begin below that point will read upto the VRAM limit as
before.
Cc: stable@vger.kernel.org
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop the function gmc_v6_0_init_compute_vmid() since it wasn't
implemented and commented out.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clean up coding style in gfx_v6_0_write_harvested_raster_configs()
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That line got missed during the merge.
v2: fix vcn_v1_0_enc_ring_emit_vm_flush as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That GFX9 needs a PDE in the registers is entirely GFX9 specific.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
512 is enough for one PD entry on Vega10.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If doorbell is used for wptr update, we also need to use it
to initialize wptr to 0.
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VM is mandatory for all hw amdgpu supports. So remove the leftovers
to make it optionally.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The fence in dep_sync cannot be optimized.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Tested and Reviewed-by: Roger.He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update gfx9 golden settings.
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the vm is guilty of a GPU reset, skips all its jobs.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
below ioctl will return -ENODEV:
amdgpu_cs_ioctl
amdgpu_cs_wait_ioctl
amdgpu_cs_wait_fences_ioctl
amdgpu_gem_va_ioctl
amdgpu_info_ioctl
v2: only for map and replace cases in amdgpu_gem_va_ioctl
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
backup first 64 byte of gart table as reset magic, check if magic is same
after gpu hw reset.
v2: use memcmp instead of manual innovation.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clock index 0 is a valid index that is needed to restore the default
graphics power profile. Use ~0 to indicate a failure to find a clock
index. This fixes the clocks getting stuck in the compute power
profile after running a compute application on Vega10.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for parsing the gpu info table on raven.
This is required to get the gpu config data for raven.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
only mmhub will be invalidated during vcn dec/enc vm flush
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This got missed due to differences in the trees
when raven support was merged.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: ken wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:75:42-43: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:466:22-23: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:468:22-23: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:470:20-21: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:473:22-23: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:475:21-22: WARNING: Use ARRAY_SIZE
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/rv_hwmgr.c:477:21-22: WARNING: Use ARRAY_SIZE
Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element
Semantic patch information:
This makes an effort to find cases where ARRAY_SIZE can be used such as
where there is a division of sizeof the array by the sizeof its first
element or by any indexed element or the element type. It replaces the
division of the two sizeofs by ARRAY_SIZE.
Generated by: scripts/coccinelle/misc/array_size.cocci
CC: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the RAVEN pci id.
v2: add exp flag for now (Alex)
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: squash in some updates (Alex)
Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the ip block and enable powerplay on raven.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
By default VCN is powered down like SDMA, power it up/down
on driver load/unload.
[Rui: Fix to add the parameter 0 to un-gate VCN] v2
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sdma is powered down by default in vbios,
need to power up in driver init. Power it down
again on driver tear down.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
hwmgr handles the GPU power state management.
v2: squash in updates (Alex)
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
smumgr provides the interface for interacting with the
smu firmware which handles power management.
v2: squash in updates (Alex)
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Headers define the driver/fw interface for smu10.
v2: squash in updates (Alex)
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Based on new vcn firmware interface changes
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
New firmware add psp header.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update and enable the vcn encode IB test.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wire up the callback and enable them.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the ring function callbacks for the encode rings.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not required on raven.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Needed for the proper command sequence for VCN.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hope it will be generic for vcn later
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the ring function callbacks for the decode ring.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the decode ring init.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fill in the core VCN 1.0 setup functionality.
v2: squash in fixup (Alex)
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add encode ring and ib tests.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VCN is the new media block on Raven. Add core support
and the ring and ib tests for decode.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the ip block version structure for psp 10.0.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PSP is the security processor. These are the support
functions.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
nbio handles misc bus io operations. Handle
differences between different nbio bus versions.
v2: switch checks from RAVEN to APU (Alex)
squash in raven rev id fetch
squash in fix uninitalized hdp flush reg index for raven
v3: add some missed RAVEN to APU checks (Alex)
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
NBIO handles misc bus io functions on the chip. This
helper lib has the apppropriate functions for NBIO 7.0.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
APU fb offset is set by sbios, which is different with DGPU.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wire up the functions to control medium grained
powergating.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wire up the enable functions to enable coarse
grained powegating.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
More stuff for gfx pg.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
setup the save and restore buffers used for gfx
powergating.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Required for proper handshaking between the GFX and RLC.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fetch correct firmware for raven for gfx and sdma.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set the appropriate ucode loading mechanism. Set to
direct for now.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the common golden settings for Raven.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the IP blocks for RAVEN.
v2: drop DC for upstream (Alex)
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RAVEN is a new APU.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
because if the fence is really signaled, it could already
released so the fence pointer is a wild pointer, but if
we use job->base.node we are safe because job will not
be released untill amdgpu_job_timedout finished.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1,TDR will kickout guilty job if it hang exceed the threshold
of the given one from kernel paramter "job_hang_limit", that
way a bad command stream will not infinitly cause GPU hang.
by default this threshold is 1 so a job will be kicked out
after it hang.
2,if a job timeout TDR routine will not reset all sched/ring,
instead if will only reset on the givn one which is indicated
by @job of amdgpu_sriov_gpu_reset, that way we don't need to
reset and recover each sched/ring if we already know which job
cause GPU hang.
3,unblock sriov_gpu_reset for AI family.
V2:
1:put kickout guilty job after sched parked.
2:since parking scheduler prior to kickout already occupies a
while, we can do last check on the in question job before
doing hw_reset.
TODO:
1:when a job is considered as guilty, we should mark some flag
in its fence status flag, and let UMD side aware that this
fence signaling is not due to job complete but job hang.
2:if gpu reset cause all video memory lost, we need introduce
a new policy to implement TDR, like drop all jobs not yet
signaled, and all IOCTL on this device will return ERROR
DEVICE_LOST.
this will be implemented later.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't need a scheduler for KIQ.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
that way we can know which job cause hang and
can do per sched reset/recovery instead of all
sched.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
because we don't want to do sriov-gpu-reset under certain
cases, so just split those two funtion and don't invoke
sr-iov one from bare-metal one.
V2:
remove debugfs_gpu_reset routine on SRIOV case.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
before that, we have function to check if reset happens by using reset count.
v2: always update reset count after vm flush
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: directly return for 'if' case.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is an improvement for previous patch, the sched_sync is to store fence
that could be skipped as scheduled, when job is executed, we didn't need
pipeline_sync if all fences in sched_sync are signalled, otherwise insert
pipeline_sync still.
v2: handle error when adding fence to sync failed.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This kind of reset handling was removed a long time ago.
v2: fix warning (Alex)
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Roger.He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the root cause is vram content is lost completely after pci reset.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Roger.He <Hongbo.He@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
to cover below case:
1. A task gart bind/unbind but not add to adev->gtt_list yet
2. at this time gpu reset, gtt only recover those gtt in adev->gtt_list
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger.He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
rather than defining it locally.
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
else branch is pointless if it's right at the end of function and use
unlikely() on err path.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
AI affected:
CP/HW team requires KMD insert FRAME_CONTROL(end) after
the last IB and before the fence of this DMAframe.
this is to make sure the cache are flushed, and it's a must
change no matter MCBP/SR-IOV or bare-metal case because new
CP hw won't do the cache flush for each IB anymore, it just
leaves it to KMD now.
with this patch, certain MCBP hang issue when rendering
vulkan/chained-ib are resolved.
v2: drop gfx8 changes. gfx8 is not affected (Alex)
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
TMZ package will be used for VULKAN/CHAINED-IB MCBP
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
According to CP/hw team requirment, to support PAL/CHAINED-IB
MCBP, kernel driver must guarantee DE_META must be inserted
right prior to the work_load DE IB (with PREEMPT flag), there
cannot be any non-work_load DE IB between-in DE_META and
work_load DE IB.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
According to HW design, need to clean doorbell after setup MMSCH
table.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change message to debug level as VI does.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If psp version doesn't match asd version, asd loading will be
failed. Add workaround to bypass it for sriov.
Signed-off-by: Daniel Wang <Daniel.Wang2@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On vega10, some hw finish operations should not be applied in SR-IOV
case. This works as workaround to fix multi-VFs reboot/shutdown
issues.
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1,this way we make those routines compatible with the sequence
requirment for both Tonga and Vega10
2,ignore PSP hw init when doing TDR, because for SR-IOV device
the ucode won't get lost after VF FLR, so no need to invoke PSP
doing the ucode reloading again.
v2: squash in ARRAY_SIZE fix
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
if sriov gpu reset is invoked by job timeout, it is run
in a global work-queue which is very slow and better not call
msleep ortherwise it takes long time to get back CPU.
so make below changes:
1: Change msleep 1 to mdelay 5
2: Ignore the ack fail from pf after time out,
because VF FLR will clear ack, sometime VF FLR is done
prior to the beginning of poll_ack so we can ignore this ack
TODO:
Put job_timedout (and the following gpu reset) in a driver thread,
instead of the global work_struct.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this is to prevent fence forever waiting if FLR occured
during register accessing.
v2:
use define instead of hardcode for the timeout msec
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to make sure the various init sequences submitted
to KIQ complete before testing the rings.
Reviewed-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Will be used in subsequent commits rather rather than
magic numbers.
Reviewed-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
And properly synchronize them with the master during
queue init.
Reviewed-by: monk liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The usage of kiq should not depend on the virtualization.
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by:Andres Rodriquez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Interface to reserve a vmid for a specific process to
add in shader debugging that requries a fixed vmid.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Implement the vmid reservation.
v2: move sync waiting only when flush needs
v3: fix racy
v4: peek fence instead of get fence, and fix potential context starved.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Limit reserved vmids to 1 to avoid taking too many
out of commission and starving the system.
v2: move #define to amdgpu_vm.h
v3: move reserved vmid counter to id_manager,
and increase counter before allocating vmid
v4: rename to reserved_vmid_num
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add reserve/unreserve vmid funtions. Used to reserve
vmids for certain shader debugging functionality that
required a fixed vmid for the life of the debug.
v3:
only reserve vmid from gfxhub
v4:
fix racy condition
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It will be used for reserving vmid for shader debugging
that requires a fixed vmid.
v2: fix warning (Alex)
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support for MCBP/Virtualization in combination with chained IBs is
formal released on firmware feature version #46. So enable it
according to firmware feature version, otherwise, world switch will
hang.
Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Bug: SWDEV-117987: Always on CU mask broken for gfx7+
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change place of virt_init_setting function so that can cover the
cg and pg flags configuration.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GPU hypervisor cover all settings of CG and PG, so guest doesn't
need to do anything. Bypass it.
Signed-off-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only per family registers are still used.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I couldn't figure out what this was original good for, but we
don't use it any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Xiaojie Yuan <Xiaojie.Yuan@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I'm not sure if the order matters, but it seems like it makes
more sense to set this after the range is programmed.
v2: rebase (Alex)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add set_doorbell functions for mec and cpg.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This could be used in Andres' priority scheduling patch
as well.
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Even if we disable clockgating, we still need to make sure the
cp/rlc interrupts are enabled for powergating which might still
be enabled.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Even if we disable clockgating, we still need to make sure the
cp/rlc interrupts are enabled for powergating which might still
be enabled.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's global, not queue specific, so move it out of the
kiq register init function.
Tested-and-Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to reset the wptr and clear the rings. The UNMAP_QUEUES
packet writes the current MQD state back the MQD on suspend,
so there is no need to reset it as well.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the UNMAP_QUEUES packet to have the KIQ properly
disable them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the KCQ setup. This way we only have to wait once for the
entire MEC.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
One for KIQ and one for the KCQ. This simplifies the logic and
allows for future optimizations.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to make sure the various init sequences submitted
to KIQ complete before testing the rings.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Re-enable GFX PG. It's working properly with MEC now that KIQ is
enabled.
Reviewed-by: Samuel Li <samuel.li@amd.com>
This reverts commit e9ef19aa1bdeac380662a112f1d03a7c3477527f.
KIQ is the Kernel Interface Queue for managing the MEC. Rather than setting
up rings via direct MMIO of ring registers, the rings are configured via
special packets sent to the KIQ. The allows the MEC to better manage shared
resources and certain power events.
v2: squash in s3/s4 fix from Rex
v3: further fixes from Rex
Signed-off-by: David Panariti <David.Panariti@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing chips to the doorbell range setup. These
were missed in the KIQ code. Fixes power and performance
regressions with KIQ. Spotted by Rex.
Tested-and-Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to properly set the MTYPE and ROQ space setting.
This should fix performance regressions with KIQ enabled.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This resolves pcie low speed problem.
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
need update smu firmware to version 0x1c20.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewws-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>