Commit Graph

37 Commits

Author SHA1 Message Date
Trigger Huang
470b425019 drm/amdgpu: call psp to program ih cntl in SR-IOV
call psp to program ih cntl in SR-IOV if supported

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:20:51 -05:00
Trigger Huang
74dcfe74b4 drm/amdgpu: Rearm IRQ in Vega10 SR-IOV if IRQ lost
In Multi-VFs stress test, sometimes we see IRQ lost when running
benchmark, just rearm it.

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-06 09:36:34 -05:00
Christian König
b51cd19e48 drm/amdgpu: enable IH ring 1&2 for Vega20 as well
That doesn't seem to have any negative effects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:58 -05:00
Christian König
1ae64cec8a drm/amdgpu: enable IH doorbell for ring 1&2 on Vega
The doorbells should already be reserved, just enable them.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:58 -05:00
Christian König
0133690e0d drm/amdgpu: change Vega IH ring 1 config
Disable overflow and enable full drain. This makes fault handling on ring 1
much more reliable since we don't generate back pressure any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:58 -05:00
Christian König
cf67950e22 drm/amdgpu: add support for self irq on Vega10 v2
This finally enables processing of ring 1 & 2.

v2: fix copy&paste error

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:35 -05:00
Christian König
ad710812b5 drm/amdgpu: enable IH ring 1 and ring 2 v4
The entries are ignored for now, but it at least stops crashing the
hardware when somebody tries to push something to the other IH rings.

v2: limit ring size, add TODO comment
v3: only program rings if they are actually allocated
v4: limit the ring init to Vega10

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:35 -05:00
Oak Zeng
7c94bc828e drm/amdgpu: Setting doorbell range registers earlier
HW doorbell writing routing policy: writing to doorbell
not in SDMA/IH/MM/ACV doorbell range will be routed to CP.
So CP doorbell routing depends on doorbell range setting
of above blocks. Setting doorbell range of above blocks
earlier (soc15_common_hw_init) to make sure CP doorbell
writing be routed to CP block.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:34 -05:00
Christian König
b821757501 drm/amdgpu: fix IH overflow on Vega10 v2
When an ring buffer overflow happens the appropriate bit is set in the WPTR
register which is also written back to memory. But clearing the bit in the
WPTR doesn't trigger another memory writeback.

So what can happen is that we end up processing the buffer overflow over and
over again because the bit is never cleared. Resulting in a random system
lockup because of an infinite loop in an interrupt handler.

This is 100% reproducible on Vega10, but it's most likely an issue we have
in the driver over all generations all the way back to radeon.

v2: rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:47 -05:00
Christian König
d81f78b440 drm/amdgpu: simplify IH programming
Calculate all the addresses and pointers in amdgpu_ih.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:47 -05:00
Christian König
8bb9eb480d drm/amdgpu: add IH ring to ih_get_wptr/ih_set_rptr v2
Let's start to support multiple rings.

v2: decode IV is needed as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:47 -05:00
Christian König
22666cc148 drm/amdgpu: move IV prescreening into the GMC code
The GMC/VM subsystem is causing the faults, so move the handling here as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:14:26 -05:00
Christian König
a655dad4b2 drm/amdgpu: remove VM fault_credit handling
printk_ratelimit() is much better suited to limit the number of reported
VM faults.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:14:20 -05:00
Oak Zeng
9564f1928e drm/amdgpu: Use asic specific doorbell index instead of macro definition
ASIC specific doorbell layout is used instead of enum definition

Signed-off-by: Oak Zeng <ozeng@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:55:33 -05:00
Philip Yang
c837243ff4 drm/amdgpu: fix bug with IH ring setup
The bug limits the IH ring wptr address to 40bit. When the system memory
is bigger than 1TB, the bus address is more than 40bit, this causes the
interrupt cannot be handled and cleared correctly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-13 09:38:28 -05:00
Christian König
425c31437f drm/amdgpu: cleanup amdgpu_ih.c
Cleanup amdgpu_ih.c to be able to handle multiple interrupt rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:21 -05:00
Christian König
f54b30d70b drm/amdgpu: make function pointers mandatory
We always want those to be setup correctly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:21 -05:00
Oak Zeng
240cd9a642 drm/amdgpu: Move fault hash table to amdgpu vm
In stead of share one fault hash table per device, make it
per vm. This can avoid inter-process lock issue when fault
hash table is full.

Change-Id: I5d1281b7c41eddc8e26113e010516557588d3708
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Christian Konig <Christian.Koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-12 16:28:53 -05:00
Oak Zeng
3760f76cbe drm/amdgpu: Move IH clientid defs to separate file
This is preparation for sharing client ID definitions
between amdgpu and amdkfd

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 15:16:35 -05:00
Christian König
3816e42f5f drm/amdgpu: rename pas_id to pasid
sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:17:41 -05:00
Christian König
147f255884 drm/amdgpu: remove WARN_ON when VM isn't found v2
It can easily be that the VM is already destroyed when this runs.

v2: fix test inversion

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-29 23:17:20 -05:00
Christian König
153b9e1b75 drm/amdgpu: fix locking in vega10_ih_prescreen_iv
The vm pointer can become invalid as soon as the lock is released.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-29 23:16:20 -05:00
Christian König
c4f46f22c4 drm/amdgpu: rename vm_id to vmid
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-27 11:34:02 -05:00
Alex Deucher
bf383fb64e drm/amdgpu: convert nbio to use callbacks (v2)
Cleans up and consolidates all of the per-asic logic.

v2: squash in "drm/amdgpu: fix NULL err for sriov detect" (Chunming)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-13 17:28:07 -05:00
Shaoyun Liu
4fd09a19a6 drm/admgpu: Reduce the usage of soc15ip.h
Remove the header where it's not used.

Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-08 11:35:19 -05:00
Feifei Xu
fb960bd283 drm/amd/include:cleanup vega10 header files.
Remove asic_reg/vega10 folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:22 -05:00
Feifei Xu
8af7454e7c drm/amd/include:cleanup vega10 osssys header files.
Cleanup asic_reg/vega10/OSSSYS folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06 12:48:22 -05:00
Hawking Zhang
b2b7e457ba drm/amdgpu: switch to use new SOC15 reg read/write macros for soc15 ih
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04 16:41:43 -05:00
Felix Kuehling
c98171ccf6 drm/amdgpu: Handle GPUVM fault storms
When many wavefronts cause VM faults at the same time, it can
overwhelm the interrupt handler and cause IH ring overflows before
the driver can notify or kill the faulting application.

As a workaround I'm introducing limited per-VM fault credit. After
that number of VM faults have occurred, further VM faults are
filtered out at the prescreen stage of processing.

This depends on the PASID in the interrupt packet, so it currently
only works for KFD contexts.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-28 16:03:30 -04:00
Monk Liu
7c3f2167b4 drm/amdgpu:no kiq in IH
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 15:14:07 -04:00
Felix Kuehling
a2f14820e3 drm/amdgpu: Track pending retry faults in IH and VM (v2)
IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.

The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.

It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.

v2:
- Made the hash table smaller and the FIFO longer. I never want the
  FIFO to fill up, because that would make prescreen take longer.
  128 pending page faults should be enough to keep migrations busy.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 14:53:20 -04:00
Felix Kuehling
00ecd8a27c drm/amdgpu: Add prescreening stage in IH processing (v2)
To filter out high-frequency interrupts that can be safely ignored.

v2: squash in trivial typo fix for si (Alex)

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-09-26 13:07:04 -04:00
Dave Airlie
04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Chunming Zhou
aecbe64f2b drm/amdgpu: apply nbio7 for Raven (v3)
nbio handles misc bus io operations. Handle
differences between different nbio bus versions.

v2: switch checks from RAVEN to APU (Alex)
    squash in raven rev id fetch
    squash in fix uninitalized hdp flush reg index for raven
v3: add some missed RAVEN to APU checks (Alex)

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-24 17:41:17 -04:00
Masahiro Yamada
248a1d6f1a drm/amd: fix include notation and remove -Iinclude/drm flag
Include <drm/*.h> instead of relative path from include/drm, then
remove the -Iinclude/drm compiler flag.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1493009447-31524-4-git-send-email-yamada.masahiro@socionext.com
2017-05-16 17:17:41 +02:00
Alex Xie
60508d3df2 drm/amdgpu: Fix 32bit x86 compilation warning
drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c:187:2: warning: right shift count >= width of type [enabled by default]
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c:173:2: warning: right shift count >= width of type [enabled by default]
drivers/gpu/drm/amd/amdgpu/vega10_ih.c:106:3: warning: right shift count >= width of type [enabled by default]

v2: Add a space between "&" and "0xff"

Reported by: kbuild-all@01.org

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-30 15:16:00 -04:00
Ken Wang
282aae555e drm/amdgpu: add vega10 interrupt handler
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29 23:54:46 -04:00