Commit Graph

68534 Commits

Author SHA1 Message Date
Josip Pavic
ece11e7b4a drm/amd/display: remove dc context from transfer function
[Why]
The ctx field of dc_transfer_func is not always populated and therefore
isn't reliable.

[How]
Remove dc context from dc_transfer_func

Signed-off-by: Josip Pavic <Josip.Pavic@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Eryk Brol
c44a22b312 drm/amd/display: Add connector to the state if DSC debugfs is set
[why]
We want to trigger atomic check on connector, which DSC debugfs
properties have changed.

[how]
Add a helper function that iterates through all active connectors
and add them to the state if DSC debugfs parameters have changed.

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Eryk Brol
28b2f656d3 drm/amd/display: Calculate DSC number of slices in debugfs when forced
[why]
When comparing current DSC timing settings with enforced through
debugfs we have to calculate number of both vertical and horisontal
slices. So instead of doing that every time we should just
use number of slices rather than setting its dimensions.

[how]
In connector's dsc preferred settings structure change slice height
and slice width parameters to number of slices vertical and horisontal.
Also calculate number of slices in debugfs rather in create_stream_for_sink.

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Eryk Brol
918698d5c2 drm/amd/display: Return the number of bytes parsed than allocated
[why & how]
Previously we were returning the number of bytes allocated
for a write buffer from debugfs and when manually used it wouldn't
rise any errors, but it wouldn't match the size of the parameters
passed from userspace.

In successful case return the size passed by usermode otherwise
the error code is returned. That simplifies the parser helper
and removes a potential error of returning mismatched input size.

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Joshua Aberback
4b675aad96 drm/amd/display: Update idle optimization handling
[How]
 - use dc interface instead of hwss interface in cursor functions, to keep
dc->idle_optimizations_allowed updated
 - add dc interface to check if idle optimizations might apply to a plane

Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Eric Yang
0825d9658b drm/amd/display: implement notify stream mask
[Why]
Send stream active state info to DMUB

[How]
Implement GPINT to notify stream mask

Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Aric Cyr
a4832640e2 drm/amd/display: 3.2.102
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Anthony Koo
8b3f6b9857 drm/amd/display: [FW Promotion] Release 0.0.32
| [Header Changes]
|       - Add debug flag to log line numbers for PSR debug

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Martin Leung
6b85151f6b drm/amd/display: adding pathway to retrieve stutter period
why:
some functions may need be dependent on stutter period in the future

how:
Extract from stutter calculations and place into perf_params structure

Signed-off-by: Martin Leung <martin.leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Eryk Brol
6b29bb3737 drm/amd/display: Add trigger connector unplug
[why]
We need a virtual tool that would emulate a physical
connector unplug to usermode, while connector is
still physically plugged in.

[how]
Added a new option to debugfs entry "trigger_hotplug".
It emulates hotplug irq handling scenario by clearing
DC and DM connector states.
It can be triggered with the following command:

	echo 0 > /sys/kernel/debug/dri/0/DP-X/trigger_hotplug

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Eryk Brol
0749ddeb7d drm/amd/display: Add DSC force disable to dsc_clock_en debugfs entry
[why]
For debug purposes we want not to enable DSC on certain connectors
even if algorithm deesires to do so, instead it should enable DSC
on other capable connectors or fail the atomic check.

[how]
Adding the third option to connector's debugfs entry dsc_clock_en.

Accepted inputs:
     0x0 - connector is using default DSC enablement policy
     0x1 - force enable DSC on the connector, if it supports DSC
     0x2 - force disable DSC on the connector, if DSC is supported

Ex. # echo 0x2 > /sys/kernel/debug/dri/0/DP-1/dsc_clock_en

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Dmytro Laktyushkin
20cc44c9e8 drm/amd/display: make dcn20 stream_gating use a pointer for dsc_pg_control
This allows us to reuse these on different asics.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Aric Cyr
e4863f118a drm/amd/display: Multi display cause system lag on mode change
[Why]
DCValidator is created/destroyed repeatedly for cofunctional validation
which causes a lot of memory thrashing, particularly when Driver Verifer
is enabled.

[How]
Implement a basic caching algorithm that will cache DCValidator with a
matching topology.  When a match is found, the DCValidator can be
reused.  If there is no match, a new one will be created and inserted
into the cache if there is space or an unreference entry can be evicted.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Zhan Liu
48e48e5984 drm/amd/display: Disable idle optimization when PSR is enabled
[Why]
Idle optimization and PSR conflict each other. If both enabled
at the same time, display flickering will be observed.

[How]
Disable idle optimization when PSR is enabled.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Bhawanpreet Lakha
bd80d584cd drm/amd/display: Don't use DRM_ERROR() for DTM add topology
[Why]
Previously we were only calling add_topology when hdcp was being enabled.
Now we call add_topology by default so the ERROR messages are printed if
the firmware is not loaded.

This error message is not relevant for normal display functionality so
no need to print a ERROR message.

[How]
Change DRM_ERROR to DRM_INFO

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Alvin Lee
6cd7923724 drm/amd/display: Compare plane when looking for pipe split being lost
[Why]
There are situations where we go from 2 pipe to 1 pipe in MPO, but this
isn't a pipe split being lost -- it's a plane disappearing in (i.e. video overlay
goes away) so we lose one pipe. In these situations we don't want to
disable the pipe in a separate operation from the rest of the pipe
programming sequence. We only want to disable a pipe in a
separate operation when we're actually disabling pipe split.

[How]
Make sure the pipe being lost has the same stream AND plane
as the old top pipe to ensure.

Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Zhan Liu
5fd35f1291 drm/amd/display: Enabling PSR on DCN30 on driver side
[Why]
PSR needs to be enabled on DCN30. This is the driver part of PSR
enablement.

Also disabled retired DMCU on driver side, since DMCU is
not supported on DCN30 anymore.

[How]
Add necessary changes to enable PSR on DCN30.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Roman Li
f9663cbd46 drm/amd/display: remove early return from dm_late_init
[Why]
ABM feature initialization was not executed due to early return.

dm_late_init() had an early return in case if DMCU is not used.
With the implementation of ABM on DMUB, DMCU can be disabled
but ABM still needs to be initialized.

[How]
Remove verification for DMCU from the top of the function.
The existing logic will handle the case when DMCU is not used.

Signed-off-by: Roman Li <roman.li@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
George Shen
6c95320d01 drm/amd/display: Rename set_mst_bandwidth to align with DP spec
[Why]
The function set_mst_bandwidth is poorly name since it isn't clear what
it does, and it also does not reflect any part of the allocation sequence
described in the DP spec.

[How]
Rename the function set_mst_bandwidth to set_throttled_vcp_size.

Signed-off-by: George Shen <george.shen@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Aric Cyr
e8cb7a4dd9 drm/amd/display: Flip pending check timeout due to disabled hubp
[Why]
When pipe locks are being taken we wait for flip pending to clear first.
In some cases the pipe mapping is changed and the pending we're checking
for will never clear.

[How]
Don't check disabled pipes for flip pending.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Aidan Gratton
123ecf6836 drm/amd/display: Increase Max EDID Size Constant
[HOW & WHY]
Change max EDID size constant to 1280 to support
10-block EDIDs.

Signed-off-by: Aidan Gratton <Aidan.Gratton@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:41 -04:00
Ashley Thomas
172c9b7781 drm/amd/display: Power eDP panel back ON before link training retry
[why]
When link training failures occur for eDP, dp_disable_link_phy
is called which powers OFF eDP panel. After link training retry
delay, the next retry begins by calling dp_enable_link_phy
which does not issue a correspnding eDP panel power ON, leaving
panel powered OFF which leads to display OFF/dark.

[how]
Power ON eDP before next link training retry.

Signed-off-by: Ashley Thomas <Ashley.Thomas2@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Harmanprit Tatla
5cd04c4846 drm/amd/display: Fix CP_IRQ clear bit and logic
[Why]
Currently clearing the wrong bit for CP_IRQ, and logic on when to
clear needs to be fixed.

[How]
Corrected bit to clear and improved logic for decision to clear.

Signed-off-by: Harmanprit Tatla <harmanprit.tatla@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Wesley Chalmers
05e3d830fa drm/amd/display: Only use offset for first ODM pipe
[WHY]
Only the first pipe in ODM combine group should have nonzero recout
offset. All other pipes should have recout offset 0;
otherwise there will be gaps in the image.

[HOW]
Set recout.x to 0 if the pipe is not the leftmost ODM pipe.

When computing viewports, calculate the horizontal offset of a pipe's src
based on the current pipe's position in the ODM group, plus whatever offset the
leftmost ODM pipe has; otherwise there will be discontinuity in the image.

Since ODM combine can only combine pipes horizontally, nothing needs to
be done for recout.y.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Wenjing Liu
3fb068c3ec drm/amd/display: always use 100us for cr aux rd interval
[why]
The cr training aux rd interval is
modified without following specs requirements.
According to the commit message the change was not intended to modify the value.
Therefore it looks like it is caused by a typo in the change.

Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Aric Cyr
64fbb86d6b drm/amd/display: 3.2.101
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Anthony Koo
81ac89cab0 drm/amd/display: [FW Promotion] Release 0.0.31
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Rodrigo Siqueira
4b4f21ff7f drm/amd/display: Check clock table return
During the load processes for Renoir, our display code needs to retrieve
the SMU clock and voltage table, however, this operation can fail which
means that we have to check this scenario. Currently, we are not
handling this case properly and as a result, we have seen the following
dmesg log during the boot:

RIP: 0010:rn_clk_mgr_construct+0x129/0x3d0 [amdgpu]
...
Call Trace:
 dc_clk_mgr_create+0x16a/0x1b0 [amdgpu]
 dc_create+0x231/0x760 [amdgpu]

This commit fixes this issue by checking the return status retrieved
from the clock table before try to populate any bandwidth.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Aric Cyr
091018a51c drm/amd/display: Triplebuffering should not be used by default
Disable triplebuffering by default.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Wenjing Liu
ce17ce17af drm/amd/display: add option to override cr training pattern
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Joshua Aberback
0b02e1fda5 drm/amd/display: Compare mpcc_inst to mpcc_count instead of a constant
[Why]
This assert triggers a false negative because there are more than 4 MPCCs
on many asics.

[How]
 - change assert comparisson
 - remove unused variable

Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Harmanprit Tatla
958000cb24 drm/amd/display: Add CP_IRQ clear capability
[Why]
Currently we do not clear the CP_IRQ bit upon receiving it.

[How]
Added a function to clear CP_IRQ bit.

Signed-off-by: Harmanprit Tatla <harmanprit.tatla@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
JinZe.Xu
498563cf9c drm/amd/display: Detect plane change when detect pipe change.
[Why]
If plane has changed, dcn20_detect_pipe_changes doesn't update dc_plane_state->update_flags, and the following dcn20_program_pipe can't reprogram hubp correctly.

[How]
Add a new flags bit "plane_changed" in pipe_ctx->update_flags.If old plane isn’t identical to new plane, this bit will be set and guide “dcn20_program_pipe” to programing HUBP correctly.

Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Naveed Ashfaq
a861736dae drm/amd/display: Fixed Intermittent blue screen on OLED panel
[why]
Changing to smaller modes on OLED panel caused a blue screen crash
as driver reported dram change during vactive when it shouldn't

[how]
Added an extra condition to prevent incorrect dram change timing

Signed-off-by: Naveed Ashfaq <Naveed.Ashfaq@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Aric Cyr
14ae69026f drm/amd/display: 3.2.100
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Anthony Koo
8e8e9463a8 drm/amd/display: [FW Promotion] Release 0.0.30
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Jiansong Chen
39767222bf drm/amd/pm: support runtime pptable update for sienna_cichlid etc.
This avoids smu issue when enabling runtime pptable update for
sienna_cichlid and so on. Runtime pptable udpate is needed for test
and debug purpose.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Flora Cui
26652cd8de drm/amdgpu: drop BOOLEAN define in display part
use bool directly

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:40 -04:00
Mukul Joshi
62f6b1162e drm/amdgpu: Enable SDMA utilization for Arcturus
SDMA utilization calculations are enabled/disabled by
writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
enable this only for Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Aurabindo Pillai
6d90a208cf drm/amd/display: Move disable interrupt into commit tail
[Why&How]
Since there is no need for accessing crtc state in the interrupt
handler, interrupts need not be disabled well in advance, and
can be moved to commit_tail where it should be.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Aurabindo Pillai
585d450c76 drm/amd/display: Refactor to prevent crtc state access in DM IRQ handler
[Why&How]
Currently commit_tail holds global locks and wait for dependencies which is
against the DRM API contracts. Inorder to fix this, IRQ handler should be able
to run without having to access crtc state. Required parameters are copied over
so that they can be directly accessed from the interrupt handler

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Aurabindo Pillai
5d1c59c479 drm/amdgpu: Move existing pflip fields into separate struct
[Why&How]
To refactor DM IRQ management, all fields used by IRQ is best moved
to a separate struct so that main amdgpu_crtc struct need not be changed
Location of the new struct shall be in DM

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
John Clements
9c7e2ceb1d drm/amdgpu: Update RAS init handling
Output RAS init status

If RAS init fails, teardown RAS context

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
YueHaibing
2b3bbf2354 drm/amdkfd: Fix -Wunused-const-variable warning
If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: ‘raven_device_info’ defined but not used [-Wunused-const-variable=]
 static const struct kfd_device_info raven_device_info = {
                                     ^~~~~~~~~~~~~~~~~

As Huang Rui suggested, Raven already has the fallback path,
so it should be out of IOMMU v2 flag.

Suggested-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Changfeng
f399d4de2d drm/amdgpu: add ta DTM/HDCP print in amdgpu_firmware_info for apu
It needs to add ta DTM/HDCP print to get HDCP/DTM version info when cat
amdgpu_firmware_info

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Jiansong Chen
9c1615be19 drm/amd/pm: update driver if version for navy_flounder
It's in accordance with pmfw 65.8.0 for navy_flounder.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Likun Gao
0e4b291bb7 drm/amd/pm: update driver if file for sienna cichlid
Update drive if file for sienna_cichlid.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Andrey Grodzovsky
7cbbc745dc drm/amdgpu: Minor checkpatch fix
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:29 -04:00
Andrey Grodzovsky
6894305c97 drm/amdgpu: Disable DPC for XGMI for now.
XGMI support is more complicated than single device support as
questions of synchronization between the device recovering from
PCI error and other members of the hive are required.
Leaving this for next round.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:22 -04:00
Andrey Grodzovsky
7ac71382e9 drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code.
Reuse exsisting functions from GPU recovery to avoid code
duplications.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:16 -04:00
Andrey Grodzovsky
c1dd4aa624 drm/amdgpu: Fix consecutive DPC recovery failures.
Cache the PCI state on boot and before each case where we might
loose it.

v2: Add pci_restore_state while caching the PCI state to avoid
breaking PCI core logic for stuff like suspend/resume.

v3: Extract pci_restore_state from amdgpu_device_cache_pci_state
to avoid superflous restores during GPU resets and suspend/resumes.

v4: Style fixes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:04 -04:00
Andrey Grodzovsky
362c7b91c1 drm/amdgpu: Fix SMU error failure
Wait for HW/PSP initiated ASIC reset to complete before
starting the recovery operations.

v2: Remove typo

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:55 -04:00
Andrey Grodzovsky
acd89fca67 drm/amdgpu: Block all job scheduling activity during DPC recovery
DPC recovery involves ASIC reset just as normal GPU recovery so block
SW GPU schedulers and wait on all concurrent GPU resets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:48 -04:00
Andrey Grodzovsky
bf36b52e78 drm/amdgpu: Avoid accessing HW when suspending SW state
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:39 -04:00
Andrey Grodzovsky
c9a6b82f45 drm/amdgpu: Implement DPC recovery
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality

v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:32 -04:00
Liu ChengZhe
2a9787dcf5 drm/amdgpu: Do gpu recovery when no job is running
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.

v2: modify the description

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:18 -04:00
Dennis Li
edb084f487 drm/amdkfd: fix a memory leak issue
In the resume stage of GPU recovery, start_cpsch will call pm_init
which set pm->allocated as false, cause the next pm_release_ib has
no chance to release ib memory.

Add pm_release_ib in stop_cpsch which will be called in the suspend
stage of GPU recovery.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:06 -04:00
Dennis Li
a9a83a92d0 drm/kfd: fix a system crash issue during GPU recovery
The crash log as the below:

[Thu Aug 20 23:18:14 2020] general protection fault: 0000 [#1] SMP NOPTI
[Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G           OE     5.4.0-42-generic #46~18.04.1-Ubuntu
[Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ52-G40-00, BIOS R12 05/13/2020
[Thu Aug 20 23:18:14 2020] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[Thu Aug 20 23:18:14 2020] RIP: 0010:evict_process_queues_cpsch+0xc9/0x130 [amdgpu]
[Thu Aug 20 23:18:14 2020] Code: 49 8d 4d 10 48 39 c8 75 21 eb 44 83 fa 03 74 36 80 78 72 00 74 0c 83 ab 68 01 00 00 01 41 c6 45 41 00 48 8b 00 48 39 c8 74 25 <80> 78 70 00 c6 40 6d 01 74 ee 8b 50 28 c6 40 70 00 83 ab 60 01 00
[Thu Aug 20 23:18:14 2020] RSP: 0018:ffffb29b52f6fc90 EFLAGS: 00010213
[Thu Aug 20 23:18:14 2020] RAX: 1c884edb0a118914 RBX: ffff8a0d45ff3c00 RCX: ffff8a2d83e41038
[Thu Aug 20 23:18:14 2020] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8a0e2e4178c0
[Thu Aug 20 23:18:14 2020] RBP: ffffb29b52f6fcb0 R08: 0000000000001b64 R09: 0000000000000004
[Thu Aug 20 23:18:14 2020] R10: ffffb29b52f6fb78 R11: 0000000000000001 R12: ffff8a0d45ff3d28
[Thu Aug 20 23:18:14 2020] R13: ffff8a2d83e41028 R14: 0000000000000000 R15: 0000000000000000
[Thu Aug 20 23:18:14 2020] FS:  0000000000000000(0000) GS:ffff8a0e2e400000(0000) knlGS:0000000000000000
[Thu Aug 20 23:18:14 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Aug 20 23:18:14 2020] CR2: 000055c783c0e6a8 CR3: 00000034a1284000 CR4: 0000000000340ee0
[Thu Aug 20 23:18:14 2020] Call Trace:
[Thu Aug 20 23:18:14 2020]  kfd_process_evict_queues+0x43/0xd0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kfd_suspend_all_processes+0x60/0xf0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kgd2kfd_suspend.part.7+0x43/0x50 [amdgpu]
[Thu Aug 20 23:18:14 2020]  kgd2kfd_pre_reset+0x46/0x60 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_amdkfd_pre_reset+0x1a/0x20 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_device_gpu_recover+0x377/0xf90 [amdgpu]
[Thu Aug 20 23:18:14 2020]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[Thu Aug 20 23:18:14 2020]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[Thu Aug 20 23:18:14 2020]  process_one_work+0x20f/0x400
[Thu Aug 20 23:18:14 2020]  worker_thread+0x34/0x410

When GPU hang, user process will fail to create a compute queue whose
struct object will be freed later, but driver wrongly add this queue to
queue list of the proccess. And then kfd_process_evict_queues will
access a freed memory, which cause a system crash.

v2:
The failure to execute_queues should probably not be reported to
the caller of create_queue, because the queue was already created.
Therefore change to ignore the return value from execute_queues.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:23:18 -04:00
Daniel Vetter
818280d5ad Linux 5.9-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9epdgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9IMH/jHCRSbcsIXHuQHn
 xcRLlhrDHfXoBza7auHfPWx2+9DZsmaSJs/SEiTGNag0Bi7jBcWcwBpsep7iVG/+
 WiftD5uOMhZigyuvfMFrt0mjr2Kr3wg5p58lwMBeBdm8iL5uKV8ehKsh05/Fral2
 6hu3jP8L0PCZMpF+sZ7s2jlhfVUMmjA8VzXZCvgQtmhoraHiF3mzfkcSMxnHwBPO
 HLo+TDDm49u+LbVsJT7+cSTiWxuUJCbix9Q4PCTx/BGg4ezYsjc6v0BnYRaYtrrA
 1uYiT6PVBEUkYYBHKQlD3N2KnUmbKx7dGUF4t+peTg5/JiocAJMNi1N9Qzvv7N6Q
 CqTiuio=
 =q+kJ
 -----END PGP SIGNATURE-----

Merge v5.9-rc5 into drm-next

Paul needs 1a21e5b930 ("drm/ingenic: Fix leak of device_node
pointer") and 3b5b005ef7 ("drm/ingenic: Fix driver not probing when
IPU port is missing") from -fixes to be able to merge further ingenic
patches into -next.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2020-09-14 17:19:11 +02:00
Dave Airlie
7f7a47952c drm-misc-fixes for v5.9-rc5:
- Fix double free in virtio.
 - Add missing put_device in sun4i, and other fixes.
 - Small ingenic fixes.
 - Handle sun4i alpha on lowest plane correctly.
 - Remove output->enabled from virtio, as it should use crtc_state.
 - Fix tve200 enable/disable.
 - Documentation fix.
 - Fix virtio unblank.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl9Y6koACgkQ/lWMcqZw
 E8P5CxAAvH5B60s23LQdi7JWgWY4r560jneCtBtnesOKMan3ckVHZSX1XEsvvodQ
 Ihvm8EriajcYDTOSCgXg5CitPpFDD3WaxUPWZ5VNiLnQInncbP0+Scz147/bVyHo
 1XoPyLqqjCrUcHEQSxYGZfFtJPEpm4CoRgqHDQk0QAdYtopgLf1ff6Z2gYGUC7kD
 8nIOVakRdDlRlhQYIy9AzznUTPJvN4vm0cLmJS4xDlVpVJcLI7X9qiMEF1OsZvhw
 x6sH+3uttbQRzMJqARyANbMc0VD3osqwzTnHHehu6xGI03MLZW2NC6+LpQEe2zK1
 AST/H8si7OpemXCyztevkqeDXj6jOBNgnIlr5gv2rUSK5Iva7ZKj4P/2xxr6qm4E
 7Ip1Qa1WXMMJJiFGddcYZ2fjEpWUPGAmigtZtwVebugmqSgWigphengqhDZb3gMM
 IsVABfbAM0DZcV8iVeHYDq5OJmdOD+/qzAKOayaj4Sgq5G7LokkkIuYhtU7i28rQ
 Bl8vCR0VprxdyfKweS98PLBpbgo4oSh20zRTqvLtiNbVWZoVhTdladd9hlL2IChg
 cx5bOf7fkboX1hTy7vdU16BIuGzEFc38XgexXDYJuvvy/T+GvYDUtw4HL1vJwT79
 /Ef6psjt6EqVE4JfTYvwv54Hbz4PrvLRUngZK/vzkzefDygw9Sc=
 =pqMB
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2020-09-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v5.9-rc5:
- Fix double free in virtio.
- Add missing put_device in sun4i, and other fixes.
- Small ingenic fixes.
- Handle sun4i alpha on lowest plane correctly.
- Remove output->enabled from virtio, as it should use crtc_state.
- Fix tve200 enable/disable.
- Documentation fix.
- Fix virtio unblank.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/478b49d1-b1b3-c983-7056-8a89249be435@mblankhorst.nl
2020-09-11 09:49:23 +10:00
Dave Airlie
7bf23bfb0d Merge tag 'drm-intel-fixes-2020-09-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc5:
- Fix regression leading to audio probe failure

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/875z8m2hss.fsf@intel.com
2020-09-11 09:45:51 +10:00
Maarten Lankhorst
166774a2c2 drm/i915: Fix slightly botched merge in __reloc_entry_gpu
This function should be an int, not a bool.

Presumably because we had the same 2 reverts in a slightly different
way, git got confused.

Thanks to Dan for reporting. :)

The conflict is between the 3 reverts in drm-fixes:

4993a8a378 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"")
ad5d95e4d5 ("Revert "drm/i915/gem: Async GPU relocations only"")
20561da3a2 ("Revert "drm/i915/gem: Delete unused code"")

And the slightly different combined revert in drm-intel-gt-next, but
with the same goal:

102a0a9051 ("Revert "drm/i915/gem: Async GPU relocations only"")

In the merge commit 1f4b2aca79 ("Merge tag
'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things
went wrong, but the merge commit view now doesn't show any conflict
anymore (as git tends to do when the resolution picks one or the other
branch).

The need to handle other than just true/false error codes in
__reloc_entry_gpu was added in the dma_resv locking changes in
c43ce12328 ("drm/i915: Use per object locking in execbuf, v12.")

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Airlie <airlied@redhat.com>
[danvet: Explain this entire saga a lot better, adding tons of commit
references. Also note that this was merged before full intel-gfx-CI
results, only after BAT, since the breakage at the BAT run is already
severe enough to block all pre-merge testing.]
Fixes: 1f4b2aca79 ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next")
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910111225.2184193-1-maarten.lankhorst@linux.intel.com
2020-09-10 15:19:10 +02:00
Dave Airlie
877d8c0743 Merge tag 'topic/nouveau-i915-dp-helpers-and-cleanup-2020-08-31-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
UAPI Changes:

None

Cross-subsystem Changes:

* Moves a bunch of miscellaneous DP code from the i915 driver into a set
  of shared DRM DP helpers

Core Changes:

* New DRM DP helpers (see above)

Driver Changes:

* Implements usage of the aforementioned DP helpers in the nouveau
  driver, along with some other various HPD related cleanup for nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/11e59ebdea7ee4f46803a21fe9b21443d2b9c401.camel@redhat.com
2020-09-09 12:27:13 +10:00
Dave Airlie
f9c88aa50b Kconfig fixes for DRM_ZYNQMP_DPSUB DMA engine dependency
-----BEGIN PGP SIGNATURE-----
 
 iQJWBAABCgBAFiEEvZRkio5H7O2/GZsYYiVdKZ4oCyQFAl9TwoAiHGxhdXJlbnQu
 cGluY2hhcnRAaWRlYXNvbmJvYXJkLmNvbQAKCRBiJV0pnigLJM1MEACQUyzkNg7u
 tGy3/EmklJ4IPgtckS/bHni9uCxYs9UEOvFg/sf2XsJQsqmgd5pntvlyhFqLcKxY
 c2kJPNNYXo4fB/qL2MK7iPyPXhvRXK25Fm2MbnKZe0YCzBzgXOqliLjL9aWeeMT5
 v2MDZByJDLc0y524HYAAiTPh/0hlOtin9kNIu8T10Pi2EuvwhqJ6l6IhZH3K1Dch
 lzz/4BMqRfU9hOKrSS+qUO0t9ri/N8WmNXAuj0yGQE5e+oXQfbNqLn8M3bIcmN76
 /dfliU4MNZQRfrSlXPr9sx26ynTGMKE/65PdYPsxQJ5qH3MrxZA37nJjPnSKhTKq
 8+GR/9uPMeI5hm8JE6zweDLOsGBO1FG0GRtkjawYaPl/4jmMof84YffFZwE8AcLX
 BvBfmTkgrSccrABqmUvV/gHG8Z7i/WdZ+XJ8cVuExYiO21lSq2LlYGm60+ej5hrJ
 w3Vcv5vaPSXNlBWpIiTlMx6mgEQHDdXMOqZt3W3TQCea6KzGWbSud+XX1CIA/Uo9
 ILvZd9W7N0kzPtECvtHRhA+RF/uGliQM5e/P7RlNEnucDa9duWfNhzXdWyAGuAAt
 o0NExhPFRs31rlV5yH1Qjnd71fD0CazBsWio9/DL9faBsxK8JNbk/TzZMUfZvupw
 U2Vxo6TH1RgoIigvE8OahoqcQ2AwEs9dIg==
 =lJxB
 -----END PGP SIGNATURE-----

Merge tag 'drm-xlnx-dpsub-fixes-20200905' of git://linuxtv.org/pinchartl/media into drm-fixes

Kconfig fixes for DRM_ZYNQMP_DPSUB DMA engine dependency

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200905172751.GC6319@pendragon.ideasonboard.com
2020-09-09 11:31:31 +10:00
Dave Airlie
1f4b2aca79 Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added)

UAPI Changes:
(- Potential implicit changes from WW locking refactoring)

Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)

Driver Changes:

- MAJOR: Apply WW locking across the driver (Maarten)

- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)

- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)

- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
2020-09-09 07:55:22 +10:00
Dave Airlie
61d98185b4 Backmerge drm-fixes merge into drm-next
Commit '6f6a73c8b715d595977774d48450a734297ab21f' from Linus' tree

The fixes reverts cause a bit of a conflict pain with intel next,
start fixing it up here.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-09 07:46:32 +10:00
Kai Vehmanen
0c4c801b31 drm/i915: fix regression leading to display audio probe failure on GLK
In commit 4f0b4352bd ("drm/i915: Extract cdclk requirements checking
to separate function") the order of force_min_cdclk_changed check and
intel_modeset_checks(), was reversed. This broke the mechanism to
immediately force a new CDCLK minimum, and lead to driver probe
errors for display audio on GLK platform with 5.9-rc1 kernel. Fix
the issue by moving intel_modeset_checks() call later.

[vsyrjala: It also broke the ability of planes to bump up the cdclk
and thus could lead to underruns when eg. flipping from 32bpp to
64bpp framebuffer. To be clear, we still compute the new cdclk
correctly but fail to actually program it to the hardware due to
intel_set_cdclk_{pre,post}_plane_update() not getting called on
account of state->modeset==false.]

Fixes: 4f0b4352bd ("drm/i915: Extract cdclk requirements checking to separate function")
BugLink: https://github.com/thesofproject/linux/issues/2410
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200901151036.1312357-1-kai.vehmanen@linux.intel.com
(cherry picked from commit cf696856bc)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-08 14:38:46 +03:00
Dave Airlie
0c8d22fcae Merge tag 'amd-drm-next-5.10-2020-09-03' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:

amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups

amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets

radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix

Scheduler:
- Clean up priority levels

UAPI:
- amdgpu INFO IOCTL query update for TMZ state
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
  https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com
2020-09-08 16:40:13 +10:00
Dave Airlie
20561da3a2 Revert "drm/i915/gem: Delete unused code"
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.

This reverts commit 7ac2d2536d.

Reported-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 15:45:27 +10:00
Dave Airlie
ad5d95e4d5 Revert "drm/i915/gem: Async GPU relocations only"
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.

This reverts commit 9e0f9464e2.

Reported-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 15:45:17 +10:00
Dave Airlie
4993a8a378 Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.

This reverts commit 763fedd6a2.

Reported-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Dave Airlie <airied@redhat.com>
2020-09-08 15:44:07 +10:00
Dave Airlie
8052ff431a Merge tag 'drm-msm-fixes-2020-09-04' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
A few fixes for a potential RPTR corruption issue.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGvnr6Nhz2J0sjv2G+j7iceVtaDiJDT8T88uW6jiBfOGKQ@mail.gmail.com
2020-09-08 14:51:16 +10:00
Dave Airlie
ce5c207c6b Linux 5.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9VerweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGhc4H/iHD6qLdB36gZB6K
 oc2nJyrqyWitv4ti2Mnt5PA7o4wX4l6nnr1QvoaJ4BRs5Ja1czRvb2XDmdzqAoIA
 xITGoafqaAeDfxQ91bWrJsVN0pCRKiGwddXlU7TWmqw/riAkfOqi6GYKvav4biJH
 +n1mUPQb1M2IbRFsqkAS+ebKHq3CWaRvzKOEneS88nGlL5u31S9NAru8Ru/fkxRn
 6CwGcs1XRaBPYaZAhdfIb0NuatUlpkhPC9yhNS9up6SqrWmK3m65vmFVng6H0eCF
 fwn1jVztboY/XcNAi5sM9ExpQCql6WLQEEktVikqRDojC8fVtSx6W55tPt7qeaoO
 Z6t4/DA=
 =bcA4
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc4' into drm-next

Backmerge 5.9-rc4 as there is a nasty qxl conflict
that needs to be resolved.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 14:41:40 +10:00
Thomas Hellström
e0ee152fce drm/i915: Unlock the shared hwsp_gtt object after pinning
The hwsp_gtt object is used for sub-allocation and could therefore
be shared by many contexts causing unnecessary contention during
concurrent context pinning.
However since we're currently locking it only for pinning, it remains
resident until we unpin it, and therefore it's safe to drop the
lock early, allowing for concurrent thread access.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:08:11 +03:00
Chris Wilson
f4b3c39554 drm/i915: Filter wake_flags passed to default_wake_function
(NOTE: This is the minimal backportable fix, a full fix is being
developed at https://patchwork.freedesktop.org/patch/388048/)

The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef46884975 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added a note and link about more complete fix]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:08:08 +03:00
Chris Wilson
2e4c6c1a9d drm/i915: Remove i915_request.lock requirement for execution callbacks
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.

Previous fix 1d9221e9d3 ("drm/i915: Skip signaling a signaled request")
however did not correctly serialize request retirement with the execution
callbacks.

We were using the i915_request.lock to serialise adding an execution callback
with __i915_request_submit. However, if we use an atomic llist_add to serialise
multiple waiters and then check to see if the request is already executing, we
can remove the irq-spinlock and fix serialization between retirement and
execution callbacks in one go.

v2: Avoid using the irq_work when outside of the irq-spinlocks, where we
can execute the callbacks immediately.
v3: Pay close attention to the order of setting ACTIVE on retirement, we
need to ensure the request is signaled and breadcrumbs detached before
we finish removing the request from the engine.
v4: Expanded commit message.

Fixes: 1d9221e9d3 ("drm/i915: Skip signaling a signaled request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:08:06 +03:00
Chris Wilson
b4d9145b01 drm/i915: Be wary of data races when reading the active execlists
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.

The latest occurence of this was spotted by KCSAN:

[ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915]
[ 1413.563221]
[ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1:
[ 1413.563548]  __await_execution+0x217/0x370 [i915]
[ 1413.563891]  i915_request_await_dma_fence+0x4eb/0x6a0 [i915]
[ 1413.564235]  i915_request_await_object+0x421/0x490 [i915]
[ 1413.564577]  i915_gem_do_execbuffer+0x29b7/0x3c40 [i915]
[ 1413.564967]  i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915]
[ 1413.564998]  drm_ioctl_kernel+0x156/0x1b0
[ 1413.565022]  drm_ioctl+0x2ff/0x480
[ 1413.565046]  __x64_sys_ioctl+0x87/0xd0
[ 1413.565069]  do_syscall_64+0x4d/0x80
[ 1413.565094]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

To complicate matters, we have to both avoid the read tearing of *active and
avoid any write tearing as perform the pending[] -> inflight[] promotion of the
execlists.

This is because we cannot rely on the memcpy doing u64 aligned copies on all
kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE
annotations to satisfy KCSAN.

v2: When in doubt, write the same comment again.
v3: Expanded commit message.

Fixes: b55230e5e8 ("drm/i915: Check for awaits on still currently executing requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 15:07:49 +03:00
Maarten Lankhorst
c1793ba86a drm/i915: Add ww locking to pin_to_display_plane, v2.
Use ww locking for pin_to_display_plane for all the pinning and locking.
With the locking removed from set_cache_level, we need to fix
i915_gem_set_caching_ioctl to take the object reservation lock.

As this is a single lock, we don't need to use the ww dance.

Changes since v1:
- Do not use ww locking in i915_gem_set_caching_ioctl (Thomas).

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-24-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:32:22 +03:00
Maarten Lankhorst
3c0ffa277e drm/i915: Add ww locking to vm_fault_gtt
We want to start requiring the reservation_lock instead of obj->mm.lock
for pinning objects, take the ww lock inside vm_fault_gtt as a first step
towards the legacy lock removal.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-23-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:32:16 +03:00
Maarten Lankhorst
15b6c92498 drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v3.
Make sure vma_lock is not used as inner lock when kernel context is used,
and add ww handling where appropriate.

Ensure that execbuf selftests keep passing by using ww handling.

Changes since v2:
- Fix i915_gem_context finally.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-22-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:32:06 +03:00
Maarten Lankhorst
8a929c9eb1 drm/i915: Use ww pinning for intel_context_create_request()
We want to get rid of intel_context_pin(), convert
intel_context_create_request() first. :)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-21-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:59 +03:00
Maarten Lankhorst
052e04f170 drm/i915/selftests: Fix locking inversion in lrc selftest.
This function does not use intel_context_create_request, so it has
to use the same locking order as normal code. This is required to
shut up lockdep in selftests.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-20-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:50 +03:00
Maarten Lankhorst
dd878c0cec drm/i915: Dirty hack to fix selftests locking inversion
Some i915 selftests still use i915_vma_lock() as inner lock, and
intel_context_create_request() intel_timeline->mutex as outer lock.
Fortunately for selftests this is not an issue, they should be fixed
but we can move ahead and cleanify lockdep now.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-19-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:43 +03:00
Maarten Lankhorst
f00ecc2ef5 drm/i915: Convert i915_perf to ww locking as well
We have the ordering of timeline->mutex vs resv_lock wrong,
convert the i915_pin_vma and intel_context_pin as well to
future-proof this.

We may need to do future changes to do this more transaction-like,
and only get down to a single i915_gem_ww_ctx, but for now this
should work.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-18-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:36 +03:00
Maarten Lankhorst
c8d225946a drm/i915: Kill last user of intel_context_create_request outside of selftests
Instead of using intel_context_create_request(), use intel_context_pin()
and i915_create_request directly.

Now all those calls are gone outside of selftests. :)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-17-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:29 +03:00
Maarten Lankhorst
6b05030496 drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.
This is the last part outside of selftests that still don't use the
correct lock ordering of timeline->mutex vs resv_lock.

With gem fixed, there are a few places that still get locking wrong:
- gvt/scheduler.c
- i915_perf.c
- Most if not all selftests.

Changes since v1:
- Add intel_engine_pm_get/put() calls to fix use-after-free when using
  intel_engine_get_pool().

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-16-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:22 +03:00
Maarten Lankhorst
47b086934f drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.
As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.

This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:13 +03:00
Maarten Lankhorst
3999a70879 drm/i915: Rework intel_context pinning to do everything outside of pin_mutex
Instead of doing everything inside of pin_mutex, we move all pinning
outside. Because i915_active has its own reference counting and
pinning is also having the same issues vs mutexes, we make sure
everything is pinned first, so the pinning in i915_active only needs
to bump refcounts. This allows us to take pin refcounts correctly
all the time.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-14-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:31:02 +03:00
Maarten Lankhorst
2bf541ff6d drm/i915: Pin engine before pinning all objects, v5.
We want to lock all gem objects, including the engine context objects,
rework the throttling to ensure that we can do this. Now we only throttle
once, but can take eb_pin_engine while acquiring objects. This means we
will have to drop the lock to wait. If we don't have to throttle we can
still take the fastpath, if not we will take the slowpath and wait for
the throttle request while unlocked.

The engine has to be pinned as first step, otherwise gpu relocations
won't work.

Changes since v1:
- Only need to get a throttled request in the fastpath, no need for
  a global flag any more.
- Always free the waited request correctly.
Changes since v2:
- Use intel_engine_pm_get()/put() to keeep engine pool alive during
  EDEADLK handling.
Changes since v3:
- Fix small rq leak.
Changes since v4:
- Use a single reloc_context, for intel_context_pin_ww().

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-13-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:52 +03:00
Maarten Lankhorst
b49a7d51c3 drm/i915: Nuke arguments to eb_pin_engine
Those arguments are already set as eb.file and eb.args, so kill off
the extra arguments. This will allow us to move eb_pin_engine() to
after we reserved all BO's.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-12-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:45 +03:00
Maarten Lankhorst
99f08d674e drm/i915: Add ww context handling to context_barrier_task
This is required if we want to pass a ww context in intel_context_pin
and gen6_ppgtt_pin().

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-11-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:38 +03:00
Maarten Lankhorst
bfdf8b1d38 drm/i915: Use ww locking in intel_renderstate.
We want to start using ww locking in intel_context_pin, for this
we need to lock multiple objects, and the single i915_gem_object_lock
is not enough.

Convert to using ww-waiting, and make sure we always pin intel_context_state,
even if we don't have a renderstate object.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-10-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:31 +03:00
Maarten Lankhorst
c43ce12328 drm/i915: Use per object locking in execbuf, v12.
Now that we changed execbuf submission slightly to allow us to do all
pinning in one place, we can now simply add ww versions on top of
struct_mutex. All we have to do is a separate path for -EDEADLK
handling, which needs to unpin all gem bo's before dropping the lock,
then starting over.

This finally allows us to do parallel submission, but because not
all of the pinning code uses the ww ctx yet, we cannot completely
drop struct_mutex yet.

Changes since v1:
- Keep struct_mutex for now. :(
Changes since v2:
- Make sure we always lock the ww context in slowpath.
Changes since v3:
- Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be
  done on normal unlock path.
- Unconditionally release vmas and context.
Changes since v4:
- Rebased on top of struct_mutex reduction.
Changes since v5:
- Remove training wheels.
Changes since v6:
- Fix accidentally broken -ENOSPC handling.
Changes since v7:
- Handle gt buffer pool better.
Changes since v8:
- Properly clear variables, to make -EDEADLK handling not BUG.
Change since v9:
- Fix unpinning fence on pnv and below.
Changes since v10:
- Make relocation gpu chaining working again.
Changes since v11:
- Remove relocation chaining, pain to make it work.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-9-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:07 +03:00
Maarten Lankhorst
8e4ba491b0 drm/i915: Parse command buffer earlier in eb_relocate(slow)
We want to introduce backoff logic, but we need to lock the
pool object as well for command parsing. Because of this, we
will need backoff logic for the engine pool obj, move the batch
validation up slightly to eb_lookup_vmas, and the actual command
parsing in a separate function which can get called from execbuf
relocation fast and slowpath.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-8-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:30:01 +03:00
Maarten Lankhorst
1af343cdc1 drm/i915: Remove locking from i915_gem_object_prepare_read/write
Execbuffer submission will perform its own WW locking, and we
cannot rely on the implicit lock there.

This also makes it clear that the GVT code will get a lockdep splat when
multiple batchbuffer shadows need to be performed in the same instance,
fix that up.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-7-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:53 +03:00
Maarten Lankhorst
80f0b679d6 drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.

To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.

Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:44 +03:00
Maarten Lankhorst
8ae275c288 Revert "drm/i915/gem: Split eb_vma into its own allocation"
This reverts commit 0f1dd02295 ("drm/i915/gem: Split eb_vma into
its own allocation") and also moves all unreserving to a single
place at the end, which is a minor simplification.

With the WW locking, we will drop all references only at the
end when unlocking, so refcounting can now be removed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-5-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:35 +03:00
Maarten Lankhorst
fd1500fcd4 Revert "drm/i915/gem: Drop relocation slowpath".
This reverts commit 7dc8f11437 ("drm/i915/gem: Drop relocation
slowpath"). We need the slowpath relocation for taking ww-mutex
inside the page fault handler, and we will take this mutex when
pinning all objects.

We also functionally revert ef398881d2 ("drm/i915/gem: Limit
struct_mutex to eb_reserve"), as we need the struct_mutex in
the slowpath as well, and a tiny part of 003d8b9143 ("drm/i915/gem:
Only call eb_lookup_vma once during execbuf ioctl"). Specifically,
we make the -EAGAIN handling part of fallback to slowpath again.

With this, we have a proper working slowpath again, which
will allow us to do fault handling with WW locks held.

[mlankhorst: Adjusted for reloc_gpu_flush() changes]

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Removed extra reloc_gpu_flush()]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-4-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:16 +03:00
Maarten Lankhorst
50ae6c61a1 drm/i915: Revert relocation chaining commits.
This reverts commit 964a9b0f61 ("drm/i915/gem: Use chained reloc batches")
and commit 0e97fbb080 ("drm/i915/gem: Use a single chained reloc batches
for a single execbuf").

When adding ww locking to execbuf, it's hard enough to deal with a
single BO that is part of relocation execution. Chaining is hard to
get right, and with GPU relocation deprecated, it's best to drop this
altogether, instead of trying to fix something we will remove.

This is not a completely 1:1 revert, we reset rq_size to 0 in
reloc_cache_init, this was from e3d291301f ("drm/i915/gem: Implement legacy
MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-3-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:29:07 +03:00
Maarten Lankhorst
102a0a9051 Revert "drm/i915/gem: Async GPU relocations only"
This reverts commit 9e0f9464e2 ("drm/i915/gem: Async GPU relocations only"),
and related commit 7ac2d2536d ("drm/i915/gem: Delete unused code").

Async GPU relocations are not the path forward, we want to remove
GPU accelerated relocation support eventually when userspace is fixed
to use VM_BIND, and this is the first step towards that. We will keep
async gpu relocations around for now, until userspace is fixed.

Relocation support will be disabled completely on platforms where there
was never any userspace that depends on it, as the hardware doesn't
require it from at least gen9+ onward. For older platforms, the plan
is to use cpu relocations only.

The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove
subtests that rely on async relocation behavior").

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-2-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:28:46 +03:00
Chris Wilson
da1ea128a6 drm/i915/gem: Free the fence after a fence-chain lookup failure
If dma_fence_chain_find_seqno() reports an error, it does so in its
preamble before it disposes of the input fence. On handling the
error, we need to drop the reference to the fence.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 13149e8baf ("drm/i915: add syncobj timeline support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:28:21 +03:00
Chris Wilson
736e785f9b drm/i915/gem: Reduce context termination list iteration guard to RCU
As we now protect the timeline list using RCU, we can drop the
timeline->mutex for guarding the list iteration during context close, as
we are searching for an inflight request. Any new request will see the
context is banned and not be submitted. In doing so, pull the checks for
a concurrent submission of the request (notably the
i915_request_completed()) under the engine spinlock, to fully serialise
with __i915_request_submit()). That is in the case of preempt-to-busy
where the request may be completed during the __i915_request_submit(),
we need to be careful that we sample the request status after
serialising so that we don't miss the request the engine is actually
submitting.

Fixes: 4a31741521 ("drm/i915/gem: Refine occupancy test in kill_context()")
References: d22d2d073e ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2158
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806105954.7766-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:28:14 +03:00
Chris Wilson
dd5e024956 drm/i915/selftests: Prevent selecting 0 for our random width/align
When igt_random_offset() is a given a range of [0, PAGE_SIZE], it is
allowed to return 0. However, attempting to use a size of 0 for the
igt_lmem_write_cpu() byte poking, leads to call igt_random_offset() with
a range of [offset, offset + 0] and ask it to find a length of 4 within
it. This triggers the bug on that the requested length should fit within
the range!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806145728.16495-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:28:08 +03:00
Chris Wilson
e23005604b drm/i915/gt: Hold context/request reference while breadcrumbs are active
Currently we hold no actual reference to the request nor context while
they are attached to a breadcrumb. To avoid freeing the request/context
too early, we serialise with cancel-breadcrumbs by taking the irq
spinlock in i915_request_retire(). The alternative is to take a
reference for a new breadcrumb and release it upon signaling; removing
the more frequently hit contention point in i915_request_retire().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:29 +03:00
Chris Wilson
3f7dc10716 drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier
Move the __intel_breadcrumbs_arm_irq earlier, next to the disarm_irq, so
that we can make use of it in the following patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:25 +03:00
Chris Wilson
82adf90113 drm/i915/gt: Shrink i915_page_directory's slab bucket
kmalloc uses power-of-two slab buckets for small allocations (up to a
few pages). Since i915_page_directory is a page of pointers, plus a
couple more, this is rounded up to 8K, and we waste nearly 50% of that
allocation. Long terms this leads to poor memory utilisation, bloating
the kernel footprint, but the problem is exacerbated by our conservative
preallocation scheme for binding VMA. As we are required to allocate all
levels for each vma just in case we need to insert them upon binding,
this leads to a large multiplication factor for a single page vma. By
halving the allocation we need for the page directory structure, we
halve the impact of that factor, bringing workloads that once fitted into
memory, hopefully back to fitting into memory.

We maintain the split between i915_page_directory and i915_page_table as
we only need half the allocation for the lowest, most populous, level.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:23 +03:00
Chris Wilson
89351925a4 drm/i915/gt: Switch to object allocations for page directories
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:08 +03:00
Chris Wilson
cd0452aa2a drm/i915: Preallocate stashes for vma page-directories
We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.

It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.

v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.
v3: Store the shift unto the first PD level to encapsulate the different
PTE counts for gen6/gen8.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:24:05 +03:00
Chris Wilson
b3786b2937 drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs
On the virtual engines, we only use the intel_breadcrumbs for tracking
signaling of stale breadcrumbs from the irq_workers. They do not have
any associated interrupt handling, active requests are passed to a
physical engine and associated breadcrumb interrupt handler. This causes
issues for us as we need to ensure that we do not actually try and
enable interrupts and the powermanagement required for them on the
virtual engine, as they will never be disabled. Instead, let's
specify the physical engine used for interrupt handler on a particular
breadcrumb.

v2: Drop b->irq_armed = true mocking for no interrupt HW

Fixes: 4fe6abb8f5 ("drm/i915/gt: Ignore irq enabling on the virtual engines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:23:55 +03:00
Chris Wilson
56f581bad4 drm/i915/gt: Only transfer the virtual context to the new engine if active
One more complication of preempt-to-busy with respect to the virtual
engine is that we may have retired the last request along the virtual
engine at the same time as preparing to submit the completed request to
a new engine. That submit will be shortcircuited, but not before we have
updated the context with the new register offsets and marked the virtual
engine as bound to the new engine (by calling swap on ve->siblings[]).
As we may have just retired the completed request, we may also be in the
middle of calling virtual_context_exit() to turn off the power management
associated with the virtual engine, and that in turn walks the
ve->siblings[]. If we happen to call swap() on the array as we walk, we
will call intel_engine_pm_put() twice on the same engine.

In this patch, we prevent this by only updating the bound engine after a
successful submission which weeds out the already completed requests.

Alternatively, we could walk a non-volatile array for the pm, such as
using the engine->mask. The small advantage to performing the update
after the submit is that we then only have to do a swap for active
requests.

Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
References: 6d06779e86 ("drm/i915: Load balancing across a virtual engine"
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:23:50 +03:00
Chris Wilson
2854d86632 drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs
After staring at the breadcrumb enabling/cancellation and coming to the
conclusion that the cause of the mysterious stale breadcrumbs must the
act of submitting a completed requests, we can then redirect those
completed requests onto a dedicated signaled_list at the time of
construction and so eliminate intel_engine_transfer_stale_breadcrumbs().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:23:14 +03:00
Chris Wilson
c18636f763 drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs
Since the breadcrumb enabling/cancelling itself is serialised by the
breadcrumbs.irq_lock, with a bit of care we can remove the outer
serialisation with i915_request.lock for concurrent
dma_fence_enable_signaling(). This has the important side-effect of
eliminating the nested i915_request.lock within request submission.

The challenge in serialisation is around the unsubmission where we take
an active request that wants a breadcrumb on the signaling engine and
put it to sleep. We do not want a concurrent
dma_fence_enable_signaling() to attach a breadcrumb as we unsubmit, so
we must mark the request as no longer active before serialising with the
concurrent enable-signaling.

On retire, we serialise with the concurrent enable-signaling, but
instead of clearing ACTIVE, we mark it as SIGNALED.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:29:32 +03:00
Chris Wilson
af5c6fcf40 drm/i915: Provide a fastpath for waiting on vma bindings
Before we can execute a request, we must wait for all of its vma to be
bound. This is a frequent operation for which we can optimise away a
few atomic operations (notably a cmpxchg) in lieu of the RCU protection.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-7-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:29:19 +03:00
Chris Wilson
9ff33bbcda drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier()
As the conversion between idle-barrier and full i915_active_fence is
already serialised by explicit memory barriers, we can reduce the
spinlock in i915_active_acquire_preallocate_barrier() for finding an
idle-barrier to reuse to an RCU read lock to ensure the fence remains
valid, only taking the spinlock for the update of the rbtree itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-6-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:19:11 +03:00
Chris Wilson
e28860ae21 drm/i915: Make the stale cached active node available for any timeline
Rather than require the next timeline after idling to match the MRU
before idling, reset the index on the node and allow it to match the
first request. However, this requires cmpxchg(u64) and so is not trivial
on 32b, so for compatibility we just fallback to keeping the cached node
pointing to the MRU timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-5-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:18:55 +03:00
Chris Wilson
99a7f4dae7 drm/i915: Keep the most recently used active-fence upon discard
Whenever an i915_active idles, we prune its tree of old fence slots to
prevent a gradual leak should it be used to track many, many timelines.
The downside is that we then have to frequently reallocate the rbtree.
A compromise is that we keep the most recently used fence slot, and
reuse that for the next active reference as that is the most likely
timeline to be reused.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:18:36 +03:00
Chris Wilson
5d9341370f drm/i915: Export a preallocate variant of i915_active_acquire()
Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915_active_acquire_for_context(). This raises the activity counter on
i915_active prior to use and ensures that the fence-tree contains a slot
for the context.

v2: Refactor active_lookup() so it can be called again before/after
locking to resolve contention. Since we protect the rbtree until we
idle, we can do a lockfree lookup, with the caveat that if another
thread performs a concurrent insertion, the rotations from the insert
may cause us to not find our target. A second pass holding the treelock
will find the target if it exists, or the place to perform our
insertion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:18:17 +03:00
Chris Wilson
04240e30ed drm/i915: Skip taking acquire mutex for no ref->active callback
If no active callback is defined for i915_active, we do not need to
serialise its enabling with the mutex. We still do only want to call the
debug activate once, and must still serialise with a concurrent retire.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:18:01 +03:00
Chris Wilson
bde246d893 drm/i915/selftests: Drop stale timeline constructor assert
Since we pass around encoded parameters to the kernel context
constructor using the ce->timeline pointer, we can no longer assert that
it should be zero for mock timeline construction.

Fixes: d1bf5dd8f6 ("drm/i915/gt: Support multiple pinned timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731102206.6793-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Updated Fixes: link after rebasing and reordering into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:17:13 +03:00
Chris Wilson
13106019f7 drm/i915/gt: Pull release of node->age under the spinlock
We need to ensure that the list is valid prior to marking the node as
retrievable, otherwise we may see two threads compete over the same node
in intel_gt_get_buffer_pool(). If the first thread acquires and releases
the node in the same jiffie, the second thread may then acquire it (as
the jiffie now again matches the expected value) and claim the node
before it is put back into the list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730134049.8822-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:16:58 +03:00
Chris Wilson
d1bf5dd8f6 drm/i915/gt: Support multiple pinned timelines
We may need to allocate more than one pinned context/timeline for each
engine which can utilise the per-engine HWSP, so we need to give each
a different offset within it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730183906.25422-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:16:43 +03:00
Chris Wilson
eb4dedae92 drm/i915/gem: Delay tracking the GEM context until it is registered
Avoid exposing a partially constructed context by deferring the
list_add() from the initial construction to the end of registration.
Otherwise, if we peek into the list of contexts from inside debugfs, we
may see the partially constructed context and chase down some dangling
incomplete pointers.

Reported-by: CQ Tang <cq.tang@intel.com>
Fixes: 3aa9945a52 ("drm/i915: Separate GEM context construction and registration to userspace")
References: f6e8aa3871 ("drm/i915: Report the number of closed vma held by each context in debugfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:15:25 +03:00
Chris Wilson
a30e4ec176 drm/i915/gt: Fix termination condition for freeing all buffer objects
A last minute change, that unfortunately broke CI so badly it declared
SUCCESS, was to refactor the debug free all buffer pool code to reuse
the normal worker, inverted the termination condition so that it instead
of discarding the nodes, they were all declared young enough and
eligible for reuse.

Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729110756.2344-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Updating Fixes: link after rebasing and reordering into drm-intel-gt-next]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:14:22 +03:00
Chris Wilson
62b1522cc3 drm/i915/selftests: Flush the active barriers before asserting
Before we peek at the barrier status for an assert, first serialise with
its callbacks so that we see a stable value.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728153325.28351-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:14:15 +03:00
Chris Wilson
06b73c2d0b drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool
Some very low hanging fruit, but contention on the pool->lock is
noticeable between intel_gt_get_buffer_pool() and pool_retire(), with
the majority of the hold time due to the locked list iteration. If we
make the node itself RCU protected, we can perform the search for an
suitable node just under RCU, reserving taking the lock itself for
claiming the node and manipulating the list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729080245.8070-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:14:07 +03:00
Chris Wilson
a817c891c1 drm/i915/gt: Disable preparser around xcs invalidations on tgl
Unlike rcs where we have conclusive evidence from our selftesting that
disabling the preparser before performing the TLB invalidate and
relocations does impact upon the GPU execution, the evidence for the
same requirement on xcs is much more circumstantial. Let's apply the
preparser disable between batches as we invalidate the TLB as a dose of
healthy paranoia, just in case.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/2169
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152110.830-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:13:59 +03:00
Chris Wilson
27a5dcfe73 drm/i915/gem: Remove disordered per-file request list for throttling
I915_GEM_THROTTLE dates back to the time before contexts where there was
just a single engine, and therefore a single timeline and request list
globally. That request list was in execution/retirement order, and so
walking it to find a particular aged request made sense and could be
split per file.

That is no more. We now have many timelines with a file, as many as the
user wants to construct (essentially per-engine, per-context). Each of
those run independently and so make the single list futile. Remove the
disordered list, and iterate over all the timelines to find a request to
wait on in each to satisfy the criteria that the CPU is no more than 20ms
ahead of its oldest request.

It should go without saying that the I915_GEM_THROTTLE ioctl is no
longer used as the primary means of throttling, so it makes sense to push
the complication into the ioctl where it only impacts upon its few
irregular users, rather than the execbuf/retire where everybody has to
pay the cost. Fortunately, the few users do not create vast amount of
contexts, so the loops over contexts/engines should be concise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:13:50 +03:00
Chris Wilson
3adee4ac29 drm/i915: Soften the tasklet flush frequency before waits
We include a tasklet flush before waiting on a request as a precaution
against the HW being lax in event signaling. We now have a precautionary
flush in the engine's heartbeat and so do not need to be quite so
zealous on every request wait. If we focus on the request, the only
tasklet flush that matters is if there is a delay in submitting this
request to HW, so if the request is not ready to be executed, no
advantage in reducing this wait can be gained by running the tasklet.
And there is little point in doing busy work for no result.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200715115147.11866-10-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:13:41 +03:00
Chris Wilson
e3d0e21396 drm/i915/selftests: Mock the status_page.vma for the kernel_context
Since we assert that the kernel_context is using the perma-pinned HWSP,
make it so.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2179
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200715155858.16410-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:13:29 +03:00
Chris Wilson
3f6a6f343c drm/i915: Reduce i915_request.lock contention for i915_request_wait
Currently, we use i915_request_completed() directly in
i915_request_wait() and follow up with a manual invocation of
dma_fence_signal(). This appears to cause a large number of contentions
on i915_request.lock as when the process is woken up after the fence is
signaled by an interrupt, we will then try and call dma_fence_signal()
ourselves while the signaler is still holding the lock.
dma_fence_is_signaled() has the benefit of checking the
DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
avoids most of that contention.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716100754.5670-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 13:13:06 +03:00
Linus Torvalds
68beef5710 xen: branch for v5.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX1Rn1wAKCRCAXGG7T9hj
 vlEjAQC/KGC3wYw5TweWcY48xVzgvued3JLAQ6pcDlOe6osd6AEAzZcZKgL948cx
 oY0T98dxb/U+lUhbIzhpBr/30g8JbAQ=
 =Xcxp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "A small series for fixing a problem with Xen PVH guests when running
  as backends (e.g. as dom0).

  Mapping other guests' memory is now working via ZONE_DEVICE, thus not
  requiring to abuse the memory hotplug functionality for that purpose"

* tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: add helpers to allocate unpopulated memory
  memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
  xen/balloon: add header guard
2020-09-06 09:59:27 -07:00
Laurent Pinchart
3e8b240354 drm: xlnx: dpsub: Fix DMADEVICES Kconfig dependency
The dpsub driver uses the DMA engine API, and thus selects DMA_ENGINE to
provide that API. DMA_ENGINE depends on DMADEVICES, which can be
deselected by the user, creating a possibly unmet indirect dependency:

WARNING: unmet direct dependencies detected for DMA_ENGINE
  Depends on [n]: DMADEVICES [=n]
  Selected by [m]:
  - DRM_ZYNQMP_DPSUB [=m] && HAS_IOMEM [=y] && (ARCH_ZYNQMP || COMPILE_TEST [=y]) && COMMON_CLK [=y] && DRM [=m] && OF [=y]

Add a dependency on DMADEVICES to fix this.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
2020-09-05 19:52:54 +03:00
Jordan Crouse
f6828e0c40 drm/msm: Disable the RPTR shadow
Disable the RPTR shadow across all targets. It will be selectively
re-enabled later for targets that need it.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-04 12:14:15 -07:00
Jordan Crouse
7b3f3948c8 drm/msm: Disable preemption on all 5xx targets
Temporarily disable preemption on a5xx targets pending some improvements
to protect the RPTR shadow from being corrupted.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-04 12:14:15 -07:00
Jordan Crouse
604234f336 drm/msm: Enable expanded apriv support for a650
a650 supports expanded apriv support that allows us to map critical buffers
(ringbuffer and memstore) as as privileged to protect them from corruption.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-04 12:14:07 -07:00
Jordan Crouse
34221545d2 drm/msm: Split the a5xx preemption record
The main a5xx preemption record can be marked as privileged to
protect it from user access but the counters storage needs to be
remain unprivileged. Split the buffers and mark the critical memory
as privileged.

Cc: stable@vger.kernel.org
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-04 12:12:56 -07:00
Linus Torvalds
cf85f5de83 drm fixes for 5.9-rc4
amdgpu:
 - Fix for 32bit systems
 - SW CTF fix
 - Update for Sienna Cichlid
 - CIK bug fixes
 
 radeon:
 - PLL fix
 
 i915:
 - Clang build warning fix
 - HDCP fixes
 
 nouveau:
 - display fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJfUbg9AAoJEAx081l5xIa+O/cP/RHOAmBdFwPJSkzj93hy3LGZ
 uOCbB7gIhnVl9DObPQncKe8ZYd6XmMhCeFmOTcAXJEdcJkm4cDCe+xrM8Jcvr7pZ
 gHesqBchXmlTsunK44bP+ljh6y8J0wv06KRDpxhJv78lk0k3jg39ivT+5znvR1NU
 Wl5R4mkoPZknS92hGV/saH+5wbgsGJCtsOed2/sTE2mfL72Nw5Ym4ZEFGiaxSpUC
 wS83iV0sgOFLjj2jhpkXA3YJ+rTWx1Gg9VqD0Zn5lUVTPCrnevVItztXjQ7FtAC6
 ADziGhIxFkyHnZBQNTmItzNSPTsWDwX60Kk9obU44s/0QOWmf5znNocsVk/Lhv6N
 qREzQVqPjUFmFgWSBQ2bFlXdnrUhb2LHngnyScdk2QTGjfIaSXOUE5KV14LkS/C8
 vKtKlIrGsQSC02eWhNqih0NIO4EFsyNtx/Mw7FlID7D9rZeUCgFpuaknlS14aNDR
 a7luJeNBhwnmpgi8ejWTAhTwMXgSa9Vx33El26bUH6jCDVYk94+4S5Z6AUkco1pZ
 egP/8k49OH4pfPxv/M9ZiPdEM4DFWTsp/hWLKonZdaQ0pciTi/GC1Ett4MRa+j+V
 Mofv7pT42ZoAui2VcKXkQzZpgFff5Ca+PYjGE8O+FbH+pr+zJzUGNhJ/00Or1L11
 tT1BQ3ae++9lyqAX7Re2
 =eBDY
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Not much going on this week, nouveau has a display hw bug workaround,
  amdgpu has some PM fixes and CIK regression fixes, one single radeon
  PLL fix, and a couple of i915 display fixes.

  amdgpu:
   - Fix for 32bit systems
   - SW CTF fix
   - Update for Sienna Cichlid
   - CIK bug fixes

  radeon:
   - PLL fix

  i915:
   - Clang build warning fix
   - HDCP fixes

  nouveau:
   - display fixes"

* tag 'drm-fixes-2020-09-04' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug
  drm/nouveau/kms/nv50-gp1xx: disable notifies again after core update
  drm/nouveau/kms/nv50-: add some whitespace before debug message
  drm/nouveau/kms/gv100-: Include correct push header in crcc37d.c
  drm/radeon: Prefer lower feedback dividers
  drm/amdgpu: Fix bug in reporting voltage for CIK
  drm/amdgpu: Specify get_argument function for ci_smu_funcs
  drm/amd/pm: enable MP0 DPM for sienna_cichlid
  drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting
  drm/amd/pm: fix is_dpm_running() run error on 32bit system
  drm/i915: Clear the repeater bit on HDCP disable
  drm/i915: Fix sha_text population code
  drm/i915/display: Ensure that ret is always initialized in icl_combo_phy_verify_state
2020-09-04 11:59:44 -07:00
Linus Torvalds
b25d1dc947 Merge branch 'simplify-do_wp_page'
Merge emailed patches from Peter Xu:
 "This is a small series that I picked up from Linus's suggestion to
  simplify cow handling (and also make it more strict) by checking
  against page refcounts rather than mapcounts.

  This makes uffd-wp work again (verified by running upmapsort)"

Note: this is horrendously bad timing, and making this kind of
fundamental vm change after -rc3 is not at all how things should work.
The saving grace is that it really is a a nice simplification:

 8 files changed, 29 insertions(+), 120 deletions(-)

The reason for the bad timing is that it turns out that commit
17839856fd ("gup: document and work around 'COW can break either way'
issue" broke not just UFFD functionality (as Peter noticed), but Mikulas
Patocka also reports that it caused issues for strace when running in a
DAX environment with ext4 on a persistent memory setup.

And we can't just revert that commit without re-introducing the original
issue that is a potential security hole, so making COW stricter (and in
the process much simpler) is a step to then undoing the forced COW that
broke other uses.

Link: https://lore.kernel.org/lkml/alpine.LRH.2.02.2009031328040.6929@file01.intranet.prod.int.rdu2.redhat.com/

* emailed patches from Peter Xu <peterx@redhat.com>:
  mm: Add PGREUSE counter
  mm/gup: Remove enfornced COW mechanism
  mm/ksm: Remove reuse_ksm_page()
  mm: do_wp_page() simplification
2020-09-04 09:31:54 -07:00
Peter Xu
a308c71bf1 mm/gup: Remove enfornced COW mechanism
With the more strict (but greatly simplified) page reuse logic in
do_wp_page(), we can safely go back to the world where cow is not
enforced with writes.

This essentially reverts commit 17839856fd ("gup: document and work
around 'COW can break either way' issue").  There are some context
differences due to some changes later on around it:

  2170ecfa76 ("drm/i915: convert get_user_pages() --> pin_user_pages()", 2020-06-03)
  376a34efa4 ("mm/gup: refactor and de-duplicate gup_fast() code", 2020-06-03)

Some lines moved back and forth with those, but this revert patch should
have striped out and covered all the enforced cow bits anyways.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-04 09:25:20 -07:00
Gerd Hoffmann
fc7f148feb drm/virtio: drop virtio_gpu_output->enabled
Not needed, already tracked by drm_crtc_state->active.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20200818072511.6745-3-kraxel@redhat.com
(cherry picked from commit 1174c8a0f3)
2020-09-04 13:11:32 +02:00
Maxime Ripard
5e2e2600a3
drm/sun4i: backend: Disable alpha on the lowest plane on the A20
Unlike we previously thought, the per-pixel alpha is just as broken on the
A20 as it is on the A10. Remove the quirk that says we can use it.

Fixes: dcf496a6a6 ("drm/sun4i: sun4i: Introduce a quirk for lowest plane alpha support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728134810.883457-2-maxime@cerno.tech
2020-09-04 11:13:00 +02:00
Maxime Ripard
e359c70462
drm/sun4i: backend: Support alpha property on lowest plane
Unlike what we previously thought, only the per-pixel alpha is broken on
the lowest plane and the per-plane alpha isn't. Remove the check on the
alpha property being set on the lowest plane to reject a mode.

Fixes: dcf496a6a6 ("drm/sun4i: sun4i: Introduce a quirk for lowest plane alpha support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728134810.883457-1-maxime@cerno.tech
2020-09-04 11:12:54 +02:00
Jernej Skrabec
0ee9f600e6
drm/sun4i: Fix DE2 YVU handling
Function sun8i_vi_layer_get_csc_mode() is supposed to return CSC mode
but due to inproper return type (bool instead of u32) it returns just 0
or 1. Colors are wrong for YVU formats because of that.

Fixes: daab3d0e8e ("drm/sun4i: de2: csc_mode in de2 format struct is mostly redundant")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200901220305.6809-1-jernej.skrabec@siol.net
2020-09-04 11:05:58 +02:00
Roger Pau Monne
9e2369c06c xen: add helpers to allocate unpopulated memory
To be used in order to create foreign mappings. This is based on the
ZONE_DEVICE facility which is used by persistent memory devices in
order to create struct pages and kernel virtual mappings for the IOMEM
areas of such devices. Note that on kernels without support for
ZONE_DEVICE Xen will fallback to use ballooned pages in order to
create foreign mappings.

The newly added helpers use the same parameters as the existing
{alloc/free}_xenballooned_pages functions, which allows for in-place
replacement of the callers. Once a memory region has been added to be
used as scratch mapping space it will no longer be released, and pages
returned are kept in a linked list. This allows to have a buffer of
pages and prevents resorting to frequent additions and removals of
regions.

If enabled (because ZONE_DEVICE is supported) the usage of the new
functionality untangles Xen balloon and RAM hotplug from the usage of
unpopulated physical memory ranges to map foreign pages, which is the
correct thing to do in order to avoid mappings of foreign pages depend
on memory hotplug.

Note the driver is currently not enabled on Arm platforms because it
would interfere with the identity mapping required on some platforms.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20200901083326.21264-4-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2020-09-04 10:00:01 +02:00
Dave Airlie
d37d569200 Merge branch 'linux-5.9' of git://github.com/skeggsb/linux into drm-fixes
A couple of minor fixes to the display changes that went in for 5.9.
The most important of which is a workaround for a HW bug that was
exposed by better push buffer space management, leading to
random(ish...) display engine hangs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv5QDxyMihrxbPk+-sORnaYtjR6_dbM68gEhb2wxht_G1w@mail.gmail.com
2020-09-04 11:14:49 +10:00
Dave Airlie
0f8aeef1a5 Merge tag 'drm-intel-fixes-2020-09-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc4:
- Clang build warning fix
- HDCP fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sgbz2pnx.fsf@intel.com
2020-09-04 11:00:48 +10:00
Linus Walleij
f71800228d drm/tve200: Stabilize enable/disable
The TVE200 will occasionally print a bunch of lost interrupts
and similar dmesg messages, sometimes during boot and sometimes
after disabling and coming back to enablement. This is probably
because the hardware is left in an unknown state by the boot
loader that displays a logo.

This can be fixed by bringing the controller into a known state
by resetting the controller while enabling it. We retry reset 5
times like the vendor driver does. We also put the controller
into reset before de-clocking it and clear all interrupts before
enabling the vblank IRQ.

This makes the video enable/disable/enable cycle rock solid
on the D-Link DIR-685. Tested extensively.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200820203144.271081-1-linus.walleij@linaro.org
2020-09-03 21:14:09 +02:00
Alex Deucher
11bc98bd71 drm/amdgpu/mmhub2.0: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:34 -04:00
Alex Deucher
02f23f5f7c drm/amdgpu/gmc9: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:29 -04:00
Alex Deucher
93fabd84c9 drm/amdgpu/gmc10: print client id string for gfxhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:27 -04:00