When looking at SDEISR to determine the connection status of combo
outputs, we should use the phy index rather than the port index.
Although they're usually the same thing, EHL's DDI-D (port D) is
attached to PHY-A and SDEISR doesn't even have bits for a "D" output.
It's also possible that future platforms may map DDIs (the internal
display engine programming units) to PHYs (the output handling on the IO
side) in ways where port!=phy, so let's look at the PHY index by
default.
v2: Rename to intel_combo_phy_connected. (Lucas)
Fixes: 719d240026 ("drm/i915/ehl: Enable DDI-D")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-2-matthew.d.roper@intel.com
(cherry picked from commit 3d1e388d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
- remove unneeded asm headers from hexagon, ia64
- add 'dir-pkg' target, which works like 'tar-pkg' but skips archiving
- add 'helpnewconfig' target, which shows help for new CONFIG options
- support 'make nsdeps' for external modules
- make rebuilds faster by deleting $(wildcard $^) checks
- remove compile tests for kernel-space headers
- refactor modpost to simplify modversion handling
- make single target builds faster
- optimize and clean up scripts/kallsyms.c
- refactor various Makefiles and scripts
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl3lKCUVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGu9sP/iTW/RjDxbAsu0aP8jFqzLK/xKB/
NQn/+dD76TjEmjgew9AXszf2rJL+ixKVymGM08FV59Bbguvi8XmAB/QXK21Sjb5j
rVl3N97TWNkvXM+QJyly23G2UtbubRSPo3g+e70BZrw3lcmrsK+sAmTOL5KtIrNX
9BHM803JwqsMJyvBwTBBw3UFeeBqb38Qx6gmigfSihuDf6pvjoVDKskpsDno3wX7
rdiXYxAsKQLQ/P2ym/bV/Oqe90RqRtV/2/WCpLshlwHkiM9huflv6GjgCkkbAx5H
N3TSptlS7l/2B/XKHgA5ALjHjUlxTGBzLLoevarCd8loKcQXFlgx+vd3nM/WJlHJ
x9UpTklDwGP9eUBsa9W980tEyUVsFGMAC8EcTdW6NN2IRtuCOSA5N2FYYt8/SDd0
2b3PhElTJIp4pTWSYN6JZxB1R8n/YBgxLqOJ6N2U6B9CdKFUCHlwGH23QfN89km/
WEMP85bsaab/dnyxbwelkoYYYyPgUHsC13AbpkHdrDxMbAGO+G1PwpHxC6ErF2en
wRGrcUxWTfHRykO5aJIQtCB9b1fv73134mTzB5fTYd6GtjepGBSBCO9xb2Iy4sc9
Y+nHVVDUrihvSOpJgqh677PcLDutOZR8fFCoc1ZMDAbBsDvrb0Qsee6oEidj98xc
5kXp9YZh/tdh/tdo
=zUaB
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- remove unneeded asm headers from hexagon, ia64
- add 'dir-pkg' target, which works like 'tar-pkg' but skips archiving
- add 'helpnewconfig' target, which shows help for new CONFIG options
- support 'make nsdeps' for external modules
- make rebuilds faster by deleting $(wildcard $^) checks
- remove compile tests for kernel-space headers
- refactor modpost to simplify modversion handling
- make single target builds faster
- optimize and clean up scripts/kallsyms.c
- refactor various Makefiles and scripts
* tag 'kbuild-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (59 commits)
MAINTAINERS: update Kbuild/Kconfig maintainer's email address
scripts/kallsyms: remove redundant initializers
scripts/kallsyms: put check_symbol_range() calls close together
scripts/kallsyms: make check_symbol_range() void function
scripts/kallsyms: move ignored symbol types to is_ignored_symbol()
scripts/kallsyms: move more patterns to the ignored_prefixes array
scripts/kallsyms: skip ignored symbols very early
scripts/kallsyms: add const qualifiers where possible
scripts/kallsyms: make find_token() return (unsigned char *)
scripts/kallsyms: replace prefix_underscores_count() with strspn()
scripts/kallsyms: add sym_name() to mitigate cast ugliness
scripts/kallsyms: remove unneeded length check for prefix matching
scripts/kallsyms: remove redundant is_arm_mapping_symbol()
scripts/kallsyms: set relative_base more effectively
scripts/kallsyms: shrink table before sorting it
scripts/kallsyms: fix definitely-lost memory leak
scripts/kallsyms: remove unneeded #ifndef ARRAY_SIZE
kbuild: make single target builds even faster
modpost: respect the previous export when 'exported twice' is warned
modpost: do not set ->preloaded for symbols from Module.symvers
...
Similar to for i915_active.mutex, we require each class of i915_active
to have distinct lockdep chains as some, but by no means all,
i915_active are used within the shrinker and so have much more severe
usage constraints. By using a lockclass local to i915_active_init() all
i915_active workers have the same lock class, and we may generate false
positives when waiting for the i915_active. If we push the lockclass
into the caller, each class of i915_active will have distinct lockdep
chains.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191202140133.2444217-1-chris@chris-wilson.co.uk
The unaligned ioread32() will make us read byte by byte looking for the
vbt. We could just as well have done a ioread8() + a shift and avoid the
extra confusion on how we are looking for "$VBT".
However when using ACPI it's guaranteed the VBT is 4-byte aligned
per spec, so we can probably assume it here as well.
v2: do not try to simplify the loop by eliminating the auxiliary counter
(Jani and Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191126225110.8127-4-lucas.demarchi@intel.com
We don't need to keep the pci rom mapped during the entire
intel_bios_init() anymore. Move it to the previous copy_vbt() function
and rename it to oprom_get_vbt() since now it's responsible to to all
operations related to get the vbt from the oprom.
v2: fix double __iomem attribute detected by sparse
v3: fix missing unmap on success (Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191126225110.8127-3-lucas.demarchi@intel.com
When we map the VBT through pci_map_rom() we may not be allowed
to simply discard the address space and go on reading the memory.
That doesn't work on my test system, but by dumping the rom via
sysfs I can can get the correct vbt. So change our find_vbt() to do
the same as done by pci_read_rom(), i.e. use memcpy_fromio().
v2: the just the minimal changes by not bothering with the unaligned io
reads: this can be done on top (from Ville and Jani)
v3: drop const in function return since now we are copying the vbt,
rather than just finding it
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191126225110.8127-2-lucas.demarchi@intel.com
From VBT 228+ this is block that PSR and other power saving
features configuration should be read from.
v3:
Using DRRS from this new block
v4:
Using BIT()
Fixing DRRS comment in parse_power_conservation_features()
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128014852.214135-5-jose.souza@intel.com
eDP specification states that sink can have its PSR capability
changed, I have never found any panel doing that but lets add that
for completeness.
For now it is not reading back the PSR capabilities and if possible
re-enabling PSR, this will be added if a panel is found using this
feature.
v4:
Cleaning DP_PSR_CAPS_CHANGE
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128014852.214135-4-jose.souza@intel.com
When this error happens sink link is not stable after the required
FW_EXIT_LATENCY period so it will miss the selective update.
As the other PSR errors, for now we are not trying to recover from
it.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128014852.214135-3-jose.souza@intel.com
eDP spec states that when sink enconters a problem that prevents it
to keep PSR running it should set PSR status to internal error and
set the reason why it happen to PSR_ERROR_STATUS but it is not how it
was implemented.
But also I don't want to change this behavior, who knows if there is
a panel out there that only set the PSR_ERROR_STATUS.
So here refactoring the code a bit to make more easy to read what was
state above as more checks will be added to this function.
v2:
returning a int instead of a bool in psr_get_status_and_error_status()
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128014852.214135-2-jose.souza@intel.com
PSR2 HW only support a limited number of bits per pixel, if mode has
more than supported PSR2 should not be enabled.
BSpec: 50422
BSpec: 7713
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128014852.214135-1-jose.souza@intel.com
The "err" label is not really "err", but rather "out" since the return
path is shared between error condition and normal path. This broke when
commit 03cea61076 ("drm/i915/dsb: fix extra warning on error path
handling") added a "dsb->cmd_buf = NULL;" there, making DSB to stop
working since now all writes would pass-through via mmio.
Remove the set to NULL since it's actually not needed: we only set it if
all steps are successful. While at it, rename the label so this confusion
doesn't happen again.
Fixes: 03cea61076 ("drm/i915/dsb: fix extra warning on error path handling")
Resolves: https://gitlab.freedesktop.org/drm/intel/issues/8
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127221119.384754-1-lucas.demarchi@intel.com
Use the canonical "[CRTC:%d:%s]" format for the obj id/name
in the debugfs display_info dump. Everyone should already be
familiar with the format since it's used in the debug logs
extensively.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-8-ville.syrjala@linux.intel.com
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
No point in repeating the crtc mode for each cloned encoder.
Just print it once, and avoid using multiple lines for it.
And while at let's polish the fixed mode print to fit on
one line as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-6-ville.syrjala@linux.intel.com
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
The bspec tells us 'Program SHPD_FILTER_CNT with the "500 microseconds
adjusted" value before enabling hotplug detection' on CNP+. We haven't
been touching this register at all thus far, but we should probably
follow the bspec's guidance.
The register also exists on LPT and SPT, but there isn't any specific
guidance I can find on how we should be programming it there so let's
leave it be for now.
Bspec: 4342
Bspec: 31297
Bspec: 8407
Bspec: 49305
Bspec: 50473
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-3-matthew.d.roper@intel.com
When looking at SDEISR to determine the connection status of combo
outputs, we should use the phy index rather than the port index.
Although they're usually the same thing, EHL's DDI-D (port D) is
attached to PHY-A and SDEISR doesn't even have bits for a "D" output.
It's also possible that future platforms may map DDIs (the internal
display engine programming units) to PHYs (the output handling on the IO
side) in ways where port!=phy, so let's look at the PHY index by
default.
v2: Rename to intel_combo_phy_connected. (Lucas)
Fixes: 719d240026 ("drm/i915/ehl: Enable DDI-D")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-2-matthew.d.roper@intel.com
The South Display is part of the PCH so we should technically be basing
our port detection logic off the PCH in use rather than the platform
generation.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-1-matthew.d.roper@intel.com
LPT/WPT only have PCH transcoder A. Make sure we poke at its
chicken register instead of some non-existent register when
FDI is being driven by pipe B or C.
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128182358.14477-1-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Though the context is closed and so no more requests can be added to the
timeline, retirement can still be removing requests. It can even be
removing the very request we are inspecting and so cause us to wander
into dead links.
Serialise with the retirement by taking the timeline->mutex used for
guarding the timeline->requests list.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404
Fixes: 4a31741521 ("drm/i915/gem: Refine occupancy test in kill_context()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk
(cherry picked from commit 7ce596a803)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Correct valid command length check for MI_ATOMIC, need to check inline
data available field instead of operand data length for whole command.
Fixes: 00a33be406 ("drm/i915/gvt: Add valid length check for MI variable commands")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Gao Fred <fred.gao@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
As the gen6 page directory is written on binding and after every update,
the code ended up duplicated. Refactor the code into a single routine to
share the locking and serialisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191201140916.2128905-1-chris@chris-wilson.co.uk
Now that many threads may try to use the same mmio to flush the global
buffers after updating the PTE, serialise access to the mmio to prevent
concurrent access on gen7.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191130162320.1683424-1-chris@chris-wilson.co.uk
After much hair pulling, resort to preallocating the ppGTT entries on
init to circumvent the apparent lack of PD invalidate following the
write to PP_DCLV upon switching mm between contexts (and here the same
context after binding new objects). However, the details of that PP_DCLV
invalidate are still unknown, and it appears we need to reload the mm
twice to cover over a timing issue. Worrying.
Fixes: 3dc007fe9b ("drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129201328.1398583-1-chris@chris-wilson.co.uk
Keep the engine awake and so avoid frequent cycling in and out of
powersaving mode to eliminate the unnecessary overhead and speed up the
testing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129222702.1456292-1-chris@chris-wilson.co.uk
As we only cancel the timers asynchronously, they may
still be running on another CPU as we shutdown, raising one last
softirq. So be safe and make sure the tasklet is flushed before
destroying the engine's memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129172542.1222810-1-chris@chris-wilson.co.uk
Though the context is closed and so no more requests can be added to the
timeline, retirement can still be removing requests. It can even be
removing the very request we are inspecting and so cause us to wander
into dead links.
Serialise with the retirement by taking the timeline->mutex used for
guarding the timeline->requests list.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404
Fixes: 4a31741521 ("drm/i915/gem: Refine occupancy test in kill_context()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk
skl_commit_modeset_enables() straight up compares dirty_pipes
with a bitmask of already committed pipes. If we set bits in
dirty_pipes for non-existent pipes that comparison will never
work right. So let's limit ourselves to bits that exist.
And we'll do the same for the active_pipes_changed bitmask.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011200949.7839-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Since commit c45e788d95 ("drm/i915/tgl: Suspend pre-parser across GTT
invalidations"), we now disable the advanced preparser on Tigerlake for the
invalidation phase at the start of the batch, we no longer need to emit
the GPU relocations from a second context as they are now flushed inlined.
References: 8a9a982767 ("drm/i915: use a separate context for gpu relocs")
References: c45e788d95 ("drm/i915/tgl: Suspend pre-parser across GTT invalidations")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129124846.949100-1-chris@chris-wilson.co.uk
Implement Wa_1604555607 (set the DS pairing timer to 128 cycles).
FF_MODE2 is part of the register state context, that's why it is
implemented here.
At TGL A0 stepping, FF_MODE2 register read back is broken, hence
disabling the WA verification.
v2: Rebased on top of the WA refactoring (Oscar)
v3: Correctly add to ctx_workarounds_init (Michel)
v4:
uncore read is used [Tvrtko]
Macros as used for MASK definition [Chris]
v5:
Skip the Wa_1604555607 verification [Ram]
i915 ptr retrieved from engine. [Tvrtko]
v6:
Added wa_add as a wrapper for __wa_add [Chris]
wa_add is directly called instead of new wrapper [tvrtko]
BSpec: 19363
HSDES: 1604555607
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Ramalingam C <ramlingam.c@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v5]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128021005.3350-1-ramalingam.c@intel.com
One does not lightly add a new hidden struct_mutex dependency deep within
the execbuf bowels! The immediate suspicion in seeing the whitelist
cached on the context, is that it is intended to be preserved between
batches, as the kernel is quite adept at caching small allocations
itself. But no, it's sole purpose is to serialise command submission in
order to save a kmalloc on a slow, slow path!
By removing the whitelist dependency from the context, our freedom to
chop the big struct_mutex is greatly augmented.
v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed
concurrently.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJd3YntAAoJEAx081l5xIa+dcQP/ikABkpm+q23FLKteRpL1rtX
xqlg5+KHW+YVCDls2BrINF6vYzyisoa8fNPlKMmOHse/IgMhFe9vBbCj1KQQOUR1
apNycI1wrcw/mn2WDikoIcF6C5cjqK9YVknnYoM6HnF1VmpGd1ecSGrOHrunEkrK
cMAWYIeqWGU8Gj/HUOitAFpLWFUMNle0BJuRoGLcoMusgS8yuCIEcpNzRhgL8fvJ
bW4imuyv24OjPoQzbKD0oQ0VIP86H0eM4LIeGZ2uyK/BSPKmMDqI4z4isUheS7RL
w4a6BdobMIdhew5dBXS0LsUJ3JniVJdHy123q9KgpmQAhGpiNoLT6BujfoUTUeWx
Mu0vM8Xmv9n4npdBYC+fLEFQXYJlu9uBA490jP84Kz6Fg1c6GyBebDY7/c2O4Zmg
7pvygmUF6boD6v2sIC/3161crgwU4g8zoxm2V4i9naxes2QB13LiEuJWlaI/FdxY
fd3zpglFGdoF1ThNne4QDh6gMKpXvjITyu/QxZeZ67Dt6i0Aqw9cRGHSpiVhYyDc
cx2hAp+rDvUi5SHkJKFpVImjB2DDn2xUG2uFMHz0cy9wNg203L3fRDi0hVtnM1+W
VpCxyLs2Upz6kEjDRVsfMZ9chCcWAWpVuKhtuuMUDw/IKnbP3uV8kzgJpVpaRVkD
76s5uYWHHBlk1IVlkOUP
=Hj7G
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Lots of stuff in here, though it hasn't been too insane this merge
apart from dealing with the security fun.
uapi:
- export different colorspace properties on DP vs HDMI
- new fourcc for ARM 16x16 block format
- syncobj: allow querying last submitted timeline value
- DRM_FORMAT_BIG_ENDIAN defined as unsigned
core:
- allow using gem vma manager in ttm
- connector/encoder/bridge doc fixes
- allow more than 3 encoders for a connector
- displayport mst suspend/resume reprobing support
- vram lazy unmapping, uniform vram mm and gem vram
- edid cleanups + AVI informframe bar info
- displayport helpers - dpcd parser added
dp_cec:
- Allow a connector to be associated with a cec device
ttm:
- pipelining with no_gpu_wait fix
- always keep BOs on the LRU
sched:
- allow free_job routine to sleep
i915:
- Block userptr from mappable GTT
- i915 perf uapi versioning
- OA stream dynamic reconfiguration
- make context persistence optional
- introduce DRM_I915_UNSTABLE Kconfig
- add fake lmem testing under unstable
- BT.2020 support for DP MSA
- struct mutex elimination
- Tigerlake display/PLL/power management improvements
- Jasper Lake PCH support
- refactor PMU for multiple GPUs
- Icelake firmware update
- Split out vga + switcheroo code
amdgpu:
- implement dma-buf import/export without helpers
- vega20 RAS enablement
- DC i2c over aux fixes
- renoir GPU reset
- DC HDCP support
- BACO support for CI/VI asics
- MSI-X support
- Arcturus EEPROM support
- Arcturus VCN encode support
- VCN dynamic powergating on RV/RV2
amdkfd:
- add navi12/14/renoir support to kfd
radeon:
- SI dpm fix ported from amdgpu
- fix bad DMA on ppc platforms
gma500:
- memory leak fixes
qxl:
- convert to new gem mmap
exynos:
- build warning fix
komeda:
- add aclk sysfs attribute
v3d:
- userspace cleanup uapi change
i810:
- fix for underflow in dispatch ioctls
ast:
- refactor show_cursor
mgag200:
- refactor show_cursor
arcgpu:
- encoder finding improvements
mediatek:
- mipi_tx, dsi and partial crtc support for MT8183 SoC
- rotation support
meson:
- add suspend/resume support
omap:
- misc refactors
tegra:
- DisplayPort support for Tegra 210, 186 and 194.
- IOMMU-backed DMA API fixes
panfrost:
- fix lockdep issue
- simplify devfreq integration
rcar-du:
- R8A774B1 SoC support
- fixes for H2 ES2.0
sun4i:
- vcc-dsi regulator support
virtio-gpu:
- vmexit vs spinlock fix
- move to gem shmem helpers
- handle large command buffers with cma"
* tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm: (1855 commits)
drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
merge fix for "ftrace: Rework event_create_dir()"
drm/amdgpu: Update Arcturus golden registers
drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
Revert "drm/amd/display: enable S/G for RAVEN chip"
drm/amdgpu: disable gfxoff on original raven
drm/amdgpu: remove experimental flag for Navi14
drm/amdgpu: disable gfxoff when using register read interface
drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK (v2)
drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
drm/radeon: fix bad DMA from INTERRUPT_CNTL2
drm/amd/display: Fix debugfs on MST connectors
drm/amdgpu/nv: add asic func for fetching vbios from rom directly
drm/amdgpu: put flush_delayed_work at first
drm/amdgpu/vcn2.5: fix the enc loop with hw fini
...
The design of our interrupt handlers is that we ack the receipt of the
interrupt first, inside the critical section where the master interrupt
control is off and other cpus cannot start processing the next
interrupt; and then process the interrupt events afterwards. However,
Icelake introduced a whole new set of banked GT_IIR that are inherently
serialised and slow to retrieve the IIR and must be processed within the
critical section. We can still push our breadcrumbs out of this critical
section by using our irq_worker. On bdw+, this should not make too much
of a difference as we only slightly defer the breadcrumbs, but on icl+
this should make a big difference to our throughput of interrupts from
concurrently executing engines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127115813.3345823-1-chris@chris-wilson.co.uk
The expected downside to commit 58b4c1a07a ("drm/i915: Reduce nested
prepare_remote_context() to a trylock") was that it would need to return
-EAGAIN to userspace in order to resolve potential mutex inversion. Such
an unsightly round trip is unnecessary if we could atomically insert a
barrier into the i915_active_fence, so make it happen.
Currently, we use the timeline->mutex (or some other named outer lock)
to order insertion into the i915_active_fence (and so individual nodes
of i915_active). Inside __i915_active_fence_set, we only need then
serialise with the interrupt handler in order to claim the timeline for
ourselves.
However, if we remove the outer lock, we need to ensure the order is
intact between not only multiple threads trying to insert themselves
into the timeline, but also with the interrupt handler completing the
previous occupant. We use xchg() on insert so that we have an ordered
sequence of insertions (and each caller knows the previous fence on
which to wait, preserving the chain of all fences in the timeline), but
we then have to cmpxchg() in the interrupt handler to avoid overwriting
the new occupant. The only nasty side-effect is having to temporarily
strip off the RCU-annotations to apply the atomic operations, otherwise
the rules are much more conventional!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402
Fixes: 58b4c1a07a ("drm/i915: Reduce nested prepare_remote_context() to a trylock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk
Now that we rapidly park the GT when the GPU idles, we often find
ourselves idling faster than the RC6 promotion timer. Thus if we tell
the GPU to enter RC6 manually as we park, we can do so quicker (by
around 50ms, half an EI on average) and marginally increase our
powersaving across all execlists platforms.
v2: Now with a selftest to check we can enter RC6 manually
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191127095657.3209854-1-chris@chris-wilson.co.uk
On context retiring, we may invoke the kernel_context to unpin this
context. Elsewhere, we may use the kernel_context to modify this
context. This currently leads to an AB-BA lock inversion, so we need to
back-off from the contended lock, and repeat.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: a9877da2d6 ("drm/i915/oa: Reconfigure contexts on the fly")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk
(cherry picked from commit 58b4c1a07a)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Based on a sampling of a number of benchmarks across platforms, by
default opt for a much more lenient timeout so that we should not
adversely affect existing "good" clients.
640ms ought to be enough for anyone.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk
(cherry picked from commit 5766a5ffc6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The BitField of non privilege register address is only from bit 2 to 25.
v2: use REG_GENMASK instead. (Zhenyu)
Signed-off-by: Gao, Fred <fred.gao@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
During the Display Interrupt Service routine the Display Interrupt
Enable bit must be disabled, The interrupts handled, then the
Display Interrupt Enable bit must be set to prevent possible missed
interrupts.
Bspec: 49212
V2: Change Title to remove SDE reference.
V3: Fix TAB spacing.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121201455.2558-1-clinton.a.taylor@intel.com
Starting with gen12, PORT_A can be connected to a transcoder
with audio support. Modify the existing logic that disabled
audio on PORT_A unconditionally.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125125313.17584-1-kai.vehmanen@linux.intel.com
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- A comprehensive rewrite of the robust/PI futex code's exit handling
to fix various exit races. (Thomas Gleixner et al)
- Rework the generic REFCOUNT_FULL implementation using
atomic_fetch_* operations so that the performance impact of the
cmpxchg() loops is mitigated for common refcount operations.
With these performance improvements the generic implementation of
refcount_t should be good enough for everybody - and this got
confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and
REFCOUNT_FULL entirely, leaving the generic implementation enabled
unconditionally. (Will Deacon)
- Other misc changes, fixes, cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
lkdtm: Remove references to CONFIG_REFCOUNT_FULL
locking/refcount: Remove unused 'refcount_error_report()' function
locking/refcount: Consolidate implementations of refcount_t
locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
locking/refcount: Move saturation warnings out of line
locking/refcount: Improve performance of generic REFCOUNT_FULL code
locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header
locking/refcount: Remove unused refcount_*_checked() variants
locking/refcount: Ensure integer operands are treated as signed
locking/refcount: Define constants for saturation and max refcount values
futex: Prevent exit livelock
futex: Provide distinct return value when owner is exiting
futex: Add mutex around futex exit
futex: Provide state handling for exec() as well
futex: Sanitize exit state handling
futex: Mark the begin of futex exit explicitly
futex: Set task::futex_state to DEAD right after handling futex exit
futex: Split futex_mm_release() for exit/exec
exit/exec: Seperate mm_release()
futex: Replace PF_EXITPIDONE with a state
...
Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:
- Dynamic tick (nohz) updates, perhaps most notably changes to force
the tick on when needed due to lengthy in-kernel execution on CPUs
on which RCU is waiting.
- Linux-kernel memory consistency model updates.
- Replace rcu_swap_protected() with rcu_prepace_pointer().
- Torture-test updates.
- Documentation updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
rcu: Suppress levelspread uninitialized messages
rcu: Fix uninitialized variable in nocb_gp_wait()
rcu: Update descriptions for rcu_future_grace_period tracepoint
rcu: Update descriptions for rcu_nocb_wake tracepoint
rcu: Remove obsolete descriptions for rcu_barrier tracepoint
rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
workqueue: Convert for_each_wq to use built-in list check
rcu: Several rcu_segcblist functions can be static
rcu: Remove unused function hlist_bl_del_init_rcu()
Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
...
According to BSpec 53998, there is a mask of
max 8 SAGV/QGV points we need to support.
Bumping this up to keep the CI happy(currently
preventing tests to run), until all SAGV
changes land.
v2: Fix second plane where QGV points were
hardcoded as well.
v3: Change the naming of I915_NUM_SAGV_POINTS
to be I915_NUM_QGV_POINTS, as more meaningful
(Ville Syrjälä)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112189
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125160800.14740-1-stanislav.lisovskiy@intel.com
[vsyrjala: Add missing braces around else (checkpatch), fix Bugzilla tag]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Based on a sampling of a number of benchmarks across platforms, by
default opt for a much more lenient timeout so that we should not
adversely affect existing "good" clients.
640ms ought to be enough for anyone.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk
In order to avoid some nasty mutex inversions, commit 09c5ab384f
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.
Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.
Fixes: 09c5ab384f ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-6-chris@chris-wilson.co.uk
(cherry picked from commit a266bf4200)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The major drawback of commit 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.
As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!
v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.
v2: Actually to play nicely with engine_retire, we have to raise the
timeline.active_lock before releasing the HW. intel_gt_retire_requsts()
is still serialised by the outer lock so they cannot see this
intermediate state, and engine_retire is serialised by HW submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk
(cherry picked from commit 88a4655e75)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I rushed a last minute correction to cancel_port_requests() to prevent
the snooping of *execlists->active as the inflight array was being
updated, without noticing we iterated the inflight array starting from
active! Oops.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112387
Fixes: 97f9af78f3 ("drm/i915/gt: Mark the execlists->active as the primary volatile access")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125112520.1760492-1-chris@chris-wilson.co.uk
(cherry picked from commit da0ef77e1e)
[Joonas: Fixed Fixes: tag to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we want to do a lockless read of the current active request, and
that request is written to by process_csb also without serialisation, we
need to instruct gcc to take care in reading the pointer itself.
Otherwise, we have observed execlists_active() to report 0x40.
[ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4
[ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000
[ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 }
[ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940
which is impossible!
The answer is that as we keep the existing execlists->active pointing
into the array as we copy over that array, the unserialised read may see
a partial pointer value.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
(cherry picked from commit 331bf90591)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In commit a79ca656b6 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.
As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.
v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.
Fixes: a79ca656b6 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
(cherry picked from commit 5cba288466)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.
This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.
And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].
The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.
Fixes: e5dadff4b0 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
(cherry picked from commit a6edbca74b)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.
Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
(cherry picked from commit 07779a76ee)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When waiting for idle, serialise with any ongoing callback so that it
will have completed before completing the wait.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-12-chris@chris-wilson.co.uk
(cherry picked from commit f4ba0707c8)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
pm_suspend_target_state is declared under CONFIG_PM_SLEEP but only
defined under CONFIG_SUSPEND. Play safe and only use the symbol if it is
both declared and defined.
Reported-by: kbuild-all@lists.01.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: a70a9e998e ("drm/i915: Defer rc6 shutdown to suspend_late")
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120182209.3967833-1-chris@chris-wilson.co.uk
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The major drawback of commit 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.
As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!
v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.
v2: Actually to play nicely with engine_retire, we have to raise the
timeline.active_lock before releasing the HW. intel_gt_retire_requsts()
is still serialised by the outer lock so they cannot see this
intermediate state, and engine_retire is serialised by HW submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk
As the engine->kernel_context is used within the engine-pm barrier, we
have to be careful when emitting requests outside of the barrier, as the
strict timeline locking rules do not apply. Instead, we must ensure the
engine_park() cannot be entered as we build the request, which is
simplest by taking an explicit engine-pm wakeref around the request
construction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk
I rushed a last minute correction to cancel_port_requests() to prevent
the snooping of *execlists->active as the inflight array was being
updated, without noticing we iterated the inflight array starting from
active! Oops.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112387
Fixes: 331bf90591 ("drm/i915/gt: Mark the execlists->active as the primary volatile access")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125112520.1760492-1-chris@chris-wilson.co.uk
Commit 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for
engines onto the GT") changed the engine query to iterate over uabi
engines but left the buffer size calculation look at the physical engine
count. Difference has no practical consequence but it is nicer to align
both queries.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for engines onto the GT")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122104115.29610-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 9acc99d8f2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The bspec initially provided a single DKL PHY vswing table for both HDMI
and DP, but was recently updated to include an independent table for
HDMI.
Bspec: 49292
Fixes: 978c3e539b ("drm/i915/tgl: Add dkl phy programming sequences")
Cc: Clinton A Taylor <clinton.a.taylor@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118180219.9309-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 362bfb995b)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The bspec was recently updated with new cdclk -> voltage level tables to
accommodate the new 324/326.4 cdclk values.
Bspec: 21809
Fixes: 63c9dae71d ("drm/i915/ehl: Add voltage level requirement table")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118164412.26216-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit d147483884)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we want to do a lockless read of the current active request, and
that request is written to by process_csb also without serialisation, we
need to instruct gcc to take care in reading the pointer itself.
Otherwise, we have observed execlists_active() to report 0x40.
[ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4
[ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000
[ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 }
[ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940
which is impossible!
The answer is that as we keep the existing execlists->active pointing
into the array as we copy over that array, the unserialised read may see
a partial pointer value.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
The generic implementation of refcount_t should be good enough for
everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely,
leaving the generic implementation enabled unconditionally.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-9-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for
engines onto the GT") changed the engine query to iterate over uabi
engines but left the buffer size calculation look at the physical engine
count. Difference has no practical consequence but it is nicer to align
both queries.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 750e76b4f9 ("drm/i915/gt: Move the [class][inst] lookup for engines onto the GT")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122104115.29610-1-tvrtko.ursulin@linux.intel.com
Before checking the current i915_active state for the asynchronous work
we submitted, flush any ongoing callback. This ensures that our sampling
is robust and does not sporadically fail due to bad timing as the work
is running on another cpu.
v2: Drop the fence callback sync, retiring under the lock should be good
enough to synchronize with engine_retire() and the
intel_gt_retire_requests() background worker.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122132404.690440-1-chris@chris-wilson.co.uk
From inside an active timeline in the execbuf ioctl, we may try to
reclaim some space in the GGTT. We need GGTT space for all objects on
!full-ppgtt platforms, and for context images everywhere. However, to
free up space in the GGTT we may need to remove some pinned objects
(e.g. context images) that require flushing the idle barriers to remove.
For this we use the big hammer of intel_gt_wait_for_idle()
However, commit 7936a22dd4 ("drm/i915/gt: Wait for new requests in
intel_gt_retire_requests()") will continue spinning on the wait if a
timeline is active but lacks requests, as is the case during execbuf
reservation. Spinning forever is quite time consuming, so revert that
commit and start again.
In practice, the effect commit 7936a22dd4 was trying to achieve is
accomplished by commit 1683d24c14 ("drm/i915/gt: Move new timelines
to the end of active_list"), so there is no immediate rush to replace
the looping.
Testcase: igt/gem_exec_reloc/basic-range
Fixes: a46bfdc83f ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
References: 1683d24c14 ("drm/i915/gt: Move new timelines to the end of active_list")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-1-chris@chris-wilson.co.uk
(cherry picked from commit 689122dcc3)
[Joonas: Corrected Fixes: tag ref to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Bonded request submission is designed to allow requests to execute in
parallel as laid out by the user. If the master request is already
finished before its bonded pair is submitted, the pair were not destined
to run in parallel and we lose the information about the master engine
to dictate selection of the secondary. If the second request was
required to be run on a particular engine in a virtual set, that should
have been specified, rather than left to the whims of a random
unconnected requests!
In the selftest, I made the mistake of not ensuring the master would
overlap with its bonded pairs, meaning that it could indeed complete
before we submitted the bonds. Those bonds were then free to select any
available engine in their virtual set, and not the one expected by the
test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122112152.660743-1-chris@chris-wilson.co.uk
As we start peeking into requests for longer and longer, e.g.
incorporating use of spinlocks when only protected by an
rcu_read_lock(), we need to be careful in how we reset the request when
recycling and need to preserve any barriers that may still be in use as
the request is reset for reuse.
Quoting Linus Torvalds:
> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU?
.. because the object can be accessed (by RCU) after the refcount has
gone down to zero, and the thing has been released.
That's the whole and only point of SLAB_TYPESAFE_BY_RCU.
That flag basically says:
"I may end up accessing this object *after* it has been free'd,
because there may be RCU lookups in flight"
This has nothing to do with constructors. It's ok if the object gets
reused as an object of the same type and does *not* get
re-initialized, because we're perfectly fine seeing old stale data.
What it guarantees is that the slab isn't shared with any other kind
of object, _and_ that the underlying pages are free'd after an RCU
quiescent period (so the pages aren't shared with another kind of
object either during an RCU walk).
And it doesn't necessarily have to have a constructor, because the
thing that a RCU walk will care about is
(a) guaranteed to be an object that *has* been on some RCU list (so
it's not a "new" object)
(b) the RCU walk needs to have logic to verify that it's still the
*same* object and hasn't been re-used as something else.
In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used
immediately, but because it gets reused as the same kind of object,
the RCU walker can "know" what parts have meaning for re-use, in a way
it couidn't if the re-use was random.
That said, it *is* subtle, and people should be careful.
> So the re-use might initialize the fields lazily, not necessarily using a ctor.
If you have a well-defined refcount, and use "atomic_inc_not_zero()"
to guard the speculative RCU access section, and use
"atomic_dec_and_test()" in the freeing section, then you should be
safe wrt new allocations.
If you have a completely new allocation that has "random stale
content", you know that it cannot be on the RCU list, so there is no
speculative access that can ever see that random content.
So the only case you need to worry about is a re-use allocation, and
you know that the refcount will start out as zero even if you don't
have a constructor.
So you can think of the refcount itself as always having a zero
constructor, *BUT* you need to be careful with ordering.
In particular, whoever does the allocation needs to then set the
refcount to a non-zero value *after* it has initialized all the other
fields. And in particular, it needs to make sure that it uses the
proper memory ordering to do so.
NOTE! One thing to be very worried about is that re-initializing
whatever RCU lists means that now the RCU walker may be walking on the
wrong list so the walker may do the right thing for this particular
entry, but it may miss walking *other* entries. So then you can get
spurious lookup failures, because the RCU walker never walked all the
way to the end of the right list. That ends up being a much more
subtle bug.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122094924.629690-1-chris@chris-wilson.co.uk
Use our more regular igt_flush_test() to bind the wait-for-idle and
error out instead of waiting around forever on critical failure.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121233021.507400-1-chris@chris-wilson.co.uk
Assume that intel_wakeref_get() may take the mutex, and perform other
sleeping actions in the course of its callbacks and so use might_sleep()
to ensure that all callers abide. Anything that cannot sleep has to use
e.g. intel_wakeref_get_if_active() to guarantee its avoidance of the
non-atomic paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121130528.309474-1-chris@chris-wilson.co.uk
As we wait upon a request, we must be holding a reference to it, and be
wary that i915_request_add() consumes the passed in reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121093326.134774-1-chris@chris-wilson.co.uk
Since retirement may be running in a worker on another CPU, it may be
skipped in the local intel_gt_wait_for_idle(). To ensure the state is
consistent for our sanity checks upon load, serialise with the remote
retirer by waiting on the timeline->mutex.
Outside of this use case, e.g. on suspend or module unload, we expect the
slack to be picked up by intel_gt_pm_wait_for_idle() and so prefer to
put the special case serialisation with retirement in its single user,
for now at least.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-2-chris@chris-wilson.co.uk
fbdev uses the physical address of our framebuffer for its fb_mmap()
routine. While we need to adapt this address for the new io BAR, we have
to fix v5.4 first! The simplest fix is to restore the smem back to v5.3
and we will then probably have to implement our fbops->fb_mmap() callback
to handle local memory.
Reported-by: Neil MacLeod <freedesktop@nmacleod.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256
Fixes: 5f889b9a61 ("drm/i915: Disregard drm_mode_config.fb_base")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: Neil MacLeod <freedesktop@nmacleod.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk
(cherry picked from commit abc5520704)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 9faf5fa4d3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From inside an active timeline in the execbuf ioctl, we may try to
reclaim some space in the GGTT. We need GGTT space for all objects on
!full-ppgtt platforms, and for context images everywhere. However, to
free up space in the GGTT we may need to remove some pinned objects
(e.g. context images) that require flushing the idle barriers to remove.
For this we use the big hammer of intel_gt_wait_for_idle()
However, commit 7936a22dd4 ("drm/i915/gt: Wait for new requests in
intel_gt_retire_requests()") will continue spinning on the wait if a
timeline is active but lacks requests, as is the case during execbuf
reservation. Spinning forever is quite time consuming, so revert that
commit and start again.
In practice, the effect commit 7936a22dd4 was trying to achieve is
accomplished by commit 1683d24c14 ("drm/i915/gt: Move new timelines
to the end of active_list"), so there is no immediate rush to replace
the looping.
Testcase: igt/gem_exec_reloc/basic-range
Fixes: 7936a22dd4 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
References: 1683d24c14 ("drm/i915/gt: Move new timelines to the end of active_list")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-1-chris@chris-wilson.co.uk
GuC submission path can be called from an interrupt context
and so should use a worker to avoid holding a mutex.
References: 07779a76ee ("drm/i915: Mark up the calling context for intel_wakeref_put()")
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120211321.88021-1-stuart.summers@intel.com
pm_suspend_target_state is declared under CONFIG_PM_SLEEP but only
defined under CONFIG_SUSPEND. Play safe and only use the symbol if it is
both declared and defined.
Reported-by: kbuild-all@lists.01.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: a70a9e998e ("drm/i915: Defer rc6 shutdown to suspend_late")
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120182209.3967833-1-chris@chris-wilson.co.uk
In commit a79ca656b6 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.
As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.
v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.
Fixes: a79ca656b6 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.
This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.
And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].
The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.
Fixes: e5dadff4b0 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.
Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
Move the assert_vblank_disabled() into intel_crtc_vblank_on()
so that we don't have to inline it all over.
This does mean we now assert_vblank_disabled() during readout as well
but that is totally fine as it happens after drm_crtc_vblank_reset().
One can even argue it's what we want to do anyway to make sure
the reset actually happened.
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118164430.27265-4-ville.syrjala@linux.intel.com
Just pass the atomic state and the crtc to intel_encoders_enable() & co.
Make life simpler when you don't have to think which state (old vs. new)
you have to pass in. Also constify the states while at it.
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118164430.27265-2-ville.syrjala@linux.intel.com
i915_request_add() consumes the passed in reference to the i915_request,
so if the selftest caller wishes to wait upon it afterwards, it needs to
take a reference for itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120102741.3734346-1-chris@chris-wilson.co.uk
When setting up a full GGTT, we expect the next insert to fail with
-ENOSPC. Simplify the use of ERR_PTR to not confuse either the reader or
smatch.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
References: f40a7b7558 ("drm/i915: Initial selftests for exercising eviction")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120093302.3723715-8-chris@chris-wilson.co.uk
As we want to be able to run inside atomic context for retiring the
i915_active, and we are no longer allowed to abuse mutex_trylock, split
the tree management portion of i915_active.mutex into an irq-safe
spinlock.
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Fixes: 274cbf20fd ("drm/i915: Push the i915_active.retire into a worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114172535.1116-1-chris@chris-wilson.co.uk
(cherry picked from commit c9ad602fea)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
For our current users we don't expect pool objects to be writable from
the gpu.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 4f7af1948a ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119150154.18249-1-matthew.auld@intel.com
(cherry picked from commit d18580b08b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reading from CTX_INFO upsets rc6, requiring us to detect and prevent
possible rc6 context corruption. Poke at the bear!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Tested-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119154723.3311814-1-chris@chris-wilson.co.uk
As we may park the gt during request retirement, we may cancel the
retirement worker only to then program the delayed worker once more.
If we schedule the next delayed retirement worker first, if we then park
the gt, the work will remain cancelled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119162559.3313003-2-chris@chris-wilson.co.uk
When adding a new active timeline, place it at the end of the list. This
allows for intel_gt_retire_requests() to pick up the newcomer more
quickly and hopefully complete the retirement sooner. A miniscule
optimisation.
References: 7936a22dd4 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119162559.3313003-1-chris@chris-wilson.co.uk
For our current users we don't expect pool objects to be writable from
the gpu.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 4f7af1948a ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119150154.18249-1-matthew.auld@intel.com
The bspec initially provided a single DKL PHY vswing table for both HDMI
and DP, but was recently updated to include an independent table for
HDMI.
Bspec: 49292
Fixes: 978c3e539b ("drm/i915/tgl: Add dkl phy programming sequences")
Cc: Clinton A Taylor <clinton.a.taylor@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118180219.9309-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
When userspace writes into the GTT itself, it is supposed to call
set-domain to let the kernel keep track and so manage the CPU/GPU
caches. As we track writes on the individual i915_vma, we should also be
sure to mark them as dirty.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119112515.2766748-1-chris@chris-wilson.co.uk
In order to avoid some nasty mutex inversions, commit 09c5ab384f
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.
Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.
Fixes: 09c5ab384f ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-6-chris@chris-wilson.co.uk
Only serialise with the chipset using an mmio if the chipset is
currently active. We expect that any writes into the chipset range will
simply be forgotten until it wakes up.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118184943.2593048-8-chris@chris-wilson.co.uk
The bspec was recently updated with new cdclk -> voltage level tables to
accommodate the new 324/326.4 cdclk values.
Bspec: 21809
Fixes: 63c9dae71d ("drm/i915/ehl: Add voltage level requirement table")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118164412.26216-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
When we call intel_dsb_get(), the dsb initialization may fail for
various reasons. We already log the error message in that path, making
it unnecessary to trigger a warning that refcount == 0 when calling
intel_dsb_put().
So here we simplify the logic and do lazy shutdown: leaving the extra
refcount alive so when we call intel_dsb_put() we end up calling
i915_vma_unpin_and_release().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111205024.22853-3-lucas.demarchi@intel.com
The current dsb API is not really prepared to handle multithread access.
I was debugging an issue that ended up fixed by commit a096883dda
("drm/i915/dsb: Remove PIN_MAPPABLE from the DSB object VMA") and was
puzzled how these atomic operations were guaranteeing atomicity.
if (atomic_add_return(1, &dsb->refcount) != 1)
return dsb;
Thread A could still be initializing dsb struct (and even fail in the
middle) while thread B would take a reference and use it (even
derefencing a NULL cmd_buf).
I don't think the atomic operations here will help much if this were
to support multithreaded scenario in future, so just remove them to
avoid confusion.
v2: Use refcount++ != 0 instead of ++refcount != 1 (from Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111205024.22853-2-lucas.demarchi@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20191116011539.18230-1-lucas.demarchi@intel.com
Since the execlists_active() is no longer protected by the
engine->active.lock, we need to protect the request pointer with RCU to
prevent it being freed as we evaluate whether or not we need to preempt.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-2-chris@chris-wilson.co.uk
(cherry picked from commit 7d14863525)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 8eb4704b12)
(cherry picked from commit 7e27238e149ce4f00d9cd801fe3aa0ea55e986a2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit cee7fb437e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit a7d87b70d6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit f77021372e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit aeec766133)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
VBT revision 229 adds a new "Generic DTD" block 58 and deprecates the
old LFP panel mode data in block 42. Let's start parsing this block to
fill in the panel fixed mode on devices with a >=229 VBT.
v2:
* Update according to the recent updates:
- DTD size is now 16 bits instead of 24
- polarity is now just a single bit for hsync and vsync and is
properly documented
* Minor checkpatch fix
v3:
* Now that panel options are parsed separately from the previous patch,
move generic DTD parsing into a function parallel to
parse_lfp_panel_dtd. We'll still fall back to looking at the legacy
LVDS timing block if the generic DTD fails. (Jani)
* Don't forget to actually set lfp_lvds_vbt_mode! (Jani)
* Drop "bdb_" prefix from dtd entry structure. (Jani)
* Follow C99 standard for structure's flexible array member. (Jani)
v4:
* Add "positive" to polarity field names for clarity. (Jani)
* Move VBT version check and fallback to legacy DTD parsing logic to a
helper to keep top-level VBT parsing uncluttered. (Jani)
* Restructure reserved bit packing at end of generic_dtd_entry from
"u32 rsvd:24" to "u8 rsvd[3]" to prevent copy/paste mistakes in the
future. (Jani)
Bspec: 54751
Bspec: 20148
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115165132.9472-3-matthew.d.roper@intel.com
Newer VBT versions will add an alternate way to read panel DTD
information, so let's split parsing of the general panel information
from the timing data in preparation.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jesse Barnes <jsbarnes@google.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115165132.9472-2-matthew.d.roper@intel.com
Having called intel_gt_init_early() to setup the mock intel_gt, we need
to call the corresponding intel_gt_driver_late_release() to clean up.
References: dea397e818 ("drm/i915/gt: Flush retire.work timer object on unload")
References: 24635c5152 ("drm/i915: Move intel_gt initialization to a separate file")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118094342.2193485-1-chris@chris-wilson.co.uk
It's supposed to be just a const pointer.
Fixes: 074c77e3ec ("drm/i915/tgl: Gen-12 display loses Yf tiling and legacy CCS support")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115120440.17883-1-jani.nikula@intel.com
(cherry picked from commit 48ea97fabe)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
On some platforms (e.g. KBL) that do not support GuC submission, but
the user enabled the GuC communication (e.g for HuC authentication)
calling the GuC EXIT_S_STATE action results in lose of ability to
enter RC6. We can remove the GuC suspend/resume entirely as we do
not need to save the GuC submission status.
Add intel_guc_submission_is_enabled() function to determine if
GuC submission is active.
v2: Do not suspend/resume the GuC on platforms that do not support
Guc Submission.
v3: Fix typo, move suspend logic to remove goto.
v4: Use intel_guc_submission_is_enabled() to check GuC submission
status.
v5: No need to look at engine to determine if submission is enabled.
Squash fix + intel_guc_submission_is_enabled() patch into one.
v6: Move resume check into intel_guc_resume() for symmetry.
Fix commit Fixes tag.
Reported-by: KiteStramuort <kitestramuort@autistici.org>
Reported-by: S. Zharkoff <s.zharkoff@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111594
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111623
Fixes: ffd5ce22fa ("drm/i915/guc: Updates for GuC 32.0.3 firmware")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceralo Spurio <daniele.ceraolospurio@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Tomas Janousek <tomi@nomi.cz>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115231538.1249-1-don.hiatt@intel.com
(cherry picked from commit 82e0c5bbd6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Our callers fall into two categories, those passing timeout=0 who just
want to flush request retirements and those passing a timeout that need
to wait for submission completion (e.g. intel_gt_wait_for_idle()).
Currently, we only wait for a snapshot of timelines at the start of the
wait (but there was an expectation that new requests would cause timelines
to appear at the end). However, our callers, such as
intel_gt_wait_for_idle() before suspend, do require us to wait for the
power management requests emitted by retirement as well. If we don't,
then it takes an extra second or two for the background worker to flush
the queue and mark the GT as idle.
Fixes: 7e80576266 ("drm/i915: Drop struct_mutex from around i915_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-1-chris@chris-wilson.co.uk
(cherry picked from commit 7936a22dd4)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The workaround to disable coarse power gating is still needed on SKL
GT3/GT4 machines and since the RC6 context corruption was discovered by
the hardware team also on all GEN9 machines. Restore applying the
workaround.
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/intel_gt_pm_late_selftests/live_rc6_ctx
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114152621.7235-1-imre.deak@intel.com
(cherry picked from commit 980f87a2ed)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
fbdev uses the physical address of our framebuffer for its fb_mmap()
routine. While we need to adapt this address for the new io BAR, we have
to fix v5.4 first! The simplest fix is to restore the smem back to v5.3
and we will then probably have to implement our fbops->fb_mmap() callback
to handle local memory.
Reported-by: Neil MacLeod <freedesktop@nmacleod.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256
Fixes: 5f889b9a61 ("drm/i915: Disregard drm_mode_config.fb_base")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: Neil MacLeod <freedesktop@nmacleod.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk
(cherry picked from commit abc5520704)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I'm observing incoherence metric values, changing from run to run.
It appears the patches introducing noa wait & reconfiguration from
command stream switched places in the series multiple times during the
review. This lead to the dependency of one onto the order to go
missing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1f8 ("drm/i915/perf: execute OA configuration from command stream")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 93937659dc)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
io_mapping_map_atomic/kmap_atomic are occasionally taken in error capture
(if there is no aperture preallocated for the use of error capture), but
the error capture and compression routines are now run in normal
context:
<3> [113.316247] BUG: sleeping function called from invalid context at mm/page_alloc.c:4653
<3> [113.318190] in_atomic(): 1, irqs_disabled(): 0, pid: 678, name: debugfs_test
<4> [113.319900] no locks held by debugfs_test/678.
<3> [113.321002] Preemption disabled at:
<4> [113.321130] [<ffffffffa02506d4>] i915_error_object_create+0x494/0x610 [i915]
<4> [113.327259] Call Trace:
<4> [113.327871] dump_stack+0x67/0x9b
<4> [113.328683] ___might_sleep+0x167/0x250
<4> [113.329618] __alloc_pages_nodemask+0x26b/0x1110
<4> [113.334614] pool_alloc.constprop.19+0x14/0x60 [i915]
<4> [113.335951] compress_page+0x7c/0x100 [i915]
<4> [113.337110] i915_error_object_create+0x4bd/0x610 [i915]
<4> [113.338515] i915_capture_gpu_state+0x384/0x1680 [i915]
However, it is not a good idea to run the slow compression inside atomic
context, so we choose not to.
Fixes: 895d8ebeaa ("drm/i915: error capture with no ggtt slot")
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231104.24208-1-yu.bruce.chang@intel.com
(cherry picked from commit 48715f7001)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
TRANS_DDI_MST_TRANSPORT_SELECT is 2 bits wide not 3, it was taking
one bit from EDP/DSI Input Select.
Fixes: b3545e0868 ("drm/i915/tgl: add support to one DP-MST stream")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-1-jose.souza@intel.com
(cherry picked from commit bb747fa5a9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
According to internal documents I found for CMP PCHs the PCI ID 0xA3C1
belongs to a CMP-V chipset. Based on the same docs the programming of
the PCH is compatible with that of KBP. Fix up my previous wrong
assumption accordingly using the SPT programming which in turn is the
basis for KBP.
The original bug reporter verified that this is the correct PCH
identification (the only way we'll program valid DDC pin-pair values to
the GMBUS register) and the Windows team uses the same identification
(that is using the KBP programming model for this PCH).
I filed the necessary Bspec update requests (BSpec/33734).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112051
Fixes: 37c92dc303 ("drm/i915: Add new CNL PCH ID seen on a CML platform")
Reported-and-tested-by: Cyrus <cyrus.lien@canonical.com>
Cc: Cyrus <cyrus.lien@canonical.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112104608.24587-1-imre.deak@intel.com
(cherry picked from commit 50a5065f44)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
It's supposed to be just a const pointer.
Fixes: 074c77e3ec ("drm/i915/tgl: Gen-12 display loses Yf tiling and legacy CCS support")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115120440.17883-1-jani.nikula@intel.com
On some platforms (e.g. KBL) that do not support GuC submission, but
the user enabled the GuC communication (e.g for HuC authentication)
calling the GuC EXIT_S_STATE action results in lose of ability to
enter RC6. We can remove the GuC suspend/resume entirely as we do
not need to save the GuC submission status.
Add intel_guc_submission_is_enabled() function to determine if
GuC submission is active.
v2: Do not suspend/resume the GuC on platforms that do not support
Guc Submission.
v3: Fix typo, move suspend logic to remove goto.
v4: Use intel_guc_submission_is_enabled() to check GuC submission
status.
v5: No need to look at engine to determine if submission is enabled.
Squash fix + intel_guc_submission_is_enabled() patch into one.
v6: Move resume check into intel_guc_resume() for symmetry.
Fix commit Fixes tag.
Reported-by: KiteStramuort <kitestramuort@autistici.org>
Reported-by: S. Zharkoff <s.zharkoff@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111594
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111623
Fixes: ffd5ce22fa ("drm/i915/guc: Updates for GuC 32.0.3 firmware")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceralo Spurio <daniele.ceraolospurio@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Tomas Janousek <tomi@nomi.cz>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115231538.1249-1-don.hiatt@intel.com
When telling the user that device power management is disabled, it is
helpful to say which device that was.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115122343.821331-1-chris@chris-wilson.co.uk
Currently we're blindly poking at the frame start delay bits
in PIPECONF when trying to sanitize the hardware state. Those
bits decided to move elsewhere on HSW, so on many platforms
we're not doing anything at all here. Also we're forgetting
about the PCH transcoder entirely.
Add all the bit definitions for the various homes these bits
have had throughout the years, and reset them all to zero.
However I'm not entirely sure this is a safe thing to do. If
not I guess we'd want full readout+statecheck for this stuff.
For now let's stick to the current logic and hope for the
best.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024122138.25065-3-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
As the heartbeat has the effect of flushing context barriers, this
interferes with the context barrier tests that are trying to observe
them directly. Disable the heartbeat so that the barriers are as
predictable as the test demands.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115150841.880349-2-chris@chris-wilson.co.uk
i915:
- MOCS table fixes for EHL and TGL
- Update Display's rawclock on resume
- GVT's dmabuf reference drop fix
amdgpu:
- Fix a potential crash in firmware parsing
sun4i:
- One fix to the dotclock dividers range for sun4i
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJdzfwDAAoJEAx081l5xIa+8PgP/3OQ/P1yfZq+trWnNpyMXD3N
z+/H6mliUgsrNo6Mmd6SRZKIsq6Ood7+WVqLSmITZOsiL0UY7P+h40lPpv+nLO5r
6KeGeRQ/dO3NlsB7Ab9cdl2bERTVCNrQpiUTrauS/veB/l729i3fgNY309Cabhhm
I03eVUFW2UhkWSVpnFBIOP/u50zbBhNOvEmZa0xeCpmoVdu5AfXyg3gCR4MePh4S
/TWmmjHo1xKar+fZB7SMPF2Dg648KY2AqB1M7AdIA5qH+7ua2LNRP9Y6st5PEm0C
/nHlQKbyYKrIPzwdb3O8lzuLRQpeo/HOw8hz0SzHJRRlx1nmNUURUVLxpzx5z+22
fFCb/X0ZUckPT/CphLt749FHpCd1X/hWljKGTzhlq3tL0QlWAvheMcnXFe01HDJ9
UC3/ZCJx9XFYvd7w6mAPtB9OtmQ9AnxFdIBHBYtKx9Vz2+E7qH761u8vIe7pEXmE
kNbkmUxnr3LdH0sCBxWrx+tmZG1OpGJvd40zX9XltyBLSMxRp6sCg/RWI+LtR4f1
BZ3OKdqvR/oIWNDp0UcK1+rY4pPNbIgei2r3ysJJUwzy1T3pRlCC3JUSQVyxjpc2
HfP63PzHVElrfBzeCau12KxfmPTV5YKHJYZw0sJkEKpeHI2mQde6lbiDQH5yQjPJ
M/0zy9UMFg1zPLzdvzbY
=RKpL
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Here is this weeks non-intel hw vuln fixes pull. Three drivers, all
small fixes.
i915:
- MOCS table fixes for EHL and TGL
- Update Display's rawclock on resume
- GVT's dmabuf reference drop fix
amdgpu:
- Fix a potential crash in firmware parsing
sun4i:
- One fix to the dotclock dividers range for sun4i"
* tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: fix null pointer deref in firmware header printing
drm/i915/tgl: MOCS table update
Revert "drm/i915/ehl: Update MOCS table for EHL"
drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
drm/i915: update rawclk also on resume
drm/i915/gvt: fix dropping obj reference twice
Verify that we can execute a long chain of dependent requests from
userspace, each one slightly more important than the last.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-4-chris@chris-wilson.co.uk
While we're waiting for the OA configuration to apply, let's give a
chance to other contexts that might need to run other workloads.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114140224.21818-1-lionel.g.landwerlin@intel.com
RC6 is tracked underneath the intel_gt, so use our local pointers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115114800.725061-1-chris@chris-wilson.co.uk
It applies to all gen9 and gen10 now, so we can use a single test
against the gen bitmask.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191115122755.830355-1-chris@chris-wilson.co.uk
Inside the constructor, while cloning, we need to replace the
dst->engines. Having forgotten that dst->engines is marked as RCU
protected, we need to add the appropriate annotations to make sparse
happy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-5-chris@chris-wilson.co.uk
Our callers fall into two categories, those passing timeout=0 who just
want to flush request retirements and those passing a timeout that need
to wait for submission completion (e.g. intel_gt_wait_for_idle()).
Currently, we only wait for a snapshot of timelines at the start of the
wait (but there was an expectation that new requests would cause timelines
to appear at the end). However, our callers, such as
intel_gt_wait_for_idle() before suspend, do require us to wait for the
power management requests emitted by retirement as well. If we don't,
then it takes an extra second or two for the background worker to flush
the queue and mark the GT as idle.
Fixes: 7e80576266 ("drm/i915: Drop struct_mutex from around i915_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-1-chris@chris-wilson.co.uk
Backmerge to get dfce90259d ("Backmerge i915 security patches from
commit 'ea0b163b13ff' into drm-next") and thus 100d46bd72 ("Merge
Intel Gen8/Gen9 graphics fixes from Jon Bloomfield.").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
- PMU "Frequency" is reported as accumulated cycles
- Avoid OOPS in dumb_create IOCTL when no CRTCs
- Mitigation for userptr put_pages deadlock with trylock_page
- Fix to avoid freeing heartbeat request too early
- Fix LRC coherency issue
- Fix Bugzilla #112212: Avoid screen corruption on MST
- Error path fix to unlock context on failed context VM SETPARAM
- Always consider holding preemption a privileged op in perf/OA
- Preload LUTs if the hw isn't currently using them to avoid color flash on VLV/CHV
- Protect context while grabbing its name for the request
- Don't resize aliasing ppGTT size
- Smaller fixes picked by tooling
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114085213.GA6440@jlahtine-desk.ger.corp.intel.com
With the new interrupt re-partitioning in Gen11, GuC controls by itself
the interrupts it receives, so steering bits and registers have been
defeatured. Being this the case, when the GuC is in control of
submissions we won't know what to do with the ctx switch interrupt
in the driver, so disable it.
v2 (Daniele): replace the gen9 paths instead of keeping gen9 and gen11
functions since we won't support guc submission on any pre-gen11 platform.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105225321.26642-1-daniele.ceraolospurio@intel.com
The workaround to disable coarse power gating is still needed on SKL
GT3/GT4 machines and since the RC6 context corruption was discovered by
the hardware team also on all GEN9 machines. Restore applying the
workaround.
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/intel_gt_pm_late_selftests/live_rc6_ctx
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114152621.7235-1-imre.deak@intel.com
HDMI_PICTURE_ASPECT_NONE is zero and the connector state is kzalloc()'d
so no need to initialize conn_state->picture_aspect_ratio with it.
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620142639.17518-6-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
HDMI_PICTURE_ASPECT_NONE means "Automatic" so when the user has that
selected we should keep whatever aspect ratio the mode already has.
Also no point in checking for connector->is_hdmi in the SDVO code
since we only attach the property to HDMI connectors.
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620142639.17518-5-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
As we want to be able to run inside atomic context for retiring the
i915_active, and we are no longer allowed to abuse mutex_trylock, split
the tree management portion of i915_active.mutex into an irq-safe
spinlock.
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Fixes: 274cbf20fd ("drm/i915: Push the i915_active.retire into a worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114172535.1116-1-chris@chris-wilson.co.uk
Probe the mocs registers for new contexts and across GPU resets. Similar
to intel_workarounds, we have tables of what register values we expect
to see, so verify that user contexts are affected by them. In the
future, we should add tests similar to intel_sseu to cover dynamic
reconfigurations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112223600.30993-4-chris@chris-wilson.co.uk
We repeatedly (and more so in future) use the same looping construct
over the mocs definition table to setup the register state. Refactor the
loop construct into a reusable macro.
add/remove: 2/1 grow/shrink: 1/2 up/down: 113/-330 (-217)
Function old new delta
intel_mocs_init_engine.cold - 71 +71
offset - 28 +28
__func__ 17273 17287 +14
intel_mocs_init 143 113 -30
mocs_register.isra 91 - -91
intel_mocs_init_engine 503 294 -209
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112223600.30993-3-chris@chris-wilson.co.uk
Be consistent in our mocs setup on Tigerlake and set the unused control
value to follow the PTE entry as we previously have done. The unused
values are beyond the defines of the ABI, the consistency simplifies our
checking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112223600.30993-1-chris@chris-wilson.co.uk
There are both positive and negative options about this feature.
At first, I thought it was a good idea, but actually Linus stated a
negative opinion (https://lkml.org/lkml/2019/9/29/227). I admit it
is ugly and annoying.
The baseline I'd like to keep is the compile-test of uapi headers.
(Otherwise, kernel developers have no way to ensure the correctness
of the exported headers.)
I will maintain a small build rule in usr/include/Makefile.
Remove the other header test functionality.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
fbdev uses the physical address of our framebuffer for its fb_mmap()
routine. While we need to adapt this address for the new io BAR, we have
to fix v5.4 first! The simplest fix is to restore the smem back to v5.3
and we will then probably have to implement our fbops->fb_mmap() callback
to handle local memory.
Reported-by: Neil MacLeod <freedesktop@nmacleod.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256
Fixes: 5f889b9a61 ("drm/i915: Disregard drm_mode_config.fb_base")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Tested-by: Neil MacLeod <freedesktop@nmacleod.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk
I'm observing incoherence metric values, changing from run to run.
It appears the patches introducing noa wait & reconfiguration from
command stream switched places in the series multiple times during the
review. This lead to the dependency of one onto the order to go
missing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1f8 ("drm/i915/perf: execute OA configuration from command stream")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com
This backmerges the branch that ended up in Linus' tree. It removes
all the changes for the rc6 patches from Linus' tree in favour of
a patch that is based on a large refactor that occured.
Otherwise it all looks good.
Signed-off-by: Dave Airlie <airlied@redhat.com>
In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.
v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
change.
v5:
- Rebased on latest upstream gt_pm refactoring.
v6:
- s/i915_rc6_/intel_rc6_/
- Don't return a value from i915_rc6_ctx_wa_check().
v7:
- Rebased on latest gt rc6 refactoring.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[airlied: pull this later version of this patch into drm-next
to make resolving the conflict mess easier.]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Extend disabling SAMPLER_STATE prefetch workaround to gen12.
v2: Limit the WA to TGL A0 and update the WA no(Chris)
BSpec: 52890
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231953.24853-1-radhakrishna.sripada@intel.com
io_mapping_map_atomic/kmap_atomic are occasionally taken in error capture
(if there is no aperture preallocated for the use of error capture), but
the error capture and compression routines are now run in normal
context:
<3> [113.316247] BUG: sleeping function called from invalid context at mm/page_alloc.c:4653
<3> [113.318190] in_atomic(): 1, irqs_disabled(): 0, pid: 678, name: debugfs_test
<4> [113.319900] no locks held by debugfs_test/678.
<3> [113.321002] Preemption disabled at:
<4> [113.321130] [<ffffffffa02506d4>] i915_error_object_create+0x494/0x610 [i915]
<4> [113.327259] Call Trace:
<4> [113.327871] dump_stack+0x67/0x9b
<4> [113.328683] ___might_sleep+0x167/0x250
<4> [113.329618] __alloc_pages_nodemask+0x26b/0x1110
<4> [113.334614] pool_alloc.constprop.19+0x14/0x60 [i915]
<4> [113.335951] compress_page+0x7c/0x100 [i915]
<4> [113.337110] i915_error_object_create+0x4bd/0x610 [i915]
<4> [113.338515] i915_capture_gpu_state+0x384/0x1680 [i915]
However, it is not a good idea to run the slow compression inside atomic
context, so we choose not to.
Fixes: 895d8ebeaa ("drm/i915: error capture with no ggtt slot")
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231104.24208-1-yu.bruce.chang@intel.com
The bspec was just updated with a minor correction to entry 61 (it
shouldn't have had the SCF bit set).
v2:
- Add a MOCS_ENTRY_UNUSED() and use it to declare the
explicitly-reserved MOCS entries. (Lucas)
- Move the warning suppression from the Makefile to a #pragma that only
affects the TGL table. (Lucas)
v3:
- Entries 16 and 17 are identical to ICL now, so no need to explicitly
adjust them (or mess with compiler warning overrides).
Bspec: 45101
Fixes: 2ddf992179 ("drm/i915/tgl: Define MOCS entries for Tigerlake")
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-2-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
(cherry picked from commit bfb0e8e63d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This reverts commit f4071997f1.
These extra EHL entries won't behave as expected without a bit more work
on the kernel side so let's drop them until that kernel work has had a
chance to land. Userspace trying to use these new entries won't get the
advantage of the new functionality these entries are meant to provide,
but at least it won't misbehave.
When we do add these back in the future, we'll probably want to
explicitly use separate tables for ICL and EHL so that userspace
software that mistakenly uses these entries (which are undefined on ICL)
sees the same behavior it sees with all the other undefined entries.
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Fixes: f4071997f1 ("drm/i915/ehl: Update MOCS table for EHL")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-1-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 046091758b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tiger Lake supports HDMI on port A. For other platforms we ignore what
the VBT says regarding HDMI to workaround broken VBTs, see
commit 2ba7d7e043 ("drm/i915/bios: ignore HDMI on port A"). Make this
apply gen12+ so they inherit the TGL behavior.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113021935.41547-1-lucas.demarchi@intel.com
This register was being enabled after enable TRANS_DDI_FUNC_CTL and
PIPECONF/TRANS_CONF while BSpec states that it should be set when
enabling TRANS_DDI_FUNC_CTL.
BSpec: 49190
BSpec: 22243
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-3-jose.souza@intel.com
Adding pipe D support to DSI transcoder.
Not adding it for EDP transcoder code paths as only TGL has 4 pipes
and it do not have a EDP transcoder.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-2-jose.souza@intel.com
TRANS_DDI_MST_TRANSPORT_SELECT is 2 bits wide not 3, it was taking
one bit from EDP/DSI Input Select.
Fixes: b3545e0868 ("drm/i915/tgl: add support to one DP-MST stream")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-1-jose.souza@intel.com
When we call intel_bios_is_valid_vbt(), size may not actually be the
size of the VBT, but rather the size of the blob the VBT is contained
in. For example, when mapping the PCI oprom, size will be the entire
oprom size. We don't want to read beyond what is reported to be the
VBT. So make sure we vbt->vbt_size makes sense and use that for
the latter checks.
v2: check for vbt_size after checking for vbt signature and give it a
more meaningful error message (from Jani)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108003602.33526-3-lucas.demarchi@intel.com
Still the saga of the hsw live_blt incoherency continues. While it did
seem that the invalidate before the breadcrumb had improved the mtbf,
nevertheless live_blt still failed. Mika's next idea was to pull the
invalidate-stall into the breadcrumb write itself.
References: 860afa0868 ("drm/i915/gt: Flush gen7 even harder")
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113151956.32242-1-chris@chris-wilson.co.uk
The bspec was just updated with a minor correction to entry 61 (it
shouldn't have had the SCF bit set).
v2:
- Add a MOCS_ENTRY_UNUSED() and use it to declare the
explicitly-reserved MOCS entries. (Lucas)
- Move the warning suppression from the Makefile to a #pragma that only
affects the TGL table. (Lucas)
v3:
- Entries 16 and 17 are identical to ICL now, so no need to explicitly
adjust them (or mess with compiler warning overrides).
Bspec: 45101
Fixes: 2ddf992179 ("drm/i915/tgl: Define MOCS entries for Tigerlake")
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-2-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
This reverts commit f4071997f1.
These extra EHL entries won't behave as expected without a bit more work
on the kernel side so let's drop them until that kernel work has had a
chance to land. Userspace trying to use these new entries won't get the
advantage of the new functionality these entries are meant to provide,
but at least it won't misbehave.
When we do add these back in the future, we'll probably want to
explicitly use separate tables for ICL and EHL so that userspace
software that mistakenly uses these entries (which are undefined on ICL)
sees the same behavior it sees with all the other undefined entries.
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Fixes: f4071997f1 ("drm/i915/ehl: Update MOCS table for EHL")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112224757.25116-1-matthew.d.roper@intel.com
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The setting of MSA is done by the DDI .pre_enable() hook. And when we are
using MST, the MSA is only set to first mst stream by calling of
DDI .pre_eanble() hook. It raies issues to non-first mst streams.
Wrong MSA or missed MSA packets might show scrambled screen or wrong
screen.
This splits a setting of MSA to MST and SST cases. And In the MST case it
will call a setting of MSA after an allocating of Virtual Channel from
MST encoder pre_enable callback.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112212
Fixes: 0c06fa1560 ("drm/i915/dp: Add support of BT.2020 Colorimetry to DP MSA")
Fixes: d4a415dcda ("drm/i915: Fix MST oops due to MSA changes")
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106212636.502471-1-gwan-gyeong.mun@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[vsyrjala: nuke spurious newline]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit bd8c9cca88)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113125241.20547-1-ville.syrjala@linux.intel.com
According to internal documents I found for CMP PCHs the PCI ID 0xA3C1
belongs to a CMP-V chipset. Based on the same docs the programming of
the PCH is compatible with that of KBP. Fix up my previous wrong
assumption accordingly using the SPT programming which in turn is the
basis for KBP.
The original bug reporter verified that this is the correct PCH
identification (the only way we'll program valid DDC pin-pair values to
the GMBUS register) and the Windows team uses the same identification
(that is using the KBP programming model for this PCH).
I filed the necessary Bspec update requests (BSpec/33734).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112051
Fixes: 37c92dc303 ("drm/i915: Add new CNL PCH ID seen on a CML platform")
Reported-and-tested-by: Cyrus <cyrus.lien@canonical.com>
Cc: Cyrus <cyrus.lien@canonical.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112104608.24587-1-imre.deak@intel.com
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
(cherry picked from commit 31b61f0ef9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The warning should be just a warning. Where it is currently is wrong
since we already registered the connector on drm, meaning it dies later
on a NULL pointer deref if the VBT-overriding we have is removed. Move
the warning up.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108214251.79305-1-lucas.demarchi@intel.com
Since CNP it's possible for rawclk to have two different values, 19.2
and 24 MHz. If the value indicated by SFUSE_STRAP register is different
from the power on default for PCH_RAWCLK_FREQ, we'll end up having a
mismatch between the rawclk hardware and software states after
suspend/resume. On previous platforms this used to work by accident,
because the power on defaults worked just fine.
Update the rawclk also on resume. The natural place to do this would be
intel_modeset_init_hw(), however VLV/CHV need it done before
intel_power_domains_init_hw(). Thus put it there even if it feels
slightly out of place.
v2: Call intel_update_rawclck() in intel_power_domains_init_hw() for all
platforms (Ville).
Reported-by: Shawn Lee <shawn.c.lee@intel.com>
Cc: Shawn Lee <shawn.c.lee@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Shawn Lee <shawn.c.lee@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101142024.13877-1-jani.nikula@intel.com
(cherry picked from commit 59ed05ccdd)
Cc: <stable@vger.kernel.org> # v4.15+
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
live_blt is still failing on hsw, showing the hallmark of incoherency.
Since we are fairly certain that the interrupt is after the seqno is
visible, the other possibility is that the seqno is before the writes to
memory are flushed. Throw in an TLB invalidate before the breadcrumb as
we are reasonably confident that forces a CS stall.
References: f9228f7658 ("drm/i915/gt: Try an extra flush on the Haswell blitter")
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112160941.23969-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c:453 igt_threaded_blt() error: uninitialized symbol 'file'.
Fixes: 34485832cb ("drm/i915/selftests: Exercise parallel blit operations on a single ctx")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112163643.3527-1-chris@chris-wilson.co.uk
Using the array is getting clumsy. Make things a bit more dynamic.
Remove early returns on not having child devices when the end result
after "iterating" the empty list would be the same.
v3:
- use list_add_tail to not reverse the child device list (Ville)
v2:
- stick to previous naming of child devices (Ville)
- use kzalloc, handle failure
- initialize list head earlier to keep intel_bios_driver_remove() safe
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3e72da0b412354ed8be6719df55b0e0cc4caa61a.1573227240.git.jani.nikula@intel.com
On gen7, including Haswell, the MI_FLUSH_DW command is not synchronous
with the command streamer nor is there an option to make it so. To hide
this we add a large delay on the CS so that the breadcrumb should always
be visible before the interrupt. However, that does not seem to be
enough to ensure the memory is actually coherent with the read of the
breadcrumb. The breadcrumb update is a post-sync op... Throw in a
preliminary MI_FLUSH_DW before the breadcrumb update in the hope that
helps.
References: https://bugs.freedesktop.org/show_bug.cgi?id=112147
Testcase: igt/i915_selftest/live_blt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111120957.17732-1-chris@chris-wilson.co.uk
Since we removed the pm hookup from the GT, the hook in
drm_i915_private.gem is unused. Remove it.
References: 18f3b2727f ("drm/i915: Remove pm park/unpark notifications")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112113434.31088-1-chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
(cherry picked from commit fecffa4668)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef780 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 0b0120d4c7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This fixes two different classes of bugs in the Intel graphics hardware:
MMIO register read hang:
"On Intels Gen8 and Gen9 Graphics hardware, a read of specific graphics
MMIO registers when the product is in certain low power states causes
a system hang.
There are two potential triggers for DoS:
a) H/W corruption of the RC6 save/restore vector
b) Hard hang within the MIPI hardware
This prevents the DoS in two areas of the hardware:
1) Detect corruption of RC6 address on exit from low-power state,
and if we find it corrupted, disable RC6 and RPM
2) Permanently lower the MIPI MMIO timeout"
Blitter command streamer unrestricted memory accesses:
"On Intels Gen9 Graphics hardware the Blitter Command Streamer (BCS)
allows writing to Memory Mapped Input Output (MMIO) that should be
blocked. With modifications of page tables, this can lead to privilege
escalation. This exposure is limited to the Guest Physical Address
space and does not allow for access outside of the graphics virtual
machine.
This series establishes a software parser into the Blitter command
stream to scan for, and prevent, reads or writes to MMIO's that should
not be accessible to non-privileged contexts.
Much of the command parser infrastructure has existed for some time,
and is used on Ivybridge/Haswell/Valleyview derived products to allow
the use of features normally blocked by hardware. In this legacy
context, the command parser is employed to allow normally unprivileged
submissions to be run with elevated privileges in order to grant
access to a limited set of extra capabilities. In this mode the parser
is optional; In the event that the parser finds any construct that it
cannot properly validate (e.g. nested command buffers), it simply
aborts the scan and submits the buffer in non-privileged mode.
For Gen9 Graphics, this series makes the parser mandatory for all
Blitter submissions. The incoming user buffer is first copied to a
kernel owned buffer, and parsed. If all checks are successful the
kernel owned buffer is mapped READ-ONLY and submitted on behalf of the
user. If any checks fail, or the parser is unable to complete the scan
(nested buffers), it is forcibly rejected. The successfully scanned
buffer is executed with NORMAL user privileges (key difference from
legacy usage).
Modern usermode does not use the Blitter on later hardware, having
switched over to using the 3D engine instead for performance reasons.
There are however some legacy usermode apps that rely on Blitter,
notably the SNA X-Server. There are no known usermode applications
that require nested command buffers on the Blitter, so the forcible
rejection of such buffers in this patch series is considered an
acceptable limitation"
* Intel graphics fixes in emailed bundle from Jon Bloomfield <jon.bloomfield@intel.com>:
drm/i915/cmdparser: Fix jump whitelist clearing
drm/i915/gen8+: Add RC6 CTX corruption WA
drm/i915: Lower RM timeout to avoid DSI hard hangs
drm/i915/cmdparser: Ignore Length operands during command matching
drm/i915/cmdparser: Add support for backward jumps
drm/i915/cmdparser: Use explicit goto for error paths
drm/i915: Add gen9 BCS cmdparsing
drm/i915: Allow parsing of unsized batches
drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
drm/i915: Add support for mandatory cmdparsing
drm/i915: Remove Master tables from cmdparser
drm/i915: Disable Secure Batches for gen6+
drm/i915: Rename gen7 cmdparser tables
Pass around the intended intel_uncore for mmio access during stolen
setup, and avoid relying on the implicit magic I915_READ() macros.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111182143.23479-1-chris@chris-wilson.co.uk
Some basic information that is useful to know, such as how many cycles
is a MI_NOOP.
v2: Keep volatile pages pinned at all times! (Matthew)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anna Karas <anna.karas@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111172716.23733-1-chris@chris-wilson.co.uk
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs. So on 64-bit
architectures, only the first half of the bitmap is cleared.
If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.
Use bitmap_zero() instead, which gets the calculation right.
Fixes: f8c08d8fae ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Enable gup to retry and fault the pages outside of the mmap_sem lock in
our worker. As we are inside our worker, outside of any critical path,
we can allow the mmap_sem lock to be dropped in order to service a page
fault; this in turn allows the mm to populate the page using a slow
fault handler.
References: 5b56d49fc3 ("mm: add locked parameter to get_user_pages_remote()")
Testcase: igt/gem_userptr/userfault
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-2-chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock
for the benefit of asynchronous memory errors who prefer a
consistent dirty state. This rule can be broken in some special
cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).
However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
The setting of MSA is done by the DDI .pre_enable() hook. And when we are
using MST, the MSA is only set to first mst stream by calling of
DDI .pre_eanble() hook. It raies issues to non-first mst streams.
Wrong MSA or missed MSA packets might show scrambled screen or wrong
screen.
This splits a setting of MSA to MST and SST cases. And In the MST case it
will call a setting of MSA after an allocating of Virtual Channel from
MST encoder pre_enable callback.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112212
Fixes: 0c06fa1560 ("drm/i915/dp: Add support of BT.2020 Colorimetry to DP MSA")
Fixes: d4a415dcda ("drm/i915: Fix MST oops due to MSA changes")
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106212636.502471-1-gwan-gyeong.mun@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[vsyrjala: nuke spurious newline]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Having been forced to reduce Braswell back to using the aliasing ppgtt,
the coherency issue we previously observed cannot impact us. Reduce the
performance penalty imposed on all platforms from using the mfence to a
mere sfence.
References: cf66b8a0ba ("drm/i915/execlists: Apply a full mb before execution for Braswell")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-10-chris@chris-wilson.co.uk
As the ftrace buffer is single shot, once dumped it will not update. As
such, it only provides information for the first bug and all subsequent
bugs are noise. The goal of CI is to have zero bugs, so taint the kernel
causing CI to reboot the machine; fix the bug and move on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-13-chris@chris-wilson.co.uk
To test mmap_offset_exhaustion, we first have to fill the entire vma
manager leaving a single page. Don't assume that the vma manager is not
already fragment, and fill all the holes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111122706.28292-2-chris@chris-wilson.co.uk
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
If we detect a hang in a closed context, just flush all of its requests
and cancel any remaining execution along the context. Note that after
closing the context, the last reference to the context may be dropped,
leaving it only valid under RCU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-5-chris@chris-wilson.co.uk
We mention that we are resetting the GPU, and dump the device state for
post mortem debugging. However, while that dump contains the active
processes and the one flagged as causing the error, we do not always
include that information in dmesg. Include the name of the guilty
process in dmesg for reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-4-chris@chris-wilson.co.uk
Use a small char buffer inside the i915_gem_context to store the user
friendly name so that ctx->name has the same lifetime as the RCU
protected GEM context. That is, e.g. when using print_request() that
prints the timeline name (ctx->name), the name will not be prematurely
freed upon the context being closed and the last reference dropped.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-2-chris@chris-wilson.co.uk
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.
[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]
Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
After doing some measuring, Icelake behaves on a par with Broadwell, and
without having to compromise for low power cores with long latencies, we
can reduce the powergating hysteresis so that the powersaving is enabled
faster. No impact observed on client side throughput measures (so
negligible increase in extra switching), and inspection from high
frequency polling using igt/gem_exec_nop/sequential, provided an estimate
for the upper bound before we can measure a substantial impact on
latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191110185806.17413-9-chris@chris-wilson.co.uk
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
(cherry picked from commit 3c954c418e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When inside the lock, remember to unlock even if you want to leave
early.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
(cherry picked from commit feba2b8146)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
(cherry picked from commit 841e867286)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.
Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.
Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On gen7, we have to avoid concurrent access to the same mmio cacheline,
and so coordinate all mmio access with the uncore->lock. However, for
pmu, we want to avoid perturbing the system and disabling interrupts
unnecessarily, so restrict the w/a to gen7 where it is requied to
prevent machine hangs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-2-chris@chris-wilson.co.uk
We want to avoid taking forcewake when querying the performance stats,
as we wish to avoid perturbing the system under observation. (And with
the forcewake being kept alive for 1ms after use, sampling the frequency
from a 200Hz timer keeps forcewake 40% active.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-1-chris@chris-wilson.co.uk
In the selftests, where we are accessing a private ctx from within the
confines of a single test, we know that the ctx->vm pointer is static
and bounded by the lifetime of the test. We can use a simple helper to
provide the RCU annotations to keep sparse happy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107221201.30497-1-chris@chris-wilson.co.uk
Since drm provided us with a real struct file we can use for our
anonymous internal clients (mock_file), complete our transition to using
that as the primary interface (and not the mocked up struct drm_file we
previous were using).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107213929.23286-1-chris@chris-wilson.co.uk
The headers in the gem/selftests/, gt/selftests, gvt/, selftests/
directories have never been compile-tested, but it would be possible
to make them self-contained.
This commit only addresses missing <linux/types.h> and forward
struct declarations.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108094142.25942-1-yamada.masahiro@socionext.com
The region of pvinfo is reserved for communication between a VMM and
the GPU driver executing on a virtual machine. HW doesn't have any
backing mmio store support for the pvinfo region, thus accessing to
this range through MMIO read/write from host side is forbidden which
is regarded as unclaimed register access.
This patch leaves pvinfo range be initialized with zero.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The reference count of obj will be decremented twice if error occurs
in dma_buf_fd(). Additionally, attempting to read the reference count of
obj after dropping reference may lead to a use after free bug. Here, we
drop obj's reference until it is not used.
Fixes: e546e281d3 ("drm/i915/gvt: Dmabuf support for GVT-g")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Rather than just specifying the bullet numbers from the bspec (e.g.,
"4.b") actually include the description of what the bspec wants us to
do. Steps can be renumbered or moved so including the description will
help us match the code up to the spec. Plus if we add support for new
platforms, some of the steps may be added/removed so more descriptive
comments will be useful for ensuring all of the bspec requirements are
met.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107174527.11165-1-matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Whenever, we unbind (or change fence registers) on an object, we must
revoke any and all mmap_gtt using the previous bindings. Those user PTEs
point at the GGTT which know points into a new object, the wrong object.
Ergo, those PTEs must be cleared so that any user access provokes a new
page fault.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-5-chris@chris-wilson.co.uk
Provide a utility function to create a vma corresponding to an mmap() of
our device. And use it to exercise the equivalent of userspace
performing a GTT mmap of our objects.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-4-chris@chris-wilson.co.uk
As drm now exports a method to create an anonymous struct file around a
drm_device for internal use, make use of it to avoid our horrible hacks.
Danial suggested that the mock_file_put() wrapper was suitable for
drm-core, along with the mock_drm_getfile() [and that the vestigal
mock_drm_file() in this patch should perhaps be the drm interface
itself]. However, the eventual goal is to remove the mock_drm_file() and
use the struct file and fput() directly, in this patch we take a simple
transition in that direction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-3-chris@chris-wilson.co.uk
As we read the ctx->vm unlocked before cloning/exporting, we should
validate our reference is correct before returning it. We already do for
clone_vm() but were not so strict around get_ppgtt().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106091312.12921-1-chris@chris-wilson.co.uk
Only add the engine to the available set of uabi engines once it has
been fully initialised and we know we want it in the public set.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107081252.10542-17-chris@chris-wilson.co.uk
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.
However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.
Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.
v2: Deal with uapi vs. hw crtc state split
s/GCM/CGM/ typo fix
Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
If Local memory is supported by hardware, we want framebuffer backing
gem objects from local memory.
if the backing obj is not from LMEM, pin_to_display is failed.
v2:
memory regions are correctly assigned to obj->memory_regions [tvrtko]
migration failure is reported as debug log [Tvrtko]
v3:
Migration is dropped. only error is reported [Daniel]
mem region check is move to pin_to_display [Chris]
v4:
s/dev_priv/i915 [chris]
v5:
i915_gem_object_is_lmem is used for detecting the obj mem type. [Matt]
cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105144414.30470-1-ramalingam.c@intel.com
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.
Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
So strictly speaking the existing annotation is also ok, because we
have a chain of
obj->mm.lock#I915_MM_GET_PAGES -> fs_reclaim -> obj->mm.lock
(the shrinker cannot get at an object while we're in get_pages, hence
this is safe). But it's confusing, so try to take the right subclass
of the lock.
This does a bit reduce our lockdep based checking, but then it's also
less fragile, in case we ever change the nesting around.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104173720.2696-3-daniel.vetter@ffwll.ch
The trouble with having a plain nesting flag for locks which do not
naturally nest (unlike block devices and their partitions, which is
the original motivation for nesting levels) is that lockdep will
never spot a true deadlock if you screw up.
This patch is an attempt at trying better, by highlighting a bit more
of the actual nature of the nesting that's going on. Essentially we
have two kinds of objects:
- objects without pages allocated, which cannot be on any lru and are
hence inaccessible to the shrinker.
- objects which have pages allocated, which are on an lru, and which
the shrinker can decide to throw out.
For the former type of object, memory allocations while holding
obj->mm.lock are permissible. For the latter they are not. And
get/put_pages transitions between the two types of objects.
This is still not entirely fool-proof since the rules might change.
But as long as we run such a code ever at runtime lockdep should be
able to observe the inconsistency and complain (like with any other
lockdep class that we've split up in multiple classes). But there are
a few clear benefits:
- We can drop the nesting flag parameter from
__i915_gem_object_put_pages, because that function by definition is
never going allocate memory, and calling it on an object which
doesn't have its pages allocated would be a bug.
- We strictly catch more bugs, since there's not only one place in the
entire tree which is annotated with the special class. All the
other places that had explicit lockdep nesting annotations we're now
going to leave up to lockdep again.
- Specifically this catches stuff like calling get_pages from
put_pages (which isn't really a good idea, if we can call get_pages
so could the shrinker). I've seen patches do exactly that.
Of course I fully expect CI will show me for the fool I am with this
one here :-)
v2: There can only be one (lockdep only has a cache for the first
subclass, not for deeper ones, and we don't want to make these locks
even slower). Still separate enums for better documentation.
Real fix: don't forget about phys objs and pin_map(), and fix the
shrinker to have the right annotations ... silly me.
v3: Forgot usertptr too ...
v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
and instead prime lockdep (Chris).
v5: Appease checkpatch, no double empty lines (Chris)
v6: More rebasing over selftest changes. Also somehow I forgot to
push this patch :-/
Also format comments consistently while at it.
v7: Fix typo in commit message (Joonas)
Also drop the priming, with the lmem merge we now have allocations
while holding the lmem lock, which wreaks the generic priming I've
done in earlier patches. Should probably be resurrected when lmem is
fixed. See
commit 232a6ebae4
Author: Matthew Auld <matthew.auld@intel.com>
Date: Tue Oct 8 17:01:14 2019 +0100
drm/i915: introduce intel_memory_region
I'm keeping the priming patch locally so it wont get lost.
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Tang, CQ" <cq.tang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch
[mlankhorst: Fix commit typos pointed out by Michael Ruhl]
Before we grab the engine wakeref, tidy up the previous heartbeat
request. If we then abort because the engine powerwell is off, we ensure
the request is freed as we know we will not have freed it when
cancelling the work (as the work is running!).
Fixes: 841e867286 ("drm/i915/gt: Only drop heartbeat.systole if the sole owner")
References: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106223410.30334-1-chris@chris-wilson.co.uk
Prefer using intel_encoder and pass the base where needed rather than
keeping both encoder and intel_encoder variables around.
v2: actually add all changes to the patch
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106071715.10613-1-lucas.demarchi@intel.com
Another TGP ID has shown up, so let's add it to avoid South Display
breakage on systems that have this ID.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106005526.1500-1-james.ausmus@intel.com
When inside the lock, remember to unlock even if you want to leave
early.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
We should not be unconditionally calling release_fake_lmem_bar.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106123135.12441-1-matthew.auld@intel.com
The uapi vs. hw state split introduced a bug in
intel_crtc_disable_noatomic() where it's now frobbing an already
freed temp crtc state instead of adjusting the crtc state we
are really left with. Fix that by making a cleaner separation
beteen the two.
This causes explosions on any machine that boots up with pipes
already running but not hooked up to any encoder (typical
behaviour for gen2-4 VBIOS).
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 58d124ea27 ("drm/i915: Complete crtc hw/uapi split, v6.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105171447.22111-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
If the device does not have an aperture through which we can indirectly
access and detile the buffers, simply reject the ioctl. Later we can
extend the ioctl to support different modes, but as an extension the
user must opt in and explicitly control the mmap type (viz
MMAP_OFFSET_IOCTL).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105145305.14314-1-chris@chris-wilson.co.uk
In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.
v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
change.
v5: rebased on gem/gt split (Mika)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
In BXT/APL, device 2 MMIO reads from MIPI controller requires its PLL
to be turned ON. When MIPI PLL is turned off (MIPI Display is not
active or connected), and someone (host or GT engine) tries to read
MIPI registers, it causes hard hang. This is a hardware restriction
or limitation.
Driver by itself doesn't read MIPI registers when MIPI display is off.
But any userspace application can submit unprivileged batch buffer for
execution. In that batch buffer there can be mmio reads. And these
reads are allowed even for unprivileged applications. If these
register reads are for MIPI DSI controller and MIPI display is not
active during that time, then the MMIO read operation causes system
hard hang and only way to recover is hard reboot. A genuine
process/application won't submit batch buffer like this and doesn't
cause any issue. But on a compromised system, a malign userspace
process/app can generate such batch buffer and can trigger system
hard hang (denial of service attack).
The fix is to lower the internal MMIO timeout value to an optimum
value of 950us as recommended by hardware team. If the timeout is
beyond 1ms (which will hit for any value we choose if MMIO READ on a
DSI specific register is performed without PLL ON), it causes the
system hang. But if the timeout value is lower than it will be below
the threshold (even if timeout happens) and system will not get into
a hung state. This will avoid a system hang without losing any
programming or GT interrupts, taking the worst case of lowest CDCLK
frequency and early DC5 abort into account.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.
Fix by masking out everything except the Opcode in the cmd tables
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.
Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.
For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.
We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.
Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.
Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.
We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.
Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.
v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
In the next patch we will be adding a second valid
termination condition which will require a small
amount of refactoring to share logic with the BB_END
case.
Refactor all error conditions to jump to a dedicated
exit path, with 'break' reserved only for a successful
parse.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers
Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.
Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.
v3: rebase (Mika)
v4: Add RING_TIMESTAMP registers to whitelist (Jon)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
In "drm/i915: Add support for mandatory cmdparsing" we introduced the
concept of mandatory parsing. This allows the cmdparser to be invoked
even when user passes batch_len=0 to the execbuf ioctl's.
However, the cmdparser needs to know the extents of the buffer being
scanned. Refactor the code to ensure the cmdparser uses the actual
object size, instead of the incoming length, if user passes 0.
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
For Gen7, the original cmdparser motive was to permit limited
use of register read/write instructions in unprivileged BB's.
This worked by copying the user supplied bb to a kmd owned
bb, and running it in secure mode, from the ggtt, only if
the scanner finds no unsafe commands or registers.
For Gen8+ we can't use this same technique because running bb's
from the ggtt also disables access to ppgtt space. But we also
do not actually require 'secure' execution since we are only
trying to reduce the available command/register set. Instead we
will copy the user buffer to a kmd owned read-only bb in ppgtt,
and run in the usual non-secure mode.
Note that ro pages are only supported by ppgtt (not ggtt), but
luckily that's exactly what we need.
Add the required paths to map the shadow buffer to ppgtt ro for Gen8+
v2: IS_GEN7/IS_GEN (Mika)
v3: rebase
v4: rebase
v5: rebase
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.
In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.
Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.
v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Retroactively stop reporting support for secure batches
through the api for gen6+ so that older binaries trigger
the fallback path instead.
Older binaries use secure batches pre gen6 to access resources
that are not available to normal usermode processes. However,
all known userspace explicitly checks for HAS_SECURE_BATCHES
before relying on the secure batch feature.
Since there are no known binaries relying on this for newer gens
we can kill secure batches from gen6, via I915_PARAM_HAS_SECURE_BATCHES.
v2: rebase (Mika)
v3: rebase (Mika)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
We're about to introduce some new tables for later gens, and the
current naming for the gen7 tables will no longer make sense.
v2: rebase
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
The counter is removed from the pm wakeref count, but it remains intact
so that we can restore it upon resume. Ergo inside suspend, it may have
a value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-1-chris@chris-wilson.co.uk
(cherry picked from commit 83c55ee82f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Currently we shutdown rc6 during i915_gem_resume() but this is called
during the preparation phase (i915_drm_prepare) for all suspend paths,
but we only want to shutdown rc6 for S3+. Move the actual shutdown to
i915_gem_suspend_late().
We then need to differentiate between suspend targets, to distinguish S0
(s2idle) where the device is kept awake but needs to be in a low power
mode (the same as runtime suspend) from the device suspend levels where
we lose control of HW and so must disable any HW access to dangling
memory.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111909
Fixes: c113236718 ("drm/i915: Extract GT render sleep (rc6) management")
Testcase: igt/gem_exec_suspend/power-S0
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-4-chris@chris-wilson.co.uk
(cherry picked from commit c601cb2135)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We already track the debugfs user_forcewake on the GT, so it is natural
to pull the suspend/resume handling under gt/ as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-3-chris@chris-wilson.co.uk
(cherry picked from commit 9ab3fe2d7d)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
As we already do reload the kernel context in intel_gt_resume, repeating
that action inside i915_gem_resume() as well is redundant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-2-chris@chris-wilson.co.uk
(cherry picked from commit c8f6cfc56f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>