VSC SDP Payload for PSR is one of data block type of SDP (Secondaray Data
Packet). In order to generalize SDP packet structure name, it renames
struct edp_vsc_psr to struct dp_sdp. And each SDP data blocks have
different usages, each SDP type has different reserved data blocks and
Video_Stream_Configuration Extension VESA SDP might use all of Data Blocks
as Extended INFORFRAME Data Byte. so it makes Data Block variables as
array type. And it adds comments of details of DB of VSC SDP Payload
for Pixel Encoding/Colorimetry Format. This comments follows DP 1.4a spec,
section 2.2.5.7.5, chapter "VSC SDP Payload for Pixel Encoding/Colorimetry
Format".
v7: Addressed review comments from Ville.
v9: Rename a member value name DB to db on struct dp_sdp [Laurent]
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-3-gwan-gyeong.mun@intel.com
This patch checks a support of YCBCR420 outputs on an encoder level.
If the input mode is YCBCR420-only mode then it prepares DP as an YCBCR420
output, else it continues with RGB output mode.
It set output_format to INTEL_OUTPUT_FORMAT_YCBCR420 in order to using
a pipe scaler as RGB to YCbCr 4:4:4.
v2:
Addressed review comments from Ville.
Style fixed with few naming.
%s/config/crtc_state/
%s/intel_crtc/crtc/
If lscon is active, it makes not to call intel_dp_ycbcr420_config()
to avoid to clobber of lspcon_ycbcr420_config() routine.
And it move the 420_only check into the intel_dp_ycbcr420_config().
v3: Fix uninitialized return value and it is reported by Dan Carpenter.
v4:
Addressed review comments from Ville.
In order to avoid the extra indentation, it inverts if-clause on
intel_dp_ycbcr420_config().
Remove the error print where no errors print are allowed.
v6: Rebase
v7:
Move intel_dp_get_colorimetry_status() to intel_dp from intel_psr.
intel_dp_get_colorimetry_status() checks
VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED bit in the
DPRX_FEATURE_ENUMERATION_LIST register.
And intel_dp_ycbcr420_config() uses intel_dp_get_colorimetry_status().
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-2-gwan-gyeong.mun@intel.com
Fixes for 5.2:
- Fix for DMCU firmware issues for stable
- Add missing polaris10 pci id to kfd
- Screen corruption fix on picasso
- Fix for driver reload on vega10
- SR-IOV fixes
- Locking fix in new SMU code
- Compute profile switching fix for KFD
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522205425.3657-1-alexander.deucher@amd.com
- gma500 fix to make lvds detection more reliable
- select devfreq for panfrost since it can't probe without it
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlzlpl0ACgkQlvcN/ahK
Bwp5PAgAnL7n2k7f6ifBhF4h0SLWAGDLNnO1Pmyi5lYQTHSPf66N2xpgatTicMY6
EKVvzsFO3dTFDFbFY3kUC+nuVTaz7NDbPXnVRxiEsIlqoEIII70XHLZJKYdEjkEQ
b2Tueo7lPj6EZjdcOHuErpdd9+Iy4FqPpsNuXz4giBlDU5yVYMgfY/j6EEq7QiSh
N9onTxlSJug/RWtqFx3tGSBxQvLuot+2w3h3i9wIqu/BqjilR1mbmYH9iPULm4wa
CDatEDP3hNSDfQfQTinFpmZR6x29GMGNr4ATBCRCxCupnEofv8eHO1u2cB28sB6s
HQXkBYDaeBIDM3Aozs2A5n89LheG9Q==
=FgtF
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2019-05-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- sun4i fixes to hdmi phy as well as u16 overflow in dsi (left from -next-fixes)
- gma500 fix to make lvds detection more reliable
- select devfreq for panfrost since it can't probe without it
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522194440.GA22359@art_vandelay
ADD HLG EOTF to the list of EOTF transfer functions supported.
Hybrid Log-Gamma (HLG) is a high dynamic range (HDR) standard.
HLG defines a nonlinear transfer function in which the lower
half of the signal values use a gamma curve and the upper half
of the signal values use a logarithmic curve.
v2: Rebase
v3: Fixed a warning message
v4: Addressed Shashank's review comments
v5: Addressed Jonas Karlman's review comment and dropped the i915
tag from header.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-8-git-send-email-uma.shankar@intel.com
Enable Dynamic Range and Mastering Infoframe for HDR
content, which is defined in CEA 861.3 spec.
The metadata will be computed based on blending
policy in userspace compositors and passed as a connector
property blob to driver. The same will be sent as infoframe
to panel which support HDR.
Added the const version of infoframe for DRM metadata
for HDR.
v2: Rebase and added Ville's POC changes.
v3: No Change
v4: Addressed Shashank's review comments and merged the
patch making drm infoframe function arguments as constant.
v5: Rebase
v6: Fixed checkpatch warnings with --strict option. Addressed
Shashank's review comments and added his RB.
v7: Addressed Brian Starkey's review comments. Merged 2 patches
into one.
v8: Addressed Jonas Karlman review comments.
v9: Addressed Jonas Karlman review comments.
v10: Addressed Ville's review comments.
v11: Added BUILD_BUG_ON and sizeof instead of magic numbers as
per Ville's comments.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-5-git-send-email-uma.shankar@intel.com
HDR metadata block is introduced in CEA-861.3 spec.
Parsing the same to get the panel's HDR metadata.
v2: Rebase and added Ville's POC changes to the patch.
v3: No Change
v4: Addressed Shashank's review comments
v5: Addressed Shashank's comment and added his RB.
v6: Addressed Jonas Karlman review comments.
v7: Adressed Ville's review comments and fixed the issue
with length handling.
v8: Put the length check as per the convention followed in
existing code, as suggested by Ville.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-4-git-send-email-uma.shankar@intel.com
This adds reference count for HDR metadata blob,
handled as part of duplicate and destroy connector
state functions.
v2: Removed the hdr_metadata_changed initialization as
the variable is dropped and not required.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-3-git-send-email-uma.shankar@intel.com
This patch adds a blob property to get HDR metadata
information from userspace. This will be send as part
of AVI Infoframe to panel.
It also implements get() and set() functions for HDR output
metadata property.The blob data is received from userspace and
saved in connector state, the same is returned as blob in get
property call to userspace.
v2: Rebase and modified the metadata structure elements
as per Ville's POC changes.
v3: No Change
v4: Addressed Shashank's review comments
v5: Rebase.
v6: Addressed Brian Starkey's review comments, defined
new structure with header for dynamic metadata scalability.
Merge get/set property functions for metadata in this patch.
v7: Addressed Jonas Karlman review comments and defined separate
structure for infoframe to better align with CTA 861.G spec. Added
Shashank's RB.
v8: Addressed Ville's review comments. Moved sink metadata structure
out of uapi headers as suggested by Jonas Karlman.
v9: Rebase and addressed Jonas Karlman review comments.
v10: Addressed Ville's review comments, dropped the metdata_changed
state variable as its not needed anymore.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558015817-12025-2-git-send-email-uma.shankar@intel.com
Currently, there is some logic for the driver to work without devfreq.
However, the driver actually fails to probe if !CONFIG_PM_DEVFREQ.
Fix this by selecting devfreq, and drop the additional checks
for devfreq.
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190517150042.776-1-ezequiel@collabora.com
That is now done by the DMA-buf helpers instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.kernel.org/patch/10943055/
Engine discovery query allows userspace to enumerate engines, probe their
configuration features, all without needing to maintain the internal PCI
ID based database.
A new query for the generic i915 query ioctl is added named
DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
drm_i915_query_engine_info. The address of latter should be passed to the
kernel in the query.data_ptr field, and should be large enough for the
kernel to fill out all known engines as struct drm_i915_engine_info
elements trailing the query.
As with other queries, setting the item query length to zero allows
userspace to query minimum required buffer size.
Enumerated engines have common type mask which can be used to query all
hardware engines, versus engines userspace can submit to using the execbuf
uAPI.
Engines also have capabilities which are per engine class namespace of
bits describing features not present on all engine instances.
v2:
* Fixed HEVC assignment.
* Reorder some fields, rename type to flags, increase width. (Lionel)
* No need to allocate temporary storage if we do it engine by engine.
(Lionel)
v3:
* Describe engine flags and mark mbz fields. (Lionel)
* HEVC only applies to VCS.
v4:
* Squash SFC flag into main patch.
* Tidy some comments.
v5:
* Add uabi_ prefix to engine capabilities. (Chris Wilson)
* Report exact size of engine info array. (Chris Wilson)
* Drop the engine flags. (Joonas Lahtinen)
* Added some more reserved fields.
* Move flags after class/instance.
v6:
* Do not check engine info array was zeroed by userspace but zero the
unused fields for them instead.
v7:
* Simplify length calculation loop. (Lionel Landwerlin)
v8:
* Remove MBZ comments where not applicable.
* Rename ABI flags to match engine class define naming.
* Rename SFC ABI flag to reflect it applies to VCS and VECS.
* SFC is wired to even _logical_ engine instances.
* SFC applies to VCS and VECS.
* HEVC is present on all instances on Gen11. (Tony)
* Simplify length calculation even more. (Chris Wilson)
* Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
* Use vdbox_sfc_access from runtime info.
* Rebase for RUNTIME_INFO.
* Refactor for lower indentation.
* Rename uAPI class/instance to engine_class/instance to avoid C++
keyword.
v9:
* Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.
v10:
* Use new copy_query_item.
v11:
* Consolidate with struct i915_engine_class_instnace.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com
Drop remaining uses of the deprecated drmP.h in gma500
Replaced drmp.h with forward declarations or include files
as relevant.
Moved all include files to blocks in following order:
\#include <linux/*>
\#include <asm/*>
\#include <drm/*>
\#include ""
And within each block sort the include files alphabetically.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190519195526.3422-6-sam@ravnborg.org
The DRM_UDELAY wrapper from drm_os_linux.h is used in a few places,
all other places calls udelay() with no wrapper.
There is no reason to continue to use this wrapper - so drop it
and direct call udelay().
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190519195526.3422-5-sam@ravnborg.org
Add proper forward declarations to minimize dependencies on
other header files.
Just add enough that we can safely include all header files in
alphabetically order in relevant files.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190519195526.3422-4-sam@ravnborg.org
Drop use of drmp.h from all header files in drm/gma500.
Fix fallout in all files.
In some cases moved include lines and sorted them too.
With drmP.h removed from all header files it can now be removed from
each .c file without any further dependencies
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190519195526.3422-3-sam@ravnborg.org
The header file gma_drm.h is empty so remove it and
drop all uses of the file.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190519195526.3422-2-sam@ravnborg.org
To align with the rest of DRM terminology, the GEM VRAM helpers now use
lock and unlock in places where reserve and unreserve where used before.
All callers have been adapted.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20190521110831.20200-3-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The push-to-system function forces a buffer out of video RAM. This decision
should rather be made by the memory manager. By replacing the function with
calls to the kunmap and unpin functions, the buffer's memory becomes available,
but the buffer remains in VRAM until it's evicted by a pin operation.
This patch replaces the remaining instances of drm_gem_vram_push_to_system()
in ast and mgag200, and removes the function from DRM.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20190521110831.20200-2-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Test context workarounds have been correctly applied in newly created
contexts.
To accomplish this the existing engine_wa_list_verify helper is extended
to take in a context from which reading of the workaround list will be
done.
Context workaround verification is done from the existing subtests, which
have been renamed to reflect they are no longer only about GT and engine
workarounds.
v2:
* Test after resets and refactor to use intel_context more. (Chris)
v3:
* Use ce->engine->i915 instead of ce->gem_context->i915. (Chris)
* gem_engine_iter.idx is engine->id + 1. (Chris)
v4:
* Make local function static.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190520142546.12493-1-tvrtko.ursulin@linux.intel.com
There is a desire to split a task onto two engines and have them run at
the same time, e.g. scanline interleaving to spread the workload evenly.
Through the use of the out-fence from the first execbuf, we can
coordinate secondary execbuf to only become ready simultaneously with
the first, so that with all things idle the second execbufs are executed
in parallel with the first. The key difference here between the new
EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
waits for the completion of the first request (so that all of its
rendering results are visible to the second execbuf, the more common
userspace fence requirement).
Since we only have a single input fence slot, userspace cannot mix an
in-fence and a submit-fence. It has to use one or the other! This is not
such a harsh requirement, since by virtue of the submit-fence, the
secondary execbuf inherit all of the dependencies from the first
request, and for the application the dependencies should be common
between the primary and secondary execbuf.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_fence/parallel
Link: https://github.com/intel/media-driver/pull/546
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-10-chris@chris-wilson.co.uk
Some users require that when a master batch is executed on one particular
engine, a companion batch is run simultaneously on a specific slave
engine. For this purpose, we introduce virtual engine bonding, allowing
maps of master:slaves to be constructed to constrain which physical
engines a virtual engine may select given a fence on a master engine.
For the moment, we continue to ignore the issue of preemption deferring
the master request for later. Ideally, we would like to then also remove
the slave and run something else rather than have it stall the pipeline.
With load balancing, we should be able to move workload around it, but
there is a similar stall on the master pipeline while it may wait for
the slave to be executed. At the cost of more latency for the bonded
request, it may be interesting to launch both on their engines in
lockstep. (Bubbles abound.)
Opens: Also what about bonding an engine as its own master? It doesn't
break anything internally, so allow the silliness.
v2: Emancipate the bonds
v3: Couple in delayed scheduling for the selftests
v4: Handle invalid mutually exclusive bonding
v5: Mention what the uapi does
v6: s/nbond/num_bonds/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
In the next patch, we will want to configure the slave request
depending on which physical engine the master request is executed on.
For this, we introduce a callback from the execute fence to convey this
information.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-8-chris@chris-wilson.co.uk
Allow the user to direct which physical engines of the virtual engine
they wish to execute one, as sometimes it is necessary to override the
load balancing algorithm.
v2: Only kick the virtual engines on context-out if required
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-7-chris@chris-wilson.co.uk
Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load. The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.
The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.
As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.
A couple of areas for potential improvement left!
- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).
- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.
- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.
Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.
sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).
v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance
v6: Drop the one-element array in the uabi
v7: Assert it is an virtual engine in to_virtual_engine()
v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2)
Link: https://github.com/intel/media-driver/pull/283
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
A usecase arose out of handling context recovery in mesa, whereby they
wish to recreate a context with fresh logical state but preserving all
other details of the original. Currently, they create a new context and
iterate over which bits they want to copy across, but it would much more
convenient if they were able to just pass in a target context to clone
during creation. This essentially extends the setparam during creation
to pull the details from a target context instead of the user supplied
parameters.
The ideal here is that we don't expose control over anything more than
can be obtained via CONTEXT_PARAM. That is userspace retains explicit
control over all features, and this api is just convenience.
For example, you could replace
struct context_param p = { .param = CONTEXT_PARAM_VM };
param.ctx_id = old_id;
gem_context_get_param(&p.param);
new_id = gem_context_create();
param.ctx_id = new_id;
gem_context_set_param(&p.param);
gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */
with
struct create_ext_param p = {
{ .name = CONTEXT_CREATE_CLONE },
.clone_id = old_id,
.flags = CLONE_FLAGS_VM
}
new_id = gem_context_create_ext(&p);
and not have to worry about stray namespace pollution etc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-5-chris@chris-wilson.co.uk
The SINGLE_TIMELINE flag can be used to create a context such that all
engine instances within that context share a common timeline. This can
be useful for mixing operations between real and virtual engines, or
when using a composite context for a single client API context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-4-chris@chris-wilson.co.uk
Allow the user to specify a local engine index (as opposed to
class:index) that they can use to refer to a preset engine inside the
ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES.
This will be useful for setting SSEU parameters on virtual engines that
are local to the context and do not have a valid global class:instance
lookup.
Note that due to the ambiguity in using class:instance with
ctx->engines[], if a user supplied engine map is active the user must
specify the engine to alter by its index into the ctx->engines[].
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-3-chris@chris-wilson.co.uk
Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.
v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/
v5: Replace 64 limit on num_engines with a note that execbuf is
currently limited to only using the first 64 engines.
v6: Actually use the engines_mutex to guard the ctx->engines.
Testcase: igt/gem_ctx_engines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-2-chris@chris-wilson.co.uk
Having hid the partially exposed new ABI from the PR, put it back again
for completion of context recovery. A significant part of context
recovery is the ability to reuse as much of the old context as is
feasible (to avoid expensive reconstruction). The biggest chunk kept
hidden at the moment is fine-control over the ctx->ppgtt (the GPU page
tables and associated translation tables and kernel maps), so make
control over the ctx->ppgtt explicit.
This allows userspace to create and share virtual memory address spaces
(within the limits of a single fd) between contexts they own, along with
the ability to query the contexts for the vm state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-1-chris@chris-wilson.co.uk
After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are
only deleted when the timeout handler is able to be cancelled
successfully.
In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT),
job cleanup would be skipped which may result in memory leaks.
Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in
drm_sched_cleanup_jobs.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/306025/?series=60878&rev=2
After "5918045c4ed4 drm/scheduler: rework job destruction", lima started
to leak memory due to buffers not being destroyed after job execution in
the drm scheduler.
This started happening because the drm scheduler only destroyed buffers
after cancelling the job timeout handler, and for lima this handler was
never started as lima specified a MAX_SCHEDULE_TIMEOUT timeout.
Lima seems to run well in its current state with a real timeout, so to
make it more aligned with the other drivers from now on, let's use a
real default timeout.
This also fixes the observed memory leaks.
The 500ms value was chosen as it is the current value for all other
embedded gpu drivers using drm sched.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190520224229.21111-1-nunes.erico@gmail.com
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not see http www gnu org licenses
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details [based]
[from] [clk] [highbank] [c] you should have received a copy of the
gnu general public license along with this program if not see http
www gnu org licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 355 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If SVGA_3D_CMD_DX_SET_SHADER is called with a shader ID
of SVGA3D_INVALID_ID, and a shader type of
SVGA3D_SHADERTYPE_INVALID, the calculated binding.shader_slot
will be 4294967295, leading to an out-of-bounds read in vmw_binding_loc()
when the offset is calculated.
Cc: <stable@vger.kernel.org>
Fixes: d80efd5cb3 ("drm/vmwgfx: Initial DX support")
Signed-off-by: Murray McAllister <murray.mcallister@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
If SVGA_3D_CMD_DX_DEFINE_RENDERTARGET_VIEW is called with a surface
ID of SVGA3D_INVALID_ID, the srf struct will remain NULL after
vmw_cmd_res_check(), leading to a null pointer dereference in
vmw_view_add().
Cc: <stable@vger.kernel.org>
Fixes: d80efd5cb3 ("drm/vmwgfx: Initial DX support")
Signed-off-by: Murray McAllister <murray.mcallister@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Use struct sg_dma_page_iter in favour struct of sg_page_iter, which fairly
recently was declared useless for obtaining dma addresses.
With a struct sg_dma_page_iter we can't call sg_page_iter_page() so
when the page is needed, use the same page lookup mechanism as for the
non-sg dma modes instead of calling sg_dma_page_iter.
Note, the fixes tag doesn't really point to a commit introducing a
failure / regression, but rather to a commit that implemented a simple
workaround for this problem.
Cc: Jason Gunthorpe <jgg@mellanox.com>
Fixes: d901b2760d ("lib/scatterlist: Provide a DMA page iterator")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
In compat mode, we allowed host-backed user-space with guest-backed
kernel / device. In this mode, set shader commands was broken since
no relocations were emitted. Fix this.
Cc: <stable@vger.kernel.org>
Fixes: e8c66efbfe ("drm/vmwgfx: Make user resource lookups reference-free during validation")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
User-space handles equal to zero are interpreted as uninitialized or
illegal by some drm systems (most notably kms). This means that a
dumb buffer or surface with a zero user-space handle can never be
used as a kms frame-buffer.
Cc: <stable@vger.kernel.org>
Fixes: c7eae62666 ("drm/vmwgfx: Make the object handles idr-generated")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
This may confuse user-space clients like plymouth that opens a drm
file descriptor as a result of a hotplug event and then generates a
new event...
Cc: <stable@vger.kernel.org>
Fixes: 5ea1734827 ("drm/vmwgfx: Send a hotplug event at master_set")
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
When unloading the bochs-drm driver, a warning message is printed by
drm_mode_config_cleanup() because a reference is still held to one of
the drm_connector structs.
Correct this by calling drm_atomic_helper_shutdown() in
bochs_pci_remove().
Fixes: 6579c39594 ("drm/bochs: atomic: switch planes to atomic, wire up helpers.")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Link: http://patchwork.freedesktop.org/patch/msgid/93b363ad62f4938d9ddf3e05b2a61e3f66b2dcd3.1558416473.git.sbobroff@linux.ibm.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
"ret" is uninitialized on this path but it should be -EINVAL.
Fixes: 930c8dfea4 ("drm/i915/gvt: Check if get_next_pt_type() always returns a valid value")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
the vGPU write on TRTTE and 0x4dfc is now write to vreg first. their
values all be restored hardware when context switching.
Fixes: e39c5add32 ("drm/i915/gvt: vGPU MMIO virtualization")
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
0x4dfc is in-context mmio for gen9+, but each vm have different settings
need to add it to save-restore list along with other trtt registers
Fixes: 1786571393 ("drm/i915/gvt: vGPU context switch")
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
for restore-inhibit context, hardware will not load in-context mmios
(engine context part) to hardware, but hardware will save the mmio
values in hardware back to context image. So, in order to save correct
values of vGPU back to context image, values of vGPU mmios have to be
loaded into hardware first for restore-inhibit context.
In this patch, the mechanism is applied to all gen9 platform.
The reason excluding gen8 platforms is only because of lacking of testing
on those platforms.
v3: for mocs registers, goto in-context mmios save-restore path for skl
platform as well (weinan li)
v2: update vreg when scanning indirect context for inhibit context for
gen9
Cc: Weinan Li <weinan.z.li@intel.com>
Acked-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
"To track whether a request has started on HW, we can emit a breadcrumb at
the beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request." It means all the
request which timeline's has_init_breadcrumb is true, then the
emit_init_breadcrumb process must have before emitting the real commands,
otherwise, the scheduler might get a wrong state of this request during
reset. If the request is exactly the guilty one, the scheduler won't
terminate it with the wrong state. To avoid this, do emit_init_breadcrumb
for all the requests from gvt.
v2: cc to stable kernel
Fixes: 8547444137 ("drm/i915: Identify active requests")
Cc: stable@vger.kernel.org
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix compute profile switching on process termination.
Add a dedicated reference counter to keep track of entry/exit to/from
compute profile. This enables switching compute profiles for other
reasons than process creation or termination.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PSP fw primary buffer is not used under SRIOV.
Under SRIOV, VBIOS or hypervisor driver will load psp
sos and psp sysdrv. Therefore, we don't need to
allocate memory for it.
v2: remove superfluous check for amdgpu_bo_free_kernel().
Signed-off-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is a typo so the code unlocks twice instead of taking the lock and
then releasing it.
Fixes: f14a323db5 ("drm/amd/powerplay: implement update enabled feature state to smc for smu11")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For SR-IOV, vram_width can't be read from ATOM as
RAVEN, and DF related registers is not readable, so hardcord
is the only way to set the correct vram_width.
Reviewed-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not necessary on soc15 and breaks driver reload on server cards.
Acked-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This was added to amdgpu but was missed in amdkfd
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.rg
[WHY]
Some early Raven boards had a bad SBIOS that doesn't play nicely with
the DMCU FW. We thought the issues were fixed by ignoring errors on DMCU
load but that doesn't seem to be the case. We've still seen reports of
users unable to boot their systems at all.
[HOW]
Disable DMCU load on Raven 1. Only load it for Raven 2 and Picasso.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[WHY]
We only want to load DMCU FW on Picasso and Raven 2, not on Raven 1.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
In commit b7404c7ecb ("drm/i915: Bump ready tasks ahead of
busywaits"), I tried cutting a corner in order to not install a signal
for each of our dependencies, and only listened to requests on which we
were intending to busywait. The compromise that was made was that
instead of then being able to promote the request with a full
NOSEMAPHORE like its non-busywaiting brethren, as we had not ensured we
had cleared the semaphore chain, we settled for only using the NEWCLIENT
boost. With an over saturated system with multiple NEWCLIENTS in flight
at any time, this was found to be an inadequate promotion and left us
with a much poorer scheduling order than prior to using semaphores.
The outcome of this patch, is that all requests have NOSEMAPHORE
priority when they have no dependencies and are ready to run and not
busywait, restoring the pre-semaphore ordering on saturated systems.
We can demonstrate the effect of poor scheduling order by oversaturating
the system using gem_wsim on a system with multiple vcs engines
(i.e running the same workloads across more clients than required for
peak throughput, e.g. media_load_balance_17i7.wsim -c4 -b context):
x v5.1 (normalized)
+ tip
* fix
+------------------------------------------------------------------------+
| x |
| x |
| x |
| x |
| %x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %#x |
| %#x |
| %#x |
| %#x |
| %#x |
| + %#xx |
| + %#xx |
| + %%#xx |
| + %%#xx |
| + %%#xx |
| + %%#xx |
| + %%##x |
| +++ %%##x |
| +++ %%##x |
| +++ %%##x |
| ++++ %%##x |
| ++++ %%##x |
| ++++ %%##xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ + %#O#xx |
| ++++ + %#O#xx |
| ++++++ + %#O#xx |
| ++++++++++ %OOOxxx|
| ++++++++++ + %#OOO#xx|
| + ++++++++++++ ++ +++++ + ++ @@OOOO#xx|
| |A_| |
||__________M_______A____________________| |
| |A_| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 0.99456 1.00628 0.999985 1.0001545 0.0024387139
+ 120 0.873021 1.00037 0.884134 0.90148752 0.039190862
Difference at 99.5% confidence
-0.098667 +/- 0.0110762
-9.86517% +/- 1.10745%
(Student's t, pooled s = 0.0277657)
% 120 0.990207 1.00165 0.9970265 0.99699748 0.0021024
Difference at 99.5% confidence
-0.003157 +/- 0.000908245
-0.315651% +/- 0.0908105%
(Student's t, pooled s = 0.00227678)
Fixes: b7404c7ecb ("drm/i915: Bump ready tasks ahead of busywaits")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-2-chris@chris-wilson.co.uk
(cherry picked from commit 17db337f50)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Commit 1413b2bc07 ("drm/i915: Trim NEWCLIENT boosting") had the
intended consequence of not allowing a sequence of work that merely
crossed into a new engine the privilege to be promoted to NEWCLIENT
status. It also had the unintended consequence of actually making
NEWCLIENT effective on heavily oversubscribed transcode machines and
impacting upon their throughput.
If we consider a client packet composed of (rcsA, rcsB, vcs) and 30 of
those clients, using the NEWCLIENT boost that will be scheduled as
rcsA x 30, (rcsB, vcs) x 30
where as before it would have been
(rcsA, rcsB, vcs) x 30
That is with NEWCLIENT only boosting the first request of each client,
we would execute all rcsA requests prior to running on the vcs engines;
acruing a lot of dead time as compared to the previous case where the
vcs engine would be started in parallel to processing the second client.
The previous patch has the effect of delaying submission until it is
required by a third party (either the user with an explicit wait, or by
another client/engine). We reduce the NEWCLIENT bump to a mere WAIT,
which has the effect of removing its preemptive grant and reducing it to
the same level as any other user interaction -- that it will not be
promoted above the interengine dependencies, and so preventing NEWCLIENTS
from starving other engines. This a large nerf to the rrul properties of
the current NEWCLIENT, but it still does give prioritised submission to
new requests from light workloads.
References: b16c765122 ("drm/i915: Priority boost for new clients")
Fixes: 1413b2bc07 ("drm/i915: Trim NEWCLIENT boosting") # customer impact
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-4-chris@chris-wilson.co.uk
(cherry picked from commit 68fc728b01)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)
So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).
v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.
Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-3-chris@chris-wilson.co.uk
(cherry picked from commit 6e7eb7a807)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To simplify the next patch, update bump_priority and schedule to accept
the internal i915_sched_ndoe directly and not expect a request pointer.
add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
Function old new delta
i915_schedule_bump_priority 109 113 +4
i915_schedule 50 54 +4
__i915_schedule 922 907 -15
v2: Adopt node for the old rq local, since it no longer is a request but
the origin node.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190513120102.29660-2-chris@chris-wilson.co.uk
(cherry picked from commit 52c76fb18a)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
To avoid pulling in a forward declaration in the next patch, move the
i915_sched_node handling to after the main dfs of the scheduler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190513120102.29660-1-chris@chris-wilson.co.uk
(cherry picked from commit 5ae87063c1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
With gtt remapping in place we can use arbitrarily large
framebuffers. Let's bump the limits to 16kx16k on gen7+.
The limit was chosen to match the maximum 2D surface size
of the 3D engine.
With the remapping we could easily go higher than that for the
display engine. However the modesetting ddx will blindly assume
it can handle whatever is reported via kms. The oversized
buffer dimensions are not caught by glamor nor Mesa until
finally an assert will trip when genxml attempts to pack the
SURFACE_STATE. So we pick a safe limit to avoid the X server
from crashing (or potentially misbehaving if the genxml asserts
are compiled out).
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110187
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-9-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
With gtt remapping plugged in we can simply raise the stride
limit on gen4+. Let's just pick the limit to match the render
engine max stride (256KiB on gen7+, 128KiB on gen4+).
No remapping CCS because the virtual address of each page actually
matters due to the new hash mode
(WaCompressedResourceDisplayNewHashMode:skl,kbl etc.), and no
remapping on gen2/3 due extra complications from fence alignment
and gen2 2KiB GTT tile size. Also no real benefit since the
display engine limits already match the other limits.
v2: Rebase due to is_ccs_modifier()
v3: Tweak the comment and commit msg
v4: Fix gen4+ stride limit to be 128KiB
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v3
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-8-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Align dumb buffer stride to 4k if the fb will be big enough to
require gtt remapping.
v2: Leave the stride alone for buffers that look to be for the cursor
v3: Make it not a hack (Daniel)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-7-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The display engine stride limits are getting in our way. On SKL+
we are limited to 8k pixels, which is easily exceeded with three
4k displays. To overcome this limitation we can remap the pages
in the GTT to provide the display engine with a view of memory
with a smaller stride.
The code is mostly already there as We already play tricks with
the plane surface address and x/y offsets.
A few caveats apply:
* linear buffers need the fb stride to be page aligned, as
otherwise the remapped lines wouldn't start at the same
spot
* compressed buffers can't be remapped due to the new
ccs hash mode causing the virtual address of the pages
to affect the interpretation of the compressed data. IIRC
the old hash was limited to the low 12 bits so if we were
using that mode we could remap. As it stands we just refuse
to remapp with compressed fbs.
* no remapping gen2/3 as we'd need a fence for the remapped
vma, which we currently don't have. Need to deal with the
fence POT requirements, and do something about the gen2
gtt page size vs tile size difference
v2: Rebase due to is_ccs_modifier()
Fix up the skl+ stride_mult mess
memset() the gtt_view because otherwise we could leave
junk in plane[1] when going from 2 plane to 1 plane format
v3: intel_check_plane_stride() was split out
v4: Drop the aligned viewport stuff, it was meant for ccs which
can't be remapped anyway
v5: Introduce intel_plane_can_remap()
Reorder the code so that plane_state->view gets filled
even for invisible planes, otherwise we'd keep using
stale values and could explode during remapping. The new
logic never remaps invisible planes since we don't have
a viewport, and instead pins the full fb instead
v6: Fix plane src coord checks after remapping by moving
plane_state->base.src to the final plane x/y offsets.
Allow intel_plane_check_stride() to fail even with
remapping (can happen at least with a linear 64bpp
fb with a 4k plane and a suitably inconvenient src
coordinates).
Improve aux plane FIXME (Daniel)
Move some code shuffling into a separate patch (Daniel)
Testcase: igt/kms_big_fb
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-6-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reorganize some fb stride checking code a bit to prepare for
gtt remapping. And do a bit of s/pitch/stride/ renaming in the
process for a bit more uniformity (apart from the whole
fb->pitches[] thing).
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-5-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Add a live selftest to excercise rotated/remapped vmas. We simply
write through the rotated/remapped vma, and confirm that the data
appears in the right page when read through the normal vma.
Not sure what the fallout of making all rotated/remapped vmas
mappable/fenceable would be, hence I just hacked it in the test.
v2: Grab rpm reference (Chris)
GEM_BUG_ON(view.type not as expected) (Chris)
Allow CAN_FENCE for rotated/remapped vmas (Chris)
Update intel_plane_uses_fence() to ask for a fence
only for normal vmas on gen4+
v3: Deal with intel_wakeref_t
v4: Rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Extend the rotated vma mock selftest to cover remapped vmas as
well.
TODO: reindent the loops I guess? Left like this for now to
ease review
v2: Include the vma type in the error message (Chris)
v3: Deal with trimmed sg
v4: Drop leftover debugs
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.
v2: Use intel_remapped_plane_info base type
s/unused/unused_mbz/ (Chris)
Separate BUILD_BUG_ON()s (Chris)
Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
to one row of pages to keep things simple
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
patches so we can merge them into -fixes.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlziwSIACgkQlvcN/ahK
BwrfMgf/egFs+Z6oNAlKgadIiy+qd8Yp4DrQHYssdV2ToxCBDiOpTrxeC92gFcqD
TNVUpLRg0XoDcVm81HriOMfmK9uUsOreFY2WoER1hKgeiv1QNpO8AO6rzF4aykY6
nn8PP0N4QS3En/bwqxDr8S9mnXzxkBjYk6j/Kn7kEXfsH2nI9RxIl9zBhLirsYBq
GOj01tsdbTPZBaDg0p0A0ZUzlqoyZInMPkri+pCiAgF2w6mo1n3Z/y45uGpboMFn
daGJjdZCoRtHaBo5naqLQ4Osx075oqZaguGaAT+BNb3QaulBDeKCA0Bu201iqWwD
INBD+Bh/6Q9MdV8vzc3VLwLJFlDj0A==
=ZyUW
-----END PGP SIGNATURE-----
Merge drm-misc-next-fixes-2019-05-20 into drm-misc-fixes
Picking up 3 sun4i patches that missed the last drm-misc-next-fixes pull
request for 5.2
Signed-off-by: Sean Paul <seanpaul@chromium.org>
drivers/gpu/drm/bochs/bochs_mm.c:19:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
Fixes: b3a25b9af8 ("drm/bochs: Convert bochs driver to VRAM MM")
CC: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: kbuild test robot <lkp@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190520122314.GA155389@lkp-kbuild22
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
drm_get_format_info directly calls into drm_format_info, but takes directly
a struct drm_mode_fb_cmd2 pointer, instead of the fourcc directly. It's
shorter to not dereference it, and we can customise the behaviour at the
driver level if we want to, so let's switch to it where it makes sense.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5859d68664b8f0804a56e7386937f6db986b9e0f.1558002671.git-series.maxime.ripard@bootlin.com
So far, the drm_format_plane_height/width functions were operating on the
format's fourcc and was doing a lookup to retrieve the drm_format_info
structure and return the cpp.
However, this is inefficient since in most cases, we will have the
drm_format_info pointer already available so we shouldn't have to perform a
new lookup. Some drm_fourcc functions also already operate on the
drm_format_info pointer for that reason, so the API is quite inconsistent
there.
Let's follow the latter pattern and remove the extra lookup while being a
bit more consistent.
In order to be extra consistent, also rename that function to
drm_format_info_plane_cpp and to a static function in the header to match
the current policy. The parameters order have also be changed to match the
other functions prototype.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/514af1d489d80b8b1767e3716b663ce5103da6eb.1558002671.git-series.maxime.ripard@bootlin.com
So far, the drm_format_plane_cpp function was operating on the format's
fourcc and was doing a lookup to retrieve the drm_format_info structure and
return the cpp.
However, this is inefficient since in most cases, we will have the
drm_format_info pointer already available so we shouldn't have to perform a
new lookup. Some drm_fourcc functions also already operate on the
drm_format_info pointer for that reason, so the API is quite inconsistent
there.
Let's follow the latter pattern and remove the extra lookup while being a
bit more consistent. In order to be extra consistent, also rename that
function to drm_format_info_plane_cpp and to a static function in the
header to match the current policy.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/32aa13e53dbc98a90207fd290aa8e79f785fb11e.1558002671.git-series.maxime.ripard@bootlin.com
drm_format_horz_chroma_subsampling and drm_format_vert_chroma_subsampling
are basically a lookup in the drm_format_info table plus an access to the
hsub and vsub fields of the appropriate entry.
Most drivers are using this function while having access to the entry
already, which means that we will perform an unnecessary lookup. Removing
the call to these functions is therefore more efficient.
Some drivers will not have access to that entry in the function, but in
this case the overhead is minimal (we just have to call drm_format_info()
to perform the lookup) and we can even avoid multiple, inefficient lookups
in some places that need multiple fields from the drm_format_info
structure.
This is amplified by the fact that most of the time the callers will have
to retrieve both the vsub and hsub fields, meaning that they would perform
twice the lookup.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6b3cceb8161e2c1d40c2681de99202328b0a8abc.1558002671.git-series.maxime.ripard@bootlin.com
drm_format_num_planes() is basically a lookup in the drm_format_info table
plus an access to the num_planes field of the appropriate entry.
Most drivers are using this function while having access to the entry
already, which means that we will perform an unnecessary lookup. Removing
the call to drm_format_num_planes is therefore more efficient.
Some drivers will not have access to that entry in the function, but in
this case the overhead is minimal (we just have to call drm_format_info()
to perform the lookup) and we can even avoid multiple, inefficient lookups
in some places that need multiple fields from the drm_format_info
structure.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5ffcec9d14a50ed538e37d565f546802452ee672.1558002671.git-series.maxime.ripard@bootlin.com
The Rockchip VOP driver has a function, scl_vop_cal_scl_fac, that will
lookup the drm_format_info structure from the fourcc passed to it by its
caller.
However, its only caller already derefences the drm_format_info structure
it has access to to retrieve that fourcc. Change the prototype of that
function to pass the drm_format_info structure directly, removing the need
for an extra lookup.
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Suggested-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/27b0041c7977402df4a087c78d2849ffe51c9f1c.1558002671.git-series.maxime.ripard@bootlin.com
With the disappearance of NEWCLIENT, we no longer need to provide the
priority boost on preemption in order to prevent repeated gazumping,
and we can remove the dead code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-5-chris@chris-wilson.co.uk
Commit 1413b2bc07 ("drm/i915: Trim NEWCLIENT boosting") had the
intended consequence of not allowing a sequence of work that merely
crossed into a new engine the privilege to be promoted to NEWCLIENT
status. It also had the unintended consequence of actually making
NEWCLIENT effective on heavily oversubscribed transcode machines and
impacting upon their throughput.
If we consider a client packet composed of (rcsA, rcsB, vcs) and 30 of
those clients, using the NEWCLIENT boost that will be scheduled as
rcsA x 30, (rcsB, vcs) x 30
where as before it would have been
(rcsA, rcsB, vcs) x 30
That is with NEWCLIENT only boosting the first request of each client,
we would execute all rcsA requests prior to running on the vcs engines;
acruing a lot of dead time as compared to the previous case where the
vcs engine would be started in parallel to processing the second client.
The previous patch has the effect of delaying submission until it is
required by a third party (either the user with an explicit wait, or by
another client/engine). We reduce the NEWCLIENT bump to a mere WAIT,
which has the effect of removing its preemptive grant and reducing it to
the same level as any other user interaction -- that it will not be
promoted above the interengine dependencies, and so preventing NEWCLIENTS
from starving other engines. This a large nerf to the rrul properties of
the current NEWCLIENT, but it still does give prioritised submission to
new requests from light workloads.
References: b16c765122 ("drm/i915: Priority boost for new clients")
Fixes: 1413b2bc07 ("drm/i915: Trim NEWCLIENT boosting") # customer impact
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-4-chris@chris-wilson.co.uk
The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)
So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).
v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.
Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-3-chris@chris-wilson.co.uk
Smatch spotted:
drivers/gpu/drm/i915//intel_hdcp.c:1406 hdcp2_authenticate_repeater_topology() warn: should this be a bitwise op?
and indeed looks to be suspect that we do need to use a bitwise or to
combine the two register fields into one counter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190517102225.3069-3-chris@chris-wilson.co.uk
Just to squelch an smatch warning that doesn't see the with_() being
taken unconditionally:
drivers/gpu/drm/i915//intel_dp.c:230 intel_dp_get_fia_supported_lane_count() error: uninitialized symbol 'lane_info'.
drivers/gpu/drm/i915//intel_dp.c:5338 intel_digital_port_connected() error: uninitialized symbol 'is_connected'.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190517102225.3069-2-chris@chris-wilson.co.uk
In commit b7404c7ecb ("drm/i915: Bump ready tasks ahead of
busywaits"), I tried cutting a corner in order to not install a signal
for each of our dependencies, and only listened to requests on which we
were intending to busywait. The compromise that was made was that
instead of then being able to promote the request with a full
NOSEMAPHORE like its non-busywaiting brethren, as we had not ensured we
had cleared the semaphore chain, we settled for only using the NEWCLIENT
boost. With an over saturated system with multiple NEWCLIENTS in flight
at any time, this was found to be an inadequate promotion and left us
with a much poorer scheduling order than prior to using semaphores.
The outcome of this patch, is that all requests have NOSEMAPHORE
priority when they have no dependencies and are ready to run and not
busywait, restoring the pre-semaphore ordering on saturated systems.
We can demonstrate the effect of poor scheduling order by oversaturating
the system using gem_wsim on a system with multiple vcs engines
(i.e running the same workloads across more clients than required for
peak throughput, e.g. media_load_balance_17i7.wsim -c4 -b context):
x v5.1 (normalized)
+ tip
* fix
+------------------------------------------------------------------------+
| x |
| x |
| x |
| x |
| %x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %%x |
| %#x |
| %#x |
| %#x |
| %#x |
| %#x |
| + %#xx |
| + %#xx |
| + %%#xx |
| + %%#xx |
| + %%#xx |
| + %%#xx |
| + %%##x |
| +++ %%##x |
| +++ %%##x |
| +++ %%##x |
| ++++ %%##x |
| ++++ %%##x |
| ++++ %%##xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ %###xx |
| ++++ + %#O#xx |
| ++++ + %#O#xx |
| ++++++ + %#O#xx |
| ++++++++++ %OOOxxx|
| ++++++++++ + %#OOO#xx|
| + ++++++++++++ ++ +++++ + ++ @@OOOO#xx|
| |A_| |
||__________M_______A____________________| |
| |A_| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 0.99456 1.00628 0.999985 1.0001545 0.0024387139
+ 120 0.873021 1.00037 0.884134 0.90148752 0.039190862
Difference at 99.5% confidence
-0.098667 +/- 0.0110762
-9.86517% +/- 1.10745%
(Student's t, pooled s = 0.0277657)
% 120 0.990207 1.00165 0.9970265 0.99699748 0.0021024
Difference at 99.5% confidence
-0.003157 +/- 0.000908245
-0.315651% +/- 0.0908105%
(Student's t, pooled s = 0.00227678)
Fixes: b7404c7ecb ("drm/i915: Bump ready tasks ahead of busywaits")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-2-chris@chris-wilson.co.uk
Avoid charging us for the presumed busywait if the request was preempted
after successfully using semaphores to reduce inter-engine latency.
v2: Bump the priority to reflect the lack of semaphores now required.
References: ca6e56f654 ("drm/i915: Disable semaphore busywaits on saturated systems")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-1-chris@chris-wilson.co.uk
The original bochs and vbox implementations of pin and unpin functions
automatically reserved BOs during validation. This functionality got lost
while converting the code to a generic implementation. This may result
in validating unlocked TTM BOs.
Adding the reserve and unreserve operations to GEM VRAM's pin and unpin
functions fixes the bochs and vbox drivers. Additionally the patch changes
the mgag200, ast and hibmc drivers to not reserve BOs by themselves.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: kernel test robot <lkp@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190516162746.11636-3-tzimmermann@suse.de
Fixes: a3232987fd ("drm/bochs: Convert bochs driver to |struct drm_gem_vram_object|")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The new interfaces drm_gem_vram_{pin/unpin}_reserved() are variants of the
GEM VRAM pin/unpin functions that do not reserve the BO during validation.
The mgag200 driver requires this behavior for its cursor handling. The
patch also converts the driver to use the new interfaces.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190516162746.11636-2-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
include when we had clk_readl() and clk_writel(), but those are gone now
so this patch pushes the dependency out to the users of clk-provider.h.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAlzdx/ERHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSUeJw/+NfQU+GwmfM1mXUnClTuJthKZRlXQTt4o
KzD4VtdqWOPSPWF2QNUM3oG5+FxbmzxZBMMyAfWKO4MS/hYvD3wZOdbP56KvoUe6
I75FHSGYlXFMvohm6vjPvfx30IcBn0QZcP9bhP5B5h0UbIG9annbVWWNR2qBg+/O
4p3o33CPSIO5W3IblSWrFzuEOBXNlkJKTIZW2BcV33aUCbAD3wrvqoP5l7xBbDJN
U+QC+4LoZtA1RSM03qOzHleXrXNhBjWNtxRqXCIu0hkmyVdPAHDg0tb745HdLUc+
PTRCCguU21ANJMf2hD0dYiRi5fSPSLzIqQ2uZW8O6/ChSIMsOrZ43tW1TsQ7E7ZD
gGEu2aj5euPyTVh0HmWKXyqEEUF/fqywJtwNQSyNTzDvQd807Pabb1YoIzZz9w2S
V+/PoDVYF90IN1DsuOnbTCQ/BK0bqUb+7BtkrCzJ1ip3FpdB3017zT1b5wIzLjfI
1NO3ub5iHGAiS1qzChGa3Va56CDjspx66atMomDaeOQsBC983GdWOerBunKxL3UM
US7rhr9DgPz8p9DEFPeXQXABgZUV4ToBb8nD8b2U1eFiOZthg7CO5mKSuwPGXcBQ
RsWwmxc87DZJJJno2abacK/h0ii/r8f+3+C9x98vtYewJEC5RHbJygYHcr3YNjo1
LOdCMassRT4=
=JjQo
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull more clk framework updates from Stephen Boyd:
"One more patch to remove io.h from clk-provider.h.
We used to need this include when we had clk_readl() and clk_writel(),
but those are gone now so this patch pushes the dependency out to the
users of clk-provider.h"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Remove io.h from clk-provider.h
We were setting the wrong flags to enable PTI errors, so we were
seeing reads to invalid PTEs show up as write errors. Also, we
weren't turning on the interrupts. The AXI IDs we were dumping
included the outstanding write number and so they looked basically
random. And the VIO_ADDR decoding was based on the MMU VA_WIDTH for
the first platform I worked on and was wrong on others. In short,
this was a thorough mess from early HW enabling.
Tested on V3D 4.1 and 4.2 with intentional L2T, CLE, PTB, and TLB
faults.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419001014.23579-4-eric@anholt.net
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Looking at a hang recently, I noticed these registers that might tell
me if something obvious was wrong. They didn't help in this case, but
keep it around for the future.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419001014.23579-3-eric@anholt.net
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJc3MkNAAoJEAx081l5xIa+DpkP/2VhwTcnyumBi/cxG9/D/vRN
W0tZDaiR3jvdz5J0lbg03fJnzgbBIFREjS9/W3PmVbsAvzMfUTok9trlw1rht82a
NLbfK/Ge8e+OqmSRJ5kqMblixoWyqmM03/3/ms/nSfd86qywYpLy3ldZHK+WKdJq
BV3topSeuP6dUj35v4L7y69DZ0Q6ignOe8udjkJgUyq9HCU7PxXxiV1K/9wUeFq1
EL736LTnvm7djzseS0kOSIB2oguRnP+6czYkVvMwP2bxuDayklOjAL6s489pvX6e
36FkFXFksj7vvi3s9JfKQ6gmlE4GbWvFevI0cXzB+G9mWr53nrV+iinRvx9neA3C
YG263vD14Brm6Nfr4PCMAI7PgNLas9y0cpzc6LwjpJdWTmPiY/UhHmYbk81ZGEC5
s6Du8vt2PHvkbsyWdNglpPztZzGTKihCIoAtoDzH0TQTLVvoUsr8CKEk/w7wUtuQ
NZSoDnlxlRVVRx07H8/aSYzwDjY9R0J3zCPRCrkDhoPy2tGFJd4xv/d2c1uVV19v
VVqUZLHNhXdSJiu0RNRmyJktgQgnMsONRqaY6qsJq9O8sQrMVU5V+BRk9iBm/m21
xSw1nXJZsPYIqvxgF6+mmr98bIktu47htguV+Zp4rCUzAfO0YxBO5++vqBrLRPax
T97d8gQopoX5nnBI4pzN
=EfK4
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2019-05-16' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"A bunch of fixes for the merge window closure, doesn't seem to be
anything too major or serious in there.
It does add TU117 turing modesetting to nouveau but it's just an
enable for preexisting code.
amdgpu:
- gpu reset at load crash fix
- ATPX hotplug fix for when dGPU is off
- SR-IOV fixes
radeon:
- r5xx pll fixes
i915:
- GVT (MCHBAR, buffer alignment, misc warnings fixes)
- Fixes for newly enabled semaphore code
- Geminilake disable framebuffer compression
- HSW edp fast modeset fix
- IRQ vs RCU race fix
nouveau:
- Turing modesetting fixes
- TU117 support
msm:
- SDM845 bringup fixes
panfrost:
- static checker fixes
pl111:
- spinlock init fix.
bridge:
- refresh rate register fix for adv7511"
* tag 'drm-next-2019-05-16' of git://anongit.freedesktop.org/drm/drm: (36 commits)
drm/msm: Upgrade gxpd checks to IS_ERR_OR_NULL
drm/msm/dpu: Remove duplicate header
drm/pl111: Initialize clock spinlock early
drm/msm: correct attempted NULL pointer dereference in debugfs
drm/msm: remove resv fields from msm_gem_object struct
drm/nouveau: fix duplication of nv50_head_atom struct
drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration
drm/nouveau/core: initial support for boards with TU117 chipset
drm/nouveau/core: allow detected chipset to be overridden
drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change
drm/nouveau/kms/nv50-: fix bug preventing non-vsync'd page flips
drm/nouveau/kms/gv100-: fix spurious window immediate interlocks
drm/bridge: adv7511: Fix low refresh rate selection
drm/panfrost: Add missing _fini() calls in panfrost_device_fini()
drm/panfrost: Only put sync_out if non-NULL
drm/i915: Seal races between async GPU cancellation, retirement and signaling
drm/i915: Fix fastset vs. pfit on/off on HSW EDP transcoder
drm/i915/fbc: disable framebuffer compression on GeminiLake
drm/amdgpu/psp: move psp version specific function pointers to early_init
drm/radeon: prefer lower reference dividers
...
drm_fb_helper_hotplug_event() should tolerate the fb_helper argument being
NULL. Commit 03a9606e7f ("drm/fb-helper: Avoid race with DRM userspace")
introduced a fb_helper dereference before the NULL check.
Fixup by moving the dereference after the NULL check.
Fixes: 03a9606e7f ("drm/fb-helper: Avoid race with DRM userspace")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515132925.48867-1-noralf@tronnes.org
Some DSI panels do use GENERIC_SHORT_WRITE_2 transfer protocol to host
DSI driver and which is similar to GENERIC_SHORT_WRITE.
Add support for the same transfer, so-that so-that the panels which are
requesting similar transfer type will process properly.
Signed-off-by: Jagan Teki <jagan@amarulasolutions.com>
Tested-by: Merlijn Wajer <merlijn@wizzup.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190512184128.13720-3-jagan@amarulasolutions.com
Loop N1 instruction delay for burst mode devices are computed
based on horizontal sync and porch timing values.
The current driver is using u16 type for computing this hsync_porch
value, which would failed to fit within the u16 type for large sync
and porch timings devices. This would result in hsync_porch overflow
and eventually computed wrong instruction delay value.
Example, timings, where it produces the overflow
{
.hdisplay = 1080,
.hsync_start = 1080 + 408,
.hsync_end = 1080 + 408 + 4,
.htotal = 1080 + 408 + 4 + 38,
}
It reproduces the desired delay value 65487 but the correct working
value should be 7.
So, Fix it by computing hsync_porch value separately with u32 type.
Fixes: 1c1a7aa366 ("drm/sun4i: dsi: Add burst support")
Signed-off-by: Jagan Teki <jagan@amarulasolutions.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190512184128.13720-2-jagan@amarulasolutions.com
Current code initializes HDMI PHY clock driver before reset line is
deasserted and clocks enabled. Because of that, initial readout of
clock divider is incorrect (0 instead of 2). This causes any clock
rate with divider 1 (register value 0) to be set incorrectly.
Fix this by moving initialization of HDMI PHY clock driver after reset
line is deasserted and clocks enabled.
Cc: stable@vger.kernel.org # 4.17+
Fixes: 4f86e81748 ("drm/sun4i: Add support for H3 HDMI PHY variant")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190514204337.11068-2-jernej.skrabec@siol.net
- Fix the low refresh rate register in adv7511
- A handful of msm fixes that fell out of 5.1 bringup on SDM845
- Fix spinlock initialization in pl111
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlzccqQACgkQlvcN/ahK
Bwojowf/fzcPku4y1L8IzPIoQ75iYr7VC4jhHiTN1DxHHVQUCCMpdPk9EPMJUPY3
OeMmOqirtOL4+QK+5z/3Zp8ERJ+NYz93Hgkc3GNlNGwt12odr1mavRWNmhPRNx0t
Ghj3nFDRI9MuK6rK3Hzc+Kap5vTFwD3bAEaCt3eRGlnf0yTzqwer969GsaKCeMeo
XqSI3ZtJq2T6bxpvwgKbRNHWxnEGtW/wI7newfz0I3BIVIbciTPz564kNmXiba8R
mC8CG2WB860hxT4AZkFoKTP8IalfskL9XPrLKOmpYocPxPDSwBz6lHQzOJlvk52d
OjQrcmD3URs5j3xSVmMvoPrT05fTYQ==
=dZYN
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-fixes-2019-05-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
- A couple new panfrost fixes
- Fix the low refresh rate register in adv7511
- A handful of msm fixes that fell out of 5.1 bringup on SDM845
- Fix spinlock initialization in pl111
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515201729.GA89093@art_vandelay
dev_pm_domain_attach_by_name() can return NULL, so we should check for
that case when we're about to dereference gxpd.
Fixes: 9325d4266a ("drm/msm/gpu: Attach to the GPU GX power domain")
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeauorora.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515170104.155525-1-sean@poorly.run
The data structure |struct drm_vram_mm| and its helpers replace hibmc's
TTM-based memory manager. It's the same implementation; except for the
type names.
v5:
* set .llseek via DRM_VRAM_MM_FILE_OPERATIONS
v4:
* don't select DRM_TTM or DRM_VRAM_MM_HELPER
v3:
* use drm_gem_vram_mm_funcs
* convert driver to drm_device-based instance
v2:
* implement hibmc_mmap() with drm_vram_mm_mmap()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-21-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_gem_vram_object| and its helpers replace
|struct hibmc_bo|. It's the same implementation; except for the type
names.
v4:
* select config option DRM_VRAM_HELPER
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-20-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_vram_mm| and its helpers replace vboxvideo's
TTM-based memory manager. It's the same implementation; except for the type
names.
v4:
* don't select DRM_TTM or DRM_VRAM_MM_HELPER
v3:
* use drm_gem_vram_mm_funcs
* convert driver to drm_device-based instance
v2:
* implement vbox_mmap() with drm_vram_mm_mmap()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-19-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch replaces |struct vbox_bo| and its helpers with the generic
implementation of |struct drm_gem_vram_object|. The only change in
semantics is that &ttm_bo_driver.verify_access() now does the actual
verification.
v4:
* select config option DRM_VRAM_HELPER
v3:
* remove forward declaration of struct vbox_gem_object
v2:
nothing
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-18-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The mgag200 driver establishes several memory mappings for frame buffers
and cursors. This patch converts the driver to use the equivalent
drm_gem_vram_kmap() functions. It removes the dependencies on TTM
and cleans up the code.
v4:
* cleanups from checkpatch.pl
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-17-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_vram_mm| and its helpers replace mgag200's
TTM-based memory manager. It's the same implementation; except for the type
names.
v4:
* don't select DRM_TTM or DRM_VRAM_MM_HELPER
v3:
* use drm_gem_vram_mm_funcs
* convert driver to drm_device-based instance
v2:
* implement mgag200_mmap() with drm_vram_mm_mmap()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-16-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_gem_vram_object| and its helpers replace
|struct mgag200_bo|. It's the same implementation; except for the type
names.
v4:
* cleanups from checkpatch.pl
* select config option DRM_VRAM_HELPER
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-15-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_vram_mm| and its helpers replace bochs'
TTM-based memory manager. It's the same implementation; except for the
type names.
v5:
* set .llseek via DRM_VRAM_MM_FILE_OPERATIONS
v4:
* don't select DRM_TTM or DRM_VRAM_MM_HELPER
v3:
* use drm_gem_vram_mm_funcs
* convert driver to drm_device-based instance
v2:
* implement bochs_mmap() with drm_vram_mm_mmap()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-14-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_gem_vram_object| and its helpers replace
|struct bochs_bo|. It's the same implementation; except for the type
names.
v5:
* use PRIME helpers from GEM VRAM
v4:
* cleanups from checkpatch.pl
* select config option DRM_VRAM_HELPER
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-13-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The AST driver establishes several memory mappings for frame buffers
and cursors. This patch converts the driver to use the equivalent
drm_gem_vram_kmap() functions. It removes the dependencies on TTM
and cleans up the code.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-12-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_vram_mm| and its helpers replace ast's
TTM-based memory manager. It's the same implementation; except for the
type names.
v4:
* don't select DRM_TTM or DRM_VRAM_MM_HELPER
v3:
* use drm_gem_vram_mm_funcs
* convert driver to drm_device-based instance
v2:
* implement ast_mmap() with drm_vram_mm_mmap()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-11-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The data structure |struct drm_gem_vram_object| and its helpers replace
|struct ast_bo|. It's the same implementation; except for the type names.
v4:
* cleanups from checkpatch.pl
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-10-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
There's now a pointer to struct drm_vram_mm stored in struct drm_device.
DRM drivers that use VRAM MM should use this field to refer to their
instance of the data structure. Appropriate helpers are now provided as
well.
Adding struct drm_vram_mm to struct drm_device further avoids wrappers
and boilerplate code in drivers. This patch implements default functions
for callbacks in struct drm_driver and struct file_operations that use
the struct drm_vram_mm stored in struct drm_device. Drivers that need to
provide their own implementations can still do so.
The patch also adds documentation for the VRAM helper library in general.
v5:
* set .llseek to no_llseek() from DRM_VRAM_MM_FILE_OPERATIONS
v4:
* cleanups from checkpatch.pl
* document VRAM helper library
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-9-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The VRAM MM memory manager is a helper library that manages dedicated video
memory of simple framebuffer devices. It is supported to be used with
struct drm_gem_vram_object, but does not depend on it.
The implementation is based on the respective code from ast, bochs, and
mgag200. These drivers share the exact same implementation except for type
names. The helpers are currently build with TTM. This may change in future
revisions.
v4:
* cleanups from checkpatch.pl
v2:
* renamed to struct drm_vram_mm
* add drm_vram_mm_mmap() helper
* documentation fixes
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-7-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
These basic helper functions for GEM VRAM allow for pinning and mapping
GEM VRAM objects via the PRIME interfaces. It's not a full implementation,
but complete enough for generic fbcon.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-6-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The helper function drm_gem_vram_fill_create_dumb() implements most of
struct drm_driver.dumb_create() for GEM-VRAM buffer objects. It's not a
full implementation of the callback, as several driver-specific parameters
are still required.
v4:
* cleanups from checkpatch.pl
v2:
* documentation fixes
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-5-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The provided helpers can be used for the respective callback functions
in |struct drm_driver|.
v4:
* cleanups from checkpatch.pl
v2:
* documentation fixes
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-4-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The provided helpers can be used for the respective callback functions
in |struct ttm_bo_driver|.
v2:
* drm_is_gem_vram() is now a private function
* documentation fixes
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-3-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The type |struct drm_gem_vram_object| implements a GEM object for simple
framebuffer devices with dedicated video memory. The BO is either located
in VRAM or system memory.
The implementation has been created from the respective code in ast,
bochs and mgag200. These drivers copy their implementation from each
other; except for the names of several data types. The helpers are
currently build with TTM, but this is considered an implementation
detail and may change in future updates.
v5:
* do WARN_ON_ONCE for pin-count mismatches
* allocate only 2 entries in placements array
v4:
* cleanups from checkpatch.pl
* removed several fixed-size types from interfaces
* DRM_VRAM_HELPER now selects DRM_TTM
* remove separate config option for GEM VRAM
v2:
* rename to |struct drm_gem_vram_object|
* move drm_is_gem_ttm() to a later patch in the series
* add drm_gem_vram_kmap_at()
* return is_iomem from kmap functions
* redefine TTM placement flags for public interface
* documentation fixes
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20190508082630.15116-2-tzimmermann@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Since commit dccd2304cc ("ARM: 7430/1: sizes.h: move from asm-generic
to <linux/sizes.h>"), <asm/sizes.h> and <asm-generic/sizes.h> are just
wrappers of <linux/sizes.h>.
This commit replaces all <asm/sizes.h> and <asm-generic/sizes.h> to
prepare for the removal.
Link: http://lkml.kernel.org/r/1553267665-27228-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following warning is seen on systems with broken clock divider.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-09698-g1fb3b52 #1
Hardware name: ARM Integrator/CP (Device Tree)
[<c0011be8>] (unwind_backtrace) from [<c000ebb8>] (show_stack+0x10/0x18)
[<c000ebb8>] (show_stack) from [<c07d3fd0>] (dump_stack+0x18/0x24)
[<c07d3fd0>] (dump_stack) from [<c0060d48>] (register_lock_class+0x674/0x6f8)
[<c0060d48>] (register_lock_class) from [<c005de2c>]
(__lock_acquire+0x68/0x2128)
[<c005de2c>] (__lock_acquire) from [<c0060408>] (lock_acquire+0x110/0x21c)
[<c0060408>] (lock_acquire) from [<c07f755c>] (_raw_spin_lock+0x34/0x48)
[<c07f755c>] (_raw_spin_lock) from [<c0536c8c>]
(pl111_display_enable+0xf8/0x5fc)
[<c0536c8c>] (pl111_display_enable) from [<c0502f54>]
(drm_atomic_helper_commit_modeset_enables+0x1ec/0x244)
Since commit eedd6033b4 ("drm/pl111: Support variants with broken clock
divider"), the spinlock is not initialized if the clock divider is broken.
Initialize it earlier to fix the problem.
Fixes: eedd6033b4 ("drm/pl111: Support variants with broken clock divider")
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1557758781-23586-1-git-send-email-linux@roeck-us.net
msm_gem_describe() would attempt to dereference a NULL pointer via the
address space pointer when no IOMMU is present. Correct this by adding
the appropriate check.
Signed-off-by: Brian Masney <masneyb@onstation.org>
Fixes: 575f048550 ("drm/msm: Clean up and enhance the output of the 'gem' debugfs node")
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190513234105.7531-2-masneyb@onstation.org
Convert to use vm_map_pages() to map range of kernel memory to user vma.
Tested on Rockchip hardware and display is working, including talking to
Lima via prime.
Link: http://lkml.kernel.org/r/7ba359eb1aceac388d05983c1f29b915bdf291f9.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the mmu_notifier_range_blockable() helper function instead of directly
dereferencing the range->blockable field. This is done to make it easier
to change the mmu_notifier range field.
This patch is the outcome of the following coccinelle patch:
%<-------------------------------------------------------------------
@@
identifier I1, FN;
@@
FN(..., struct mmu_notifier_range *I1, ...) {
<...
-I1->blockable
+mmu_notifier_range_blockable(I1)
...>
}
------------------------------------------------------------------->%
spatch --in-place --sp-file blockable.spatch --dir .
Link: http://lkml.kernel.org/r/20190326164747.24405-3-jglisse@redhat.com
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To facilitate additional options to get_user_pages_fast() change the
singular write parameter to be gup_flags.
This patch does not change any functionality. New functionality will
follow in subsequent patches.
Some of the get_user_pages_fast() call sites were unchanged because they
already passed FOLL_WRITE or 0 for the write parameter.
NOTE: It was suggested to change the ordering of the get_user_pages_fast()
arguments to ensure that callers were converted. This breaks the current
GUP call site convention of having the returned pages be the final
parameter. So the suggestion was rejected.
Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marshall <hubcap@omnibond.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The msm_gem_object structure contains resv and _resv fields that are
no longer needed since the reservation object is now stored on
drm_gem_object. msm_atomic_prepare_fb() and msm_atomic_prepare_fb()
both referenced the wrong reservation object, and would lead to an
attempt to dereference a NULL pointer. Correct those two cases to
point to the correct reservation object.
Fixes: dd55cf6929 ("drm: msm: Switch to use drm_gem_object reservation_object")
Cc: David Airlie <airlied@linux.ie>
Cc: linux-arm-msm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: freedreno@lists.freedesktop.org
Cc: Rob Herring <robh@kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Cc: Sean Paul <sean@poorly.run>
Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190513234105.7531-1-masneyb@onstation.org
The values are already present in the modeset.
This is done in preparation for the removal of struct drm_fb_helper_crtc.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190506180139.6913-5-noralf@tronnes.org
Getting rotation info is cheap so we can do it on demand.
This is done in preparation for the removal of struct drm_fb_helper_crtc.
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190506180139.6913-4-noralf@tronnes.org
drm_fb_helper_is_bound() is used to check if DRM userspace is in control.
This is done by looking at the fb on the primary plane. By the time
fb-helper gets around to committing, it's possible that the facts have
changed.
Avoid this race by holding the drm_device->master_mutex lock while
committing. When DRM userspace does its first open, it will now wait
until fb-helper is done. The helper will stay away if there's a master.
Two igt tests fail with the new 'bail out if master' rule. Work around
this by relaxing this rule for drm_fb_helper_restore_fbdev_mode_unlocked()
until the tests have been fixed. Add todo entry for this.
Locking rule: Always take the fb-helper lock first.
v5: drm_fb_helper_restore_fbdev_mode_unlocked(): Use
restore_fbdev_mode_force()
v2:
- Remove drm_fb_helper_is_bound() (Daniel Vetter)
- No need to check fb_helper->dev->master in
drm_fb_helper_single_fb_probe(), restore_fbdev_mode() has the check.
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190506180139.6913-3-noralf@tronnes.org
Add an assert that we don't use TypeC ports for eDP. That may in theory
be possible on TypeC legacy ports, but I'm not sure if that's a
practical scenario, so let's deal with that only if there's a use case.
Adding support for that wouldn't be too difficult, since TypeC mode
switching is not possible on TypeC legacy ports.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509173446.31095-12-imre.deak@intel.com
On ICL we have to make sure that we enable the AUX power domain in a
controlled way (corresponding to the port's actual TypeC mode). Since
the PPS lock - which takes an AUX power ref - is only needed on
eDP on all platforms and eDP/DP on VLV/CHV avoid taking it in all
other cases.
v2:
- Clarify commit log about the condition for taking the PPS lock.
(Ville)
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509173446.31095-11-imre.deak@intel.com
There isn't a separate power domain specific to PLLs. When programming
them we require the same power domain to be enabled which is needed when
accessing other display core parts (not specific to any
pipe/port/transcoder). This corresponds to the DISPLAY_CORE domain added
previously in this patchset, so use that instead to save bits in the
power domain mask.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509173446.31095-10-imre.deak@intel.com
The power get/put was added in
commit 1c767b339b ("drm/i915: take display port power domain in DP HPD handler")
Author: Imre Deak <imre.deak@intel.com>
Date: Mon Aug 18 14:42:42 2014 +0300
to account for the HW access in ibx_digital_port_connected(). This
latter call was in turn removed in
commit 7d23e3c37b ("drm/i915: Cleaning up intel_dp_hpd_pulse")
Author: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Date: Wed Mar 30 18:05:23 2016 +0530
after which we didn't actually need the power reference.
One way we are accessing the HW during HPD pulse handling is via DP AUX
transfers, but the transfer function takes its own reference, so doesn't
need the reference in intel_dp_hpd_pulse().
The other spot is in
intel_psr_short_pulse()->intel_psr_disable_locked()
but that can only happen when the panel is enabled with the
corresponding modeset already holding the required power reference.
v2:
- Remove the unneeded power get/put from intel_psr_disable_locked().
(Ville)
- Checkpatch commit quoting format fix in the commit log.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509173446.31095-9-imre.deak@intel.com