linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-25 08:47:24 +07:00

Author	SHA1	Message	Date
Chris Wilson	b27e35ae5b	drm/i915: Keep user GGTT alive for a minimum of 250ms Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too quickly. For example, igt sets the autosuspend delay to 0, and so we immediately attempt to perform runtime suspend upon releasing the wakeref. Unfortunately, that involves tearing down GGTT mmaps as they require an active device. Override the autosuspend for GGTT mmaps, by keeping the wakeref around for 250ms after populating the PTE for a fresh mmap. v2: Prefer refcount_t for its under/overflow error detection v3: Flush the user runtime autosuspend prior to system system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190527115114.13448-1-chris@chris-wilson.co.uk	2019-05-28 08:23:09 +01:00
Ville Syrjälä	c457d9cf25	drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com	2019-05-27 20:51:48 +03:00
Ville Syrjälä	d284d5145e	drm/i915: Make sandybridge_pcode_read() deal with the second data register The pcode mailbox has two data registers. So far we've only ever used the one, but that's about to change. Expose the second data register to the callers of sandybridge_pcode_read(). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521164025.30225-1-ville.syrjala@linux.intel.com	2019-05-27 20:51:48 +03:00
Imre Deak	4361ccac28	drm/i915/icl: Fix AUX-B HW not done issue w/o AUX-A Atm AUX-B transfers can fail with the following error if AUX-A is not enabled: [ 594.594108] [drm:intel_dp_aux_xfer [i915]] dp_aux_ch timeout status 0x7c2003ff [ 594.615854] [drm:intel_dp_aux_xfer [i915]] ERROR dp aux hw did not signal timeout! [ 594.632851] [drm:intel_dp_aux_xfer [i915]] ERROR dp aux hw did not signal timeout! [ 594.632915] [drm:intel_dp_aux_xfer [i915]] ERROR dp_aux_ch not done status 0xac2003ff [ 594.641786] ------------[ cut here ]------------ [ 594.641790] dp_aux_ch not started status 0xac2003ff [ 594.641874] WARNING: CPU: 4 PID: 1366 at drivers/gpu/drm/i915/intel_dp.c:1268 intel_dp_aux_xfer+0x232/0x890 [i915] Ville noticed this issue already earlier and managed to work around it by keeping AUX-A always powered whenever AUX-B was used. He also reported the issue to HW folks and they have now root caused the problem and updated BSpec with a fix (see internal BSpec/Index/21257, HSD/1607152412). I noticed the same error - even with the WA being applied - while doing AUX transfers with Chamelium being connected with a DP cable to the source but letting Chamelium imitate an unplug. This is probably some unstandard way on Chamelium's behalf of disconnecting itself from the AUX pins. For instance it could still pull on the AUX pins which would prevent the source from detecting AUX timeouts in the proper way, leading to the ERRORs or WARNs seen in the logs in the Reference: bug below. In case I disconnect the sink properly (the cable itself, not via the Chamelium unplug xmlrpc command) then the AUX timeout signaling works properly and so there won't be any ERRORs/WARNs emitted. Reference: https://bugs.freedesktop.org/show_bug.cgi?id=110718 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524173532.6444-1-imre.deak@intel.com	2019-05-27 17:40:01 +03:00
Jani Nikula	591d4dc472	drm/i915: make REG_BIT() and REG_GENMASK() work with variables REG_BIT() and REG_GENMASK() were intended to work with both constant expressions and otherwise, with the former having extra compile time checks for the bit ranges. Incredibly, the result of __builtin_constant_p() is not an integer constant expression when given a non-constant expression, leading to errors in BUILD_BUG_ON_ZERO(). Replace __builtin_constant_p() with the __is_constexpr() magic spell. Reported-by: Ville Syrjala <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524185253.1088-1-jani.nikula@intel.com	2019-05-27 13:08:37 +03:00
Colin Ian King	c2df2201b6	drm/i915/gtt: set err to -ENOMEM on memory allocation failure Currently when the allocation of ppgtt->work fails the error return path via err_free returns an uninitialized value in err. Fix this by setting err to the appropriate error return of -ENOMEM. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: `d3622099c7` ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190524212627.24256-1-colin.king@canonical.com	2019-05-27 10:27:34 +01:00
Hans de Goede	5c27de1df8	drm/i915/dsi: Call drm_connector_cleanup on vlv_dsi_init error exit path If we exit vlv_dsi_init() because we failed to find a fixed_mode, then we've already called drm_connector_init() and we should call drm_connector_cleanup() to unregister the connector object. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524163518.17545-1-hdegoede@redhat.com	2019-05-27 10:55:33 +02:00
Jani Nikula	c0a74c7325	drm/i915: Update DRIVER_DATE to 20190524 Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2019-05-24 20:35:22 +03:00
Dongwon Kim	397049a030	drm/i915/gen11: enable support for headerless msgs Setting bit5 (headerless msg for preemptible GPGPU context) of SAMPLER_MODE register to enable support for the headless msgs on gen11. None of existing use cases will be affected by this as this change makes both types of message - headerless and w/ header supported at the same time. It also complies with the new recommendation for the default bit value for the next gen. v2: rewrote commit message to include more information v3: setting the bit in icl_ctx_workarounds_init() Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190425055005.21790-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-05-24 10:06:26 +01:00
Chris Wilson	63e8dcdb4f	drm/i915/gtt: Neuter the deferred unbind callback from gen6_ppgtt_cleanup Having deferred the vma destruction to a worker where we can acquire the struct_mutex, we have to avoid chasing back into the now destroyed ppgtt. The pd_vma is special in having a custom unbind function to scan for unused pages despite the VMA itself being notionally part of the GGTT. As such, we need to disable that callback to avoid a use-after-free. This unfortunately blew up so early during boot that CI declared the machine unreachable as opposed to being the major failure it was. Oops. Fixes: `d3622099c7` ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524064529.20514-1-chris@chris-wilson.co.uk	2019-05-24 10:02:01 +01:00
Dongli Zhang	b3ca0d4491	drm/i915: remove unused IO_TLB_SEGPAGES which should be defined by swiotlb This patch removes IO_TLB_SEGPAGES which is no longer used since commit `5584f1b1d7` ("drm/i915: fix i915 running as dom0 under Xen"). As the define of both IO_TLB_SEGSIZE and IO_TLB_SHIFT are from swiotlb, IO_TLB_SEGPAGES should be defined on swiotlb side if it is required in the future. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1558413639-22568-1-git-send-email-dongli.zhang@oracle.com	2019-05-23 22:18:24 +01:00
Michal Wajdeczko	eaf20e6933	drm/i915/uc: Skip reset preparation if GuC is already dead We may skip reset preparation steps if GuC is already sanitized. v2: replace USES_GUC with guc_is_loaded Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-10-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	a2ce231473	drm/i915/uc: Stop talking with GuC when resetting Knowing that GuC will be reset soon, we may stop all communication immediately without doing graceful cleanup as it is not needed. This patch will also help us capture any unwanted/unexpected attempts to talk with GuC after we decided to reset it. And we need to keep 'disable' part as current and upcoming firmware still expect graceful cleanup. v2: update commit msg Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190523172555.2780-1-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	0922f3459f	drm/i915/uc: Skip GuC HW unwinding if GuC is already dead We should not attempt to unwind GuC hardware/firmware setup if we already have sanitized GuC. v2: replace USES_GUC with guc_is_loaded Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-8-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	f1e6b336ba	drm/i915/uc: Use GuC firmware status helper We already have helper function for checking GuC firmware load status. Replace existing open-coded checks. v2: drop redundant USES_GUC check Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-7-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	89195bab5d	drm/i915/uc: Explicitly sanitize GuC/HuC on failure and finish Explicitly sanitize GuC/HuC on load failure and when we finish using them to make sure our fw state tracking is always correct. While around, use new helper in uc_reset_prepare. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-6-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	78577e294b	drm/i915/guc: Rename intel_guc_is_alive to intel_guc_is_loaded This function just check our software flag, while 'is_alive' may suggest that we are checking runtime firmware status. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-5-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	beca36ffbd	drm/i915/selftests: Use prepare/finish during atomic reset test We were testing full GPU reset in atomic context without correctly wrapping it by prepare/finish steps. This could confuse our GuC reset handling code. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-4-michal.wajdeczko@intel.com	2019-05-23 21:58:36 +01:00
Michal Wajdeczko	f6470c9bcc	drm/i915/selftests: Split igt_atomic_reset testcase Split igt_atomic_reset selftests into separate full & engines parts, so we can move former to the dedicated reset selftests file. While here change engines test to loop first over atomic phases and then loop over available engines. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-3-michal.wajdeczko@intel.com	2019-05-23 21:53:26 +01:00
Michal Wajdeczko	932309fb03	drm/i915/selftests: Move some reset testcases to separate file igt_global_reset and igt_wedged_reset testcases are first candidates. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-2-michal.wajdeczko@intel.com	2019-05-23 21:52:26 +01:00
Chris Wilson	d3622099c7	drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup We rearranged the vm_destroy_ioctl to avoid taking struct_mutex, little realising that buried underneath the gen6 ppgtt release path was a struct_mutex requirement (to remove its GGTT vma). Until that struct_mutex is vanquished, take a detour in gen6_ppgtt_cleanup to do the i915_vma_destroy from inside a worker under the struct_mutex. <4> [257.740160] WARN_ON(debug_locks && !lock_is_held(&(&vma->vm->i915->drm.struct_mutex)->dep_map)) <4> [257.740213] WARNING: CPU: 3 PID: 1507 at drivers/gpu/drm/i915/i915_vma.c:841 i915_vma_destroy+0x1ae/0x3a0 [i915] <4> [257.740214] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal mei_hdcp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core r8169 realtek snd_pcm mei_me mei prime_numbers lpc_ich <4> [257.740224] CPU: 3 PID: 1507 Comm: gem_vm_create Tainted: G U 5.2.0-rc1-CI-CI_DRM_6118+ #1 <4> [257.740225] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016 <4> [257.740249] RIP: 0010:i915_vma_destroy+0x1ae/0x3a0 [i915] <4> [257.740250] Code: 00 00 00 48 81 c7 c8 00 00 00 e8 ed 08 f0 e0 85 c0 0f 85 78 fe ff ff 48 c7 c6 e8 ec 30 a0 48 c7 c7 da 55 33 a0 e8 42 8c e9 e0 <0f> 0b 8b 83 40 01 00 00 85 c0 0f 84 63 fe ff ff 48 c7 c1 c1 58 33 <4> [257.740251] RSP: 0018:ffffc90000aafc68 EFLAGS: 00010282 <4> [257.740252] RAX: 0000000000000000 RBX: ffff8883f7957840 RCX: 0000000000000003 <4> [257.740253] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff8212d1b9 <4> [257.740254] RBP: ffffc90000aafcc8 R08: 0000000000000000 R09: 0000000000000000 <4> [257.740255] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8883f4d5c2a8 <4> [257.740256] R13: ffff8883f4d5d680 R14: ffff8883f4d5c668 R15: ffff8883f4d5c2f0 <4> [257.740257] FS: 00007f777fa8fe40(0000) GS:ffff88840f780000(0000) knlGS:0000000000000000 <4> [257.740258] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [257.740259] CR2: 00007f777f6522b0 CR3: 00000003c612a006 CR4: 00000000001606e0 <4> [257.740260] Call Trace: <4> [257.740283] gen6_ppgtt_cleanup+0x25/0x60 [i915] <4> [257.740306] i915_ppgtt_release+0x102/0x290 [i915] <4> [257.740330] i915_gem_vm_destroy_ioctl+0x7c/0xa0 [i915] <4> [257.740376] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915] <4> [257.740379] drm_ioctl_kernel+0x83/0xf0 <4> [257.740382] drm_ioctl+0x2f3/0x3b0 <4> [257.740422] ? i915_gem_vm_create_ioctl+0x160/0x160 [i915] <4> [257.740426] ? _raw_spin_unlock_irqrestore+0x39/0x60 <4> [257.740430] do_vfs_ioctl+0xa0/0x6e0 <4> [257.740433] ? lock_acquire+0xa6/0x1c0 <4> [257.740436] ? __task_pid_nr_ns+0xb9/0x1f0 <4> [257.740439] ksys_ioctl+0x35/0x60 <4> [257.740441] __x64_sys_ioctl+0x11/0x20 <4> [257.740443] do_syscall_64+0x55/0x1c0 <4> [257.740445] entry_SYSCALL_64_after_hwframe+0x49/0xbe References: `e0695db729` ("drm/i915: Create/destroy VM (ppGTT) for use with contexts") Fixes: `7f3f317a66` ("drm/i915: Restore control over ppgtt for context creation ABI") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190523064933.23604-1-chris@chris-wilson.co.uk	2019-05-23 21:12:12 +01:00
Jani Nikula	09a93ef3d6	drm/i915: remove duplicate typedef for intel_wakeref_t Fix the duplicate typedef for intel_wakeref_t leading to Clang build issues. While at it, actually make the intel_runtime_pm.h header self-contained, which was claimed in the commit being fixed. Reported-by: Nathan Chancellor <natechancellor@gmail.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> References: http://mid.mail-archive.com/20190521183850.GA9157@archlinux-epyc References: https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/201754420#L2435 Fixes: `0d5adc5f2f` ("drm/i915: extract intel_runtime_pm.h from intel_drv.h") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190522103505.2082-1-jani.nikula@intel.com	2019-05-23 15:46:42 +03:00
Jani Nikula	cfc0e7bbf4	drm/i915: Update DRIVER_DATE to 20190523 Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2019-05-23 11:57:24 +03:00
Gwan-gyeong Mun	47d0ccecc9	drm/i915/dp: Support DP ports YUV 4:2:0 output to GEN11 Bspec describes that GEN10 only supports capability of YUV 4:2:0 output to HDMI port and GEN11 supports capability of YUV 4:2:0 output to both DP and HDMI ports. v2: Minor style fix. Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-7-gwan-gyeong.mun@intel.com	2019-05-23 09:49:44 +03:00
Gwan-gyeong Mun	16668f486f	drm/i915/dp: Change a link bandwidth computation for DP Data M/N calculations were assumed a bpp as RGB format. But when we are using YCbCr 4:2:0 output format on DP, we should change bpp calculations as YCbCr 4:2:0 format. The pipe_bpp value was assumed RGB format, therefore, it was multiplied with 3. But YCbCr 4:2:0 requires a multiplier value to 1.5. Therefore we need to divide pipe_bpp to 2 while DP output uses YCbCr4:2:0 format. - RGB format bpp = bpc x 3 - YCbCr 4:2:0 format bpp = bpc x 1.5 But Link M/N values are calculated and applied based on the Full Clock for YCbCr 4:2:0. And DP YCbCr 4:2:0 does not need to pixel clock double for a dotclock caluation. Only for HDMI YCbCr 4:2:0 needs to pixel clock double for a dot clock calculation. It only affects dp and edp port which use YCbCr 4:2:0 output format. And for now, it does not consider a use case of DSC + YCbCr 4:2:0. v2: Addressed review comments from Ville. Remove a changing of pipe_bpp on intel_ddi_set_pipe_settings(). Because the pipe is running at the full bpp, keep pipe_bpp as RGB even though YCbCr 4:2:0 output format is used. Add a link bandwidth computation for YCbCr4:2:0 output format. v3: Addressed reivew comments from Ville. In order to make codes simple, it adds and uses intel_dp_output_bpp() function. v6: Link M/N values are calculated and applied based on the Full Clock for YCbCr420. The Bit per Pixel needs to be adjusted for YUV420 mode as it requires only half of the RGB case. - Link M/N values are calculated and applied based on the Full Clock - Data M/N values needs to be calculated considering the data is half due to subsampling Remove a doubling of pixel clock on a dot clock calculator for DP YCbCr 4:2:0. Rebase and remove a duplicate setting of vsc_sdp.DB17. Add a setting of dynamic range bit to vsc_sdp.DB17. Change Content Type bit to "Graphics" from "Not defined". Change a dividing of pipe_bpp to muliplying to constant values on a switch-case statement. v7: Addressed review comments from Ville. Move a setting of dynamic range bit and a setting of bpc which is based on pipe_bpp to a "drm/i915/dp: Program VSC Header and DB for Pixel Encoding/Colorimetry Format" commit. Change Content Type bit to "Not defined" from "Graphics". Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-6-gwan-gyeong.mun@intel.com	2019-05-23 09:49:44 +03:00
Gwan-gyeong Mun	ec4401d389	drm/i915/dp: Add a support of YCBCR 4:2:0 to DP MSA When YCBCR 4:2:0 outputs is used for DP, we should program YCBCR 4:2:0 to MSA and VSC SDP. As per DP 1.4a spec section 2.2.4.3 [MSA Field for Indication of Color Encoding Format and Content Color Gamut] while sending YCBCR 420 signals we should program MSA MISC1 fields which indicate VSC SDP for the Pixel Encoding/Colorimetry Format. v2: Block comment style fix. v6: Fix an wrong setting of MSA MISC1 fields for Pixel Encoding/Colorimetry Format indication. As per DP 1.4a spec Table 2-96 [MSA MISC1 and MISC0 Fields for Pixel Encoding/Colorimetry Format Indication] When MISC1, bit 6, is Set to 1, a Source device uses a VSC SDP to indicate the Pixel Encoding/Colorimetry Format. On the wrong version it set a bit 5 of MISC1, now it set a bit 6 of MISC1. Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-5-gwan-gyeong.mun@intel.com	2019-05-23 09:49:43 +03:00
Gwan-gyeong Mun	3c053a96ef	drm/i915/dp: Program VSC Header and DB for Pixel Encoding/Colorimetry Format Function intel_pixel_encoding_setup_vsc handles vsc header and data block setup for pixel encoding / colorimetry format. Setup VSC header and data block in function intel_pixel_encoding_setup_vsc for pixel encoding / colorimetry format as per dp 1.4a spec, section 2.2.5.7.1, table 2-119: VSC SDP Header Bytes, section 2.2.5.7.5, table 2-120:VSC SDP Payload for DB16 through DB18. v2: Minor style fix. [Maarten] Refer to commit ids instead of patchwork. [Maarten] v6: Rebase v7: Rebase and addressed review comments from Ville. Use a structure initializer instead of memset(). Fix non-standard comment format. Remove a referring to specific commit. Add a setting of dynamic range bit to vsc_sdp.DB17. Add a setting of bpc which is based on pipe_bpp. Remove duplicated checking of connector's ycbcr_420_allowed from intel_pixel_encoding_setup_vsc(). It is already checked from intel_dp_ycbcr420_config(). Remove comments for VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED. It is already implemented on intel_dp_get_colorimetry_status(). v8: A missing of setting bpc to VSC setup is the pretty fatal case, it replaces DRM_DEBUG_KMS() to MISSING_CASE(). [Maarten] v9: Use a changed member name of struct dp_sdp. it renamed to db from DB. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-4-gwan-gyeong.mun@intel.com	2019-05-23 09:49:43 +03:00
Gwan-gyeong Mun	4d432f956d	drm: Rename struct edp_vsc_psr to struct dp_sdp VSC SDP Payload for PSR is one of data block type of SDP (Secondaray Data Packet). In order to generalize SDP packet structure name, it renames struct edp_vsc_psr to struct dp_sdp. And each SDP data blocks have different usages, each SDP type has different reserved data blocks and Video_Stream_Configuration Extension VESA SDP might use all of Data Blocks as Extended INFORFRAME Data Byte. so it makes Data Block variables as array type. And it adds comments of details of DB of VSC SDP Payload for Pixel Encoding/Colorimetry Format. This comments follows DP 1.4a spec, section 2.2.5.7.5, chapter "VSC SDP Payload for Pixel Encoding/Colorimetry Format". v7: Addressed review comments from Ville. v9: Rename a member value name DB to db on struct dp_sdp [Laurent] Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-3-gwan-gyeong.mun@intel.com	2019-05-23 09:49:32 +03:00
Gwan-gyeong Mun	8e9d645c68	drm/i915/dp: Add a config function for YCBCR420 outputs This patch checks a support of YCBCR420 outputs on an encoder level. If the input mode is YCBCR420-only mode then it prepares DP as an YCBCR420 output, else it continues with RGB output mode. It set output_format to INTEL_OUTPUT_FORMAT_YCBCR420 in order to using a pipe scaler as RGB to YCbCr 4:4:4. v2: Addressed review comments from Ville. Style fixed with few naming. %s/config/crtc_state/ %s/intel_crtc/crtc/ If lscon is active, it makes not to call intel_dp_ycbcr420_config() to avoid to clobber of lspcon_ycbcr420_config() routine. And it move the 420_only check into the intel_dp_ycbcr420_config(). v3: Fix uninitialized return value and it is reported by Dan Carpenter. v4: Addressed review comments from Ville. In order to avoid the extra indentation, it inverts if-clause on intel_dp_ycbcr420_config(). Remove the error print where no errors print are allowed. v6: Rebase v7: Move intel_dp_get_colorimetry_status() to intel_dp from intel_psr. intel_dp_get_colorimetry_status() checks VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED bit in the DPRX_FEATURE_ENUMERATION_LIST register. And intel_dp_ycbcr420_config() uses intel_dp_get_colorimetry_status(). Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-2-gwan-gyeong.mun@intel.com	2019-05-23 09:48:59 +03:00
Tvrtko Ursulin	c5d3e39caa	drm/i915: Engine discovery query Engine discovery query allows userspace to enumerate engines, probe their configuration features, all without needing to maintain the internal PCI ID based database. A new query for the generic i915 query ioctl is added named DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure drm_i915_query_engine_info. The address of latter should be passed to the kernel in the query.data_ptr field, and should be large enough for the kernel to fill out all known engines as struct drm_i915_engine_info elements trailing the query. As with other queries, setting the item query length to zero allows userspace to query minimum required buffer size. Enumerated engines have common type mask which can be used to query all hardware engines, versus engines userspace can submit to using the execbuf uAPI. Engines also have capabilities which are per engine class namespace of bits describing features not present on all engine instances. v2: * Fixed HEVC assignment. * Reorder some fields, rename type to flags, increase width. (Lionel) * No need to allocate temporary storage if we do it engine by engine. (Lionel) v3: * Describe engine flags and mark mbz fields. (Lionel) * HEVC only applies to VCS. v4: * Squash SFC flag into main patch. * Tidy some comments. v5: * Add uabi_ prefix to engine capabilities. (Chris Wilson) * Report exact size of engine info array. (Chris Wilson) * Drop the engine flags. (Joonas Lahtinen) * Added some more reserved fields. * Move flags after class/instance. v6: * Do not check engine info array was zeroed by userspace but zero the unused fields for them instead. v7: * Simplify length calculation loop. (Lionel Landwerlin) v8: * Remove MBZ comments where not applicable. * Rename ABI flags to match engine class define naming. * Rename SFC ABI flag to reflect it applies to VCS and VECS. * SFC is wired to even _logical_ engine instances. * SFC applies to VCS and VECS. * HEVC is present on all instances on Gen11. (Tony) * Simplify length calculation even more. (Chris Wilson) * Move info_ptr assigment closer to loop for clarity. (Chris Wilson) * Use vdbox_sfc_access from runtime info. * Rebase for RUNTIME_INFO. * Refactor for lower indentation. * Rename uAPI class/instance to engine_class/instance to avoid C++ keyword. v9: * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO. v10: * Use new copy_query_item. v11: * Consolidate with struct i915_engine_class_instnace. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tony Ye <tony.ye@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com	2019-05-22 14:17:55 +01:00
Tvrtko Ursulin	cbe3e1d103	drm/i915/icl: Add WaDisableBankHangMode Disable GPU hang by default on unrecoverable ECC cache errors. v2: * Rebase. v3: * Use intel_uncore_read. (Chris) Fixes: `cc38cae7c4` ("drm/i915/icl: Introduce initial Icelake Workarounds") Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190520110442.403-2-tvrtko.ursulin@linux.intel.com	2019-05-22 10:11:10 +01:00
Tvrtko Ursulin	fde938867b	drm/i915/selftests: Verify context workarounds Test context workarounds have been correctly applied in newly created contexts. To accomplish this the existing engine_wa_list_verify helper is extended to take in a context from which reading of the workaround list will be done. Context workaround verification is done from the existing subtests, which have been renamed to reflect they are no longer only about GT and engine workarounds. v2: * Test after resets and refactor to use intel_context more. (Chris) v3: * Use ce->engine->i915 instead of ce->gem_context->i915. (Chris) * gem_engine_iter.idx is engine->id + 1. (Chris) v4: * Make local function static. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190520142546.12493-1-tvrtko.ursulin@linux.intel.com	2019-05-22 10:11:09 +01:00
Chris Wilson	a88b6e4cba	drm/i915: Allow specification of parallel execbuf There is a desire to split a task onto two engines and have them run at the same time, e.g. scanline interleaving to spread the workload evenly. Through the use of the out-fence from the first execbuf, we can coordinate secondary execbuf to only become ready simultaneously with the first, so that with all things idle the second execbufs are executed in parallel with the first. The key difference here between the new EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence waits for the completion of the first request (so that all of its rendering results are visible to the second execbuf, the more common userspace fence requirement). Since we only have a single input fence slot, userspace cannot mix an in-fence and a submit-fence. It has to use one or the other! This is not such a harsh requirement, since by virtue of the submit-fence, the secondary execbuf inherit all of the dependencies from the first request, and for the application the dependencies should be common between the primary and secondary execbuf. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_fence/parallel Link: https://github.com/intel/media-driver/pull/546 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-10-chris@chris-wilson.co.uk	2019-05-22 08:40:50 +01:00
Chris Wilson	ee1136908e	drm/i915/execlists: Virtual engine bonding Some users require that when a master batch is executed on one particular engine, a companion batch is run simultaneously on a specific slave engine. For this purpose, we introduce virtual engine bonding, allowing maps of master:slaves to be constructed to constrain which physical engines a virtual engine may select given a fence on a master engine. For the moment, we continue to ignore the issue of preemption deferring the master request for later. Ideally, we would like to then also remove the slave and run something else rather than have it stall the pipeline. With load balancing, we should be able to move workload around it, but there is a similar stall on the master pipeline while it may wait for the slave to be executed. At the cost of more latency for the bonded request, it may be interesting to launch both on their engines in lockstep. (Bubbles abound.) Opens: Also what about bonding an engine as its own master? It doesn't break anything internally, so allow the silliness. v2: Emancipate the bonds v3: Couple in delayed scheduling for the selftests v4: Handle invalid mutually exclusive bonding v5: Mention what the uapi does v6: s/nbond/num_bonds/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk	2019-05-22 08:40:46 +01:00
Chris Wilson	f71e01a78b	drm/i915: Extend execution fence to support a callback In the next patch, we will want to configure the slave request depending on which physical engine the master request is executed on. For this, we introduce a callback from the execute fence to convey this information. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-8-chris@chris-wilson.co.uk	2019-05-22 08:40:45 +01:00
Chris Wilson	78e41ddd21	drm/i915: Apply an execution_mask to the virtual_engine Allow the user to direct which physical engines of the virtual engine they wish to execute one, as sometimes it is necessary to override the load balancing algorithm. v2: Only kick the virtual engines on context-out if required Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-7-chris@chris-wilson.co.uk	2019-05-22 08:40:43 +01:00
Chris Wilson	6d06779e86	drm/i915: Load balancing across a virtual engine Having allowed the user to define a set of engines that they will want to only use, we go one step further and allow them to bind those engines into a single virtual instance. Submitting a batch to the virtual engine will then forward it to any one of the set in a manner as best to distribute load. The virtual engine has a single timeline across all engines (it operates as a single queue), so it is not able to concurrently run batches across multiple engines by itself; that is left up to the user to submit multiple concurrent batches to multiple queues. Multiple users will be load balanced across the system. The mechanism used for load balancing in this patch is a late greedy balancer. When a request is ready for execution, it is added to each engine's queue, and when an engine is ready for its next request it claims it from the virtual engine. The first engine to do so, wins, i.e. the request is executed at the earliest opportunity (idle moment) in the system. As not all HW is created equal, the user is still able to skip the virtual engine and execute the batch on a specific engine, all within the same queue. It will then be executed in order on the correct engine, with execution on other virtual engines being moved away due to the load detection. A couple of areas for potential improvement left! - The virtual engine always take priority over equal-priority tasks. Mostly broken up by applying FQ_CODEL rules for prioritising new clients, and hopefully the virtual and real engines are not then congested (i.e. all work is via virtual engines, or all work is to the real engine). - We require the breadcrumb irq around every virtual engine request. For normal engines, we eliminate the need for the slow round trip via interrupt by using the submit fence and queueing in order. For virtual engines, we have to allow any job to transfer to a new ring, and cannot coalesce the submissions, so require the completion fence instead, forcing the persistent use of interrupts. - We only drip feed single requests through each virtual engine and onto the physical engines, even if there was enough work to fill all ELSP, leaving small stalls with an idle CS event at the end of every request. Could we be greedy and fill both slots? Being lazy is virtuous for load distribution on less-than-full workloads though. Other areas of improvement are more general, such as reducing lock contention, reducing dispatch overhead, looking at direct submission rather than bouncing around tasklets etc. sseu: Lift the restriction to allow sseu to be reconfigured on virtual engines composed of RENDER_CLASS (rcs). v2: macroize check_user_mbz() v3: Cancel virtual engines on wedging v4: Commence commenting v5: Replace 64b sibling_mask with a list of class:instance v6: Drop the one-element array in the uabi v7: Assert it is an virtual engine in to_virtual_engine() v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2) Link: https://github.com/intel/media-driver/pull/283 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk	2019-05-22 08:40:38 +01:00
Chris Wilson	b81dde7194	drm/i915: Allow userspace to clone contexts on creation A usecase arose out of handling context recovery in mesa, whereby they wish to recreate a context with fresh logical state but preserving all other details of the original. Currently, they create a new context and iterate over which bits they want to copy across, but it would much more convenient if they were able to just pass in a target context to clone during creation. This essentially extends the setparam during creation to pull the details from a target context instead of the user supplied parameters. The ideal here is that we don't expose control over anything more than can be obtained via CONTEXT_PARAM. That is userspace retains explicit control over all features, and this api is just convenience. For example, you could replace struct context_param p = { .param = CONTEXT_PARAM_VM }; param.ctx_id = old_id; gem_context_get_param(&p.param); new_id = gem_context_create(); param.ctx_id = new_id; gem_context_set_param(&p.param); gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */ with struct create_ext_param p = { { .name = CONTEXT_CREATE_CLONE }, .clone_id = old_id, .flags = CLONE_FLAGS_VM } new_id = gem_context_create_ext(&p); and not have to worry about stray namespace pollution etc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-5-chris@chris-wilson.co.uk	2019-05-22 08:40:37 +01:00
Chris Wilson	8319f44c05	drm/i915: Re-expose SINGLE_TIMELINE flags for context creation The SINGLE_TIMELINE flag can be used to create a context such that all engine instances within that context share a common timeline. This can be useful for mixing operations between real and virtual engines, or when using a composite context for a single client API context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-4-chris@chris-wilson.co.uk	2019-05-22 08:40:35 +01:00
Chris Wilson	e620f7b3a2	drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[] Allow the user to specify a local engine index (as opposed to class:index) that they can use to refer to a preset engine inside the ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES. This will be useful for setting SSEU parameters on virtual engines that are local to the context and do not have a valid global class:instance lookup. Note that due to the ambiguity in using class:instance with ctx->engines[], if a user supplied engine map is active the user must specify the engine to alter by its index into the ctx->engines[]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-3-chris@chris-wilson.co.uk	2019-05-22 08:40:34 +01:00
Chris Wilson	976b55f0e1	drm/i915: Allow a context to define its set of engines Over the last few years, we have debated how to extend the user API to support an increase in the number of engines, that may be sparse and even be heterogeneous within a class (not all video decoders created equal). We settled on using (class, instance) tuples to identify a specific engine, with an API for the user to construct a map of engines to capabilities. Into this picture, we then add a challenge of virtual engines; one user engine that maps behind the scenes to any number of physical engines. To keep it general, we want the user to have full control over that mapping. To that end, we allow the user to constrain a context to define the set of engines that it can access, order fully controlled by the user via (class, instance). With such precise control in context setup, we can continue to use the existing execbuf uABI of specifying a single index; only now it doesn't automagically map onto the engines, it uses the user defined engine map from the context. v2: Fixup freeing of local on success of get_engines() v3: Allow empty engines[] v4: s/nengine/num_engines/ v5: Replace 64 limit on num_engines with a note that execbuf is currently limited to only using the first 64 engines. v6: Actually use the engines_mutex to guard the ctx->engines. Testcase: igt/gem_ctx_engines Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-2-chris@chris-wilson.co.uk	2019-05-22 08:40:31 +01:00
Chris Wilson	7f3f317a66	drm/i915: Restore control over ppgtt for context creation ABI Having hid the partially exposed new ABI from the PR, put it back again for completion of context recovery. A significant part of context recovery is the ability to reuse as much of the old context as is feasible (to avoid expensive reconstruction). The biggest chunk kept hidden at the moment is fine-control over the ctx->ppgtt (the GPU page tables and associated translation tables and kernel maps), so make control over the ctx->ppgtt explicit. This allows userspace to create and share virtual memory address spaces (within the limits of a single fd) between contexts they own, along with the ability to query the contexts for the vm state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-1-chris@chris-wilson.co.uk	2019-05-22 08:40:29 +01:00
Ville Syrjälä	5c000fb33b	drm/i915: Bump gen7+ fb size limits to 16kx16k With gtt remapping in place we can use arbitrarily large framebuffers. Let's bump the limits to 16kx16k on gen7+. The limit was chosen to match the maximum 2D surface size of the 3D engine. With the remapping we could easily go higher than that for the display engine. However the modesetting ddx will blindly assume it can handle whatever is reported via kms. The oversized buffer dimensions are not caught by glamor nor Mesa until finally an assert will trip when genxml attempts to pack the SURFACE_STATE. So we pick a safe limit to avoid the X server from crashing (or potentially misbehaving if the genxml asserts are compiled out). Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110187 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-9-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:48 +03:00
Ville Syrjälä	2033012982	drm/i915: Bump fb stride limit to 128KiB for gen4+ and 256KiB for gen7+ With gtt remapping plugged in we can simply raise the stride limit on gen4+. Let's just pick the limit to match the render engine max stride (256KiB on gen7+, 128KiB on gen4+). No remapping CCS because the virtual address of each page actually matters due to the new hash mode (WaCompressedResourceDisplayNewHashMode:skl,kbl etc.), and no remapping on gen2/3 due extra complications from fence alignment and gen2 2KiB GTT tile size. Also no real benefit since the display engine limits already match the other limits. v2: Rebase due to is_ccs_modifier() v3: Tweak the comment and commit msg v4: Fix gen4+ stride limit to be 128KiB Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #v3 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-8-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:48 +03:00
Ville Syrjälä	aa5ca8b742	drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping Align dumb buffer stride to 4k if the fb will be big enough to require gtt remapping. v2: Leave the stride alone for buffers that look to be for the cursor v3: Make it not a hack (Daniel) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-7-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:48 +03:00
Ville Syrjälä	54d4d719fa	drm/i915: Overcome display engine stride limits via GTT remapping The display engine stride limits are getting in our way. On SKL+ we are limited to 8k pixels, which is easily exceeded with three 4k displays. To overcome this limitation we can remap the pages in the GTT to provide the display engine with a view of memory with a smaller stride. The code is mostly already there as We already play tricks with the plane surface address and x/y offsets. A few caveats apply: * linear buffers need the fb stride to be page aligned, as otherwise the remapped lines wouldn't start at the same spot * compressed buffers can't be remapped due to the new ccs hash mode causing the virtual address of the pages to affect the interpretation of the compressed data. IIRC the old hash was limited to the low 12 bits so if we were using that mode we could remap. As it stands we just refuse to remapp with compressed fbs. * no remapping gen2/3 as we'd need a fence for the remapped vma, which we currently don't have. Need to deal with the fence POT requirements, and do something about the gen2 gtt page size vs tile size difference v2: Rebase due to is_ccs_modifier() Fix up the skl+ stride_mult mess memset() the gtt_view because otherwise we could leave junk in plane[1] when going from 2 plane to 1 plane format v3: intel_check_plane_stride() was split out v4: Drop the aligned viewport stuff, it was meant for ccs which can't be remapped anyway v5: Introduce intel_plane_can_remap() Reorder the code so that plane_state->view gets filled even for invisible planes, otherwise we'd keep using stale values and could explode during remapping. The new logic never remaps invisible planes since we don't have a viewport, and instead pins the full fb instead v6: Fix plane src coord checks after remapping by moving plane_state->base.src to the final plane x/y offsets. Allow intel_plane_check_stride() to fail even with remapping (can happen at least with a linear 64bpp fb with a 4k plane and a suitably inconvenient src coordinates). Improve aux plane FIXME (Daniel) Move some code shuffling into a separate patch (Daniel) Testcase: igt/kms_big_fb Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-6-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Ville Syrjälä	a88c40ebb8	drm/i915: Shuffle stride checking code around Reorganize some fb stride checking code a bit to prepare for gtt remapping. And do a bit of s/pitch/stride/ renaming in the process for a bit more uniformity (apart from the whole fb->pitches[] thing). Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-5-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Ville Syrjälä	bb211c3d0c	drm/i915/selftests: Add live vma selftest Add a live selftest to excercise rotated/remapped vmas. We simply write through the rotated/remapped vma, and confirm that the data appears in the right page when read through the normal vma. Not sure what the fallout of making all rotated/remapped vmas mappable/fenceable would be, hence I just hacked it in the test. v2: Grab rpm reference (Chris) GEM_BUG_ON(view.type not as expected) (Chris) Allow CAN_FENCE for rotated/remapped vmas (Chris) Update intel_plane_uses_fence() to ask for a fence only for normal vmas on gen4+ v3: Deal with intel_wakeref_t v4: Rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Ville Syrjälä	e2e394bffa	drm/i915/selftests: Add mock selftest for remapped vmas Extend the rotated vma mock selftest to cover remapped vmas as well. TODO: reindent the loops I guess? Left like this for now to ease review v2: Include the vma type in the error message (Chris) v3: Deal with trimmed sg v4: Drop leftover debugs Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-3-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Ville Syrjälä	1a74fc0b3f	drm/i915: Add a new "remapped" gtt_view To overcome display engine stride limits we'll want to remap the pages in the GTT. To that end we need a new gtt_view type which is just like the "rotated" type except not rotated. v2: Use intel_remapped_plane_info base type s/unused/unused_mbz/ (Chris) Separate BUILD_BUG_ON()s (Chris) Use I915_GTT_PAGE_SIZE (Chris) v3: Use i915_gem_object_get_dma_address() (Chris) Trim the sg (Tvrtko) v4: Actually trim this time. Limit the max length to one row of pages to keep things simple Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00

1 2 3 4 5 ...

827770 Commits