linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-17 06:48:18 +07:00

Author	SHA1	Message	Date
Emil Velikov	b1c1f5c400	drm/i915: add extern C guard for the UAPI header Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Daniel Vetter <daniel.vetter@intel.com>	2016-05-13 14:05:53 +01:00
Tvrtko Ursulin	d9da6aa035	drm/i915: Fix VCS ring selection after uapi decoupling This got broken in: commit `de1add3605` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Jan 15 15:12:50 2016 +0000 drm/i915: Decouple execbuf uAPI from internal implementation BSD ring flags need to be shifted before they can be considered indices into the ring array. Reported by Zhipeng Gong. v2: Simplify the code. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453902069-31353-1-git-send-email-tvrtko.ursulin@linux.intel.com Testcase: igt/gem_exec_basic # bdw-gt3	2016-01-28 10:25:49 +00:00
Chris Wilson	426960bed3	drm/i915: Seal busy-ioctl uABI and prevent leaking of internal ids Tvrtko was looking through the execbuffer-ioctl and noticed that the uABI was tightly coupled to our internal engine identifiers. Close inspection also revealed that we leak those internal engine identifiers through the busy-ioctl, and those internal identifiers already do not match the user identifiers. Fortuitiously, there is only one user of the set of busy rings from the busy-ioctl, and they only wish to choose between the RENDER and the BLT engines. Let's fix the userspace ABI while we still can. v2: Update the uAPI documentation to explain the identifiers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Testcase: igt/gem_busy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1452876706-21620-1-git-send-email-chris@chris-wilson.co.uk	2016-01-21 11:00:35 +00:00
Dave Airlie	ade1ba7346	Merge tag 'drm-intel-next-2015-12-18' of git://anongit.freedesktop.org/drm-intel into drm-next - fix atomic watermark recomputation logic (Maarten) - modeset sequence fixes for LPT (Ville) - more kbl enabling&prep work (Rodrigo, Wayne) - first bits for mst audio - page dirty tracking fixes from Dave Gordon - new get_eld hook from Takashi, also included in the sound tree - fixup cursor handling when placed at address 0 (Ville) - refactor VBT parsing code (Jani) - rpm wakelock debug infrastructure ( Imre) - fbdev is pinned again (Chris) - tune the busywait logic to avoid wasting cpu cycles (Chris) * tag 'drm-intel-next-2015-12-18' of git://anongit.freedesktop.org/drm-intel: (81 commits) drm/i915: Update DRIVER_DATE to 20151218 drm/i915/skl: Default to noncoherent access up to F0 drm/i915: Only spin whilst waiting on the current request drm/i915: Limit the busy wait on requests to 5us not 10ms! drm/i915: Break busywaiting for requests on pending signals drm/i915: don't enable autosuspend on platforms without RPM support drm/i915/backlight: prefer dev_priv over dev pointer drm/i915: Disable primary plane if we fail to reconstruct BIOS fb (v2) drm/i915: Pin the ifbdev for the info->system_base GGTT mmapping drm/i915: Set the map-and-fenceable flag for preallocated objects drm/i915: mdelay(10) considered harmful drm/i915: check that we are in an RPM atomic section in GGTT PTE updaters drm/i915: add support for checking RPM atomic sections drm/i915: check that we hold an RPM wakelock ref before we put it drm/i915: add support for checking if we hold an RPM reference drm/i915: use assert_rpm_wakelock_held instead of opencoding it drm/i915: add assert_rpm_wakelock_held helper drm/i915: remove HAS_RUNTIME_PM check from RPM get/put/assert helpers drm/i915: get a permanent RPM reference on platforms w/o RPM support drm/i915: refactor RPM disabling due to RC6 being disabled ...	2015-12-23 14:22:09 +10:00
Dave Airlie	663a233eef	Merge branch 'drm-header-fixes' of https://github.com/GabrielL/linux into drm-next Fix all the problems with the header files and userspace builds off them. I really care so little about this, but hey who am I to stop progress. * 'drm-header-fixes' of https://github.com/GabrielL/linux: (30 commits) drm: fix inclusion of drm.h in via_drm.h drm: fix inclusion of drm.h in vmwgfx_drm.h drm: fix inclusion of drm.h in virtgpu_drm.h drm: fix inclusion of drm.h in tegra_drm.h drm: fix inclusion of drm.h in savage_drm.h drm: fix inclusion of drm.h in r128_drm.h drm: fix inclusion of drm.h in qxl_drm.h drm: fix inclusion of drm.h in omap_drm.h drm: fix inclusion of drm.h in msm_drm.h drm: fix inclusion of drm.h in mga_drm.h drm: fix inclusion of drm.h in exynos_sarea.h drm: fix inclusion of drm.h in i810_drm.h drm: fix inclusion of drm.h in exynos_sarea.h drm: fix inclusion of drm.h in drm_sarea.h drm: drm_mode.h fix includes drm: drm_fourcc.h fix includes drm: include drm.h in armada_drm.h include/uapi/drm/amdgpu_drm.h: use __u32 and __u64 from <linux/types.h> drm: Kbuild: add admgpu_drm.h to the installed headers drm: use __u{32,64} instead of uint{32,64}_t in virtgpu_drm.h ...	2015-12-11 13:46:05 +10:00
Gabriel Laskar	1049102ff7	drm: fix inclusion of drm.h in exynos_sarea.h Using `#include "drm.h"` instead of `#include <drm/drm.h>` allow drm headers to be moved in another directory without changes, like for the libdrm imports. Signed-off-by: Gabriel Laskar <gabriel@lse.epita.fr> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Mikko Rapeli <mikko.rapeli@iki.fi>	2015-12-10 12:33:23 +01:00
Chris Wilson	506a8e87d8	drm/i915: Add soft-pinning API for execbuffer Userspace can pass in an offset that it presumes the object is located at. The kernel will then do its utmost to fit the object into that location. The assumption is that userspace is handling its own object locations (for example along with full-ppgtt) and that the kernel will rarely have to make space for the user's requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> v2: Fixed incorrect eviction found by Michal Winiarski - fix suggested by Chris Wilson. Fixed incorrect error paths causing crash found by Michal Winiarski. (Not published externally) v3: Rebased because of trivial conflict in object_bind_to_vm. Fixed eviction to allow eviction of soft-pinned objects when another soft-pinned object used by a subsequent execbuffer overlaps reported by Michal Winiarski. (Not published externally) v4: Moved soft-pinned objects to the front of ordered_vmas so that they are pinned first after an address conflict happens to avoid repeated conflicts in rare cases (Suggested by Chris Wilson). Expanded comment on drm_i915_gem_exec_object2.offset to cover this new API. v5: Added I915_PARAM_HAS_EXEC_SOFTPIN parameter for detecting this capability (Kristian). Added check for multiple pinnings on eviction (Akash). Made sure buffers are not considered misplaced without the user specifying EXEC_OBJECT_SUPPORTS_48B_ADDRESS. User must assume responsibility for any addressing workarounds. Updated object2.offset field comment again to clarify NO_RELOC case (Chris). checkpatch cleanup. v6: Trivial rebase on latest drm-intel-nightly v7: Catch attempts to pin above the max virtual address size and return EINVAL (Tvrtko). Decouple EXEC_OBJECT_SUPPORTS_48B_ADDRESS and EXEC_OBJECT_PINNED flags, user must pass both flags in any attempt to pin something at an offset above 4GB (Chris, Daniel Vetter). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Zou Nanhai <nanhai.zou@intel.com> Cc: Kristian Høgsberg <hoegsberg@gmail.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Acked-by: PDT Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1449575707-20933-1-git-send-email-thomas.daniel@intel.com	2015-12-09 10:20:17 +00:00
Ville Syrjälä	8697600b40	drm/i915: Make the high dword offset more explicit in i915_reg_read_ioctl Store the upper dword of the register offset in the whitelist as well. This would allow it to read register where the two halves aren't sitting right next to each other, and it'll make it easier to make register access type safe. While at it change the register offsets to u32 from u64. Our register space isn't quite that big, yet :) v2: Use ldw/udw as the suffixes, and add a note about 64bit wide split regs (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446839021-18599-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-11-18 14:35:16 +02:00
Chris Wilson	fa8848f278	drm/i915: Report context GTT size Since the beginning we have conflated the size of the global GTT with that of the per-process context sizes. In recent times (gen8+), those are no longer the same where the global GTT is limited to 2/4GiB but the per-process GTT may be anything up to 256TiB. Userspace knows nothing of this discrepancy and outside of one or two hacks, uses the getaperture ioctl to determine the maximum size it can use. Let's leave that as reporting the global GTT and use the context reporting method to describe the per-process value (which naturally fallsback to reporting the aliasing or global on older platforms, so userspace can always use this method where available). Testcase: igt/gem_userptr_blits/minor-normal-sync Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90065 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-19 12:16:46 +02:00
Michel Thierry	101b506a7f	drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset There are some allocations that must be only referenced by 32-bit offsets. To limit the chances of having the first 4GB already full, objects not requiring this workaround use DRM_MM_SEARCH_BELOW/ DRM_MM_CREATE_TOP flags In specific, any resource used with flat/heapless (0x00000000-0xfffff000) General State Heap (GSH) or Instruction State Heap (ISH) must be in a 32-bit range, because the General State Offset and Instruction State Offset are limited to 32-bits. Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if they can be allocated above the 32-bit address range. To limit the chances of having the first 4GB already full, objects will use DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible. The libdrm user of the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag is here: http://lists.freedesktop.org/archives/intel-gfx/2015-September/075836.html v2: Changed flag logic from neeeds_32b, to supports_48b. v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel) v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK to use last PIN_ defined instead of hard-coded value; use correct limit check in eb_vma_misplaced. (Chris) v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris) v6: Apply pin-high for ggtt too (Chris) v7: Handle simultaneous pin-high and pin-mappable end correctly (Akash) Fix check for entries currently using +4GB addresses, use min_t and other polish in object_bind_to_vm (Chris) v8: Commit message updated to point to libdrm patch. v9: vmas are allocated in the correct ozone, so only check flag when the vma has not been allocated. (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4) Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-01 18:12:17 +02:00
Artem Savkov	16f7249ddf	uapi/drm/i915_drm.h: fix userspace compilation. commit `346add7834` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 14 18:07:30 2015 +0200 drm/i915: Use expcitly fixed type in compat32 structs changed the type of param field in drm_i915_getparam from int to s32. This header is exported to userspace and needs to use userspace type __s32 instead. This fixes userspace compilation errors like the following: include/drm/i915_drm.h:361:2: error: unknown type name 's32' s32 param; Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2015-09-02 16:26:58 +03:00
Dave Airlie	4eebf60b74	Linux 4.2-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV0R4AAAoJEHm+PkMAQRiG8xIH/AmiRd+JDrs0qqEy46p6X8Gn 0lB5/KsGycvIGIBTiy2nZzcT0Ly6LeFUKUjzPytlOhIZPMrxMVMShDaQKCXXIMUr 1mN6hkvpkLNnUhvL2fR6mm0zkjbz3zZEazFY+Jic8wQrtSkHgfH0DXqSAo8le0f8 kNrd5BPPhIwvpHGaNGFdTpbgpPcalXyQk/fHyvDGidbyXzY/d7l05QfYJ6XCD4Zm IAy48iK5BFts2+z3aOYrOeuuCcm1qFX8YArqzE1rfPp+U/LQpfUfij4cmOqDLn/F qnv9E7bRRVovvrgKe4I3Trta8kT53VLJvqpdw2Usqo8zvhs4VyrYpHC+gEE6YUY= =9Rd4 -----END PGP SIGNATURE----- Merge tag 'v4.2-rc7' into drm-next Linux 4.2-rc7 Backmerge master for i915 fixes	2015-08-17 14:13:53 +10:00
Chris Wilson	648a9bc530	drm/i915: Use two 32bit reads for select 64bit REG_READ ioctls Since the hardware sometimes mysteriously totally flummoxes the 64bit read of a 64bit register when read using a single instruction, split the read into two instructions. Since the read here is of automatically incrementing timestamp counters, we also have to be very careful in order to make sure that it does not increment between the two instructions. However, since userspace tried to workaround this issue and so enshrined this ABI for a broken hardware read and in the process neglected that the read only fails in some environments, we have to introduce a new uABI flag for userspace to request the 2x32 bit accurate read of the timestamp. v2: Fix alignment check and include details of the workaround for userspace. Reported-by: Karol Herbst <freedesktop@karolherbst.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91317 Testcase: igt/gem_reg_read Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Tested-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-21 11:49:13 +02:00
Daniel Vetter	346add7834	drm/i915: Use expcitly fixed type in compat32 structs I was confused shortly whether the compat was needed for the int, until I noticed the pointer in the original. Also remove typedef. v2: Review from Chris. - Add comments. - Also change the int param in the original structure. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-15 11:39:59 +02:00
Abdiel Janulgue	a9ed33ca07	drm/i915: Expose I915_EXEC_RESOURCE_STREAMER flag and getparam Ensures that the batch buffer is executed by the resource streamer. And will let userspace know whether Resource Streamer is supported in the kernel. v2: Don't skip 1<<15 for the exec flags (Jani Nikula) v3: Use HAS_RESOURCE_STREAMER macro for execbuf validation (Chris Wilson) (from getparam patch) v2: Update I915_PARAM_HAS_RESOURCE_STREAMER so it's after I915_PARAM_HAS_GPU_RESET. v3: Only advertise RS support for hardware that supports it. v4: Add HAS_RESOURCE_STREAMER() macro (Chris) Testcase: igt/gem_exec_params Cc: Jani Nikula <jani.nikula@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> [danvet: squash in getparam patch since it'd break bisect, suggested by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-06 10:36:46 +02:00
Chris Wilson	49e4d842f0	drm/i915: Report to userspace if we have a (presumed) working GPU reset In igt, we want to test handling of GPU hangs, both for recovery purposes and for reporting. However, we don't want to inject a genuine GPU hang onto a machine that cannot recover and so be permenantly wedged. Rather than embed heuristics into igt, have the kernel report exactly when it expects the GPU reset to work. This can also be usefully extended in future to indicate different levels of fine-grained resets. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tim Gore <tim.gore@intel.com> Cc: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-15 16:59:58 +02:00
David Weinehall	b1b38278e1	drm/i915: add a context parameter to {en, dis}able zero address mapping Export a new context parameter that can be set/queried through the context_{get,set}param ioctls. This parameter is passed as a context flag and decides whether or not a GPU address mapping is allowed to be made at address zero. The default is to allow such mappings. Signed-off-by: David Weinehall <david.weinehall@intel.com> Acked-by: "Zou, Nanhai" <nanhai.zou@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-29 10:15:19 +02:00
Damien Lespiau	21631f10ea	drm/i915: Fix the confusing comment about the ioctl limits It was reported that this comment was confusing, and indeed it is. v2: (one year later!) Add the range for the DRM_I915_* iotcl defines (Daniel) Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-05-26 17:20:30 +02:00
Chris Wilson	ea9da4e460	drm/i915: Allow disabling the destination colorkey for overlay Sometimes userspace wants a true overlay that is never clipped. In such cases, we need to disable the destination colorkey. However, it is currently unconditionally enabled in the overlay with no means of disabling. So rectify that by always default to on, and extending the UPDATE_ATTR ioctl to support explicit disabling of the colorkey. This is contrast to the spite code which requires explicit enabling of either the destination or source colorkey. Handling source colorkey is still todo for the overlay. (Of course it may be worth migrating overlay to sprite before then.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:54 +02:00
Tommi Rantala	2c60fae148	drm/i915: fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl Fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl, so that it is different from the DRM_IOCTL_I915_SET_SPRITE_COLORKEY ioctl. Note that this is just for accuracy, the ioctl implementation itself is totally unused and already ripped out. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> [danvet: Add note that this is a dead ioctl.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-27 09:10:26 +01:00
Jeff McGee	a1559ffefb	drm/i915: Export total subslice and EU counts Setup new I915_GETPARAM ioctl entries for subslice total and EU total. Userspace drivers need these values when constructing GPGPU commands. This kernel query method is intended to replace the PCI ID-based tables that userspace drivers currently maintain. The kernel driver can employ fuse register reads as needed to ensure the most accurate determination of GT config attributes. This first became important with Cherryview in which the config could differ between devices with the same PCI ID. The kernel detection of these values is device-specific and not included in this patch. Because zero is not a valid value for any of these parameters, a value of zero is interpreted as unknown for the device. Userspace drivers should continue to maintain ID-based tables for older devices not supported by the new query method. v2: Increment our I915_GETPARAM indices to fit after REVISION which was merged ahead of us. For: VIZ-4636 Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com> Acked-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-17 22:30:31 +01:00
Neil Roberts	27cd44618b	drm/i915: Add I915_PARAM_REVISION Adds a parameter which can be used with DRM_I915_GETPARAM to query the GPU revision. The intention is to use this in Mesa to implement the WaDisableSIMD16On3SrcInstr workaround on Skylake but only for revision 2. Signed-off-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-17 22:29:54 +01:00
Zhipeng Gong	08e16dc874	drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam This will let userland only try to use the new ring when the appropriate kernel is present v2: change the number to be consistent with upstream (Zhipeng) Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed--by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:51:05 +01:00
Zhipeng Gong	8d360dffd6	drm/i915: Specify bsd rings through exec flag On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical. For example, HEVC GPU commands can be only dispatched to VCS1 ring. But userspace has no control when using VCS1 or VCS2. This patch introduces a mechanism to avoid the default ping-pong mode and use one specific ring through execution flag. This mechanism is usable for all the platforms with 2 VCS rings. The open source usage is from these two commits in vaapi/intel: commit 702050f04131a44ef8ac16651708ce8a8d98e4b8 Author: Zhao, Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:19 2014 +0800 Allow the batchbuffer to be submitted with override flag commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8 Author: Zhao Yakui <yakui.zhao@intel.com> Date: Mon Nov 17 12:44:22 2014 +0800 Add the override flag to assure that HEVC video command always uses BSD ring0 for SKL GT3 machine v2: fix whitespace (Rodrigo) v3: remove incorrect chunk that came on -collector rebase. (Rodrigo) v4: change the comment (Zhipeng) v5: address Daniel's comment (Zhipeng) Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:51:05 +01:00
Chris Wilson	c9dc0f3598	drm/i915: Add ioctl to set per-context parameters Sometimes we wish to tweak how an individual context behaves. Since we always create a context for every filp, this means that individual processes can fine tune their behaviour even if they do not explicitly create a context. The first example parameter here is to enable multi-process GPU testing, but the interface should be able to cope with passing arbitrarily complex parameters. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Testcase: igt/gem_reset_stats/ban-period-* Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-07 18:19:06 +01:00
Akash Goel	1816f92363	drm/i915: Support creation of unbound wc user mappings for objects This patch provides support to create write-combining virtual mappings of GEM object. It intends to provide the same funtionality of 'mmap_gtt' interface without the constraints and contention of a limited aperture space, but requires clients handles the linear to tile conversion on their own. This is for improving the CPU write operation performance, as with such mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache flush after update from CPU side, when object is passed onto GPU. This type of mapping is specially useful in case of sub-region update, i.e. when only a portion of the object is to be updated. Using a CPU mmap in such cases would normally incur a clflush of the whole object, and using a GTT mmapping would likely require eviction of an active object or fence and thus stall. The write-combining CPU mmap avoids both. To ensure the cache coherency, before using this mapping, the GTT domain has been reused here. This provides the required cache flush if the object is in CPU domain or synchronization against the concurrent rendering. Although the access through an uncached mmap should automatically invalidate the cache lines, this may not be true for non-temporal write instructions and also not all pages of the object may be updated at any given point of time through this mapping. Having a call to get_pages in set_to_gtt_domain function, as added in the earlier patch 'drm/i915: Broaden application of set-domain(GTT)', would guarantee the clflush and so there will be no cachelines holding the data for the object before it is accessed through this map. The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been extended with a new flags field (defaulting to 0 for existent users). In order for userspace to detect the extended ioctl, a new parameter I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface. v2: Fix error handling, invalid flag detection, renaming (ickle) v3: Rebase to latest drm-intel-nightly codebase The new mmapping is exercised by igt/gem_mmap_wc, igt/gem_concurrent_blit and igt/gem_gtt_speed. Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-06 09:08:00 +01:00
Chris Wilson	6a2c4232ec	drm/i915: Make the physical object coherent with GTT Currently objects for which the hardware needs a contiguous physical address are allocated a shadow backing storage to satisfy the contraint. This shadow buffer is not wired into the normal obj->pages and so the physical object is incoherent with accesses via the GPU, GTT and CPU. By setting up the appropriate scatter-gather table, we can allow userspace to access the physical object via either a GTT mmaping of or by rendering into the GEM bo. However, keeping the CPU mmap of the shmemfs backing storage coherent with the contiguous shadow is not yet possible. Fortuituously, CPU mmaps of objects requiring physical addresses are not expected to be coherent anyway. This allows the physical constraint of the GEM object to be transparent to userspace and allow it to efficiently render into or update them via the GTT and GPU. v2: Fix leak of pci handle spotted by Ville v3: Remove the now duplicate call to detach_phys_object during free. v4: Wait for rendering before pwrite. As this patch makes it possible to render into the phys object, we should make it correct as well! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:18 +01:00
Chris Wilson	70f2f5c704	drm/i915: Report the actual swizzling back to userspace Userspace cares about whether or not swizzling depends on the page address for its direct access into bound objects. Extend the get_tiling ioctl to report the physical swizzling value in addition to the logical swizzling value so that userspace can accurately determine when it is possible for manual detiling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Testcase: igt/gem_tiled_wc Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-07 18:42:01 +01:00
Chris Wilson	5cc9ed4b9a	drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl By exporting the ability to map user address and inserting PTEs representing their backing pages into the GTT, we can exploit UMA in order to utilize normal application data as a texture source or even as a render target (depending upon the capabilities of the chipset). This has a number of uses, with zero-copy downloads to the GPU and efficient readback making the intermixed streaming of CPU and GPU operations fairly efficient. This ability has many widespread implications from faster rendering of client-side software rasterisers (chromium), mitigation of stalls due to read back (firefox) and to faster pipelining of texture data (such as pixel buffer objects in GL or data blobs in CL). v2: Compile with CONFIG_MMU_NOTIFIER v3: We can sleep while performing invalidate-range, which we can utilise to drop our page references prior to the kernel manipulating the vma (for either discard or cloning) and so protect normal users. v4: Only run the invalidate notifier if the range intercepts the bo. v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers v6: Recheck after reacquire mutex for lost mmu. v7: Fix implicit padding of ioctl struct by rounding to next 64bit boundary. v8: Fix rebasing error after forwarding porting the back port. v9: Limit the userptr to page aligned entries. We now expect userspace to handle all the offset-in-page adjustments itself. v10: Prevent vma from being copied across fork to avoid issues with cow. v11: Drop vma behaviour changes -- locking is nigh on impossible. Use a worker to load user pages to avoid lock inversions. v12: Use get_task_mm()/mmput() for correct refcounting of mm. v13: Use a worker to release the mmu_notifier to avoid lock inversion v14: Decouple mmu_notifier from struct_mutex using a custom mmu_notifer with its own locking and tree of objects for each mm/mmu_notifier. v15: Prevent overlapping userptr objects, and invalidate all objects within the mmu_notifier range v16: Fix a typo for iterating over multiple objects in the range and rearrange error path to destroy the mmu_notifier locklessly. Also close a race between invalidate_range and the get_pages_worker. v17: Close a race between get_pages_worker/invalidate_range and fresh allocations of the same userptr range - and notice that struct_mutex was presumed to be held when during creation it wasn't. v18: Sigh. Fix the refactor of st_set_pages() to allocate enough memory for the struct sg_table and to clear it before reporting an error. v19: Always error out on read-only userptr requests as we don't have the hardware infrastructure to support them at the moment. v20: Refuse to implement read-only support until we have the required infrastructure - but reserve the bit in flags for future use. v21: use_mm() is not required for get_user_pages(). It is only meant to be used to fix up the kernel thread's current->mm for use with copy_user(). v22: Use sg_alloc_table_from_pages for that chunky feeling v23: Export a function for sanity checking dma-buf rather than encode userptr details elsewhere, and clean up comments based on suggestions by Bradley. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com> Cc: Akash Goel <akash.goel@intel.com> Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Frob ioctl allocation to pick the next one - will cause a bit of fuss with create2 apparently, but such are the rules.] [danvet2: oops, forgot to git add after manual patch application] [danvet3: Appease sparse.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-16 19:31:29 +02:00
Brad Volkin	d728c8ef8b	drm/i915: Add a CMD_PARSER_VERSION getparam So userspace can query the kernel for command parser support. v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc OTC-Tracker: AXIA-4631 Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-01 22:58:15 +02:00
Geert Uytterhoeven	c3d19d3c3f	drm/i915: Spelling s/auxilliary/auxiliary/ Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 09:58:24 +01:00
Mika Kuoppala	b6359918b8	drm/i915: add i915_get_reset_stats_ioctl This ioctl returns reset stats for specified context. The struct returned contains context loss counters. reset_count: all resets across all contexts batch_active: active batches lost on resets batch_pending: pending batches lost on resets v2: get rid of state tracking completely and deliver only counts. Idea from Chris Wilson. v3: fix commit message v4: default context handled inside i915_gem_context_get_hang_stats v5: reset_count only for priviledged process v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson) v7: context hang stats never returns NULL v8: rebased on top of reworked context hang stats DRM_RENDER_ALLOW for ioctl v9: use DEFAULT_CONTEXT_ID. Improve comments for ioctl struct members Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ian Romanick <idr@freedesktop.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-12 14:15:48 +01:00
Ben Widawsky	35a85ac606	drm/i915: Add second slice l3 remapping Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:37:04 +02:00
Chris Wilson	651d794fae	drm/i915: Use Write-Through cacheing for the display plane on Iris Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. However, we still need to be careful as the CPU does not see the WT when accessing the cache. In particular, this means that we need to flush the cache lines after writing to an object through the CPU, and on transitioning from a cached state to WT. v2: Actually do the clflush on transition to WT, nagging by Ville. v3: Flush the CPU cache after writes into WT objects. v4: Rease onto LLC updates and report WT as "uncached" for get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:38 +02:00
Daniel Vetter	35c7ab421a	drm/i915: reserve I915_CACHING_DISPLAY and document cache modes Resolve the catch-22 of igt needing a stable number and patches first needing testcases by reserving the interface number up-front. v2: Improve the spelling a bit. v3: More spelling fail spotted by Chris. Requested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:34 +02:00
Ben Widawsky	cce723ed09	drm/i915: Make i915 events part of uapi Make the uevent strings part of the user API for people who wish to write their own listeners. v2: Make a space in the string concatenation. (Chad) Use the "UEVENT" suffix intead of "EVENT" (Chad) Make kernel-doc parseable Docbook comments (Daniel) v3: Undid reset change introduced in last submission (Daniel) Fixed up comments to address removal changes. Thanks to Daniel Vetter for a majority of the parity error comments. CC: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 18:26:57 +02:00
Xiang, Haihao	a1f2cc73c7	drm/i915: add I915_PARAM_HAS_VEBOX to i915_getparam This will let userland only try to use the new ring when the appropriate kernel is present Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:22 +02:00
Xiang, Haihao	82f91b6e93	drm/i915: add I915_EXEC_VEBOX to i915_gem_do_execbuffer() A user can run batchbuffer via VEBOX ring. Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:21 +02:00
Chris Wilson	eef90ccb8a	drm/i915: Use the reloc.handle as an index into the execbuffer array Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 623000.0/sec. i3-330m: 748000.0/sec to 789000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:23:47 +01:00
Daniel Vetter	ed5982e6ce	drm/i915: Allow userspace to hint that the relocations were known Userspace is able to hint to the kernel that its command stream and auxiliary state buffers already hold the correct presumed addresses and so the relocation process may be skipped if the kernel does not need to move any buffers in preparation for the execbuffer. Thus for the common case where the allotment of buffers is static between batches, we can avoid the overhead of individually checking the relocation entries. Note that this requires userspace to supply the domain tracking and requests for workarounds itself that would otherwise be computed based upon the relocation entries. Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 632000.0/sec. i3-330m: 748000.0/sec to 830000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Fixup merge conflict in userspace header due to different baseline trees.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:23:36 +01:00
Daniel Vetter	b45305fce5	drm/i915: Implement workaround for broken CS tlb on i830/845 Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 17:27:02 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
David Howells	718dcedd7e	UAPI: (Scripted) Disintegrate include/drm Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-04 18:21:50 +01:00

43 Commits