linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 23:33:22 +07:00

Author	SHA1	Message	Date
Chris Wilson	ac14fbd460	drm/i915/scheduler: Support user-defined priorities Use a priority stored in the context as the initial value when submitting a request. This allows us to change the default priority on a per-context basis, allowing different contexts to be favoured with GPU time at the expense of lower importance work. The user can adjust the context's priority via I915_CONTEXT_PARAM_PRIORITY, with more positive values being higher priority (they will be serviced earlier, after their dependencies have been resolved). Any prerequisite work for an execbuf will have its priority raised to match the new request as required. Normal users can specify any value in the range of -1023 to 0 [default], i.e. they can reduce the priority of their workloads (and temporarily boost it back to normal if so desired). Privileged users can specify any value in the range of -1023 to 1023, [default is 0], i.e. they can raise their priority above all overs and so potentially starve the system. Note that the existing schedulers are not fair, nor load balancing, the execution is strictly by priority on a first-come, first-served basis, and the driver may choose to boost some requests above the range available to users. This priority was originally based around nice(2), but evolved to allow clients to adjust their priority within a small range, and allow for a privileged high priority range. For example, this can be used to implement EGL_IMG_context_priority https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt EGL_CONTEXT_PRIORITY_LEVEL_IMG determines the priority level of the context to be created. This attribute is a hint, as an implementation may not support multiple contexts at some priority levels and system policy may limit access to high priority contexts to appropriate system privilege level. The default value for EGL_CONTEXT_PRIORITY_LEVEL_IMG is EGL_CONTEXT_PRIORITY_MEDIUM_IMG." so we can map PRIORITY_HIGH -> 1023 [privileged, will failback to 0] PRIORITY_MED -> 0 [default] PRIORITY_LOW -> -1023 They also map onto the priorities used by VkQueue (and a VkQueue is essentially a timeline, our i915_gem_context under full-ppgtt). v2: s/CAP_SYS_ADMIN/CAP_SYS_NICE/ v3: Report min/max user priorities as defines in the uapi, and rebase internal priorities on the exposed values. Testcase: igt/gem_exec_schedule Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-9-chris@chris-wilson.co.uk	2017-10-04 17:52:46 +01:00
Chris Wilson	e7af311683	drm/i915: Introduce a preempt context Add another perma-pinned context for using for preemption at any time. We cannot just reuse the existing kernel context, as first and foremost we need to ensure that we can preempt the kernel context itself, so require a distinct context id. Similar to the kernel context, we may want to interrupt execution and switch to the preempt context at any time, and so it needs to be permanently pinned and available. To compensate for yet another permanent allocation, we shrink the existing context and the new context by reducing their ringbuffer to the minimum. v2: Assert that we never allocate a request from the preemption context. v3: Limit perma-pin to engines that may preempt. v4: Onion cleanup for early driver death v5: Onion ordering in main driver cleanup as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk	2017-10-04 17:52:45 +01:00
Michal Wajdeczko	4f044a88a8	drm/i915: Rename global i915 to i915_modparams Our global struct with params is named exactly the same way as new preferred name for the drm_i915_private function parameter. To avoid such name reuse lets use different name for the global. v5: pure rename v6: fix Credits-to: Coccinelle @@ identifier n; @@ ( - i915.n + i915_modparams.n ) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjala <ville.syrjala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170919193846.38060-1-michal.wajdeczko@intel.com	2017-09-22 14:50:36 +03:00
Chris Wilson	d1b48c1e71	drm/i915: Replace execbuf vma ht with an idr This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk	2017-08-18 11:59:02 +01:00
Chris Wilson	12124bea5b	drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt When switching between contexts using the aliasing_ppgtt, the VM is shared. We don't need to reload the PD registers unless they are dirty. Martin Peres reported an issue that looks like corruption between Haswell context switches, bisecting to commit `f9326be5f1` ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use"). Switching between the same mm (the aliasing_ppgtt is used for all contexts in this case) should be a nop, but appears to trigger some side-effects in the context switch. However, as we know the switch is redundant in this case, we can skip it and continue to ignore the issue until somebody feels strong enough to investigate full-ppgtt on gen7 again! Except.. Martin was using full-ppgtt which is not supported as it doesn't work correctly yet. So whilst the bisect did yield valuable information about the failures, the fix should not have any user impact under default settings, with the exception of a slightly lower throughput on xcs as the VM would always be reloaded. v2: Also remember to set the legacy_active_context following the switch on xcs (commit `e8a9c58fcd` ("drm/i915: Unify active context tracking between legacy/execlists/guc")) Fixes: `f9326be5f1` ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use") Fixes: `e8a9c58fcd` ("drm/i915: Unify active context tracking between legacy/execlists/guc") Reported-by: Martin Peres <martin.peres@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170812152724.6883-1-chris@chris-wilson.co.uk	2017-08-12 16:50:12 +01:00
Chris Wilson	77b25a972b	drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates Since we make call i915_gem_context_mark_guilty() concurrently when resetting different engines in parallel, we need to make sure that our updates are safe for the unlocked access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-12-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-07-27 09:38:47 +02:00
Chris Wilson	cb0aeaa818	drm/i915: Only free the oldest stale context before allocating Currently, we move all unreferenced contexts to an RCU free list and then onto a worker for eventual reaping. To compensate against this growing into a long list with frequent allocations starving the system of available memory, before we allocate a new context we reap all the stale contexts. This puts all the cost of destroying the context into the next allocator, which is presumably more sensitive to syscall latency and unfair. We can limit the number of contexts being freed by the new allocator to both keep the list trimmed and to allow the allocator to be reasonably fast. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-4-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-07-06 11:51:52 +01:00
Chris Wilson	6b6573d114	drm/i915: Drop request retirement before reaping stale contexts Before we create a new context, we try and reap all the stale contexts (i.e. those that are freed but waiting for a worker to come and return their allocations to the system). Before we do this, we retire all requests so that we clear any inflight no longer used contexts (who are only being kept alived by those inflght requests). However, any context that is finally unreferenced by this retirement is put onto an RCU list and not available for immediately reaping, we stall for no immediate benefit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-3-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-07-06 11:51:18 +01:00
Chris Wilson	ddfc925851	drm/i915: Move stale context reaping to common i915_gem_context_create We need to reap the stale contexts for all new contexts, be they created by user in i915_gem_context_ioctl or from opening a new file in i915_gem_context_open. Both paths may be called very frequently accumulating many stale contexts before any worker has a chance to run and free their memory. Fixes: `1acfc104cd` ("drm/i915: Enable rcu-only context lookups") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-2-chris@chris-wilson.co.uk	2017-07-06 11:50:52 +01:00
Chris Wilson	e4d5dc218c	drm/i915: Check new context against kernel_context after reporting an error Avoid any pointer dereference in inspecting a potential PTR_ERR by checking for the error pointer before checking for an invalid context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170705142634.18554-1-chris@chris-wilson.co.uk	2017-07-06 11:50:47 +01:00
Chris Wilson	fad2083483	drm/i915: Fix use-after-free of context during free_contexts When iterating the list of contexts to free, we need to use a safe iterator as we are freeing the link as we go. Pass an extra thick brown paper bag. Fixes: `5f09a9c8ab` ("drm/i915: Allow contexts to be unreferenced locklessly") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170630230517.1938-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-07-04 11:55:27 +01:00
Chris Wilson	1acfc104cd	drm/i915: Enable rcu-only context lookups Whilst the contents of the context is still protected by the big struct_mutex, this is not much of an improvement. It is just one tiny step towards reducing our BKL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-3-chris@chris-wilson.co.uk	2017-06-20 17:13:54 +01:00
Chris Wilson	5f09a9c8ab	drm/i915: Allow contexts to be unreferenced locklessly If we move the actual cleanup of the context to a worker, we can allow the final free to be called from any context and avoid undue latency in the caller. v2: Negotiate handling the delayed contexts free by flushing the workqueue before calling i915_gem_context_fini() and performing the final free of the kernel context directly v3: Flush deferred frees before new context allocations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk	2017-06-20 17:13:47 +01:00
Chris Wilson	829a0af29f	drm/i915: Group all the global context information together Create a substruct to hold all the global context state under drm_i915_private. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-1-chris@chris-wilson.co.uk	2017-06-20 17:13:40 +01:00
Chris Wilson	dade2a6165	drm/i915: Store a persistent reference for an object in the execbuffer cache If we take a reference to the object/vma when it is first used in an execbuf, we can keep that reference until the object's file-local handle is closed. Thereby saving a frequent ref/unref pair. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-06-16 16:54:05 +01:00
Chris Wilson	4ff4b44cbb	drm/i915: Store a direct lookup from object handle to vma The advent of full-ppgtt lead to an extra indirection between the object and its binding. That extra indirection has a noticeable impact on how fast we can convert from the user handles to our internal vma for execbuffer. In order to bypass the extra indirection, we use a resizable hashtable to jump from the object to the per-ctx vma. rhashtable was considered but we don't need the online resizing feature and the extra complexity proved to undermine its usefulness. Instead, we simply reallocate the hastable on demand in a background task and serialize it before iterating. In non-full-ppgtt modes, multiple files and multiple contexts can share the same vma. This leads to having multiple possible handle->vma links, so we only use the first to establish the fast path. The majority of buffers are not shared and so we should still be able to realise speedups with multiple clients. v2: Prettier names, more magic. v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-06-16 16:54:04 +01:00
Chris Wilson	4c9c0d0974	drm/i915: Fix retrieval of hangcheck stats The default context is always supported (as it contains the global hangcheck stats) and the contexts for hangcheck are not limited to any ring. This was dropped in 2013 because it was supposed to have been included with Ben's full-ppgtt patch set. It never landed and the bug remains. References: https://bugs.freedesktop.org/show_bug.cgi?id=65845 Link: http://patchwork.freedesktop.org/patch/msgid/1372175222-27622-1-git-send-email-mika.kuoppala@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616132849.29597-1-chris@chris-wilson.co.uk	2017-06-16 16:22:43 +01:00
Chris Wilson	e4f815f6bf	drm/i915: Use a define for the default priority [0] Explicitly assign the default priority, and give it a name. After much discussion, we have chosen to call it I915_PRIORITY_NORMAL! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-7-chris@chris-wilson.co.uk	2017-05-17 13:38:08 +01:00
Joonas Lahtinen	63ffbcdadc	drm/i915: Sanitize engine context sizes Pre-calculate engine context size based on engine class and device generation and store it in the engine instance. v2: - Squash and get rid of hw_context_size (Chris) v3: - Move after MMIO init for probing on Gen7 and 8 (Chris) - Retained rounding (Tvrtko) v4: - Rebase for deferred legacy context allocation Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: intel-gvt-dev@lists.freedesktop.org Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-28 12:11:59 +03:00
Chris Wilson	3204c343bb	drm/i915: Defer context state allocation for legacy ring submission Almost from the outset for execlists, we used deferred allocation of the logical context and rings. Then we ported the infrastructure for pinning contexts back to legacy, and so now we are able to also implement deferred allocation for context objects prior to first use on the legacy submission. v2: We still need to differentiate between legacy engines, Joonas is fixing that but I want this first ;) (Joonas) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170427104651.22394-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-04-27 12:22:13 +01:00
Chris Wilson	e02d9d76bf	drm/i915: Disable MI_SET_CONTEXT psmi w/a for bdw The current w/a for the gen7 psmi related hangs doesn't apply to bdw, so disable it if using bdw ringbuffer submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170324151724.32640-1-chris@chris-wilson.co.uk Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>	2017-03-24 17:10:10 +00:00
Chris Wilson	e555e32645	drm/i915: Remove superfluous hw_flags from mi_set_context() Why have both hw_flags and flags, when just one will do? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170322210350.6208-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-03-23 09:43:14 +00:00
Chris Wilson	e642c85b03	drm/i915: Remove superfluous i915_add_request_no_flush() helper The only time we need to emit a flush inside request emission is after an execbuffer, for which we can use the full __i915_add_request(). All other instances want the simpler i915_add_request() without flushing, so remove the useless helper. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170317114709.8388-1-chris@chris-wilson.co.uk	2017-03-17 13:03:25 +00:00
Changbin Du	3fc03069bc	drm/i915: make context status notifier head be per engine GVTg has introduced the context status notifier to schedule the GVTg workload. At that time, the notifier is bound to GVTg context only, so GVTg is not aware of host workloads. Now we are going to improve GVTg's guest workload scheduler policy, and add Guc emulation support for new Gen graphics. Both these two features require acknowledgment for all contexts running on hardware. (But will not alter host workload.) So here try to make some change. The change is simple: 1. Move the context status notifier head from i915_gem_context to intel_engine_cs. Which means there is a notifier head per engine instead of per context. Execlist driver still call notifier for each context sched-in/out events of current engine. 2. At GVTg side, it binds a notifier_block for each physical engine at GVTg initialization period. Then GVTg can hear all context status events. In this patch, GVTg do nothing for host context event, but later will add a function there. But in any case, the notifier callback is a noop if this is no active vGPU. Since intel_gvt_init() is called at early initialization stage and require the status notifier head has been initiated, I initiate it in intel_engine_setup(). v2: remove a redundant newline. (chris) Fixes: `3c7ba6359d` ("drm/i915: Introduce execlist context status change notification") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232 Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.com Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 16:24:35 +00:00
Chris Wilson	afeddf5081	drm/i915: Reduce context alignment No hardware was ever shipped that needed more than 4096 byte alignment and future hardware will not use this legacy path. So reduce the alignment to make it easier and quicker to launch workloads. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227135913.8056-3-chris@chris-wilson.co.uk	2017-02-27 16:01:47 +00:00
Chris Wilson	20fe17aa52	drm/i915: Remove redundant TLB invalidate on switching contexts We are required to reload the TLBs around context switches (MI_SET_CONTEXT specifically) and the recommendation is do that before the MI_SET_CONTEXT so that it is serialised with the switch and not forgotten: [DevSNB] If Flush TLB invalidation Mode is enabled it’s the driver’s responsibility to invalidate the TLBs at least once after the previous context switch after any GTT mappings changed (including new GTT entries). This can be done by a pipeline PIPE_CONTROL with TLB inv bit set immediately before MI_SET_CONTEXT. However, we already do an unconditional TLB invalidate before every batch so this condition is satifisfied. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227135913.8056-1-chris@chris-wilson.co.uk	2017-02-27 16:01:45 +00:00
Chuanxiao Dong	718e884a01	drm/i915/gvt: set ring buffer size to default for guc submission When not using GuC submission, the ring buffer size for GVT context is 512KB which is the max size. When switching to GuC submission, the ring buffer size is required to be less than 16KB. So use the GVT context default ring buffer size if GuC submission is enabled. Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170216063639.GA17107@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-02-22 10:17:56 +00:00
Tvrtko Ursulin	a937eaf824	drm/i915: Fix uninitialized return from mi_set_context For some reason my compiler (and CI as well) failed to spot the uninitialized ret in mi_set_context. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `73dec95e6b` ("drm/i915: Emit to ringbuffer directly") Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170214152901.20361-1-tvrtko.ursulin@linux.intel.com	2017-02-14 19:34:54 +00:00
Tvrtko Ursulin	73dec95e6b	drm/i915: Emit to ringbuffer directly This removes the usage of intel_ring_emit in favour of directly writing to the ring buffer. intel_ring_emit was preventing the compiler for optimising fetch and increment of the current ring buffer pointer and therefore generating very verbose code for every write. It had no useful purpose since all ringbuffer operations are started and ended with intel_ring_begin and intel_ring_advance respectively, with no bail out in the middle possible, so it is fine to increment the tail in intel_ring_begin and let the code manage the pointer itself. Useless instruction removal amounts to approximately two and half kilobytes of saved text on my build. Not sure if this has any measurable performance implications but executing a ton of useless instructions on fast paths cannot be good. v2: * Change return from intel_ring_begin to error pointer by popular demand. * Move tail increment to intel_ring_advance to enable some error checking. v3: * Move tail advance back into intel_ring_begin. * Rebase and tidy. v4: * Complete rebase after a few months since v3. v5: * Remove unecessary cast and fix !debug compile. (Chris Wilson) v6: * Make intel_ring_offset take request as well. * Fix recording of request postfix plus a sprinkle of asserts. (Chris Wilson) v7: * Use intel_ring_offset to get the postfix. (Chris Wilson) * Convert GVT code as well. v8: * Rename out++ to cs++. v9: * Fix GVT out to cs conversion in GVT. v10: * Rebase for new intel_ring_begin in selftests. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170214113242.29241-1-tvrtko.ursulin@linux.intel.com	2017-02-14 14:30:46 +00:00
Chris Wilson	791ff39ae3	drm/i915: Live testing for context execution Check we can create and execution within a context. v2: Write one set of dwords through each context/engine to exercise more contexts within the same time period. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-38-chris@chris-wilson.co.uk	2017-02-13 20:46:44 +00:00
Chris Wilson	0daf0113cf	drm/i915: Mock infrastructure for request emission Create a fake engine that runs requests using a timer to simulate hw. v2: Prevent leaks of ctx->name along error paths Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-8-chris@chris-wilson.co.uk	2017-02-13 20:45:31 +00:00
Chris Wilson	949e8ab3a9	drm/i915: Use the size/type of address space to make decisions Once the address space has been created (using 3 or 4 levels of page tables), we should use that to program the appropriate type into the contexts. This gives us the flexibility to handle different types of address spaces at runtime. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170209144036.23664-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-02-09 17:09:27 +00:00
Joonas Lahtinen	6d1f9fb312	drm/i915: Add __destroy_hw_context __create_hw_context can use a good counterpart. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1486640065-13695-1-git-send-email-joonas.lahtinen@linux.intel.com	2017-02-09 15:33:36 +02:00
Mika Kuoppala	2355cf088d	drm/i915: Create context desc template when context is created Move the invariant parts of context desc setup from execlist init to context creation. This is advantageous when we need to create different templates based on the context parametrization, ie. for svm capable contexts. v2: s/create/default, remove engine->ctx_desc_template Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1485522189-31984-1-git-send-email-mika.kuoppala@intel.com	2017-01-30 16:36:20 +02:00
Chris Wilson	5d12fcef4e	drm/i915: Assert that the kernel_context is hw-id 0 For easy recognisability, we want the kernel context to have id 0 and all user contexts to have non-zero ids. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170123113132.18665-1-chris@chris-wilson.co.uk	2017-01-23 11:53:32 +00:00
Chris Wilson	a01cb37aff	drm/i915: Remove i915_vma_create from VMA API With the introduce of i915_vma_instance() for obtaining the VMA singleton for a (obj, vm, view) tuple, we can remove the i915_vma_create() in favour of a single entry point. We do incur a lookup onto an empty tree, but the i915_vma_create() were being called infrequently and during initialisation, so the small overhead is negligible. v2: Drop the i915_ prefix from the now static vma_create() function Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk	2017-01-19 10:17:39 +00:00
Chris Wilson	f131e3562e	drm/i915: Skip switch to kernel context if already done Some engines are never user or already sitting idle in the kernel context and for those we can skip flushing the current context for i915_gem_switch_to_kernel_context(). We used to perform this optimisation but that was removed for convenience of converting over to multiple timelines and handling the pending request queues. From the perspective of writing selftests, reducing the number of background operations on the engines makes defining assertions easier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170114162334.10271-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-16 12:21:20 +00:00
Chris Wilson	0c7eeda1af	drm/i915: Move i915_ppgtt_close() into i915_gem_gtt.c Move it alongside its ppgtt counterparts, in order to make it available for the ppgtt selftests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-26-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-13 16:35:27 +00:00
Chris Wilson	f51455d442	drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE Start converting over from the byte count to its semantic macro, either we want to allocate the size of a physical page in main memory or we want the size of a virtual page in the GTT. 4096 could mean either, but PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve code comprehension and future changes. In the future, we may want to use variable GTT page sizes and so have the challenge of knowing which hardcoded values were used to represent a physical page vs the virtual page. v2: Look for a few more 4096s to convert, discover IS_ALIGNED(). v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep bdw stolen w/a as 4096 until we know better. v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page sizes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-01-10 20:54:32 +00:00
Chris Wilson	984ff29f74	drm/i915: Simplify testing for am-I-the-kernel-context? The kernel context (dev_priv->kernel_context) is unique in that it is not associated with any user filp - it is the only one with ctx->file_priv == NULL. This is a simpler test than comparing it against dev_priv->kernel_context which involves some pointer dancing. In checking that this is true, we notice that the gvt context is allocating itself a i915_hw_ppgtt it doesn't use and not flagging that its file_priv should be invalid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-5-chris@chris-wilson.co.uk	2017-01-06 16:02:12 +00:00
Chris Wilson	6095868a27	drm/i915: Complete kerneldoc for struct i915_gem_context The existing kerneldoc was outdated, so time for a refresh. v2: Use single line kdoc, mention functions for manipulation Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-3-chris@chris-wilson.co.uk	2016-12-31 11:41:47 +00:00
Daniele Ceraolo Spurio	d3ef1af6fd	drm/i915: request ring to be pinned above GUC_WOPCM_TOP GuC will validate the ring offset and fail if it is in the [0, GUC_WOPCM_TOP) range. The bias is conditionally applied only if GuC loading is enabled (we can't check for guc submission enabled as in other cases because HuC loading requires this fix). Note that the default context is processed before enable_guc_loading is sanitized, so we might still apply the bias to its ring even if it is not needed. v2: compute the value during ctx init and pass it to intel_ring_pin (Chris), updated commit message Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1482537382-28584-1-git-send-email-daniele.ceraolospurio@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-12-24 10:06:59 +00:00
Chris Wilson	70ffe9956c	drm/i915: Mark the shadow gvt context as closed As the shadow gvt is not user accessible and does not have an associated vm, we can mark it as closed during its construction. This saves leaking the internal knowledge of i915_gem_context into gvt/. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-5-chris@chris-wilson.co.uk	2016-12-18 16:18:54 +00:00
Chris Wilson	e8a9c58fcd	drm/i915: Unify active context tracking between legacy/execlists/guc The requests conversion introduced a nasty bug where we could generate a new request in the middle of constructing a request if we needed to idle the system in order to evict space for a context. The request to idle would be executed (and waited upon) before the current one, creating a minor havoc in the seqno accounting, as we will consider the current request to already be completed (prior to deferred seqno assignment) but ring->last_retired_head would have been updated and still could allow us to overwrite the current request before execution. We also employed two different mechanisms to track the active context until it was switched out. The legacy method allowed for waiting upon an active context (it could forcibly evict any vma, including context's), but the execlists method took a step backwards by pinning the vma for the entire active lifespan of the context (the only way to evict was to idle the entire GPU, not individual contexts). However, to circumvent the tricky issue of locking (i.e. we cannot take struct_mutex at the time of i915_gem_request_submit(), where we would want to move the previous context onto the active tracker and unpin it), we take the execlists approach and keep the contexts pinned until retirement. The benefit of the execlists approach, more important for execlists than legacy, was the reduction in work in pinning the context for each request - as the context was kept pinned until idle, it could short circuit the pinning for all active contexts. We introduce new engine vfuncs to pin and unpin the context respectively. The context is pinned at the start of the request, and only unpinned when the following request is retired (this ensures that the context is idle and coherent in main memory before we unpin it). We move the engine->last_context tracking into the retirement itself (rather than during request submission) in order to allow the submission to be reordered or unwound without undue difficultly. And finally an ulterior motive for unifying context handling was to prepare for mock requests. v2: Rename to last_retired_context, split out legacy_context tracking for MI_SET_CONTEXT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk	2016-12-18 16:18:50 +00:00
Tvrtko Ursulin	cb15d9f8c3	drm/i915: More GEM init dev_priv cleanup Simplifies the code to pass the right parameter in. v2: Commit message. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-12-01 18:01:16 +00:00
Tvrtko Ursulin	bf9e8429ab	drm/i915: Make various init functions take dev_priv Like GEM init, GUC init, MOCS init and context creation. Enables them to lose dev_priv locals. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-12-01 18:01:15 +00:00
Tvrtko Ursulin	12d79d7828	drm/i915: Make GEM object create and create from data take dev_priv Makes all GEM object constructors consistent. v2: Fix compilation in GVT code. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1)	2016-12-01 18:01:08 +00:00
Tvrtko Ursulin	793b61ea9f	drm/i915: i915_gem_alloc_context_obj can be static It has only one call site from the same file. v2: Also rename it to alloc_context_obj. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1479898155-21014-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-11-23 14:23:00 +00:00
Mika Kuoppala	bc1d53c647	drm/i915: Wipe hang stats as an embedded struct Bannable property, banned status, guilty and active counts are properties of i915_gem_context. Make them so. v2: rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479309634-28574-1-git-send-email-mika.kuoppala@intel.com	2016-11-21 14:36:56 +02:00
Mika Kuoppala	b083a0870c	drm/i915: Add per client max context ban limit If we have a bad client submitting unfavourably across different contexts, creating new ones, the per context scoring of badness doesn't remove the root cause, the offending client. To counter, keep track of per client context bans. Deny access if client is responsible for more than 3 context bans in it's lifetime. v2: move ban check to context create ioctl (Chris) v3: add commentary about hangs needed to reach client ban (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-11-21 14:36:40 +02:00

1 2 3 4 5 ...

302 Commits