linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 21:47:04 +07:00

Author	SHA1	Message	Date
Chris Wilson	3fef5cda97	drm/i915: Automatic i915_switch_context for legacy During request construction, after pinning the context we know whether or not we have to emit a context switch. So move this common operation from every caller into i915_gem_request_alloc() itself. v2: Always submit the request if we emitted some commands during request construction, as typically it also involves changes in global state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk	2017-11-20 15:56:16 +00:00
Chris Wilson	2113184c6f	drm/i915: Pull the unconditional GPU cache invalidation into request construction As the request will, in the following patch, implicitly invoke a context-switch on construction, we should precede that with a GPU TLB invalidation. Also, even before using GGTT, we always want to invalidate the TLBs for any updates (as well as the ppgtt invalidates that are unconditionally applied by execbuf). Since we almost always require the TLB invalidate, do it unconditionally on request allocation and so we can remove it from all other paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-11-20 15:56:16 +00:00
Chris Wilson	223c73a366	drm/i915/selftests: Report ENOMEM clearly for an allocation failure If we can not run the drunk_hole test because we couldn't allocate the memory for the permutation array (even after we tried trimming the size), report a clear ENOMEM. Similary, if we are asked to operate on a hole too small for ourselves, make it skip quietly. v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL. v3: Fixup similar construction for lowlevel_hole v4: Use u64 >> 1 to avoid 64b div. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117101732.4335-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171117162945.16390-1-chris@chris-wilson.co.uk	2017-11-17 18:20:36 +00:00
Michel Thierry	55bd6bd757	drm/i915/selftests: Add a GuC doorbells selftest The first test aims to check guc_init_doorbell_hw, changing the existing guc clients and doorbells state before calling it. The second test tries to create as many clients as it is currently possible (currently limited to max number of doorbells) and exercise the doorbell alloc/dealloc code. Since our usage mode require very few clients/doorbells, this code has been exercised very lightly and it's good to have a simple test for it. As reference, this test already helped identify the bug fixed by commit `7f1ea2ac30` ("drm/i915/guc: Fix doorbell id selection"). v2: Extend number of clients; check for client allocation failure when number of doorbells is exceeded; validate client properties; reuse guc_init_doorbell_hw (Chris). v3: guc_init_doorbell_hw test added per Chris suggestion. v4: Try to explain why guc_init_doorbell_hw exist and comment some details in the subtest. v5: Remove redundant pr_info at the beginning of each subtest (Chris); rebase (s/i915_guc_client/intel_guc_client/). Signed-off-by: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-17 10:02:39 +00:00
Chris Wilson	4fe95b042d	drm/i915/selftests: exercise_ggtt may have nothing to do When operating on the live_ggtt we have to find a usuable hole for our test. It is possible for there to be no hole we can use, so initialise the err to 0 for the early exit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171115152558.31252-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-16 15:47:17 +00:00
Chris Wilson	24fd018aae	drm/i915/selftests: Increase size for mock ringbuffer We don't actually emit any commands into the ringbuffer, so we set it very small. However, an upcoming change centralises the wait-for-space into i915_gem_request_alloc() and that imposes a minimum size upon all ringbuffers (mock or real) of MIN_SPACE_FOR_ADD_REQUEST. Grow the mock ringbuffer such that we allocate a single page for the struct+buffer, satisfying the new condition without wasting too much space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171115151204.8105-1-chris@chris-wilson.co.uk	2017-11-15 17:12:49 +00:00
Chris Wilson	fb4e14860b	drm/i915/selftests: Markup __iomem for igt_gem_coherency Silence sparse warnings by using __iomem markup and io accessors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171114191842.19063-1-chris@chris-wilson.co.uk	2017-11-15 17:12:49 +00:00
Chris Wilson	6e1281412a	drm/i915/selftests: Always initialise err smatch does not track initialised values as well as gcc, and this triggers many warnings by smatch not presented by gcc. Silence smatch by initialising the error values to -ENODEV, which we use to denote internal errors. (If we see a selftest fail with a silent -ENODEV, we know smatch was right!) v2: smatch was right about igt_create_vma(), it may unlikely fail on the first object allocation which we want to be loud about. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171114223346.25958-1-chris@chris-wilson.co.uk Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>	2017-11-14 23:50:49 +00:00
Chris Wilson	9c52d1c816	drm/i915/selftests: Yet another forgotten mock_i915->mm initialiser Move all of the i915->mm initialisation to a private function that can be reused by the mock i915 device to save forgetting any more steps. For example, <7>[ 1542.046332] [IGT] drv_selftest: starting subtest mock_objects <4>[ 1542.123924] Setting dangerous option mock_selftests - tainting kernel <6>[ 1542.167941] i915: Performing mock selftests with st_random_seed=0x246f5ab5 st_timeout=1000 <4>[ 1542.178012] INFO: trying to register non-static key. <4>[ 1542.178027] the code is fine but needs lockdep annotation. <4>[ 1542.178032] turning off the locking correctness validator. <4>[ 1542.178041] CPU: 3 PID: 6008 Comm: kworker/3:7 Tainted: G U 4.14.0-rc8-CI-CI_DRM_3332+ #1 <4>[ 1542.178049] Hardware name: /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017 <4>[ 1542.178144] Workqueue: events __i915_gem_free_work [i915] <4>[ 1542.178152] Call Trace: <4>[ 1542.178163] dump_stack+0x68/0x9f <4>[ 1542.178170] register_lock_class+0x3fd/0x580 <4>[ 1542.178177] ? unwind_next_frame+0x14/0x20 <4>[ 1542.178184] ? __save_stack_trace+0x73/0xd0 <4>[ 1542.178191] __lock_acquire+0xa4/0x1b00 <4>[ 1542.178254] ? __i915_gem_free_work+0x28/0xa0 [i915] <4>[ 1542.178261] ? __lock_acquire+0x4ab/0x1b00 <4>[ 1542.178268] lock_acquire+0xb0/0x200 <4>[ 1542.178273] ? lock_acquire+0xb0/0x200 <4>[ 1542.178336] ? __i915_gem_free_work+0x28/0xa0 [i915] <4>[ 1542.178344] _raw_spin_lock+0x32/0x50 <4>[ 1542.178405] ? __i915_gem_free_work+0x28/0xa0 [i915] <4>[ 1542.178468] __i915_gem_free_work+0x28/0xa0 [i915] <4>[ 1542.178476] process_one_work+0x221/0x650 <4>[ 1542.178483] worker_thread+0x4e/0x3c0 <4>[ 1542.178489] kthread+0x114/0x150 <4>[ 1542.178494] ? process_one_work+0x650/0x650 <4>[ 1542.178499] ? kthread_create_on_node+0x40/0x40 <4>[ 1542.178506] ret_from_fork+0x27/0x40 v2: Fish out i915->mm.object_stat_lock which was being inited over in i915_drv.c (Matthew) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110232447.21618-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-11-10 23:42:49 +00:00
Chris Wilson	9511ce1ce7	drm/i915/selftests: Initialise mock_i915->mm.obj_lock lockdep spotted that the mock tests were using the i915->mm.obj_lock without first initialiasing it: >[ 1303.217043] [IGT] drv_selftest: starting subtest mock_objects <4>[ 1303.240898] Setting dangerous option mock_selftests - tainting kernel <6>[ 1303.253665] i915: Performing mock selftests with st_random_seed=0xd87ea6c6 st_timeout=1000 <4>[ 1303.254812] INFO: trying to register non-static key. <4>[ 1303.254816] the code is fine but needs lockdep annotation. <4>[ 1303.254818] turning off the locking correctness validator. <4>[ 1303.254820] CPU: 4 PID: 13112 Comm: drv_selftest Tainted: G U W 4.14.0-rc8-CI-Patchwork_7058+ #1 <4>[ 1303.254823] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016 <4>[ 1303.254825] Call Trace: <4>[ 1303.254829] dump_stack+0x68/0x9f <4>[ 1303.254832] register_lock_class+0x3fd/0x580 <4>[ 1303.254835] ? ___slab_alloc.constprop.29+0x157/0x3d0 <4>[ 1303.254837] ? ___slab_alloc.constprop.29+0x157/0x3d0 <4>[ 1303.254840] ? sg_kmalloc+0x1e/0x50 <4>[ 1303.254842] ? debug_smp_processor_id+0x17/0x20 <4>[ 1303.254845] __lock_acquire+0xa4/0x1b00 <4>[ 1303.254884] ? __i915_gem_object_set_pages+0x116/0x1f0 [i915] <4>[ 1303.254887] ? __this_cpu_preempt_check+0x13/0x20 <4>[ 1303.254889] ? sg_kmalloc+0x1e/0x50 <4>[ 1303.254891] lock_acquire+0xb0/0x200 <4>[ 1303.254893] ? lock_acquire+0xb0/0x200 <4>[ 1303.254917] ? __i915_gem_object_set_pages+0x116/0x1f0 [i915] <4>[ 1303.254920] _raw_spin_lock+0x32/0x50 <4>[ 1303.254944] ? __i915_gem_object_set_pages+0x116/0x1f0 [i915] <4>[ 1303.254967] __i915_gem_object_set_pages+0x116/0x1f0 [i915] <4>[ 1303.254991] i915_gem_object_get_pages_phys+0x286/0x2b0 [i915] <4>[ 1303.255015] ____i915_gem_object_get_pages+0x20/0x60 [i915] <4>[ 1303.255039] i915_gem_object_attach_phys+0x137/0x1a0 [i915] <4>[ 1303.255063] igt_phys_object+0x45/0x120 [i915] <4>[ 1303.255094] __i915_subtests+0x40/0xd0 [i915] <4>[ 1303.255099] ? work_on_cpu_safe+0x60/0x60 <4>[ 1303.255131] i915_gem_object_mock_selftests+0x34/0x50 [i915] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171110151919.18451-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-11-10 16:21:17 +00:00
Hans de Goede	a5266db4d3	drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset() intel_uncore_forcewake_reset() does forcewake puts and gets as such we need to make sure that no-one tries to access the PUNIT->PMIC bus (on systems where this bus is shared) while it runs, otherwise bad things happen. Normally this is taken care of by the i915_pmic_bus_access_notifier() which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other driver tries to access the PMIC bus, so that later forcewake gets are no-ops (for the duration of the bus access). But intel_uncore_forcewake_reset gets called in 3 cases: 1) Before registering the pmic_bus_access_notifier 2) After unregistering the pmic_bus_access_notifier 3) To reset forcewake state on a GPU reset In all 3 cases the i915_pmic_bus_access_notifier() protection is insufficient. This commit fixes this race by calling iosf_mbi_punit_acquire() before calling intel_uncore_forcewake_reset(). In the case where it is called directly after unregistering the pmic_bus_access_notifier, we need to hold the punit-lock over both calls to avoid a race where intel_uncore_fw_release_timer() may execute between the 2 calls. Reviewed-by: Imre Deak <imre.deak@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171019111620.26761-3-hdegoede@redhat.com	2017-11-10 13:14:03 +01:00
Chris Wilson	693b1ccabe	drm/i915/selftests: Take rpm wakeref around partial tiling tests Since the partial tiling tests are poking into the GGTT to watch the fence registers in operation, it itself needs the device rpm wakeref in order for the GGTT to remain accessible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107115653.10716-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-11-07 17:50:34 +00:00
Chris Wilson	c29ccb9f42	drm/i915/selftests: Take rpm wakeref around GGTT lowlevel tests The vma routines are responsible for acquiring the device rpm wakeref before they poke the HW. However, some of the selftests bypass the higher level vma routines in order to poke directly at the lowlevel GGTT functions; these are then responsible for managing rpm themselves. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107114051.10583-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-11-07 17:50:33 +00:00
Chris Wilson	f6d03042b9	drm/i915/selftests: Skip mixed page exhaustion if only small pages available If we only have 4k pages, we can't mix together different combinations of hugepages to see if the world burns. However, as the loops did nothing, we never set err to 0 and reported ENODEV aborting the test. Teach the test to skip instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107110559.6098-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>	2017-11-07 17:50:33 +00:00
Chris Wilson	69ea47a5a9	drm/i915/selftests: Hide dangerous tests Some tests are designed to exercise the limits of the HW and may trigger unintended side-effects making the machine unusable. This should not be executed by default, but are still useful for early platform validation. References: https://bugs.freedesktop.org/show_bug.cgi?id=103453 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171025153207.9589-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-06 11:17:03 +00:00
Chris Wilson	02a1ca45cf	Revert "drm/i915/selftests: Convert timers to use timer_setup()" This reverts commit `6d0dbd3096`. timer_setup_on_stack() does not yet exist: In file included from drivers/gpu/drm/i915/i915_sw_fence.c:517:0: drivers/gpu/drm/i915/selftests/lib_sw_fence.c: In function ‘timed_fence_init’: drivers/gpu/drm/i915/selftests/lib_sw_fence.c:63:2: error: implicit declaration of function ‘timer_setup_on_stack’; did you mean ‘hrtimer_init_on_stack’? [-Werror=implicit-function-declaration] timer_setup_on_stack(&tf->timer, timed_fence_wake, 0); Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171025131336.2584-1-chris@chris-wilson.co.uk Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-25 14:16:16 +01:00
Kees Cook	6d0dbd3096	drm/i915/selftests: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20171024151344.GA104417@beast Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-10-25 12:13:10 +01:00
Chris Wilson	b1f9107e1b	drm/i915/selftests: Don't try to queue a request with zero delay Instead of trying to create a timer with zero delay (i.e. with expires set to the current jiffies and not the future, an already expired timer), execute that request immediately. v2: Refactor list_del_init+signal into its own little function. v3: Reorder testing so as not to immediately signal a delayed request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171024220855.30155-1-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-25 12:13:03 +01:00
Kees Cook	39cbf2aa41	drm/i915: Convert timers to use timer_setup() In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171017065304.3358-1-joonas.lahtinen@linux.intel.com	2017-10-18 14:56:10 +03:00
Chris Wilson	134649ff35	drm/i915/selftests: Silence the compiler for impossible errors It should be impossible for these tests not to run due to an empty ppgtt, but if it should happen, let's report ENODEV (our typical internal error for impossible events). In file included from drivers/gpu/drm/i915/i915_gem.c:5415: drivers/gpu/drm/i915/selftests/huge_pages.c: In function 'igt_mock_ppgtt_huge_fill': >> drivers/gpu/drm/i915/selftests/huge_pages.c:612: error: 'err' may be used uninitialized in this function drivers/gpu/drm/i915/selftests/huge_pages.c: In function 'igt_ppgtt_exhaust_huge': drivers/gpu/drm/i915/selftests/huge_pages.c:1159: error: 'err' may be used uninitialized in this function Reported-by: kbuild-all@01.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171017103723.6933-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-10-17 14:07:47 +01:00
Chris Wilson	f2123818ff	drm/i915: Move dev_priv->mm.[un]bound_list to its own lock Remove the struct_mutex requirement around dev_priv->mm.bound_list and dev_priv->mm.unbound_list by giving it its own spinlock. This reduces one more requirement for struct_mutex and in the process gives us slightly more accurate unbound_list tracking, which should improve the shrinker - but the drawback is that we drop the retirement before counting so i915_gem_object_is_active() may be stale and lead us to underestimate the number of objects that may be shrunk (see commit `bed50aea61` ("drm/i915/shrinker: Flush active on objects before counting")). v2: Crosslink the spinlock to the lists it protects, and btw this changes s/obj->global_link/obj->mm.link/ v3: Fix decoupling of old links in i915_gem_object_attach_phys() v3.1: Fix the fix, only unlink if it was linked v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk	2017-10-16 20:44:19 +01:00
Chris Wilson	9c1477e83e	drm/i915/selftests: Exercise adding requests to a full GGTT A bug recently encountered involved the issue where are we were submitting requests to different ppGTT, each would pin a segment of the GGTT for its logical context and ring. However, this is invisible to eviction as we do not tie the context/ring VMA to a request and so do not automatically wait upon it them (instead they are marked as pinned, preventing eviction entirely). Instead the eviction code must flush those contexts by switching to the kernel context. This selftest tries to fill the GGTT with contexts to exercise a path where the switch-to-kernel-context failed to make forward progress and we fail with ENOSPC. v2: Make the hole in the filled GGTT explicit. v3: Swap out the arbitrary timeout for a private notification from i915_gem_evict_something() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-3-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-10-12 21:06:26 +01:00
Chris Wilson	214707fc2c	drm/i915/selftests: Wrap a timer into a i915_sw_fence For some selftests, we want to issue requests but delay them going to hardware. Furthermore, we don't want those requests to block indefinitely (or else we may hang the driver and block testing) so we want to employ a timeout. So naturally we want a fence that is automatically signaled by a timer. v2: Add kselftests. v3: Limit the API available to selftests; there isn't an overwhelming reason to export it universally. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-2-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-10-12 21:06:26 +01:00
Daniel Vetter	af7a8ffad9	drm/i915: Use rcu instead of stop_machine in set_wedged stop_machine is not really a locking primitive we should use, except when the hw folks tell us the hw is broken and that's the only way to work around it. This patch tries to address the locking abuse of stop_machine() from commit `20e4933c47` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Nov 22 14:41:21 2016 +0000 drm/i915: Stop the machine as we install the wedged submit_request handler Chris said parts of the reasons for going with stop_machine() was that it's no overhead for the fast-path. But these callbacks use irqsave spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast. To stay as close as possible to the stop_machine semantics we first update all the submit function pointers to the nop handler, then call synchronize_rcu() to make sure no new requests can be submitted. This should give us exactly the huge barrier we want. I pondered whether we should annotate engine->submit_request as __rcu and use rcu_assign_pointer and rcu_dereference on it. But the reason behind those is to make sure the compiler/cpu barriers are there for when you have an actual data structure you point at, to make sure all the writes are seen correctly on the read side. But we just have a function pointer, and .text isn't changed, so no need for these barriers and hence no need for annotations. Unfortunately there's a complication with the call to intel_engine_init_global_seqno: - Without stop_machine we must hold the corresponding spinlock. - Without stop_machine we must ensure that all requests are marked as having failed with dma_fence_set_error() before we call it. That means we need to split the nop request submission into two phases, both synchronized with rcu: 1. Only stop submitting the requests to hw and mark them as failed. 2. After all pending requests in the scheduler/ring are suitably marked up as failed and we can force complete them all, also force complete by calling intel_engine_init_global_seqno(). This should fix the followwing lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G U ------------------------------------------------------ kworker/3:4/562 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40 but task is already holding lock: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&dev->struct_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_interruptible_nested+0x1b/0x20 i915_mutex_lock_interruptible+0x51/0x130 [i915] i915_gem_fault+0x209/0x650 [i915] __do_fault+0x1e/0x80 __handle_mm_fault+0xa08/0xed0 handle_mm_fault+0x156/0x300 __do_page_fault+0x2c5/0x570 do_page_fault+0x28/0x250 page_fault+0x22/0x30 -> #5 (&mm->mmap_sem){++++}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __might_fault+0x68/0x90 _copy_to_user+0x23/0x70 filldir+0xa5/0x120 dcache_readdir+0xf9/0x170 iterate_dir+0x69/0x1a0 SyS_getdents+0xa5/0x140 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #4 (&sb->s_type->i_mutex_key#5){++++}: down_write+0x3b/0x70 handle_create+0xcb/0x1e0 devtmpfsd+0x139/0x180 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #3 ((complete)&req.done){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 wait_for_common+0x58/0x210 wait_for_completion+0x1d/0x20 devtmpfs_create_node+0x13d/0x160 device_add+0x5eb/0x620 device_create_groups_vargs+0xe0/0xf0 device_create+0x3a/0x40 msr_device_create+0x2b/0x40 cpuhp_invoke_callback+0xc9/0xbf0 cpuhp_thread_fun+0x17b/0x240 smpboot_thread_fn+0x18a/0x280 kthread+0x152/0x190 ret_from_fork+0x27/0x40 -> #2 (cpuhp_state-up){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpuhp_issue_call+0x133/0x1c0 __cpuhp_setup_state_cpuslocked+0x139/0x2a0 __cpuhp_setup_state+0x46/0x60 page_writeback_init+0x43/0x67 pagecache_init+0x3d/0x42 start_kernel+0x3a8/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #1 (cpuhp_state_mutex){+.+.}: __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 __mutex_lock+0x86/0x9b0 mutex_lock_nested+0x1b/0x20 __cpuhp_setup_state_cpuslocked+0x53/0x2a0 __cpuhp_setup_state+0x46/0x60 page_alloc_init+0x28/0x30 start_kernel+0x145/0x3fc x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x6d/0x70 verify_cpu+0x0/0xfb -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x430/0x840 __lock_acquire+0x1420/0x15e0 lock_acquire+0xb0/0x200 cpus_read_lock+0x3d/0xb0 stop_machine+0x1c/0x40 i915_gem_set_wedged+0x1a/0x20 [i915] i915_reset+0xb9/0x230 [i915] i915_reset_device+0x1f6/0x260 [i915] i915_handle_error+0x2d8/0x430 [i915] hangcheck_declare_hang+0xd3/0xf0 [i915] i915_hangcheck_elapsed+0x262/0x2d0 [i915] process_one_work+0x233/0x660 worker_thread+0x4e/0x3b0 kthread+0x152/0x190 ret_from_fork+0x27/0x40 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->struct_mutex); lock(&mm->mmap_sem); lock(&dev->struct_mutex); lock(cpu_hotplug_lock.rw_sem); * DEADLOCK * 3 locks held by kworker/3:4/562: #0: ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 #1: ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660 #2: (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915] stack backtrace: CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G U 4.14.0-rc3-CI-CI_DRM_3179+ #1 Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017 Workqueue: events_long i915_hangcheck_elapsed [i915] Call Trace: dump_stack+0x68/0x9f print_circular_bug+0x235/0x3c0 ? lockdep_init_map_crosslock+0x20/0x20 check_prev_add+0x430/0x840 ? irq_work_queue+0x86/0xe0 ? wake_up_klogd+0x53/0x70 __lock_acquire+0x1420/0x15e0 ? __lock_acquire+0x1420/0x15e0 ? lockdep_init_map_crosslock+0x20/0x20 lock_acquire+0xb0/0x200 ? stop_machine+0x1c/0x40 ? i915_gem_object_truncate+0x50/0x50 [i915] cpus_read_lock+0x3d/0xb0 ? stop_machine+0x1c/0x40 stop_machine+0x1c/0x40 i915_gem_set_wedged+0x1a/0x20 [i915] i915_reset+0xb9/0x230 [i915] i915_reset_device+0x1f6/0x260 [i915] ? gen8_gt_irq_ack+0x170/0x170 [i915] ? work_on_cpu_safe+0x60/0x60 i915_handle_error+0x2d8/0x430 [i915] ? vsnprintf+0xd1/0x4b0 ? scnprintf+0x3a/0x70 hangcheck_declare_hang+0xd3/0xf0 [i915] ? intel_runtime_pm_put+0x56/0xa0 [i915] i915_hangcheck_elapsed+0x262/0x2d0 [i915] process_one_work+0x233/0x660 worker_thread+0x4e/0x3b0 kthread+0x152/0x190 ? process_one_work+0x660/0x660 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x27/0x40 Setting dangerous option reset - tainting kernel i915 0000:00:02.0: Resetting chip after gpu hang Setting dangerous option reset - tainting kernel i915 0000:00:02.0: Resetting chip after gpu hang v2: Have 1 global synchronize_rcu() barrier across all engines, and improve commit message. v3: We need to protect the seqno update with the timeline spinlock (in set_wedged) to avoid racing with other updates of the seqno, like we already do in nop_submit_request (Chris). v4: Use two-phase sequence to plug the race Chris spotted where we can complete requests before they're marked up with -EIO. v5: Review from Chris: - simplify nop_submit_request. - Add comment to rcu_read_lock section. - Align comments with the new style. v6: Remove unused variable to appease CI. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171011091019.1425-1-daniel.vetter@ffwll.ch	2017-10-11 17:51:21 +02:00
Matthew Auld	617dc7610d	drm/i915/selftests: ditch the kernel context There's really no good reason to be using the kernel context for the huge-page livetests. Also with the introduction of commit `bef27bdb6c` ("drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT") we start hitting the bug on in the selftests, since the kernel context will always return true for i915_vma_is_ggtt(), so now seems like the opportune time to instead create our own context. Fixes: `4049866f09` ("drm/i915/selftests: huge page tests") Fixes: `bef27bdb6c` ("drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171010133030.12112-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-10-10 21:30:24 +01:00
Matthew Auld	84e8978e62	drm/i915: s/sg_mask/sg_page_sizes/ It's a little unclear what the sg_mask actually is, so prefer the more meaningful name of sg_page_sizes. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-10-09 17:07:29 +01:00
Chris Wilson	b4563f595e	drm/i915: Pin fence for iomap Acquire the fence register for the iomap in i915_vma_pin_iomap() on behalf of the caller. We probably want for the caller to specify whether the fence should be pinned for their usage, but at the moment all callers do want the associated fence, or none, so take it on their behalf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk	2017-10-09 17:07:29 +01:00
Chris Wilson	ff97d3ae69	drm/i915/selftests: Hold the rpm wakeref for the reset tests The lowlevel reset functions expect the caller to be holding the rpm wakeref for the device access across the reset. We were not explicitly doing this in the sefltest, so for simplicity acquire the wakeref for the duration of all subtests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-4-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-10-09 17:07:28 +01:00
Chris Wilson	95a19ab4d7	drm/i915/selftests: Pretty print engine state when requests fail to start During hangcheck testing, we try to execute requests following the GPU reset, and in particular want to try and debug when those fail. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-2-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-10-09 17:07:28 +01:00
Matthew Auld	a883241c39	drm/i915: enable platform support for 2M pages For gen8+ platforms which support the 48b PPGTT, enable platform level support for 2M pages. Also enable for mock testing. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-22-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-21-chris@chris-wilson.co.uk	2017-10-07 10:12:05 +01:00
Matthew Auld	f1f3f98272	drm/i915: enable platform support for 64K pages For gen9+ enable platform level support for 64K pages. Also enable for mock testing. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-21-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-20-chris@chris-wilson.co.uk	2017-10-07 10:12:04 +01:00
Matthew Auld	7924d9d4dc	drm/i915/selftests: mix huge pages Try to mix sg page sizes for 4K, 64K and 2M pages. v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/ Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-19-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-18-chris@chris-wilson.co.uk	2017-10-07 10:12:03 +01:00
Matthew Auld	4049866f09	drm/i915/selftests: huge page tests v2: mock test page support configurations and add MI_STORE_DWORD test v3: run all mockable huge page tests on all platforms via the mock_device v4: add pin_update regression test various improvements suggested by Chris v5: fix issues reported by kbuild test single sg spanning multiple page sizes don't explode when running the live-tests through the appgtt v6: lots of improvements from Chris v7: run on each engine for igt_write_huge add simple tmpfs fallback test v8: size_t is bad don't break the i386 build Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-17-chris@chris-wilson.co.uk	2017-10-07 10:12:00 +01:00
Matthew Auld	fa3f46afd3	drm/i915: introduce vm set_pages/clear_pages Move the setting/clearing of the vma->pages to a vm operation. Doing so neatens things up a little, but more importantly gives us a sane place to also set/clear the vma->pages_sizes, which we introduce later in preparation for supporting huge-pages. v2: remove redundant vma->pages check v3: GEM_BUG_ON(vma->pages) following i915_vma_remove Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk	2017-10-07 10:11:50 +01:00
Matthew Auld	a5c0816626	drm/i915: introduce page_size members In preparation for supporting huge gtt pages for the ppgtt, we introduce page size members for gem objects. We fill in the page sizes by scanning the sg table. v2: pass the sg_mask to set_pages v3: calculate the sg_mask inline with populating the sg_table where possible, and pass to set_pages along with the pages. v4: bunch of improvements from Joonas v5: fix num_pages blunder introduce i915_sg_page_sizes helper v6: prefer GEM_BUG_ON(sizes == 0) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-6-chris@chris-wilson.co.uk	2017-10-07 10:11:48 +01:00
Matthew Auld	b91b09eea7	drm/i915: push set_pages down to the callers Each backend is now responsible for calling __i915_gem_object_set_pages upon successfully gathering its backing storage. This eliminates the inconsistency between the async and sync paths, which stands out even more when we start throwing around an sg_mask in a later patch. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-6-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-5-chris@chris-wilson.co.uk	2017-10-07 10:11:45 +01:00
Matthew Auld	2a9654b2cd	drm/i915: introduce page_sizes field to dev_info In preparation for huge gtt pages expose page_sizes as part of the device info, to indicate the page sizes supported by the HW. Currently only 4K is supported. v2: s/page_size_mask/page_sizes/ v3: introduce I915_GTT_MAX_PAGE_SIZE Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-4-chris@chris-wilson.co.uk	2017-10-07 10:11:44 +01:00
Matthew Auld	465c403cb5	drm/i915: introduce simple gemfs Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so moves us away from the shmemfs shm_mnt, and gives us the much needed flexibility to do things like set our own mount options, namely huge= which should allow us to enable the use of transparent-huge-pages for our shmem backed objects. v2: various improvements suggested by Joonas v3: move gemfs instance to i915.mm and simplify now that we have file_setup_with_mnt v4: fallback to tmpfs shm_mnt upon failure to setup gemfs v5: make tmpfs fallback kinder v5: better gemfs failure message flags variable Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: linux-mm@kvack.org Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-2-chris@chris-wilson.co.uk	2017-10-07 10:11:41 +01:00
Arnd Bergmann	764d2997ec	drm/i915/selftests: fix check for intel IOMMU An earlier bugfix tried to work around this build failure: drivers/gpu/drm/i915/selftests/mock_gem_device.c: In function 'mock_gem_device': drivers/gpu/drm/i915/selftests/mock_gem_device.c:151:20: error: 'struct dev_archdata' has no member named 'iommu' Checking for CONFIG_IOMMU_API is not sufficient as a compile-time test since that may be enabled in configurations that have neither INTEL_IOMMU not AMD_IOMMU enabled. This changes the check to INTEL_IOMMU instead, as this is the only case we actually care about. Fixes: `f46f156ea7` ("drm/i915/selftests: Only touch archdata.iommu when it exists") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171005120749.400818-1-arnd@arndb.de	2017-10-05 15:25:27 +02:00
Chris Wilson	e91ef99b95	drm/i915/selftests: Remember to create the fake preempt context For the fake device we have our own set of mock contexts that need to match the real contexts we normally create. Currently this requires us to manually instantiate them for the selftests, which I forgot. Reported-by: Matthew Auld <matthew.william.auld@gmail.com> Fixes: `e7af311683` ("drm/i915: Introduce a preempt context") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171005105927.22991-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>	2017-10-05 13:21:16 +01:00
Chris Wilson	60456d5c2d	drm/i915/selftests: Replace wmb() with i915_gem_chipset_flush() Currently, we are being fairly lazy and only using a wmb() following an update to an active batch. Previously, we have found that to be insufficient to ensure that a write from the CPU reaches memory in a timely fashion, and in some caches we may need to flush a chipset cache. To that end, we have i915_gem_chipset_flush() so use it. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170926153409.7928-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-09-29 12:30:17 +01:00
Jani Nikula	32f35b8634	Merge drm-upstream/drm-next into drm-intel-next-queued Need MST sideband message transaction to power up/down nodes. Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-09-28 15:56:49 +03:00
Chris Wilson	87dc03ad26	drm/i915/selftests: Try to recover from a wedged GPU during reset tests If we see the seqno stop progressing, we abandon the test for fear that the GPU died following the reset. However, during test teardown we still wait for the GPU to idle before continuing, but we have already confirmed that the GPU is dead. Furthermore, since we are inside a reset test, we have disabled the hangchecker, and so there is no safety net and we wait indefinitely. Detect the stuck GPU and declare it wedged as a state of emergency so we can escape. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jari Tahvanainen <jari.tahvanainen@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170915130929.18892-1-chris@chris-wilson.co.uk Tested-by: Jari Tahvanainen <jari.tahvanainen@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-09-26 14:19:25 +01:00
Chris Wilson	f46f156ea7	drm/i915/selftests: Only touch archdata.iommu when it exists archdata.iommu only exists when CONFIG_IOMMU_API is enabled (and only applies to intel-iommu in our case) so conditionally compile it out when it doesn't exist. Fixes: `b5891fb520` ("drm/i915/selftests: Disable iommu for the mock device") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170918164652.14200-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-09-19 10:13:50 +01:00
Chris Wilson	b5891fb520	drm/i915/selftests: Disable iommu for the mock device On some machines, the iommu cannot allocate a domain for the mock device causing the dma_map_sg() to fail, and the selftest to fail with -ENOMEM. For the mock selftests, we are using a fake device and do not care about iommu; so convince intel_iommu to treat us as a dummy device with an identity mapping (and no iommu domain). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101080 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170914162240.18310-1-chris@chris-wilson.co.uk Tested-by: Elizabeth De La Torre Mena <elizabethx.de.la.torre.mena@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com	2017-09-18 16:57:35 +01:00
Michal Hocko	0ee931c4e3	mm: treewide: remove GFP_TEMPORARY allocation flag GFP_TEMPORARY was introduced by commit `e12ba74d8f` ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL \| __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-09-13 18:53:16 -07:00
Chris Wilson	7ce5b6850b	drm/i915/selftests: Use mul_u32_u32() for 32b x 32b -> 64b result As realised by commit `9e3d6223d2` ("math64, timers: Fix 32bit mul_u64_u32_shr() and friends"), GCC does not always generate ideal code for performing a 32b x 32b multiply returning a 64b result (i.e. where we idiomatically use u64 result = (u64)x * (u32)x). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170913105154.2910-2-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2017-09-13 13:27:20 +01:00
Chris Wilson	d1b48c1e71	drm/i915: Replace execbuf vma ht with an idr This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk	2017-08-18 11:59:02 +01:00
Chris Wilson	f2f5c0610f	drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs MI_STORE_DWORD_IMM just doesn't work on the video decode engine under Sandybridge, so refrain from using it. Then switch the selftests over to using the now common test prior to using MI_STORE_DWORD_IMM. Fixes: `7dd4f6729f` ("drm/i915: Async GPU relocation processing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+ Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2017-08-18 11:55:02 +01:00
Chris Wilson	b8f55be644	drm/i915: Split obj->cache_coherent to track r/w Another month, another story in the cache coherency saga. This time, we come to the realisation that i915_gem_object_is_coherent() has been reporting whether we can read from the target without requiring a cache invalidate; but we were using it in places for testing whether we could write into the object without requiring a cache flush. So split the tracking into two, one to decide before reads, one after writes. See commit `e27ab73d17` ("drm/i915: Mark CPU cache as dirty on every transition for CPU writes") for the previous entry in this saga. v2: Be verbose v3: Remove unused function (i915_gem_object_is_coherent) v4: Fix inverted coherency check prior to execbuf (from v2) v5: Add comment for nasty code where we are optimising on gcc's behalf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555 Testcase: igt/kms_mmap_write_crc Testcase: igt/kms_pwrite_crc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-08-15 15:46:57 +01:00

1 2 3 4

165 Commits