Commit Graph

165 Commits

Author SHA1 Message Date
Chris Wilson
3fef5cda97 drm/i915: Automatic i915_switch_context for legacy
During request construction, after pinning the context we know whether
or not we have to emit a context switch. So move this common operation
from every caller into i915_gem_request_alloc() itself.

v2: Always submit the request if we emitted some commands during request
construction, as typically it also involves changes in global state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-2-chris@chris-wilson.co.uk
2017-11-20 15:56:16 +00:00
Chris Wilson
2113184c6f drm/i915: Pull the unconditional GPU cache invalidation into request construction
As the request will, in the following patch, implicitly invoke a
context-switch on construction, we should precede that with a GPU TLB
invalidation. Also, even before using GGTT, we always want to invalidate
the TLBs for any updates (as well as the ppgtt invalidates that are
unconditionally applied by execbuf). Since we almost always require the
TLB invalidate, do it unconditionally on request allocation and so we can
remove it from all other paths.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120102002.22254-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-20 15:56:16 +00:00
Chris Wilson
223c73a366 drm/i915/selftests: Report ENOMEM clearly for an allocation failure
If we can not run the drunk_hole test because we couldn't allocate the
memory for the permutation array (even after we tried trimming the
size), report a clear ENOMEM. Similary, if we are asked to operate on a
hole too small for ourselves, make it skip quietly.

v2: Avoid malloc(0) since that returns ZERO_SIZE_PTR not NULL.
v3: Fixup similar construction for lowlevel_hole
v4: Use u64 >> 1 to avoid 64b div.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117101732.4335-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171117162945.16390-1-chris@chris-wilson.co.uk
2017-11-17 18:20:36 +00:00
Michel Thierry
55bd6bd757 drm/i915/selftests: Add a GuC doorbells selftest
The first test aims to check guc_init_doorbell_hw, changing the existing
guc clients and doorbells state before calling it.

The second test tries to create as many clients as it is currently possible
(currently limited to max number of doorbells) and exercise the doorbell
alloc/dealloc code.

Since our usage mode require very few clients/doorbells, this code has
been exercised very lightly and it's good to have a simple test for it.

As reference, this test already helped identify the bug fixed by
commit 7f1ea2ac30 ("drm/i915/guc: Fix doorbell id selection").

v2: Extend number of clients; check for client allocation failure when
number of doorbells is exceeded; validate client properties; reuse
guc_init_doorbell_hw (Chris).

v3: guc_init_doorbell_hw test added per Chris suggestion.

v4: Try to explain why guc_init_doorbell_hw exist and comment some
details in the subtest.

v5: Remove redundant pr_info at the beginning of each subtest (Chris);
rebase (s/i915_guc_client/intel_guc_client/).

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171116220632.1909-1-michel.thierry@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-17 10:02:39 +00:00
Chris Wilson
4fe95b042d drm/i915/selftests: exercise_ggtt may have nothing to do
When operating on the live_ggtt we have to find a usuable hole for our
test. It is possible for there to be no hole we can use, so initialise
the err to 0 for the early exit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115152558.31252-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-16 15:47:17 +00:00
Chris Wilson
24fd018aae drm/i915/selftests: Increase size for mock ringbuffer
We don't actually emit any commands into the ringbuffer, so we set it
very small. However, an upcoming change centralises the wait-for-space
into i915_gem_request_alloc() and that imposes a minimum size upon all
ringbuffers (mock or real) of MIN_SPACE_FOR_ADD_REQUEST. Grow the
mock ringbuffer such that we allocate a single page for the struct+buffer,
satisfying the new condition without wasting too much space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115151204.8105-1-chris@chris-wilson.co.uk
2017-11-15 17:12:49 +00:00
Chris Wilson
fb4e14860b drm/i915/selftests: Markup __iomem for igt_gem_coherency
Silence sparse warnings by using __iomem markup and io accessors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114191842.19063-1-chris@chris-wilson.co.uk
2017-11-15 17:12:49 +00:00
Chris Wilson
6e1281412a drm/i915/selftests: Always initialise err
smatch does not track initialised values as well as gcc, and this
triggers many warnings by smatch not presented by gcc. Silence smatch by
initialising the error values to -ENODEV, which we use to denote
internal errors. (If we see a selftest fail with a silent -ENODEV, we
know smatch was right!)

v2: smatch was right about igt_create_vma(), it may unlikely fail on the
first object allocation which we want to be loud about.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114223346.25958-1-chris@chris-wilson.co.uk
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
2017-11-14 23:50:49 +00:00
Chris Wilson
9c52d1c816 drm/i915/selftests: Yet another forgotten mock_i915->mm initialiser
Move all of the i915->mm initialisation to a private function that can
be reused by the mock i915 device to save forgetting any more steps.

For example,
<7>[ 1542.046332] [IGT] drv_selftest: starting subtest mock_objects
<4>[ 1542.123924] Setting dangerous option mock_selftests - tainting kernel
<6>[ 1542.167941] i915: Performing mock selftests with st_random_seed=0x246f5ab5 st_timeout=1000
<4>[ 1542.178012] INFO: trying to register non-static key.
<4>[ 1542.178027] the code is fine but needs lockdep annotation.
<4>[ 1542.178032] turning off the locking correctness validator.
<4>[ 1542.178041] CPU: 3 PID: 6008 Comm: kworker/3:7 Tainted: G     U          4.14.0-rc8-CI-CI_DRM_3332+ #1
<4>[ 1542.178049] Hardware name:                  /NUC6CAYB, BIOS AYAPLCEL.86A.0040.2017.0619.1722 06/19/2017
<4>[ 1542.178144] Workqueue: events __i915_gem_free_work [i915]
<4>[ 1542.178152] Call Trace:
<4>[ 1542.178163]  dump_stack+0x68/0x9f
<4>[ 1542.178170]  register_lock_class+0x3fd/0x580
<4>[ 1542.178177]  ? unwind_next_frame+0x14/0x20
<4>[ 1542.178184]  ? __save_stack_trace+0x73/0xd0
<4>[ 1542.178191]  __lock_acquire+0xa4/0x1b00
<4>[ 1542.178254]  ? __i915_gem_free_work+0x28/0xa0 [i915]
<4>[ 1542.178261]  ? __lock_acquire+0x4ab/0x1b00
<4>[ 1542.178268]  lock_acquire+0xb0/0x200
<4>[ 1542.178273]  ? lock_acquire+0xb0/0x200
<4>[ 1542.178336]  ? __i915_gem_free_work+0x28/0xa0 [i915]
<4>[ 1542.178344]  _raw_spin_lock+0x32/0x50
<4>[ 1542.178405]  ? __i915_gem_free_work+0x28/0xa0 [i915]
<4>[ 1542.178468]  __i915_gem_free_work+0x28/0xa0 [i915]
<4>[ 1542.178476]  process_one_work+0x221/0x650
<4>[ 1542.178483]  worker_thread+0x4e/0x3c0
<4>[ 1542.178489]  kthread+0x114/0x150
<4>[ 1542.178494]  ? process_one_work+0x650/0x650
<4>[ 1542.178499]  ? kthread_create_on_node+0x40/0x40
<4>[ 1542.178506]  ret_from_fork+0x27/0x40

v2: Fish out i915->mm.object_stat_lock which was being inited over in
i915_drv.c (Matthew)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110232447.21618-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-11-10 23:42:49 +00:00
Chris Wilson
9511ce1ce7 drm/i915/selftests: Initialise mock_i915->mm.obj_lock
lockdep spotted that the mock tests were using the i915->mm.obj_lock
without first initialiasing it:

>[ 1303.217043] [IGT] drv_selftest: starting subtest mock_objects
<4>[ 1303.240898] Setting dangerous option mock_selftests - tainting kernel
<6>[ 1303.253665] i915: Performing mock selftests with st_random_seed=0xd87ea6c6 st_timeout=1000
<4>[ 1303.254812] INFO: trying to register non-static key.
<4>[ 1303.254816] the code is fine but needs lockdep annotation.
<4>[ 1303.254818] turning off the locking correctness validator.
<4>[ 1303.254820] CPU: 4 PID: 13112 Comm: drv_selftest Tainted: G     U  W       4.14.0-rc8-CI-Patchwork_7058+ #1
<4>[ 1303.254823] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4>[ 1303.254825] Call Trace:
<4>[ 1303.254829]  dump_stack+0x68/0x9f
<4>[ 1303.254832]  register_lock_class+0x3fd/0x580
<4>[ 1303.254835]  ? ___slab_alloc.constprop.29+0x157/0x3d0
<4>[ 1303.254837]  ? ___slab_alloc.constprop.29+0x157/0x3d0
<4>[ 1303.254840]  ? sg_kmalloc+0x1e/0x50
<4>[ 1303.254842]  ? debug_smp_processor_id+0x17/0x20
<4>[ 1303.254845]  __lock_acquire+0xa4/0x1b00
<4>[ 1303.254884]  ? __i915_gem_object_set_pages+0x116/0x1f0 [i915]
<4>[ 1303.254887]  ? __this_cpu_preempt_check+0x13/0x20
<4>[ 1303.254889]  ? sg_kmalloc+0x1e/0x50
<4>[ 1303.254891]  lock_acquire+0xb0/0x200
<4>[ 1303.254893]  ? lock_acquire+0xb0/0x200
<4>[ 1303.254917]  ? __i915_gem_object_set_pages+0x116/0x1f0 [i915]
<4>[ 1303.254920]  _raw_spin_lock+0x32/0x50
<4>[ 1303.254944]  ? __i915_gem_object_set_pages+0x116/0x1f0 [i915]
<4>[ 1303.254967]  __i915_gem_object_set_pages+0x116/0x1f0 [i915]
<4>[ 1303.254991]  i915_gem_object_get_pages_phys+0x286/0x2b0 [i915]
<4>[ 1303.255015]  ____i915_gem_object_get_pages+0x20/0x60 [i915]
<4>[ 1303.255039]  i915_gem_object_attach_phys+0x137/0x1a0 [i915]
<4>[ 1303.255063]  igt_phys_object+0x45/0x120 [i915]
<4>[ 1303.255094]  __i915_subtests+0x40/0xd0 [i915]
<4>[ 1303.255099]  ? work_on_cpu_safe+0x60/0x60
<4>[ 1303.255131]  i915_gem_object_mock_selftests+0x34/0x50 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110151919.18451-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-10 16:21:17 +00:00
Hans de Goede
a5266db4d3 drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset()
intel_uncore_forcewake_reset() does forcewake puts and gets as such
we need to make sure that no-one tries to access the PUNIT->PMIC bus
(on systems where this bus is shared) while it runs, otherwise bad
things happen.

Normally this is taken care of by the i915_pmic_bus_access_notifier()
which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other
driver tries to access the PMIC bus, so that later forcewake gets are
no-ops (for the duration of the bus access).

But intel_uncore_forcewake_reset gets called in 3 cases:
1) Before registering the pmic_bus_access_notifier
2) After unregistering the pmic_bus_access_notifier
3) To reset forcewake state on a GPU reset

In all 3 cases the i915_pmic_bus_access_notifier() protection is
insufficient.

This commit fixes this race by calling iosf_mbi_punit_acquire() before
calling intel_uncore_forcewake_reset(). In the case where it is called
directly after unregistering the pmic_bus_access_notifier, we need to
hold the punit-lock over both calls to avoid a race where
intel_uncore_fw_release_timer() may execute between the 2 calls.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171019111620.26761-3-hdegoede@redhat.com
2017-11-10 13:14:03 +01:00
Chris Wilson
693b1ccabe drm/i915/selftests: Take rpm wakeref around partial tiling tests
Since the partial tiling tests are poking into the GGTT to watch the
fence registers in operation, it itself needs the device rpm wakeref in
order for the GGTT to remain accessible.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107115653.10716-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-11-07 17:50:34 +00:00
Chris Wilson
c29ccb9f42 drm/i915/selftests: Take rpm wakeref around GGTT lowlevel tests
The vma routines are responsible for acquiring the device rpm wakeref
before they poke the HW. However, some of the selftests bypass the
higher level vma routines in order to poke directly at the lowlevel GGTT
functions; these are then responsible for managing rpm themselves.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107114051.10583-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-11-07 17:50:33 +00:00
Chris Wilson
f6d03042b9 drm/i915/selftests: Skip mixed page exhaustion if only small pages available
If we only have 4k pages, we can't mix together different combinations
of hugepages to see if the world burns. However, as the loops did
nothing, we never set err to 0 and reported ENODEV aborting the test.
Teach the test to skip instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107110559.6098-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
2017-11-07 17:50:33 +00:00
Chris Wilson
69ea47a5a9 drm/i915/selftests: Hide dangerous tests
Some tests are designed to exercise the limits of the HW and may trigger
unintended side-effects making the machine unusable. This should not be
executed by default, but are still useful for early platform validation.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103453
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025153207.9589-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-06 11:17:03 +00:00
Chris Wilson
02a1ca45cf Revert "drm/i915/selftests: Convert timers to use timer_setup()"
This reverts commit 6d0dbd3096.

timer_setup_on_stack() does not yet exist:

In file included from drivers/gpu/drm/i915/i915_sw_fence.c:517:0:
drivers/gpu/drm/i915/selftests/lib_sw_fence.c: In function ‘timed_fence_init’:
drivers/gpu/drm/i915/selftests/lib_sw_fence.c:63:2: error: implicit declaration of function ‘timer_setup_on_stack’; did you mean ‘hrtimer_init_on_stack’? [-Werror=implicit-function-declaration]
  timer_setup_on_stack(&tf->timer, timed_fence_wake, 0);

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025131336.2584-1-chris@chris-wilson.co.uk
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-25 14:16:16 +01:00
Kees Cook
6d0dbd3096 drm/i915/selftests: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024151344.GA104417@beast
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-25 12:13:10 +01:00
Chris Wilson
b1f9107e1b drm/i915/selftests: Don't try to queue a request with zero delay
Instead of trying to create a timer with zero delay (i.e. with expires
set to the current jiffies and not the future, an already expired
timer), execute that request immediately.

v2: Refactor list_del_init+signal into its own little function.
v3: Reorder testing so as not to immediately signal a delayed request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024220855.30155-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-25 12:13:03 +01:00
Kees Cook
39cbf2aa41 drm/i915: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017065304.3358-1-joonas.lahtinen@linux.intel.com
2017-10-18 14:56:10 +03:00
Chris Wilson
134649ff35 drm/i915/selftests: Silence the compiler for impossible errors
It should be impossible for these tests not to run due to an empty
ppgtt, but if it should happen, let's report ENODEV (our typical
internal error for impossible events).

In file included from drivers/gpu/drm/i915/i915_gem.c:5415:
   drivers/gpu/drm/i915/selftests/huge_pages.c: In function 'igt_mock_ppgtt_huge_fill':
>> drivers/gpu/drm/i915/selftests/huge_pages.c:612: error: 'err' may be used uninitialized in this function
   drivers/gpu/drm/i915/selftests/huge_pages.c: In function 'igt_ppgtt_exhaust_huge':
   drivers/gpu/drm/i915/selftests/huge_pages.c:1159: error: 'err' may be used uninitialized in this function

Reported-by: kbuild-all@01.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171017103723.6933-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-10-17 14:07:47 +01:00
Chris Wilson
f2123818ff drm/i915: Move dev_priv->mm.[un]bound_list to its own lock
Remove the struct_mutex requirement around dev_priv->mm.bound_list and
dev_priv->mm.unbound_list by giving it its own spinlock. This reduces
one more requirement for struct_mutex and in the process gives us
slightly more accurate unbound_list tracking, which should improve the
shrinker - but the drawback is that we drop the retirement before
counting so i915_gem_object_is_active() may be stale and lead us to
underestimate the number of objects that may be shrunk (see commit
bed50aea61 ("drm/i915/shrinker: Flush active on objects before
counting")).

v2: Crosslink the spinlock to the lists it protects, and btw this
changes s/obj->global_link/obj->mm.link/
v3: Fix decoupling of old links in i915_gem_object_attach_phys()
v3.1: Fix the fix, only unlink if it was linked
v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
2017-10-16 20:44:19 +01:00
Chris Wilson
9c1477e83e drm/i915/selftests: Exercise adding requests to a full GGTT
A bug recently encountered involved the issue where are we were
submitting requests to different ppGTT, each would pin a segment of the
GGTT for its logical context and ring. However, this is invisible to
eviction as we do not tie the context/ring VMA to a request and so do
not automatically wait upon it them (instead they are marked as pinned,
preventing eviction entirely). Instead the eviction code must flush those
contexts by switching to the kernel context. This selftest tries to
fill the GGTT with contexts to exercise a path where the
switch-to-kernel-context failed to make forward progress and we fail
with ENOSPC.

v2: Make the hole in the filled GGTT explicit.
v3: Swap out the arbitrary timeout for a private notification from
i915_gem_evict_something()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-3-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-10-12 21:06:26 +01:00
Chris Wilson
214707fc2c drm/i915/selftests: Wrap a timer into a i915_sw_fence
For some selftests, we want to issue requests but delay them going to
hardware. Furthermore, we don't want those requests to block
indefinitely (or else we may hang the driver and block testing) so we
want to employ a timeout. So naturally we want a fence that is
automatically signaled by a timer.

v2: Add kselftests.
v3: Limit the API available to selftests; there isn't an overwhelming
reason to export it universally.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171012125726.14736-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-10-12 21:06:26 +01:00
Daniel Vetter
af7a8ffad9 drm/i915: Use rcu instead of stop_machine in set_wedged
stop_machine is not really a locking primitive we should use, except
when the hw folks tell us the hw is broken and that's the only way to
work around it.

This patch tries to address the locking abuse of stop_machine() from

commit 20e4933c47
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Nov 22 14:41:21 2016 +0000

    drm/i915: Stop the machine as we install the wedged submit_request handler

Chris said parts of the reasons for going with stop_machine() was that
it's no overhead for the fast-path. But these callbacks use irqsave
spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.

To stay as close as possible to the stop_machine semantics we first
update all the submit function pointers to the nop handler, then call
synchronize_rcu() to make sure no new requests can be submitted. This
should give us exactly the huge barrier we want.

I pondered whether we should annotate engine->submit_request as __rcu
and use rcu_assign_pointer and rcu_dereference on it. But the reason
behind those is to make sure the compiler/cpu barriers are there for
when you have an actual data structure you point at, to make sure all
the writes are seen correctly on the read side. But we just have a
function pointer, and .text isn't changed, so no need for these
barriers and hence no need for annotations.

Unfortunately there's a complication with the call to
intel_engine_init_global_seqno:

- Without stop_machine we must hold the corresponding spinlock.

- Without stop_machine we must ensure that all requests are marked as
  having failed with dma_fence_set_error() before we call it. That
  means we need to split the nop request submission into two phases,
  both synchronized with rcu:

  1. Only stop submitting the requests to hw and mark them as failed.

  2. After all pending requests in the scheduler/ring are suitably
  marked up as failed and we can force complete them all, also force
  complete by calling intel_engine_init_global_seqno().

This should fix the followwing lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G     U
------------------------------------------------------
kworker/3:4/562 is trying to acquire lock:
 (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40

but task is already holding lock:
 (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&dev->struct_mutex){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_interruptible_nested+0x1b/0x20
       i915_mutex_lock_interruptible+0x51/0x130 [i915]
       i915_gem_fault+0x209/0x650 [i915]
       __do_fault+0x1e/0x80
       __handle_mm_fault+0xa08/0xed0
       handle_mm_fault+0x156/0x300
       __do_page_fault+0x2c5/0x570
       do_page_fault+0x28/0x250
       page_fault+0x22/0x30

-> #5 (&mm->mmap_sem){++++}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __might_fault+0x68/0x90
       _copy_to_user+0x23/0x70
       filldir+0xa5/0x120
       dcache_readdir+0xf9/0x170
       iterate_dir+0x69/0x1a0
       SyS_getdents+0xa5/0x140
       entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #4 (&sb->s_type->i_mutex_key#5){++++}:
       down_write+0x3b/0x70
       handle_create+0xcb/0x1e0
       devtmpfsd+0x139/0x180
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #3 ((complete)&req.done){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       wait_for_common+0x58/0x210
       wait_for_completion+0x1d/0x20
       devtmpfs_create_node+0x13d/0x160
       device_add+0x5eb/0x620
       device_create_groups_vargs+0xe0/0xf0
       device_create+0x3a/0x40
       msr_device_create+0x2b/0x40
       cpuhp_invoke_callback+0xc9/0xbf0
       cpuhp_thread_fun+0x17b/0x240
       smpboot_thread_fn+0x18a/0x280
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #2 (cpuhp_state-up){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpuhp_issue_call+0x133/0x1c0
       __cpuhp_setup_state_cpuslocked+0x139/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_writeback_init+0x43/0x67
       pagecache_init+0x3d/0x42
       start_kernel+0x3a8/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #1 (cpuhp_state_mutex){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_nested+0x1b/0x20
       __cpuhp_setup_state_cpuslocked+0x53/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_alloc_init+0x28/0x30
       start_kernel+0x145/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #0 (cpu_hotplug_lock.rw_sem){++++}:
       check_prev_add+0x430/0x840
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpus_read_lock+0x3d/0xb0
       stop_machine+0x1c/0x40
       i915_gem_set_wedged+0x1a/0x20 [i915]
       i915_reset+0xb9/0x230 [i915]
       i915_reset_device+0x1f6/0x260 [i915]
       i915_handle_error+0x2d8/0x430 [i915]
       hangcheck_declare_hang+0xd3/0xf0 [i915]
       i915_hangcheck_elapsed+0x262/0x2d0 [i915]
       process_one_work+0x233/0x660
       worker_thread+0x4e/0x3b0
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

other info that might help us debug this:

Chain exists of:
  cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->struct_mutex);
                               lock(&mm->mmap_sem);
                               lock(&dev->struct_mutex);
  lock(cpu_hotplug_lock.rw_sem);

 *** DEADLOCK ***

3 locks held by kworker/3:4/562:
 #0:  ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
 #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
 #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]

stack backtrace:
CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3179+ #1
Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
Workqueue: events_long i915_hangcheck_elapsed [i915]
Call Trace:
 dump_stack+0x68/0x9f
 print_circular_bug+0x235/0x3c0
 ? lockdep_init_map_crosslock+0x20/0x20
 check_prev_add+0x430/0x840
 ? irq_work_queue+0x86/0xe0
 ? wake_up_klogd+0x53/0x70
 __lock_acquire+0x1420/0x15e0
 ? __lock_acquire+0x1420/0x15e0
 ? lockdep_init_map_crosslock+0x20/0x20
 lock_acquire+0xb0/0x200
 ? stop_machine+0x1c/0x40
 ? i915_gem_object_truncate+0x50/0x50 [i915]
 cpus_read_lock+0x3d/0xb0
 ? stop_machine+0x1c/0x40
 stop_machine+0x1c/0x40
 i915_gem_set_wedged+0x1a/0x20 [i915]
 i915_reset+0xb9/0x230 [i915]
 i915_reset_device+0x1f6/0x260 [i915]
 ? gen8_gt_irq_ack+0x170/0x170 [i915]
 ? work_on_cpu_safe+0x60/0x60
 i915_handle_error+0x2d8/0x430 [i915]
 ? vsnprintf+0xd1/0x4b0
 ? scnprintf+0x3a/0x70
 hangcheck_declare_hang+0xd3/0xf0 [i915]
 ? intel_runtime_pm_put+0x56/0xa0 [i915]
 i915_hangcheck_elapsed+0x262/0x2d0 [i915]
 process_one_work+0x233/0x660
 worker_thread+0x4e/0x3b0
 kthread+0x152/0x190
 ? process_one_work+0x660/0x660
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x27/0x40
Setting dangerous option reset - tainting kernel
i915 0000:00:02.0: Resetting chip after gpu hang
Setting dangerous option reset - tainting kernel
i915 0000:00:02.0: Resetting chip after gpu hang

v2: Have 1 global synchronize_rcu() barrier across all engines, and
improve commit message.

v3: We need to protect the seqno update with the timeline spinlock (in
set_wedged) to avoid racing with other updates of the seqno, like we
already do in nop_submit_request (Chris).

v4: Use two-phase sequence to plug the race Chris spotted where we can
complete requests before they're marked up with -EIO.

v5: Review from Chris:
- simplify nop_submit_request.
- Add comment to rcu_read_lock section.
- Align comments with the new style.

v6: Remove unused variable to appease CI.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171011091019.1425-1-daniel.vetter@ffwll.ch
2017-10-11 17:51:21 +02:00
Matthew Auld
617dc7610d drm/i915/selftests: ditch the kernel context
There's really no good reason to be using the kernel context for the
huge-page livetests. Also with the introduction of commit bef27bdb6c
("drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT")
we start hitting the bug on in the selftests, since the kernel context
will always return true for i915_vma_is_ggtt(), so now seems like the
opportune time to instead create our own context.

Fixes: 4049866f09 ("drm/i915/selftests: huge page tests")
Fixes: bef27bdb6c ("drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171010133030.12112-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-10 21:30:24 +01:00
Matthew Auld
84e8978e62 drm/i915: s/sg_mask/sg_page_sizes/
It's a little unclear what the sg_mask actually is, so prefer the more
meaningful name of sg_page_sizes.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-09 17:07:29 +01:00
Chris Wilson
b4563f595e drm/i915: Pin fence for iomap
Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.

We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk
2017-10-09 17:07:29 +01:00
Chris Wilson
ff97d3ae69 drm/i915/selftests: Hold the rpm wakeref for the reset tests
The lowlevel reset functions expect the caller to be holding the rpm
wakeref for the device access across the reset. We were not explicitly
doing this in the sefltest, so for simplicity acquire the wakeref for
the duration of all subtests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-10-09 17:07:28 +01:00
Chris Wilson
95a19ab4d7 drm/i915/selftests: Pretty print engine state when requests fail to start
During hangcheck testing, we try to execute requests following the GPU
reset, and in particular want to try and debug when those fail.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-10-09 17:07:28 +01:00
Matthew Auld
a883241c39 drm/i915: enable platform support for 2M pages
For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-22-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-21-chris@chris-wilson.co.uk
2017-10-07 10:12:05 +01:00
Matthew Auld
f1f3f98272 drm/i915: enable platform support for 64K pages
For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-21-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-20-chris@chris-wilson.co.uk
2017-10-07 10:12:04 +01:00
Matthew Auld
7924d9d4dc drm/i915/selftests: mix huge pages
Try to mix sg page sizes for 4K, 64K and 2M pages.

v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-19-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-18-chris@chris-wilson.co.uk
2017-10-07 10:12:03 +01:00
Matthew Auld
4049866f09 drm/i915/selftests: huge page tests
v2: mock test page support configurations and add MI_STORE_DWORD test

v3: run all mockable huge page tests on all platforms via the mock_device

v4: add pin_update regression test
    various improvements suggested by Chris

v5: fix issues reported by kbuild
    test single sg spanning multiple page sizes
    don't explode when running the live-tests through the appgtt

v6: lots of improvements from Chris

v7: run on each engine for igt_write_huge
    add simple tmpfs fallback test

v8: size_t is bad
    don't break the i386 build

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-17-chris@chris-wilson.co.uk
2017-10-07 10:12:00 +01:00
Matthew Auld
fa3f46afd3 drm/i915: introduce vm set_pages/clear_pages
Move the setting/clearing of the vma->pages to a vm operation. Doing so
neatens things up a little, but more importantly gives us a sane place
to also set/clear the vma->pages_sizes, which we introduce later in
preparation for supporting huge-pages.

v2: remove redundant vma->pages check

v3: GEM_BUG_ON(vma->pages) following i915_vma_remove

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk
2017-10-07 10:11:50 +01:00
Matthew Auld
a5c0816626 drm/i915: introduce page_size members
In preparation for supporting huge gtt pages for the ppgtt, we introduce
page size members for gem objects.  We fill in the page sizes by
scanning the sg table.

v2: pass the sg_mask to set_pages

v3: calculate the sg_mask inline with populating the sg_table where
possible, and pass to set_pages along with the pages.

v4: bunch of improvements from Joonas

v5: fix num_pages blunder
    introduce i915_sg_page_sizes helper

v6: prefer GEM_BUG_ON(sizes == 0)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-6-chris@chris-wilson.co.uk
2017-10-07 10:11:48 +01:00
Matthew Auld
b91b09eea7 drm/i915: push set_pages down to the callers
Each backend is now responsible for calling __i915_gem_object_set_pages
upon successfully gathering its backing storage. This eliminates the
inconsistency between the async and sync paths, which stands out even
more when we start throwing around an sg_mask in a later patch.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-6-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-5-chris@chris-wilson.co.uk
2017-10-07 10:11:45 +01:00
Matthew Auld
2a9654b2cd drm/i915: introduce page_sizes field to dev_info
In preparation for huge gtt pages expose page_sizes as part of the
device info, to indicate the page sizes supported by the HW.  Currently
only 4K is supported.

v2: s/page_size_mask/page_sizes/

v3: introduce I915_GTT_MAX_PAGE_SIZE

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-5-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-4-chris@chris-wilson.co.uk
2017-10-07 10:11:44 +01:00
Matthew Auld
465c403cb5 drm/i915: introduce simple gemfs
Not a fully blown gemfs, just our very own tmpfs kernel mount. Doing so
moves us away from the shmemfs shm_mnt, and gives us the much needed
flexibility to do things like set our own mount options, namely huge=
which should allow us to enable the use of transparent-huge-pages for
our shmem backed objects.

v2: various improvements suggested by Joonas

v3: move gemfs instance to i915.mm and simplify now that we have
file_setup_with_mnt

v4: fallback to tmpfs shm_mnt upon failure to setup gemfs

v5: make tmpfs fallback kinder

v5: better gemfs failure message
    flags variable

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-3-matthew.auld@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-2-chris@chris-wilson.co.uk
2017-10-07 10:11:41 +01:00
Arnd Bergmann
764d2997ec drm/i915/selftests: fix check for intel IOMMU
An earlier bugfix tried to work around this build failure:

drivers/gpu/drm/i915/selftests/mock_gem_device.c: In function 'mock_gem_device':
drivers/gpu/drm/i915/selftests/mock_gem_device.c:151:20: error: 'struct dev_archdata' has no member named 'iommu'

Checking for CONFIG_IOMMU_API is not sufficient as a compile-time
test since that may be enabled in configurations that have neither
INTEL_IOMMU not AMD_IOMMU enabled. This changes the check to
INTEL_IOMMU instead, as this is the only case we actually care about.

Fixes: f46f156ea7 ("drm/i915/selftests: Only touch archdata.iommu when it exists")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171005120749.400818-1-arnd@arndb.de
2017-10-05 15:25:27 +02:00
Chris Wilson
e91ef99b95 drm/i915/selftests: Remember to create the fake preempt context
For the fake device we have our own set of mock contexts that need to
match the real contexts we normally create. Currently this requires us
to manually instantiate them for the selftests, which I forgot.

Reported-by: Matthew Auld <matthew.william.auld@gmail.com>
Fixes: e7af311683 ("drm/i915: Introduce a preempt context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171005105927.22991-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
2017-10-05 13:21:16 +01:00
Chris Wilson
60456d5c2d drm/i915/selftests: Replace wmb() with i915_gem_chipset_flush()
Currently, we are being fairly lazy and only using a wmb() following an
update to an active batch. Previously, we have found that to be
insufficient to ensure that a write from the CPU reaches memory in a
timely fashion, and in some caches we may need to flush a chipset cache.
To that end, we have i915_gem_chipset_flush() so use it.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170926153409.7928-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-09-29 12:30:17 +01:00
Jani Nikula
32f35b8634 Merge drm-upstream/drm-next into drm-intel-next-queued
Need MST sideband message transaction to power up/down nodes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-09-28 15:56:49 +03:00
Chris Wilson
87dc03ad26 drm/i915/selftests: Try to recover from a wedged GPU during reset tests
If we see the seqno stop progressing, we abandon the test for fear that
the GPU died following the reset. However, during test teardown we still
wait for the GPU to idle before continuing, but we have already
confirmed that the GPU is dead. Furthermore, since we are inside a reset
test, we have disabled the hangchecker, and so there is no safety net and
we wait indefinitely. Detect the stuck GPU and declare it wedged as a
state of emergency so we can escape.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jari Tahvanainen <jari.tahvanainen@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170915130929.18892-1-chris@chris-wilson.co.uk
Tested-by: Jari Tahvanainen <jari.tahvanainen@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-09-26 14:19:25 +01:00
Chris Wilson
f46f156ea7 drm/i915/selftests: Only touch archdata.iommu when it exists
archdata.iommu only exists when CONFIG_IOMMU_API is enabled (and only
applies to intel-iommu in our case) so conditionally compile it out when
it doesn't exist.

Fixes: b5891fb520 ("drm/i915/selftests: Disable iommu for the mock device")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170918164652.14200-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2017-09-19 10:13:50 +01:00
Chris Wilson
b5891fb520 drm/i915/selftests: Disable iommu for the mock device
On some machines, the iommu cannot allocate a domain for the mock device
causing the dma_map_sg() to fail, and the selftest to fail with -ENOMEM.
For the mock selftests, we are using a fake device and do not care about
iommu; so convince intel_iommu to treat us as a dummy device with an
identity mapping (and no iommu domain).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101080
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170914162240.18310-1-chris@chris-wilson.co.uk
Tested-by: Elizabeth De La Torre Mena <elizabethx.de.la.torre.mena@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com
2017-09-18 16:57:35 +01:00
Michal Hocko
0ee931c4e3 mm: treewide: remove GFP_TEMPORARY allocation flag
GFP_TEMPORARY was introduced by commit e12ba74d8f ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
  Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13 18:53:16 -07:00
Chris Wilson
7ce5b6850b drm/i915/selftests: Use mul_u32_u32() for 32b x 32b -> 64b result
As realised by commit 9e3d6223d2 ("math64, timers: Fix 32bit
mul_u64_u32_shr() and friends"), GCC does not always generate ideal code
for performing a 32b x 32b multiply returning a 64b result (i.e. where
we idiomatically use u64 result = (u64)x * (u32)x).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913105154.2910-2-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-09-13 13:27:20 +01:00
Chris Wilson
d1b48c1e71 drm/i915: Replace execbuf vma ht with an idr
This was the competing idea long ago, but it was only with the rewrite
of the idr as an radixtree and using the radixtree directly ourselves,
along with the realisation that we can store the vma directly in the
radixtree and only need a list for the reverse mapping, that made the
patch performant enough to displace using a hashtable. Though the vma ht
is fast and doesn't require any extra allocation (as we can embed the node
inside the vma), it does require a thread for resizing and serialization
and will have the occasional slow lookup. That is hairy enough to
investigate alternatives and favour them if equivalent in peak performance.
One advantage of allocating an indirection entry is that we can support a
single shared bo between many clients, something that was done on a
first-come first-serve basis for shared GGTT vma previously. To offset
the extra allocations, we create yet another kmem_cache for them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
2017-08-18 11:59:02 +01:00
Chris Wilson
f2f5c0610f drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs
MI_STORE_DWORD_IMM just doesn't work on the video decode engine under
Sandybridge, so refrain from using it. Then switch the selftests over to
using the now common test prior to using MI_STORE_DWORD_IMM.

Fixes: 7dd4f6729f ("drm/i915: Async GPU relocation processing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.13-rc1+
Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-08-18 11:55:02 +01:00
Chris Wilson
b8f55be644 drm/i915: Split obj->cache_coherent to track r/w
Another month, another story in the cache coherency saga. This time, we
come to the realisation that i915_gem_object_is_coherent() has been
reporting whether we can read from the target without requiring a cache
invalidate; but we were using it in places for testing whether we could
write into the object without requiring a cache flush. So split the
tracking into two, one to decide before reads, one after writes.

See commit e27ab73d17 ("drm/i915: Mark CPU cache as dirty on every
transition for CPU writes") for the previous entry in this saga.

v2: Be verbose
v3: Remove unused function (i915_gem_object_is_coherent)
v4: Fix inverted coherency check prior to execbuf (from v2)
v5: Add comment for nasty code where we are optimising on gcc's behalf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555
Testcase: igt/kms_mmap_write_crc
Testcase: igt/kms_pwrite_crc
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-08-15 15:46:57 +01:00