linux_dsm_epyc7002/drivers/gpu/drm/i915/gt
Chris Wilson 6cd34b10cd drm/i915/execlists: Backtrack along timeline
After a preempt-to-busy, we may find an active request that is caught
between execution states. Walk back along the timeline instead of the
execution list to be safe.

[  106.417541] i915 0000:00:02.0: Resetting rcs0 for preemption time out
[  106.417659] ==================================================================
[  106.418041] BUG: KASAN: slab-out-of-bounds in __execlists_reset+0x2f2/0x440 [i915]
[  106.418123] Read of size 8 at addr ffff888703506b30 by task swapper/1/0
[  106.418194]
[  106.418267] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G     U            5.3.0-rc3+ #5
[  106.418344] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[  106.418434] Call Trace:
[  106.418508]  <IRQ>
[  106.418585]  dump_stack+0x5b/0x90
[  106.418941]  ? __execlists_reset+0x2f2/0x440 [i915]
[  106.419022]  print_address_description+0x67/0x32d
[  106.419376]  ? __execlists_reset+0x2f2/0x440 [i915]
[  106.419731]  ? __execlists_reset+0x2f2/0x440 [i915]
[  106.419810]  __kasan_report.cold.6+0x1a/0x3c
[  106.419888]  ? __trace_bprintk+0xc0/0xd0
[  106.420239]  ? __execlists_reset+0x2f2/0x440 [i915]
[  106.420318]  check_memory_region+0x144/0x1c0
[  106.420671]  __execlists_reset+0x2f2/0x440 [i915]
[  106.421029]  execlists_reset+0x3d/0x50 [i915]
[  106.421387]  intel_engine_reset+0x203/0x3a0 [i915]
[  106.421744]  ? igt_reset_nop+0x2b0/0x2b0 [i915]
[  106.421825]  ? _raw_spin_trylock_bh+0xe0/0xe0
[  106.421901]  ? rcu_core+0x1b9/0x6a0
[  106.422251]  preempt_reset+0x9a/0xf0 [i915]
[  106.422333]  tasklet_action_common.isra.15+0xc0/0x1e0
[  106.422685]  ? execlists_submit_request+0x200/0x200 [i915]
[  106.422764]  __do_softirq+0x106/0x3cf
[  106.422840]  irq_exit+0xdc/0xf0
[  106.422914]  smp_apic_timer_interrupt+0x81/0x1c0
[  106.422988]  apic_timer_interrupt+0xf/0x20
[  106.423059]  </IRQ>
[  106.423144] RIP: 0010:cpuidle_enter_state+0xc3/0x620
[  106.423222] Code: 24 0f 1f 44 00 00 31 ff e8 da 87 9c ff 80 7c 24 10 00 74 12 9c 58 f6 c4 02 0f 85 33 05 00 00 31 ff e8 c1 77 a3 ff fb 45 85 e4 <0f> 89 bf 02 00 00 48 8d 7d 10 e8 4e 45 b9 ff c7 45 10 00 00 00 00
[  106.423311] RSP: 0018:ffff88881c30fda8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[  106.423390] RAX: 0000000000000000 RBX: ffffffff825b4c80 RCX: ffffffff810c8a00
[  106.423465] RDX: dffffc0000000000 RSI: 0000000039f89620 RDI: ffff88881f6b00a8
[  106.423540] RBP: ffff88881f6b5bf8 R08: 0000000000000002 R09: 000000000002ed80
[  106.423616] R10: 0000003fdd956146 R11: ffff88881c2d1e47 R12: 0000000000000008
[  106.423691] R13: 0000000000000008 R14: ffffffff825b4f80 R15: ffffffff825b4fc0
[  106.423772]  ? sched_idle_set_state+0x20/0x30
[  106.423851]  ? cpuidle_enter_state+0xa6/0x620
[  106.423874]  ? tick_nohz_idle_stop_tick+0x1d1/0x3f0
[  106.423896]  cpuidle_enter+0x37/0x60
[  106.423919]  do_idle+0x246/0x280
[  106.423941]  ? arch_cpu_idle_exit+0x30/0x30
[  106.423964]  ? __wake_up_common+0x46/0x240
[  106.423986]  cpu_startup_entry+0x14/0x20
[  106.424009]  start_secondary+0x1b0/0x200
[  106.424031]  ? set_cpu_sibling_map+0x990/0x990
[  106.424054]  secondary_startup_64+0xa4/0xb0
[  106.424075]
[  106.424096] Allocated by task 626:
[  106.424119]  save_stack+0x19/0x80
[  106.424143]  __kasan_kmalloc.constprop.7+0xc1/0xd0
[  106.424165]  kmem_cache_alloc+0xb2/0x1d0
[  106.424277]  i915_sched_lookup_priolist+0x1ab/0x320 [i915]
[  106.424385]  execlists_submit_request+0x73/0x200 [i915]
[  106.424498]  submit_notify+0x59/0x60 [i915]
[  106.424600]  __i915_sw_fence_complete+0x9b/0x330 [i915]
[  106.424713]  __i915_request_commit+0x4bf/0x570 [i915]
[  106.424818]  intel_engine_pulse+0x213/0x310 [i915]
[  106.424925]  context_close+0x22f/0x470 [i915]
[  106.425033]  i915_gem_context_destroy_ioctl+0x7b/0xa0 [i915]
[  106.425058]  drm_ioctl_kernel+0x131/0x170
[  106.425081]  drm_ioctl+0x2d9/0x4f1
[  106.425104]  do_vfs_ioctl+0x115/0x890
[  106.425126]  ksys_ioctl+0x35/0x70
[  106.425147]  __x64_sys_ioctl+0x38/0x40
[  106.425169]  do_syscall_64+0x66/0x220
[  106.425191]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  106.425213]
[  106.425234] Freed by task 0:
[  106.425255] (stack is not available)
[  106.425276]
[  106.425297] The buggy address belongs to the object at ffff888703506a40
[  106.425297]  which belongs to the cache i915_priolist of size 104
[  106.425321] The buggy address is located 136 bytes to the right of
[  106.425321]  104-byte region [ffff888703506a40, ffff888703506aa8)
[  106.425345] The buggy address belongs to the page:
[  106.425367] page:ffffea001c0d4180 refcount:1 mapcount:0 mapping:ffff88873e1cf740 index:0xffff888703506e40 compound_mapcount: 0
[  106.425391] flags: 0x8000000000010200(slab|head)
[  106.425415] raw: 8000000000010200 ffffea0020192b88 ffff8888174b5450 ffff88873e1cf740
[  106.425439] raw: ffff888703506e40 000000000010000e 00000001ffffffff 0000000000000000
[  106.425464] page dumped because: kasan: bad access detected
[  106.425486]
[  106.425506] Memory state around the buggy address:
[  106.425528]  ffff888703506a00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[  106.425551]  ffff888703506a80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
[  106.425573] >ffff888703506b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  106.425597]                                      ^
[  106.425619]  ffff888703506b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  106.425642]  ffff888703506c00: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
[  106.425664] ==================================================================

Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809073723.6593-1-chris@chris-wilson.co.uk
2019-08-09 13:32:29 +01:00
..
selftests drm/i915: Rename i915_timeline to intel_timeline and move under gt 2019-06-21 13:48:53 +01:00
uc drm/i915: extract i915_memcpy.h from i915_drv.h 2019-08-09 12:03:25 +03:00
gen6_renderstate.c drm/i915: Move the renderstate setup under gt/ 2019-07-04 11:48:22 +01:00
gen7_renderstate.c drm/i915: Move the renderstate setup under gt/ 2019-07-04 11:48:22 +01:00
gen8_renderstate.c drm/i915: Move the renderstate setup under gt/ 2019-07-04 11:48:22 +01:00
gen9_renderstate.c drm/i915: Move the renderstate setup under gt/ 2019-07-04 11:48:22 +01:00
intel_breadcrumbs.c drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.h 2019-08-07 12:43:14 +03:00
intel_context_types.h drm/i915/gt: Provide a local intel_context.vm 2019-07-30 16:09:35 +01:00
intel_context.c drm/i915: Hide unshrinkable context objects from the shrinker 2019-08-02 23:39:46 +01:00
intel_context.h drm/i915: Allow sharing the idle-barrier from other kernel requests 2019-08-02 11:53:04 +01:00
intel_engine_cs.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
intel_engine_pm.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
intel_engine_pm.h drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
intel_engine_pool_types.h drm/i915: Replace struct_mutex for batch pool serialisation 2019-08-04 14:31:18 +01:00
intel_engine_pool.c drm/i915: Replace struct_mutex for batch pool serialisation 2019-08-04 14:31:18 +01:00
intel_engine_pool.h drm/i915: Replace struct_mutex for batch pool serialisation 2019-08-04 14:31:18 +01:00
intel_engine_types.h drm/i915: Fix up the inverse mapping for default ctx->engines[] 2019-08-08 15:45:35 +01:00
intel_engine_user.c drm/i915: Fix up the inverse mapping for default ctx->engines[] 2019-08-08 15:45:35 +01:00
intel_engine_user.h drm/i915: Rename engines to match their user interface 2019-08-07 14:30:55 +01:00
intel_engine.h drm/i915/gt: Move the [class][inst] lookup for engines onto the GT 2019-08-06 15:00:43 +01:00
intel_gpu_commands.h drm/i915/selftests: Ensure we don't clamp a random offset to 32b 2019-07-11 10:06:37 +01:00
intel_gt_pm.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
intel_gt_pm.h drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
intel_gt_types.h drm/i915: Fix up the inverse mapping for default ctx->engines[] 2019-08-08 15:45:35 +01:00
intel_gt.c drm/i915: extract gem/i915_gem_stolen.h from i915_drv.h 2019-08-09 12:03:29 +03:00
intel_gt.h drm/i915/gt: Move gt_cleanup_early out of gem_cleanup_early 2019-08-01 17:58:50 +01:00
intel_hangcheck.c drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
intel_lrc_reg.h
intel_lrc.c drm/i915/execlists: Backtrack along timeline 2019-08-09 13:32:29 +01:00
intel_lrc.h
intel_mocs.c drm/i915/gt: Remove stale kerneldoc for internal MOCS functions 2019-08-05 18:27:17 +01:00
intel_mocs.h drm/i915: Move MOCS setup to intel_mocs.c 2019-07-31 07:40:35 -07:00
intel_renderstate.c drm/i915: Inline engine->init_context into its caller 2019-07-30 11:50:42 +01:00
intel_renderstate.h drm/i915: Move the renderstate setup under gt/ 2019-07-04 11:48:22 +01:00
intel_reset_types.h drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
intel_reset.c drm/i915: rename intel_drv.h to display/intel_display_types.h 2019-08-07 12:43:50 +03:00
intel_reset.h drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
intel_ringbuffer.c drm/i915: Hide unshrinkable context objects from the shrinker 2019-08-02 23:39:46 +01:00
intel_sseu.c drm/i915/perf: Refactor oa object to better manage resources 2019-08-07 20:34:39 +01:00
intel_sseu.h
intel_timeline_types.h drm/i915: Rename i915_timeline to intel_timeline and move under gt 2019-06-21 13:48:53 +01:00
intel_timeline.c drm/i915/gt: Always call kref_init for the timeline 2019-06-26 07:25:54 +01:00
intel_timeline.h drm/i915: Rename i915_timeline to intel_timeline and move under gt 2019-06-21 13:48:53 +01:00
intel_workarounds_types.h drm/i915: Add engine name to workaround debug print 2019-07-12 09:55:30 +01:00
intel_workarounds.c drm/i915/icl: Add Wa_1409178092 2019-07-19 15:35:21 +01:00
intel_workarounds.h drm/i915: Convert gt workarounds to intel_gt 2019-06-21 13:48:25 +01:00
Makefile drm/i915: use upstream version of header tests 2019-07-30 12:11:57 +03:00
mock_engine.c drm/i915/selftests: Fixup a missing legacy_idx 2019-08-08 20:53:31 +01:00
mock_engine.h
selftest_context.c drm/i915: Fix some NULL vs IS_ERR() conditions 2019-08-07 14:30:59 +01:00
selftest_engine_cs.c drm/i915: Rename engines to match their user interface 2019-08-07 14:30:55 +01:00
selftest_engine_pm.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
selftest_engine.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
selftest_engine.h drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
selftest_hangcheck.c drm/i915/selftests: Careful not to flush hang_fini on error setups 2019-07-29 11:00:18 +01:00
selftest_lrc.c drm/i915: Fix up the inverse mapping for default ctx->engines[] 2019-08-08 15:45:35 +01:00
selftest_reset.c drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
selftest_timeline.c drm/i915/gt: Use intel_gt as the primary object for handling resets 2019-07-12 21:06:56 +01:00
selftest_workarounds.c drm/i915: Fix up the inverse mapping for default ctx->engines[] 2019-08-08 15:45:35 +01:00