linux_dsm_epyc7002/drivers/gpu/drm/i915
Daniel Vetter acc240d41e drm/i915: Fix use-after-free in do_switch
So apparently under ridiculous amounts of memory pressure we can get
into trouble in do_switch when we try to move the old hw context
backing storage object onto the active lists.

With list debugging enabled that usually results in us chasing a
poisoned pointer - which means we've hit upon a vma that has been
removed from all lrus with list_del (and then deallocated, so it's a
real use-after free).

Ian Lister has done some great callchain chasing and noticed that we
can reenter do_switch:

i915_gem_do_execbuffer()

i915_switch_context()

do_switch()
   from = ring->last_context;
   i915_gem_object_pin()

      i915_gem_object_bind_to_gtt()
         ret = drm_mm_insert_node_in_range_generic();
         // If the above call fails then it will try i915_gem_evict_something()
         // If that fails it will call i915_gem_evict_everything() ...
	 i915_gem_evict_everything()
	    i915_gpu_idle()
	       i915_switch_context(DEFAULT_CONTEXT)

Like with everything else where the shrinker or eviction code can
invalidate pointers we need to reload relevant state.

Note that there's no need to recheck whether a context switch is still
required because:

- Doing a switch to the same context is harmless (besides wasting a
  bit of energy).

- This can only happen with the default context. But since that one's
  pinned we'll never call down into evict_everything under normal
  circumstances. Note that there's a little driver bringup fun
  involved namely that we could recourse into do_switch for the
  initial switch. Atm we're fine since we assign the context pointer
  only after the call to do_switch at driver load or resume time. And
  in the gpu reset case we skip the entire setup sequence (which might
  be a bug on its own, but definitely not this one here).

Cc'ing stable since apparently ChromeOS guys are seeing this in the
wild (and not just on artificial stress tests), see the reference.

Note that in upstream code doesn't calle evict_everything directly
from evict_something, that's an extension in this product branch. But
we can still hit upon this bug (and apparently we do, see the linked
backtraces). I've noticed this while trying to construct a testcase
for this bug and utterly failed to provoke it. It looks like we need
to driver the system squarly into the lowmem wall and provoke the
shrinker to evict the context object by doing the last-ditch
evict_everything call.

Aside: There's currently no means to get a badly-fragmenting hw
context object away from a bad spot in the upstream code. We should
fix this by at least adding some code to evict_something to handle hw
contexts.

References: https://code.google.com/p/chromium/issues/detail?id=248191
Reported-by: Ian Lister <ian.lister@intel.com>
Cc: Ian Lister <ian.lister@intel.com>
Cc: stable@vger.kernel.org
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Bloomfield, Jon <jon.bloomfield@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Reviewed-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-06 13:09:11 +01:00
..
dvo_ch7xxx.c drm/i915: dvo_ch7xxx: fix vsync polarity setting 2013-07-25 16:10:22 +02:00
dvo_ch7017.c
dvo_ivch.c
dvo_ns2501.c
dvo_sil164.c
dvo_tfp410.c
dvo.h drm/i915: Remove unused mode_fixup() vfunc of struct intel_dvo_dev_ops 2013-09-05 21:39:59 +02:00
i915_debugfs.c Merge tag 'bdw-stage1-2013-11-08-v2' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-11-10 18:35:33 +10:00
i915_dma.c drm/i915: fix pm init ordering 2013-12-06 13:08:15 +01:00
i915_drv.c drm/i915: Do not clobber config status after a forced restore of hw state 2013-12-03 23:15:46 +01:00
i915_drv.h drm/i915: fix pm init ordering 2013-12-06 13:08:15 +01:00
i915_gem_context.c drm/i915: Fix use-after-free in do_switch 2013-12-06 13:09:11 +01:00
i915_gem_debug.c drm/i915: Fix #endif comment 2013-08-09 10:45:52 +02:00
i915_gem_dmabuf.c drm/i915: Pin pages whilst allocating for dma-buf vmap() 2013-11-29 15:51:20 +01:00
i915_gem_evict.c drm/i915: trace vm eviction instead of everything 2013-10-01 07:45:20 +02:00
i915_gem_execbuffer.c drm/i915: Pin relocations for the duration of constructing the execbuffer 2013-11-27 09:04:36 +01:00
i915_gem_gtt.c drm/i915: Prefer setting PTE cache age to 3 2013-11-25 09:50:15 +01:00
i915_gem_stolen.c Linux 3.12-rc2 2013-09-24 09:32:53 +02:00
i915_gem_tiling.c drm/i915: prevent tiling changes on framebuffer backing storage 2013-10-16 22:04:52 +02:00
i915_gem.c drm/i915: MI_PREDICATE_RESULT_2 is HSW only 2013-11-29 15:00:03 +01:00
i915_gpu_error.c drm/i915/bdw: Update relevant error state 2013-11-08 18:09:43 +01:00
i915_ioc32.c
i915_irq.c Merge tag 'bdw-stage1-2013-11-08-v2' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-11-10 18:35:33 +10:00
i915_reg.h drm/i915: Make the DERRMR SRM target global GTT 2013-11-29 14:56:44 +01:00
i915_suspend.c drm/i915/vlv: use per-pipe backlight controls v2 2013-11-06 18:26:31 +01:00
i915_sysfs.c Merge tag 'drm-intel-next-2013-10-18' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-10-25 09:35:04 +01:00
i915_trace_points.c
i915_trace.h drm/i915: Add a tracepoint for using a semaphore 2013-10-01 07:45:24 +02:00
i915_ums.c drm/i915: scrap register address storage 2013-06-10 19:54:14 +02:00
intel_acpi.c i915: fix ACPI _DSM warning 2013-08-05 19:04:05 +02:00
intel_bios.c i915: Use 120MHz LVDS SSC clock for gen5/gen6/gen7 2013-11-15 00:38:44 +01:00
intel_bios.h drm/i915: Make intel_dp_is_edp() less specific 2013-11-05 07:59:40 +01:00
intel_crt.c drm/i915: Use hsw_crt_get_config on BDW 2013-11-08 18:10:09 +01:00
intel_ddi.c drm/i915: Check VBT for eDP ports on VLV 2013-11-28 13:42:12 +01:00
intel_display.c drm/i915: Skip clock checks on BDW 2013-12-03 23:15:47 +01:00
intel_dp.c drm/i915: Simplify DP vs. eDP detection 2013-11-28 13:42:25 +01:00
intel_drv.h drm/i915: fix pm init ordering 2013-12-06 13:08:15 +01:00
intel_dsi_cmd.c drm/i915/dsi: s/size_t/int/ 2013-09-04 17:34:51 +02:00
intel_dsi_cmd.h drm/i915/dsi: s/size_t/int/ 2013-09-04 17:34:51 +02:00
intel_dsi_pll.c drm/i915: Use adjusted_mode in DSI PLL calculations 2013-09-16 23:36:38 +02:00
intel_dsi.c drm/i915: Use pipe_name() instead of the pipe number 2013-10-16 19:42:52 +02:00
intel_dsi.h drm/i915: add VLV DSI PLL Calculations 2013-09-04 17:34:48 +02:00
intel_dvo.c drm/i915/dvo: call ->mode_set callback only when the port is running 2013-11-04 16:30:33 +01:00
intel_fbdev.c drm/i915: fix open-coded DIV_ROUND_UP 2013-10-21 10:04:03 +02:00
intel_hdmi.c drm/i915/bdw: Broadwell has a max port clock of 300Mhz on HDMI 2013-11-08 18:10:01 +01:00
intel_i2c.c drm/i915: Program GMBUS Frequency based on the CDCLK for VLV. 2013-10-01 07:45:41 +02:00
intel_lvds.c drm/i915: make backlight functions take a connector 2013-11-06 17:56:28 +01:00
intel_modes.c
intel_opregion.c drm/i915/opregion: fix build error on CONFIG_ACPI=n 2013-11-08 19:32:52 +01:00
intel_overlay.c drm/i915: use pointer = k[cmz...]alloc(sizeof(*pointer), ...) pattern 2013-10-01 07:45:01 +02:00
intel_panel.c Merge tag 'drm-intel-fixes-2013-11-07' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-11-08 16:34:39 +10:00
intel_pm.c drm/i915: fix pm init ordering 2013-12-06 13:08:15 +01:00
intel_ringbuffer.c drm/i915/bdw: Take render error interrupt out of the mask 2013-11-08 18:10:10 +01:00
intel_ringbuffer.h drm/i915: Write RING_TAIL once per-request 2013-09-10 15:35:58 +02:00
intel_sdvo_regs.h
intel_sdvo.c drm/i915: destroy connector sysfs files earlier 2013-10-01 07:45:48 +02:00
intel_sideband.c drm/i915/vlv: add doc names to sideband file 2013-10-11 23:33:44 +02:00
intel_sprite.c Merge tag 'bdw-stage1-2013-11-08-v2' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-11-10 18:35:33 +10:00
intel_tv.c drm/i915/tv: add ->get_config callback 2013-11-18 22:24:33 +01:00
intel_uncore.c drm/i915: restore the early forcewake cleanup 2013-11-17 20:39:51 +01:00
Kconfig drm/i915: Kconfig option to disable the legacy fbdev support 2013-10-11 23:37:23 +02:00
Makefile drm/i915: rename intel_fb.c to intel_fbdev.c 2013-10-11 23:37:33 +02:00