linux_dsm_epyc7002/drivers
Chris Wilson 7d5d59e527 drm/i915: Use the full hammer when shutting down the rcu tasks
To flush all call_rcu() tasks (here from i915_gem_free_object()) we need
to call rcu_barrier() (not synchronize_rcu()). If we don't then we may
still have objects being freed as we continue to teardown the driver -
in particular, the recently released rings may race with the memory
manager shutdown resulting in sporadic:

[  142.217186] WARNING: CPU: 7 PID: 6185 at drivers/gpu/drm/drm_mm.c:932 drm_mm_takedown+0x2e/0x40
[  142.217187] Memory manager not clean during takedown.
[  142.217187] Modules linked in: i915(-) x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_realtek snd_hda_codec_generic mei_me mei snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: snd_hda_intel]
[  142.217199] CPU: 7 PID: 6185 Comm: rmmod Not tainted 4.9.0-rc2-CI-Trybot_242+ #1
[  142.217199] Hardware name: LENOVO 10AGS00601/SHARKBAY, BIOS FBKT34AUS 04/24/2013
[  142.217200]  ffffc90002ecfce0 ffffffff8142dd65 ffffc90002ecfd30 0000000000000000
[  142.217202]  ffffc90002ecfd20 ffffffff8107e4e6 000003a40778c2a8 ffff880401355c48
[  142.217204]  ffff88040778c2a8 ffffffffa040f3c0 ffffffffa040f4a0 00005621fbf8b1f0
[  142.217206] Call Trace:
[  142.217209]  [<ffffffff8142dd65>] dump_stack+0x67/0x92
[  142.217211]  [<ffffffff8107e4e6>] __warn+0xc6/0xe0
[  142.217213]  [<ffffffff8107e54a>] warn_slowpath_fmt+0x4a/0x50
[  142.217214]  [<ffffffff81559e3e>] drm_mm_takedown+0x2e/0x40
[  142.217236]  [<ffffffffa035c02a>] i915_gem_cleanup_stolen+0x1a/0x20 [i915]
[  142.217246]  [<ffffffffa034c581>] i915_ggtt_cleanup_hw+0x31/0xb0 [i915]
[  142.217253]  [<ffffffffa0310311>] i915_driver_cleanup_hw+0x31/0x40 [i915]
[  142.217260]  [<ffffffffa0312001>] i915_driver_unload+0x141/0x1a0 [i915]
[  142.217268]  [<ffffffffa031c2c4>] i915_pci_remove+0x14/0x20 [i915]
[  142.217269]  [<ffffffff8147d214>] pci_device_remove+0x34/0xb0
[  142.217271]  [<ffffffff8157b14c>] __device_release_driver+0x9c/0x150
[  142.217272]  [<ffffffff8157bcc6>] driver_detach+0xb6/0xc0
[  142.217273]  [<ffffffff8157abe3>] bus_remove_driver+0x53/0xd0
[  142.217274]  [<ffffffff8157c787>] driver_unregister+0x27/0x50
[  142.217276]  [<ffffffff8147c265>] pci_unregister_driver+0x25/0x70
[  142.217287]  [<ffffffffa03d764c>] i915_exit+0x1a/0x71 [i915]
[  142.217289]  [<ffffffff811136b3>] SyS_delete_module+0x193/0x1e0
[  142.217291]  [<ffffffff818174ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  142.217292] ---[ end trace 6fd164859c154772 ]---
[  142.217505] [drm:show_leaks] *ERROR* node [6b6b6b6b6b6b6b6b + 6b6b6b6b6b6b6b6b]: inserted at
                [<ffffffff81559ff3>] save_stack.isra.1+0x53/0xa0
                [<ffffffff8155a98d>] drm_mm_insert_node_in_range_generic+0x2ad/0x360
                [<ffffffffa035bf23>] i915_gem_stolen_insert_node_in_range+0x93/0xe0 [i915]
                [<ffffffffa035c855>] i915_gem_object_create_stolen+0x75/0xb0 [i915]
                [<ffffffffa036a51a>] intel_engine_create_ring+0x9a/0x140 [i915]
                [<ffffffffa036a921>] intel_init_ring_buffer+0xf1/0x440 [i915]
                [<ffffffffa036be1b>] intel_init_render_ring_buffer+0xab/0x1b0 [i915]
                [<ffffffffa0363d08>] intel_engines_init+0xc8/0x210 [i915]
                [<ffffffffa0355d7c>] i915_gem_init+0xac/0xf0 [i915]
                [<ffffffffa0311454>] i915_driver_load+0x9c4/0x1430 [i915]
                [<ffffffffa031c2f8>] i915_pci_probe+0x28/0x40 [i915]
                [<ffffffff8147d315>] pci_device_probe+0x85/0xf0
                [<ffffffff8157b7ff>] driver_probe_device+0x21f/0x430
                [<ffffffff8157baee>] __driver_attach+0xde/0xe0

In particular note that the node was being poisoned as we inspected the
list, a  clear indication that the object is being freed as we make the
assertion.

v2: Don't loop, just assert that we do all the work required as that
will be better at detecting further errors.

Fixes: fbbd37b36f ("drm/i915: Move object release to a freelist + worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101084843.3961-1-chris@chris-wilson.co.uk
2016-11-01 09:30:08 +00:00
..
accessibility
acpi More ACPI updates for v4.9-rc1 2016-10-14 12:50:05 -07:00
amba
android
ata ahci: qoriq: Revert "ahci: qoriq: Disable NCQ on ls2080a SoC" 2016-09-30 10:28:51 +02:00
atm
auxdisplay auxdisplay: img-ascii-lcd: driver for simple ASCII LCD displays 2016-10-06 17:03:41 +02:00
base dma-buf: Rename struct fence to dma_fence 2016-10-25 14:40:39 +02:00
bcma
block rbd: don't retry watch reregistration if header object is gone 2016-10-15 23:22:09 +02:00
bluetooth Bluetooth: btusb: Fix atheros firmware download error 2016-10-07 09:46:56 +02:00
bus ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
cdrom
char A small bug fix and a new driver for acting as an IPMI device. 2016-10-23 15:56:23 -07:00
clk ARM: SoC: late DT updates for v4.9 2016-10-07 21:34:49 -07:00
clocksource Revert "clocksource/drivers/timer_sun5i: Replace code by clocksource_mmio_init" 2016-10-20 21:58:58 +02:00
connector
cpufreq More power management updates for v4.9-rc1 2016-10-14 12:46:13 -07:00
cpuidle Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-10-15 09:26:12 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-10-10 14:04:16 -07:00
dax Merge branch 'for-4.9/dax' into libnvdimm-for-next 2016-10-07 16:46:30 -07:00
dca
devfreq PM / devfreq: Skip status update on uninitialized previous_freq 2016-10-11 00:01:20 +02:00
dio
dma dmaengine updates for 4.8-rc1 2016-10-06 17:13:54 -07:00
dma-buf dma-buf: Rename struct fence to dma_fence 2016-10-25 14:40:39 +02:00
edac * Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO 2016-10-04 12:06:26 -07:00
eisa
extcon
firewire firewire: nosy: do not ignore errors in ioremap_nocache() 2016-10-09 11:38:11 +02:00
firmware efi/arm: Fix absolute relocation detection for older toolchains 2016-10-19 14:49:44 +02:00
fmc
fpga
gpio gpio: pca953x: add a comment explaining the need for a lockdep subclass 2016-10-11 23:17:08 +02:00
gpu drm/i915: Use the full hammer when shutting down the rcu tasks 2016-11-01 09:30:08 +00:00
hid HID: add quirk for Akai MIDImix. 2016-10-10 10:58:22 +02:00
hsi
hv Drivers: hv: get rid of id in struct vmbus_channel 2016-09-27 12:35:49 +02:00
hwmon hwmon: (max31790) potential ERR_PTR dereference 2016-10-17 10:16:20 -07:00
hwspinlock
hwtracing
i2c i2c: 'i2c-bus' node support for v4.8-rc1 2016-10-11 23:37:26 +02:00
ide
idle nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
iio Staging/IIO patches for 4.9-rc1 2016-10-05 14:50:51 -07:00
infiniband Merge branch 'gup_flag-cleanups' 2016-10-19 08:39:47 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2016-10-14 13:19:30 -07:00
iommu IOMMU Updates for Linux v4.9 2016-10-11 12:52:41 -07:00
ipack
irqchip GIC updates for Linux 4.9-rc2 2016-10-21 21:40:29 +02:00
isdn
leds leds: triggers: Check return value of kobject_uevent_env() 2016-09-20 10:22:10 +02:00
lguest
lightnvm Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block 2016-10-07 14:42:05 -07:00
macintosh powerpc: Remove all usages of NO_IRQ 2016-09-20 20:57:12 +10:00
mailbox Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration 2016-10-06 17:36:53 -07:00
mcb mcb: Add a dma_device to mcb_device 2016-09-27 12:33:47 +02:00
md kthread: kthread worker API cleanup 2016-10-11 15:06:33 -07:00
media Merge branch 'gup_flag-cleanups' 2016-10-19 08:39:47 -07:00
memory ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
memstick memstick: rtsx_usb_ms: Manage runtime PM when accessing the device 2016-10-17 15:43:05 +02:00
message
mfd - Core Frameworks 2016-10-07 08:35:35 -07:00
misc powerpc fixes for 4.9 #3 2016-10-21 19:13:00 -07:00
mmc mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led 2016-10-17 15:43:03 +02:00
mtd UBI: Fix crash in try_recover_peb() 2016-10-20 00:06:06 +02:00
net Merge of the qedr RoCE driver 2016-10-14 13:43:08 -07:00
nfc
ntb
nubus
nvdimm Merge branch 'for-4.9/dax' into libnvdimm-for-next 2016-10-07 16:46:30 -07:00
nvme Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2016-10-21 10:54:01 -07:00
nvmem ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
of Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-10-15 09:26:12 -07:00
oprofile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
parisc
parport
pci PCI: designware-plat: Update author email address 2016-10-21 09:54:46 -05:00
pcmcia pcmcia: soc_common: add driver-data pointer 2016-09-22 09:39:16 +01:00
perf perf: xgene: Remove bogus IS_ERR() check 2016-10-17 15:50:07 +01:00
phy
pinctrl pinctrl: intel: Only restore pins that are used by the driver 2016-10-18 14:38:16 +02:00
platform platform-drivers-x86 for 4.9-2 2016-10-19 11:45:06 -07:00
pnp
power power supply and reset changes for the v4.9 series 2016-10-06 18:21:15 -07:00
powercap
pps pps: kc: fix non-tickless system config dependency 2016-10-11 15:06:32 -07:00
ps3 powerpc: Remove all usages of NO_IRQ 2016-09-20 20:57:12 +10:00
ptp drivers/ptp: Fix kernel memory disclosure 2016-10-13 10:20:06 -04:00
pwm
rapidio mm: replace get_user_pages() write/force parameters with gup_flags 2016-10-19 08:11:43 -07:00
ras
regulator Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2016-10-12 11:05:23 -07:00
remoteproc rpmsg updates for v4.9 2016-10-06 17:03:49 -07:00
reset
rpmsg
rtc RTC for 4.9 2016-10-14 13:13:44 -07:00
s390 scsi: zfcp: spin_lock_irqsave() is not nestable 2016-10-14 16:21:08 -04:00
sbus
scsi SCSI fixes on 20161021 2016-10-21 10:57:09 -07:00
sfi
sh
sn
soc powerpc updates for 4.9 #2 2016-10-14 11:07:42 -07:00
spi kthread: kthread worker API cleanup 2016-10-11 15:06:33 -07:00
spmi spmi: pmic-arb: Return an error code if sanity check fails 2016-09-27 12:43:34 +02:00
ssb
staging mm: replace get_user_pages() write/force parameters with gup_flags 2016-10-19 08:11:43 -07:00
target target/tcm_fc: use CPU affinity for responses 2016-10-21 01:19:44 -07:00
tc
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2016-10-12 11:05:23 -07:00
thunderbolt
tty kthread: kthread worker API cleanup 2016-10-11 15:06:33 -07:00
uio
usb Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-10-15 09:26:12 -07:00
uwb
vfio vfio_pci: use pci_alloc_irq_vectors 2016-09-29 13:36:38 -06:00
vhost
video Merge tag 'topic/drm-misc-2016-10-24' of git://anongit.freedesktop.org/drm-intel into drm-next 2016-10-25 16:35:20 +10:00
virt mm: replace get_user_pages() write/force parameters with gup_flags 2016-10-19 08:11:43 -07:00
virtio
vlynq
vme vme: fake: remove unexpected unlock in fake_master_set() 2016-09-27 12:43:35 +02:00
w1
watchdog Merge branches 'acpi-wdat' and 'acpi-cppc' 2016-10-21 22:24:23 +02:00
xen xen: features and fixes for 4.9-rc0 2016-10-06 11:19:10 -07:00
zorro
Kconfig
Makefile A small bug fix and a new driver for acting as an IPMI device. 2016-10-23 15:56:23 -07:00