linux_dsm_epyc7002/drivers/base
John Garry 0b777eee88 driver core: Postpone DMA tear-down until after devres release for probe failure
In commit 376991db4b ("driver core: Postpone DMA tear-down until after
devres release"), we changed the ordering of tearing down the device DMA
ops and releasing all the device's resources; this was because the DMA ops
should be maintained until we release the device's managed DMA memories.

However, we have seen another crash on an arm64 system when a
device driver probe fails:

  hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2
  scsi host1: hisi_sas_v3_hw
  BUG: Bad page state in process swapper/0  pfn:313f5
  page:ffff7e0000c4fd40 count:1 mapcount:0
  mapping:0000000000000000 index:0x0
  flags: 0xfffe00000001000(reserved)
  raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48
0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
  bad because of flags: 0x1000(reserved)
  Modules linked in:
  CPU: 49 PID: 1 Comm: swapper/0 Not tainted
5.1.0-rc1-43081-g22d97fd-dirty #1433
  Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI
RC0 - V1.12.01 01/29/2019
  Call trace:
  dump_backtrace+0x0/0x118
  show_stack+0x14/0x1c
  dump_stack+0xa4/0xc8
  bad_page+0xe4/0x13c
  free_pages_check_bad+0x4c/0xc0
  __free_pages_ok+0x30c/0x340
  __free_pages+0x30/0x44
  __dma_direct_free_pages+0x30/0x38
  dma_direct_free+0x24/0x38
  dma_free_attrs+0x9c/0xd8
  dmam_release+0x20/0x28
  release_nodes+0x17c/0x220
  devres_release_all+0x34/0x54
  really_probe+0xc4/0x2c8
  driver_probe_device+0x58/0xfc
  device_driver_attach+0x68/0x70
  __driver_attach+0x94/0xdc
  bus_for_each_dev+0x5c/0xb4
  driver_attach+0x20/0x28
  bus_add_driver+0x14c/0x200
  driver_register+0x6c/0x124
  __pci_register_driver+0x48/0x50
  sas_v3_pci_driver_init+0x20/0x28
  do_one_initcall+0x40/0x25c
  kernel_init_freeable+0x2b8/0x3c0
  kernel_init+0x10/0x100
  ret_from_fork+0x10/0x18
  Disabling lock debugging due to kernel taint
  BUG: Bad page state in process swapper/0  pfn:313f6
  page:ffff7e0000c4fd80 count:1 mapcount:0
mapping:0000000000000000 index:0x0
[   89.322983] flags: 0xfffe00000001000(reserved)
  raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88
0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000

The crash occurs for the same reason.

In this case, on the really_probe() failure path, we are still clearing
the DMA ops prior to releasing the device's managed memories.

This patch fixes this issue by reordering the DMA ops teardown and the
call to devres_release_all() on the failure path.

Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 21:48:37 +02:00
..
firmware_loader drivers: base: firmware_loader: add proper SPDX identifiers on files that did not have them. 2019-04-04 20:03:40 +02:00
power drivers: base: power: add proper SPDX identifiers on files that did not have them. 2019-04-04 20:03:40 +02:00
regmap Merge remote-tracking branch 'regmap/topic/irq' into regmap-next 2019-01-29 17:17:03 +00:00
test drivers: base: test: add proper SPDX identifier to Makefile 2019-04-04 20:03:40 +02:00
arch_topology.c arch_topology: Make cpu_capacity sysfs node as read-only 2019-04-04 18:41:21 +02:00
attribute_container.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
base.h driver core: Probe devices asynchronously instead of the driver 2019-01-31 14:20:53 +01:00
bus.c driver core: Probe devices asynchronously instead of the driver 2019-01-31 14:20:53 +01:00
cacheinfo.c cacheinfo: Keep the old value if of_property_read_u32 fails 2019-01-22 13:50:31 +01:00
class.c driver core: move device->knode_class to device_private 2019-01-18 16:55:48 +01:00
component.c drivers/component: kerneldoc polish 2019-02-19 13:20:35 +01:00
container.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
core.c driver core: Clarify which counterparts to use to device_add() 2019-04-25 19:34:54 +02:00
cpu.c Driver core patches for 5.1-rc1 2019-03-06 14:52:48 -08:00
dd.c driver core: Postpone DMA tear-down until after devres release for probe failure 2019-04-25 21:48:37 +02:00
devcon.c device connection: Find device connections also from device graphs 2019-02-14 10:52:25 +01:00
devcoredump.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
devres.c devres: Align data[] to ARCH_KMALLOC_MINALIGN 2018-11-11 11:40:04 -08:00
devtmpfs.c vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled 2018-12-20 16:32:56 +00:00
driver.c driver-core: return EINVAL error instead of BUG_ON() 2018-05-25 18:18:45 +02:00
firmware.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
hypervisor.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
init.c base: fix order of OF initialization 2018-07-07 17:54:29 +02:00
isa.c Merge 4.15-rc3 into driver-core-next 2017-12-11 08:50:05 +01:00
Kconfig node: Add heterogenous memory access attributes 2019-04-04 18:41:21 +02:00
Makefile drivers: base: Introducing software nodes to the firmware node framework 2018-11-26 18:19:11 +01:00
map.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
memory.c mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lock 2019-04-25 19:40:20 +02:00
module.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
node.c node: Add memory-side caching attributes 2019-04-04 18:41:21 +02:00
pinctrl.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
platform-msi.c platform-msi: Free descriptors in platform_msi_domain_free() 2018-12-13 09:35:31 +00:00
platform.c driver core: platform: Propagate error from insert_resource() 2019-04-25 21:48:37 +02:00
property.c device property: fix fwnode_graph_get_next_endpoint() documentation 2018-12-18 23:39:45 +01:00
soc.c base: soc: use put_device() instead of kfree() 2018-03-15 14:37:03 +01:00
swnode.c drivers: base: swnode: Make two functions static 2019-03-19 16:37:56 +01:00
syscore.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
topology.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
transport_class.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00