linux_dsm_epyc7002/drivers/base
David Hildenbrand 381eab4a6e mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
There seem to be some problems as result of 30467e0b3b ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
	a) device_lock()
	b) mem_hotplug_lock

While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took a),
followed by b), exposing a possible deadlock.

In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.

The problems I spotted related to this:

1. Memory block device attributes: While .state first calls
   mem_hotplug_begin() and the calls device_online() - which takes
   device_lock() - .online does no longer call mem_hotplug_begin(), so
   effectively calls online_pages() without mem_hotplug_lock.

2. device_online() should be called under device_hotplug_lock, however
   onlining memory during add_memory() does not take care of that.

In addition, I think there is also something wrong about the locking in

3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
   without locks. This was introduced after 30467e0b3b. And skimming over
   the code, I assume it could need some more care in regards to locking
   (e.g. device_online() called without device_hotplug_lock. This will
   be addressed in the following patches.

Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()

We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().

Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).

[1] http://driverdev.linuxdriverproject.org/pipermail/ driverdev-devel/
    2015-February/065324.html

This patch is partly based on a patch by Vitaly Kuznetsov.

Link: http://lkml.kernel.org/r/20180925091457.28651-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:17 -07:00
..
firmware_loader firmware: Always initialize the fw_priv list object 2018-09-30 08:49:55 -07:00
power PM / Domains: Deal with multiple states but no governor in genpd 2018-10-18 12:25:09 +02:00
regmap Merge remote-tracking branches 'regmap/topic/noinc' and 'regmap/topic/single-rw' into regmap-next 2018-10-21 12:07:26 +01:00
test
arch_topology.c sched/topology, drivers/base/arch_topology: Rebuild the sched_domain hierarchy when capacities change 2018-09-10 11:05:47 +02:00
attribute_container.c
base.h driver core: remove unnecessary function extern declare 2018-07-16 13:32:20 +02:00
bus.c
cacheinfo.c drivers: base: cacheinfo: Do not populate sysfs for unknown cache types 2018-10-04 23:02:17 +02:00
class.c
component.c component: fix loop condition to call unbind() if bind() fails 2018-09-16 22:37:16 +02:00
container.c
core.c Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
cpu.c x86/speculation/l1tf: Add sysfs reporting for l1tf 2018-06-20 19:10:00 +02:00
dd.c dma-mapping: remove dma_deconfigure 2018-09-08 11:19:28 +02:00
devcon.c
devcoredump.c
devres.c devres: provide devm_kstrdup_const() 2018-10-16 12:53:27 +02:00
devtmpfs.c drivers/base/devtmpfs.c: don't pretend path is const in delete_path 2018-09-16 22:37:16 +02:00
driver.c
firmware.c
hypervisor.c
init.c base: fix order of OF initialization 2018-07-07 17:54:29 +02:00
isa.c
Kconfig
Makefile dma-mapping: move all DMA mapping code to kernel/dma 2018-06-14 08:50:37 +02:00
map.c
memory.c mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock 2018-10-31 08:54:17 -07:00
module.c
node.c mm, proc: add KReclaimable to /proc/meminfo 2018-10-26 16:26:32 -07:00
pinctrl.c
platform-msi.c genirq/msi: Allow creation of a tree-based irqdomain for platform-msi 2018-10-02 11:16:38 +01:00
platform.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
property.c
soc.c
syscore.c
topology.c
transport_class.c