linux_dsm_epyc7002/mm
Suzuki K Poulose e577c8b64d mm, compaction: make sure we isolate a valid PFN
When we have holes in a normal memory zone, we could endup having
cached_migrate_pfns which may not necessarily be valid, under heavy memory
pressure with swapping enabled ( via __reset_isolation_suitable(),
triggered by kswapd).

Later if we fail to find a page via fast_isolate_freepages(), we may end
up using the migrate_pfn we started the search with, as valid page.  This
could lead to accessing NULL pointer derefernces like below, due to an
invalid mem_section pointer.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825]
 Mem abort info:
   ESR = 0x96000004
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x00000004
   CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9
 [0000000000000008] pgd=0000000000000000
 Internal error: Oops: 96000004 [#1] SMP
 ...
 CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6
 Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018
 pstate: 60000005 (nZCv daif -PAN -UAO)
 pc : set_pfnblock_flags_mask+0x58/0xe8
 lr : compaction_alloc+0x300/0x950
 [...]
 Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5)
 Call trace:
  set_pfnblock_flags_mask+0x58/0xe8
  compaction_alloc+0x300/0x950
  migrate_pages+0x1a4/0xbb0
  compact_zone+0x750/0xde8
  compact_zone_order+0xd8/0x118
  try_to_compact_pages+0xb4/0x290
  __alloc_pages_direct_compact+0x84/0x1e0
  __alloc_pages_nodemask+0x5e0/0xe18
  alloc_pages_vma+0x1cc/0x210
  do_huge_pmd_anonymous_page+0x108/0x7c8
  __handle_mm_fault+0xdd4/0x1190
  handle_mm_fault+0x114/0x1c0
  __get_user_pages+0x198/0x3c0
  get_user_pages_unlocked+0xb4/0x1d8
  __gfn_to_pfn_memslot+0x12c/0x3b8
  gfn_to_pfn_prot+0x4c/0x60
  kvm_handle_guest_abort+0x4b0/0xcd8
  handle_exit+0x140/0x1b8
  kvm_arch_vcpu_ioctl_run+0x260/0x768
  kvm_vcpu_ioctl+0x490/0x898
  do_vfs_ioctl+0xc4/0x898
  ksys_ioctl+0x8c/0xa0
  __arm64_sys_ioctl+0x28/0x38
  el0_svc_common+0x74/0x118
  el0_svc_handler+0x38/0x78
  el0_svc+0x8/0xc
 Code: f8607840 f100001f 8b011401 9a801020 (f9400400)
 ---[ end trace af6a35219325a9b6 ]---

The issue was reported on an arm64 server with 128GB with holes in the
zone (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while
running 100 KVM guest instances.

This patch fixes the issue by ensuring that the page belongs to a valid
PFN when we fallback to using the lower limit of the scan range upon
failure in fast_isolate_freepages().

Link: http://lkml.kernel.org/r/1558711908-15688-1-git-send-email-suzuki.poulose@arm.com
Fixes: 5a811889de ("mm, compaction: use free lists to quickly locate a migration target")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01 15:51:32 -07:00
..
kasan kasan: initialize tag to 0xff in __kasan_kmalloc 2019-06-01 15:51:31 -07:00
backing-dev.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
balloon_compaction.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
cleancache.c
cma_debug.c mm/cma_debug.c: fix the break condition in cma_maxchunk_get() 2019-05-14 09:47:45 -07:00
cma.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98 2019-05-24 17:37:54 +02:00
cma.h
compaction.c mm, compaction: make sure we isolate a valid PFN 2019-06-01 15:51:32 -07:00
debug_page_ref.c
debug.c mm: update references to page _refcount 2019-05-14 19:52:47 -07:00
dmapool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
early_ioremap.c
fadvise.c vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
failslab.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
filemap.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
frame_vector.c
frontswap.c
gup_benchmark.c mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM 2019-05-14 09:47:45 -07:00
gup.c mm/gup: continue VM_FAULT_RETRY processing even for pre-faults 2019-06-01 15:51:31 -07:00
highmem.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
hmm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
huge_memory.c mm/huge_memory.c: make __thp_get_unmapped_area static 2019-05-14 09:47:51 -07:00
hugetlb_cgroup.c
hugetlb.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
hwpoison-inject.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
init-mm.c
internal.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
interval_tree.c
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.debug treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
khugepaged.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
kmemleak-test.c
kmemleak.c Merge branch 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-06 13:11:48 -07:00
ksm.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
list_lru.c memcg: make it work on sparse non-0-node systems 2019-06-01 15:51:31 -07:00
maccess.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
madvise.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
Makefile mm: shuffle initial free memory to improve memory-side-cache utilization 2019-05-14 19:52:48 -07:00
memblock.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
memcontrol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
memfd.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
memory_hotplug.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
memory-failure.c mm: hwpoison: fix thp split handing in soft_offline_in_use_page() 2019-03-05 21:07:13 -08:00
memory.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mempolicy.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 225 2019-05-30 11:29:56 -07:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memtest.c
migrate.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
mincore.c mm/mincore.c: make mincore() more conservative 2019-05-14 19:52:48 -07:00
mlock.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
mm_init.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mmap.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mmu_context.c
mmu_gather.c asm-generic/tlb: Remove tlb_table_flush() 2019-04-03 10:33:02 +02:00
mmu_notifier.c mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper 2019-05-14 09:47:49 -07:00
mmzone.c
mprotect.c mm/mprotect.c: fix compilation warning because of unused 'mm' variable 2019-05-14 09:47:51 -07:00
mremap.c mm/mmu_notifier: contextual information for event triggering invalidation 2019-05-14 09:47:49 -07:00
msync.c
nommu.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
oom_kill.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
page_alloc.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
page_counter.c
page_ext.c memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
page_idle.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
page_io.c mm/page_io.c: fix polled swap page in 2019-01-04 13:13:48 -08:00
page_isolation.c mm/page_isolation.c: remove redundant pfn_valid_within() in __first_valid_page() 2019-05-14 09:47:46 -07:00
page_owner.c mm/page_owner: Simplify stack trace handling 2019-04-29 12:37:50 +02:00
page_poison.c page_poison: play nicely with KASAN 2019-03-05 21:07:13 -08:00
page_vma_mapped.c mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly 2018-10-31 08:54:11 -07:00
page-writeback.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
pagewalk.c
percpu-internal.h percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-km.c percpu: set PCPU_BITMAP_BLOCK_SIZE to PAGE_SIZE 2019-03-13 12:25:31 -07:00
percpu-stats.c percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-vm.c
percpu.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2019-05-13 15:34:03 -07:00
pgtable-generic.c x86/mm: Page size aware flush_tlb_mm_range() 2018-10-09 16:51:11 +02:00
process_vm_access.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
quicklist.c
readahead.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
rmap.c mm/rmap.c: use the pra.mapcount to do the check 2019-05-14 09:47:49 -07:00
rodata_test.c
shmem.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
shuffle.c mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
shuffle.h mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
slab_common.c mm: add support for kmem caches in DMA32 zone 2019-03-29 10:01:37 -07:00
slab.c slab: remove /proc/slab_allocators 2019-05-16 15:51:55 -07:00
slab.h mm: add support for kmem caches in DMA32 zone 2019-03-29 10:01:37 -07:00
slob.c slob: use slab_list instead of lru 2019-05-14 09:47:44 -07:00
slub.c mm/slub.c: update the comment about slab frozen 2019-05-14 09:47:45 -07:00
sparse-vmemmap.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sparse.c mm/sparse.c: clean up obsolete code comment 2019-05-14 09:47:48 -07:00
swap_cgroup.c
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
swap.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
swapfile.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
truncate.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
usercopy.c mm/usercopy.c: no check page span for stack objects 2019-01-08 17:15:11 -08:00
userfaultfd.c hugetlb: use same fault hash key for shared and private mappings 2019-05-14 09:47:48 -07:00
util.c prctl_set_mm: downgrade mmap_sem to read lock 2019-06-01 15:51:31 -07:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c mm/vmalloc.c: fix typo in comment 2019-06-01 15:51:31 -07:00
vmpressure.c
vmscan.c mm: memcontrol: make cgroup stats and events query API explicitly local 2019-05-14 19:52:53 -07:00
vmstat.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
workingset.c mm: memcontrol: make cgroup stats and events query API explicitly local 2019-05-14 19:52:53 -07:00
z3fold.c z3fold: fix sheduling while atomic 2019-06-01 15:51:31 -07:00
zbud.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zpool.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zsmalloc.c mm/zsmalloc.c: fix fall-through annotation 2018-10-26 16:26:35 -07:00
zswap.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00