linux_dsm_epyc7002/mm
Johannes Weiner 3a025760fc mm: page_alloc: spill to remote nodes before waking kswapd
On NUMA systems, a node may start thrashing cache or even swap anonymous
pages while there are still free pages on remote nodes.

This is a result of commits 81c0a2bb51 ("mm: page_alloc: fair zone
allocator policy") and fff4068cba ("mm: page_alloc: revert NUMA aspect
of fair allocation policy").

Before those changes, the allocator would first try all allowed zones,
including those on remote nodes, before waking any kswapds.  But now,
the allocator fastpath doubles as the fairness pass, which in turn can
only consider the local node to prevent remote spilling based on
exhausted fairness batches alone.  Remote nodes are only considered in
the slowpath, after the kswapds are woken up.  But if remote nodes still
have free memory, kswapd should not be woken to rebalance the local node
or it may thrash cash or swap prematurely.

Fix this by adding one more unfair pass over the zonelist that is
allowed to spill to remote nodes after the local fairness pass fails but
before entering the slowpath and waking the kswapds.

This also gets rid of the GFP_THISNODE exemption from the fairness
protocol because the unfair pass is no longer tied to kswapd, which
GFP_THISNODE is not allowed to wake up.

However, because remote spills can be more frequent now - we prefer them
over local kswapd reclaim - the allocation batches on remote nodes could
underflow more heavily.  When resetting the batches, use
atomic_long_read() directly instead of zone_page_state() to calculate the
delta as the latter filters negative counter values.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org>		[3.12+]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:57 -07:00
..
backing-dev.c bdi: avoid oops on device removal 2014-04-03 16:20:49 -07:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
bounce.c block: Convert bio_for_each_segment() to bvec_iter 2013-11-23 22:33:49 -08:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
compaction.c mm, compaction: determine isolation mode only once 2014-04-07 16:35:55 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
filemap.c memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
fremap.c mm: fix bad rss-counter if remap_file_pages raced migration 2014-03-19 16:21:49 -07:00
frontswap.c
highmem.c
huge_memory.c memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
hugetlb_cgroup.c cgroup: drop const from @buffer of cftype->write_string() 2014-03-19 10:23:54 -04:00
hugetlb.c mm: fix 'ERROR: do not initialise globals to 0 or NULL' and coding style 2014-04-07 16:35:55 -07:00
hwpoison-inject.c mm/hwpoison: add '#' to hwpoison_inject 2014-01-21 16:19:48 -08:00
init-mm.c
internal.h mm: page_alloc: spill to remote nodes before waking kswapd 2014-04-07 16:35:57 -07:00
interval_tree.c
Kconfig mm: disable split page table lock for !MMU 2014-04-07 16:35:52 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: change some global variables to int 2014-04-03 16:20:50 -07:00
ksm.c mm: close PageTail race 2014-03-04 07:55:47 -08:00
list_lru.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
maccess.c
madvise.c
Makefile mm: per-thread vma caching 2014-04-07 16:35:53 -07:00
memblock.c ARM: 7993/1: mm/memblock: add memblock_get_current_limit 2014-03-12 00:16:56 +00:00
memcontrol.c memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
memory_hotplug.c mm/memory_hotplug.c: move register_memory_resource out of the lock_memory_hotplug 2014-01-23 16:36:52 -08:00
memory-failure.c Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
memory.c memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
mempolicy.c mm, mempolicy: remove per-process flag 2014-04-07 16:35:54 -07:00
mempool.c mempool: add unlikely and likely hints 2014-04-07 16:35:55 -07:00
migrate.c mm: fix swapops.h:131 bug if remap_file_pages raced migration 2014-03-20 22:09:09 -07:00
mincore.c mm + fs: prepare for non-page entries in page cache radix trees 2014-04-03 16:21:00 -07:00
mlock.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm: per-thread vma caching 2014-04-07 16:35:53 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c
mprotect.c mm: move mmu notifier call from change_protection to change_pmd_range 2014-04-07 16:35:50 -07:00
mremap.c
msync.c
nobootmem.c mm/nobootmem.c: mark function as static 2014-04-03 16:21:02 -07:00
nommu.c mm: fix 'ERROR: do not initialise globals to 0 or NULL' and coding style 2014-04-07 16:35:55 -07:00
oom_kill.c mm, oom: base root bonus on current usage 2014-01-30 16:56:56 -08:00
page_alloc.c mm: page_alloc: spill to remote nodes before waking kswapd 2014-04-07 16:35:57 -07:00
page_cgroup.c mm/page_cgroup.c: mark functions as static 2014-04-03 16:21:02 -07:00
page_io.c Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
page_isolation.c
page-writeback.c mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq() 2014-02-06 13:48:51 -08:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c percpu: renew the max_contig if we merge the head and previous block 2014-03-29 09:29:42 -04:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c mm/process_vm_access.c: mark function as static 2014-04-03 16:21:02 -07:00
quicklist.c
readahead.c mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages 2014-04-03 16:21:05 -07:00
rmap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2014-03-31 14:35:30 -07:00
shmem.c memcg: rename high level charging functions 2014-04-07 16:35:57 -07:00
slab_common.c slab: fix wrong retval on kmem_cache_create_memcg error path 2014-01-29 16:22:40 -08:00
slab.c mm, mempolicy: remove per-process flag 2014-04-07 16:35:54 -07:00
slab.h memcg, slab: RCU protect memcg_params for root caches 2014-01-23 16:36:51 -08:00
slob.c
slub.c mm, mempolicy: rename slab_node for clarity 2014-04-07 16:35:54 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap_state.c swap: add a simple detector for inappropriate swapin readahead 2014-02-06 13:48:51 -08:00
swap.c mm: thrash detection-based file cache sizing 2014-04-03 16:21:01 -07:00
swapfile.c mm/swap: fix race on swap_info reuse between swapoff and swapon 2014-02-06 13:48:51 -08:00
truncate.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
util.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
vmacache.c mm: per-thread vma caching 2014-04-07 16:35:53 -07:00
vmalloc.c mm/vmalloc.c: enhance vm_map_ram() comment 2014-04-07 16:35:55 -07:00
vmpressure.c arm, pm, vmpressure: add missing slab.h includes 2014-02-03 13:24:01 -05:00
vmscan.c mm/vmscan: do not check compaction_ready on promoted zones 2014-04-07 16:35:50 -07:00
vmstat.c drop_caches: add some documentation and info message 2014-04-03 16:21:04 -07:00
workingset.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
zbud.c
zsmalloc.c zsmalloc: add copyright 2014-01-30 16:56:55 -08:00
zswap.c mm/zswap.c: change params from hidden to ro 2014-01-23 16:36:50 -08:00