linux_dsm_epyc7002/mm
Mel Gorman 1a501907bb mm: vmscan: use proportional scanning during direct reclaim and full scan at DEF_PRIORITY
Commit "mm: vmscan: obey proportional scanning requirements for kswapd"
ensured that file/anon lists were scanned proportionally for reclaim from
kswapd but ignored it for direct reclaim.  The intent was to minimse
direct reclaim latency but Yuanhan Liu pointer out that it substitutes one
long stall for many small stalls and distorts aging for normal workloads
like streaming readers/writers.  Hugh Dickins pointed out that a
side-effect of the same commit was that when one LRU list dropped to zero
that the entirety of the other list was shrunk leading to excessive
reclaim in memcgs.  This patch scans the file/anon lists proportionally
for direct reclaim to similarly age page whether reclaimed by kswapd or
direct reclaim but takes care to abort reclaim if one LRU drops to zero
after reclaiming the requested number of pages.

Based on ext4 and using the Intel VM scalability test

                                              3.15.0-rc5            3.15.0-rc5
                                                shrinker            proportion
Unit  lru-file-readonce    elapsed      5.3500 (  0.00%)      5.4200 ( -1.31%)
Unit  lru-file-readonce time_range      0.2700 (  0.00%)      0.1400 ( 48.15%)
Unit  lru-file-readonce time_stddv      0.1148 (  0.00%)      0.0536 ( 53.33%)
Unit lru-file-readtwice    elapsed      8.1700 (  0.00%)      8.1700 (  0.00%)
Unit lru-file-readtwice time_range      0.4300 (  0.00%)      0.2300 ( 46.51%)
Unit lru-file-readtwice time_stddv      0.1650 (  0.00%)      0.0971 ( 41.16%)

The test cases are running multiple dd instances reading sparse files. The results are within
the noise for the small test machine. The impact of the patch is more noticable from the vmstats

                            3.15.0-rc5  3.15.0-rc5
                              shrinker  proportion
Minor Faults                     35154       36784
Major Faults                       611        1305
Swap Ins                           394        1651
Swap Outs                         4394        5891
Allocation stalls               118616       44781
Direct pages scanned           4935171     4602313
Kswapd pages scanned          15921292    16258483
Kswapd pages reclaimed        15913301    16248305
Direct pages reclaimed         4933368     4601133
Kswapd efficiency                  99%         99%
Kswapd velocity             670088.047  682555.961
Direct efficiency                  99%         99%
Direct velocity             207709.217  193212.133
Percentage direct scans            23%         22%
Page writes by reclaim        4858.000    6232.000
Page writes file                   464         341
Page writes anon                  4394        5891

Note that there are fewer allocation stalls even though the amount
of direct reclaim scanning is very approximately the same.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:12 -07:00
..
backing-dev.c arch: Mass conversion of smp_mb__*() 2014-04-18 14:20:48 +02:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
compaction.c mm, compaction: properly signal and act upon lock and need_sched() contention 2014-06-04 16:54:11 -07:00
debug-pagealloc.c mm, x86: Remove debug_pagealloc_enabled 2011-12-06 09:24:07 +01:00
dmapool.c mm/dmapool.c: reuse devres_release() to free resources 2014-06-04 16:54:08 -07:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
filemap.c mm: avoid unnecessary atomic operations during end_page_writeback() 2014-06-04 16:54:10 -07:00
fremap.c mm: softdirty: make freshly remapped file pages being softdirty unconditionally 2014-06-04 16:53:56 -07:00
frontswap.c swap: change swap_list_head to plist, add swap_avail_head 2014-06-04 16:54:07 -07:00
gup.c mm: cleanup __get_user_pages() 2014-06-04 16:54:05 -07:00
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm/huge_memory.c: complete conversion to pr_foo() 2014-06-04 16:53:58 -07:00
hugetlb_cgroup.c cgroup: drop const from @buffer of cftype->write_string() 2014-03-19 10:23:54 -04:00
hugetlb.c mm, hugetlb: move the error handle logic out of normal code path 2014-06-04 16:54:10 -07:00
hwpoison-inject.c mm/hwpoison: add '#' to hwpoison_inject 2014-01-21 16:19:48 -08:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm, compaction: properly signal and act upon lock and need_sched() contention 2014-06-04 16:54:11 -07:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
iov_iter.c take iov_iter stuff to mm/iov_iter.c 2014-04-01 23:19:30 -04:00
Kconfig hugetlb: restrict hugepage_migration_support() to x86_64 2014-06-04 16:53:51 -07:00
Kconfig.debug mm: more intensive memory corruption debugging 2012-01-10 16:30:42 -08:00
kmemcheck.c
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c mem-hotplug: implement get/put_online_mems 2014-06-04 16:53:59 -07:00
ksm.c mm: close PageTail race 2014-03-04 07:55:47 -08:00
list_lru.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c mm: madvise: fix MADV_WILLNEED on shmem swapouts 2014-05-23 09:37:29 -07:00
Makefile mm: move get_user_pages()-related code to separate file 2014-06-04 16:54:04 -07:00
memblock.c mm/memblock.c: use PFN_DOWN 2014-06-04 16:54:02 -07:00
memcontrol.c memcg: cleanup kmem cache creation/destruction functions naming 2014-06-04 16:54:08 -07:00
memory_hotplug.c mm, migration: add destination page freeing callback 2014-06-04 16:54:06 -07:00
memory-failure.c mm/memory-failure.c: move comment 2014-06-04 16:54:10 -07:00
memory.c mm: fix typo in comment in do_fault_around() 2014-06-04 16:54:11 -07:00
mempolicy.c mm, migration: add destination page freeing callback 2014-06-04 16:54:06 -07:00
mempool.c mm/mempool: warn about __GFP_ZERO usage 2014-06-04 16:53:58 -07:00
migrate.c mm, migration: add destination page freeing callback 2014-06-04 16:54:06 -07:00
mincore.c mm + fs: prepare for non-page entries in page cache radix trees 2014-04-03 16:21:00 -07:00
mlock.c mm: try_to_unmap_cluster() should lock_page() before mlocking 2014-04-07 16:35:57 -07:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm/mmap.c: remove the first mapping check 2014-06-04 16:54:01 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: move mmu notifier call from change_protection to change_pmd_range 2014-04-07 16:35:50 -07:00
mremap.c mm, thp: close race between mremap() and split_huge_page() 2014-05-11 17:55:48 +09:00
msync.c mm/msync.c: sync only the requested range in msync() 2014-06-04 16:54:11 -07:00
nobootmem.c mm/nobootmem.c: mark function as static 2014-04-03 16:21:02 -07:00
nommu.c mm: fix 'ERROR: do not initialise globals to 0 or NULL' and coding style 2014-04-07 16:35:55 -07:00
oom_kill.c mm, oom: base root bonus on current usage 2014-01-30 16:56:56 -08:00
page_alloc.c mm: page_alloc: calculate classzone_idx once from the zonelist ref 2014-06-04 16:54:10 -07:00
page_cgroup.c mm/page_cgroup.c: mark functions as static 2014-04-03 16:21:02 -07:00
page_io.c swap: use bdev_read_page() / bdev_write_page() 2014-06-04 16:54:02 -07:00
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
page-writeback.c mm: replace __get_cpu_var uses with this_cpu_ptr 2014-06-04 16:54:03 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c
percpu-vm.c mm: fix kernel-doc warnings 2012-06-20 14:39:36 -07:00
percpu.c percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() 2014-04-14 16:18:06 -04:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-04-12 14:49:50 -07:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm/readahead.c: inline ra_submit 2014-04-07 16:35:58 -07:00
rmap.c mm: fold mlocked_vma_newpage() into its only call site 2014-06-04 16:54:07 -07:00
shmem.c mm: non-atomically mark page accessed during page cache allocation where possible 2014-06-04 16:54:10 -07:00
slab_common.c slab: delete cache from list after __kmem_cache_shutdown succeeds 2014-06-04 16:54:08 -07:00
slab.c memcg, slab: merge memcg_{bind,release}_pages to memcg_{un}charge_slab 2014-06-04 16:54:01 -07:00
slab.h memcg, slab: merge memcg_{bind,release}_pages to memcg_{un}charge_slab 2014-06-04 16:54:01 -07:00
slob.c slab: get_online_mems for kmem_cache_{create,destroy,shrink} 2014-06-04 16:53:59 -07:00
slub.c mm: replace __get_cpu_var uses with this_cpu_ptr 2014-06-04 16:54:03 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap_state.c mm: page_alloc: convert hot/cold parameter and immediate callers to bool 2014-06-04 16:54:09 -07:00
swap.c mm: non-atomically mark page accessed during page cache allocation where possible 2014-06-04 16:54:10 -07:00
swapfile.c swap: change swap_list_head to plist, add swap_avail_head 2014-06-04 16:54:07 -07:00
truncate.c mm: filemap: update find_get_pages_tag() to deal with shadow entries 2014-05-06 13:04:59 -07:00
util.c nick kvfree() from apparmor 2014-05-06 14:02:53 -04:00
vmacache.c mm,vmacache: optimize overflow system-wide flushing 2014-06-04 16:53:57 -07:00
vmalloc.c mm/vmalloc.c: replace seq_printf by seq_puts 2014-06-04 16:54:04 -07:00
vmpressure.c arm, pm, vmpressure: add missing slab.h includes 2014-02-03 13:24:01 -05:00
vmscan.c mm: vmscan: use proportional scanning during direct reclaim and full scan at DEF_PRIORITY 2014-06-04 16:54:12 -07:00
vmstat.c mm: use the light version __mod_zone_page_state in mlocked_vma_newpage() 2014-06-04 16:54:07 -07:00
workingset.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zsmalloc.c mm: replace __get_cpu_var uses with this_cpu_ptr 2014-06-04 16:54:03 -07:00
zswap.c Merge branch 'akpm' (incoming from Andrew) 2014-04-07 16:38:06 -07:00