linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-02-04 11:35:28 +07:00

History

Shaohua Li eb709b0d06 mm: batch activate_page() to reduce lock contention The zone->lru_lock is heavily contented in workload where activate_page() is frequently used. We could do batch activate_page() to reduce the lock contention. The batched pages will be added into zone list when the pool is full or page reclaim is trying to drain them. For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes, processes shared map to the file. Each process read access the whole file and then exit. The process exit will do unmap_vmas() and cause a lot of activate_page() call. In such workload, we saw about 58% total time reduction with below patch. Other workloads with a lot of activate_page also benefits a lot too. Andrew Morton suggested activate_page() and putback_lru_pages() should follow the same path to active pages, but this is hard to implement (see commit `7a608572a2` ("Revert "mm: batch activate_page() to reduce lock contention")). On the other hand, do we really need putback_lru_pages() to follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't show anything significant with/without below patch (there is slight difference but mostly some noise which we found even without below patch before). Below patch basically returns to the same as my first post. I tested some microbenchmarks: case-anon-cow-rand-mt 0.58% case-anon-cow-rand -3.30% case-anon-cow-seq-mt -0.51% case-anon-cow-seq -5.68% case-anon-r-rand-mt 0.23% case-anon-r-rand 0.81% case-anon-r-seq-mt -0.71% case-anon-r-seq -1.99% case-anon-rx-rand-mt 2.11% case-anon-rx-seq-mt 3.46% case-anon-w-rand-mt -0.03% case-anon-w-rand -0.50% case-anon-w-seq-mt -1.08% case-anon-w-seq -0.12% case-anon-wx-rand-mt -5.02% case-anon-wx-seq-mt -1.43% case-fork 1.65% case-fork-sleep -0.07% case-fork-withmem 1.39% case-hugetlb -0.59% case-lru-file-mmap-read-mt -0.54% case-lru-file-mmap-read 0.61% case-lru-file-mmap-read-rand -2.24% case-lru-file-readonce -0.64% case-lru-file-readtwice -11.69% case-lru-memcg -1.35% case-mmap-pread-rand-mt 1.88% case-mmap-pread-rand -15.26% case-mmap-pread-seq-mt 0.89% case-mmap-pread-seq -69.72% case-mmap-xread-rand-mt 0.71% case-mmap-xread-seq-mt 0.38% The most significent are: case-lru-file-readtwice -11.69% case-mmap-pread-rand -15.26% case-mmap-pread-seq -69.72% which use activate_page a lot. others are basically variations because each run has slightly difference. In UP case, 'size mm/swap.o' before the two patches: text data bss dec hex filename 6466 896 4 7366 1cc6 mm/swap.o after the two patches: text data bss dec hex filename 6343 896 4 7243 1c4b mm/swap.o Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2011-05-25 08:39:37 -07:00
..
backing-dev.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
bootmem.c	crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn	2011-03-23 19:47:19 -07:00
bounce.c	bounce: call flush_dcache_page() after bounce_copy_vec()	2010-09-09 18:57:25 -07:00
compaction.c	mm: compaction: minimise the time IRQs are disabled while isolating pages for migration	2011-03-22 17:44:05 -07:00
debug-pagealloc.c
dmapool.c	mm/dmapool.c: use TASK_UNINTERRUPTIBLE in dma_pool_alloc()	2011-01-13 17:32:48 -08:00
fadvise.c	readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM	2010-03-06 11:26:25 -08:00
failslab.c	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h	2010-03-30 22:02:32 +09:00
filemap_xip.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
filemap.c	readahead: trigger mmap sequential readahead on PG_readahead	2011-05-25 08:39:27 -07:00
fremap.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
highmem.c	mm,x86: fix kmap_atomic_push vs ioremap_32.c	2010-10-27 18:03:05 -07:00
huge_memory.c	mm: thp: optimize memcg charge in khugepaged	2011-05-25 08:39:21 -07:00
hugetlb.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
hwpoison-inject.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
init-mm.c	mm: convert mm->cpu_vm_cpumask into cpumask_var_t	2011-05-25 08:39:21 -07:00
internal.h	mm: nommu: sort mm->mmap list properly	2011-05-25 08:39:05 -07:00
Kconfig	mm: compaction: don't depend on HUGETLB_PAGE	2011-01-26 10:50:02 +10:00
Kconfig.debug	mm: debug-pagealloc: fix kconfig dependency warning	2011-03-22 17:44:02 -07:00
kmemcheck.c	kmemcheck: Fix build errors due to missing slab.h	2010-03-30 22:02:32 +09:00
kmemleak-test.c	kmemleak: remove memset by using kzalloc	2011-01-27 18:31:51 +00:00
kmemleak.c	kmemleak: Do not return a pointer to an object that kmemleak did not get	2011-05-19 17:35:28 +01:00
ksm.c	oom: replace PF_OOM_ORIGIN with toggling oom_score_adj	2011-05-25 08:39:10 -07:00
maccess.c	MN10300: Save frame pointer in thread_info struct rather than global var	2010-10-27 17:29:01 +01:00
madvise.c	thp: khugepaged: make khugepaged aware about madvise	2011-01-13 17:32:47 -08:00
Makefile	bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c	2011-02-24 14:43:05 +01:00
memblock.c	mm/memblock: properly handle overlaps and fix error path	2011-03-22 17:44:09 -07:00
memcontrol.c	memsw: remove noswapaccount kernel parameter	2011-05-25 08:39:36 -07:00
memory_hotplug.c	mm: remove dependency on CONFIG_FLATMEM from online_page()	2011-05-25 08:39:28 -07:00
memory-failure.c	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
memory.c	mm: uninline large generic tlb.h functions	2011-05-25 08:39:20 -07:00
mempolicy.c	mm: proc: move show_numa_map() to fs/proc/task_mmu.c	2011-05-25 08:39:34 -07:00
mempool.c	mm: remove broken 'kzalloc' mempool	2009-09-22 07:17:35 -07:00
migrate.c	mm: use refcounts for page_lock_anon_vma()	2011-05-25 08:39:19 -07:00
mincore.c	thp: mincore transparent hugepage support	2011-01-13 17:32:44 -08:00
mlock.c	VM: skip the stack guard page lookup in get_user_pages only for mlock	2011-05-04 21:30:28 -07:00
mm_init.c
mmap.c	mm: convert anon_vma->lock to a mutex	2011-05-25 08:39:19 -07:00
mmu_context.c	exit: fix oops in sync_mm_rss	2010-03-24 16:31:21 -07:00
mmu_notifier.c	thp: mmu_notifier_test_young	2011-01-13 17:32:46 -08:00
mmzone.c	mm: page allocator: adjust the per-cpu counter threshold when memory is low	2011-01-13 17:32:31 -08:00
mprotect.c	thp: mprotect: transparent huge page support	2011-01-13 17:32:44 -08:00
mremap.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
msync.c	sanitize vfs_fsync calling conventions	2010-05-21 18:31:21 -04:00
nobootmem.c	memblock/nobootmem: remove unneeded code from alloc_bootmem_node_high()	2011-05-25 08:39:31 -07:00
nommu.c	mm: nommu: fix a compile warning in do_mmap_pgoff()	2011-05-25 08:39:07 -07:00
oom_kill.c	oom: replace PF_OOM_ORIGIN with toggling oom_score_adj	2011-05-25 08:39:10 -07:00
page_alloc.c	mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()	2011-05-25 08:39:36 -07:00
page_cgroup.c	memcg: allocate memory cgroup structures in local nodes	2011-05-11 18:50:45 -07:00
page_io.c	block: kill off REQ_UNPLUG	2011-03-10 08:52:27 +01:00
page_isolation.c	mm: page_isolation: codeclean fix comment and rm unneeded val init	2010-10-26 16:52:11 -07:00
page-writeback.c	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block	2011-03-24 10:16:26 -07:00
pagewalk.c	pagewalk: only split huge pages when necessary	2011-03-22 17:44:04 -07:00
percpu-km.c	percpu: clear memory allocated with the km allocator	2010-10-02 10:28:42 +03:00
percpu-vm.c	mm: remove gfp mask from pcpu_get_vm_areas	2011-01-13 17:32:34 -08:00
percpu.c	Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu	2011-05-24 11:53:42 -07:00
pgtable-generic.c	mm/pgtable-generic.c: fix CONFIG_SWAP=n build	2011-01-26 10:49:58 +10:00
prio_tree.c	sanitize <linux/prefetch.h> usage	2011-05-20 12:50:29 -07:00
quicklist.c	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h	2010-03-30 22:02:32 +09:00
readahead.c	readahead: readahead page allocations are OK to fail	2011-05-25 08:39:25 -07:00
rmap.c	mm: optimize page_lock_anon_vma() fast-path	2011-05-25 08:39:20 -07:00
shmem.c	tmpfs: implement generic xattr support	2011-05-25 08:39:31 -07:00
slab.c	sanitize <linux/prefetch.h> usage	2011-05-20 12:50:29 -07:00
slob.c	mm: Remove support for kmem_cache_name()	2011-01-23 21:00:05 +02:00
slub.c	slub: Fix double bit unlock in debug mode	2011-05-25 08:38:24 -07:00
sparse-vmemmap.c	tree-wide: fix comment/printk typos	2010-11-01 15:38:34 -04:00
sparse.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
swap_state.c	block: remove per-queue plugging	2011-03-10 08:52:07 +01:00
swap.c	mm: batch activate_page() to reduce lock contention	2011-05-25 08:39:37 -07:00
swapfile.c	oom: replace PF_OOM_ORIGIN with toggling oom_score_adj	2011-05-25 08:39:10 -07:00
thrash.c
truncate.c	mm: deactivate invalidated pages	2011-03-22 17:44:03 -07:00
util.c	mm: nommu: sort mm->mmap list properly	2011-05-25 08:39:05 -07:00
vmalloc.c	mm: print vmalloc() state after allocation failures	2011-05-25 08:39:22 -07:00
vmscan.c	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
vmstat.c	mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur	2011-05-25 08:39:09 -07:00