linux_dsm_epyc7002/mm
Claudio Imbrenda e86c59b1b1 mm/ksm: improve deduplication of zero pages with colouring
Some architectures have a set of zero pages (coloured zero pages)
instead of only one zero page, in order to improve the cache
performance.  In those cases, the kernel samepage merger (KSM) would
merge all the allocated pages that happen to be filled with zeroes to
the same deduplicated page, thus losing all the advantages of coloured
zero pages.

This behaviour is noticeable when a process accesses large arrays of
allocated pages containing zeroes.  A test I conducted on s390 shows
that there is a speed penalty when KSM merges such pages, compared to
not merging them or using actual zero pages from the start without
breaking the COW.

This patch fixes this behaviour.  When coloured zero pages are present,
the checksum of a zero page is calculated during initialisation, and
compared with the checksum of the current canditate during merging.  In
case of a match, the normal merging routine is used to merge the page
with the correct coloured zero page, which ensures the candidate page is
checked to be equal to the target zero page.

A sysfs entry is also added to toggle this behaviour, since it can
potentially introduce performance regressions, especially on
architectures without coloured zero pages.  The default value is
disabled, for backwards compatibility.

With this patch, the performance with KSM is the same as with non
COW-broken actual zero pages, which is also the same as without KSM.

[akpm@linux-foundation.org: make zero_checksum and ksm_use_zero_pages __read_mostly, per Andrea]
[imbrenda@linux.vnet.ibm.com: documentation for coloured zero pages deduplication]
  Link: http://lkml.kernel.org/r/1484927522-1964-1-git-send-email-imbrenda@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1484850953-23941-1-git-send-email-imbrenda@linux.vnet.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:53 -08:00
..
kasan arm64 updates for 4.11: 2017-02-22 10:46:44 -08:00
backing-dev.c mm/backing-dev.c: use rb_entry() 2017-02-22 16:41:30 -08:00
balloon_compaction.c mm: balloon: use general non-lru movable page feature 2016-07-26 16:19:19 -07:00
bootmem.c mm/bootmem.c: cosmetic improvement of code readability 2017-02-22 16:41:29 -08:00
cleancache.c cleancache: constify cleancache_ops structure 2016-01-27 09:09:57 -05:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
cma.c mm/cma: Cleanup highmem check 2017-01-11 13:56:49 +00:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
compaction.c mm,compaction: serialize waitqueue_active() checks 2017-02-22 16:41:29 -08:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm, debug: print raw struct page data in __dump_page() 2016-12-12 18:55:08 -08:00
dmapool.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c mm: fix filemap.c kernel-doc warnings 2017-02-22 16:41:29 -08:00
frame_vector.c mm: replace get_vaddr_frames() write/force parameters with gup_flags 2016-10-19 08:11:24 -07:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c userfaultfd: hugetlbfs: gup: support VM_FAULT_RETRY 2017-02-22 16:41:28 -08:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c mm, thp: add new defer+madvise defrag option 2017-02-22 16:41:30 -08:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hugetlb.c userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings 2017-02-22 16:41:28 -08:00
hwpoison-inject.c hwpoison: use page_cgroup_ino for filtering by memcg 2015-09-10 13:29:01 -07:00
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2016-11-22 11:49:48 -06:00
internal.h oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA 2017-02-22 16:41:30 -08:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig mm: THP page cache support for ppc64 2016-12-12 18:55:08 -08:00
Kconfig.debug PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO 2016-09-13 02:35:27 +02:00
khugepaged.c mm: get rid of __GFP_OTHER_NODE 2017-01-10 18:31:55 -08:00
kmemcheck.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c kmemleak: fix reference to Documentation 2016-12-12 18:55:07 -08:00
ksm.c mm/ksm: improve deduplication of zero pages with colouring 2017-02-24 17:46:53 -08:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-10-27 18:43:42 -07:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA 2017-02-22 16:41:30 -08:00
Makefile mm/swap: add cache for swap slots allocation 2017-02-22 16:41:30 -08:00
memblock.c mm/memblock.c: remove unnecessary log and clean up 2017-02-22 16:41:30 -08:00
memcontrol.c slab: use memcg_kmem_cache_wq for slab destruction operations 2017-02-22 16:41:27 -08:00
memory_hotplug.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-02-22 16:41:29 -08:00
memory-failure.c mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked 2016-12-25 11:54:48 -08:00
memory.c mm: drop unused argument of zap_page_range() 2017-02-22 16:41:30 -08:00
mempolicy.c mm/mempolicy.c: do not put mempolicy before using its nodemask 2017-01-24 16:26:14 -08:00
mempool.c Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" 2016-07-28 16:07:41 -07:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked 2016-12-25 11:54:48 -08:00
mincore.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mlock.c thp: fix corner case of munlock() of PTE-mapped THPs 2016-11-30 16:32:52 -08:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c powerpc: do not make the entire heap executable 2017-02-22 16:41:29 -08:00
mmu_context.c mm/mmu_context, sched/core: Fix mmu_context.h assumption 2016-04-28 11:44:19 +02:00
mmu_notifier.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
mmzone.c mm/mmzone.c: swap likely to unlikely as code logic is different for next_zones_zonelist() 2017-02-22 16:41:29 -08:00
mprotect.c mm: mprotect: use pmd_trans_unstable instead of taking the pmd_lock 2017-02-22 16:41:29 -08:00
mremap.c userfaultfd: non-cooperative: optimize mremap_userfaultfd_complete() 2017-02-22 16:41:28 -08:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping 2016-10-11 15:06:33 -07:00
nommu.c lib/show_mem.c: teach show_mem to work with the given nodemask 2017-02-22 16:41:30 -08:00
oom_kill.c oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA 2017-02-22 16:41:30 -08:00
page_alloc.c mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled 2017-02-22 16:41:30 -08:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm/page_ext: support extra space allocation by page_ext user 2016-10-07 18:46:27 -07:00
page_idle.c mm, vmscan: move lru_lock to the node 2016-07-28 16:07:41 -07:00
page_io.c writeback: add wbc_to_write_flags() 2016-11-02 10:24:03 -06:00
page_isolation.c mm, page_alloc: avoid page_to_pfn() when merging buddies 2017-02-22 16:41:27 -08:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
page-writeback.c block: Use pointer to backing_dev_info from request_queue 2017-02-02 08:20:48 -07:00
pagewalk.c thp: rename split_huge_page_pmd() to split_huge_pmd() 2016-01-15 17:56:32 -08:00
percpu-km.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2016-12-13 12:34:47 -08:00
pgtable-generic.c mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range 2016-03-17 15:09:34 -07:00
process_vm_access.c mm: unexport __get_user_pages_unlocked() 2016-12-14 16:04:09 -08:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm, rmap: handle anon_vma_prepare() common case inline 2016-12-12 18:55:08 -08:00
shmem.c userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY 2017-02-22 16:41:29 -08:00
slab_common.c slab: use memcg_kmem_cache_wq for slab destruction operations 2017-02-22 16:41:27 -08:00
slab.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slab.h slab: remove synchronous synchronize_sched() from memcg cache deactivation path 2017-02-22 16:41:27 -08:00
slob.c slab: introduce __kmemcg_cache_deactivate() 2017-02-22 16:41:27 -08:00
slub.c slub: make sysfs directories for memcg sub-caches optional 2017-02-22 16:41:27 -08:00
sparse-vmemmap.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sparse.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-02-22 16:41:29 -08:00
swap_cgroup.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_slots.c mm/swap: skip readahead only when swap slot cache is enabled 2017-02-22 16:41:30 -08:00
swap_state.c mm/swap: skip readahead only when swap slot cache is enabled 2017-02-22 16:41:30 -08:00
swap.c mm/swap: split swap cache into 64MB trunks 2017-02-22 16:41:30 -08:00
swapfile.c mm/swap: enable swap slots cache usage 2017-02-22 16:41:30 -08:00
truncate.c mm: Invalidate DAX radix tree entries only if appropriate 2016-12-26 20:29:24 -08:00
usercopy.c mm/usercopy: Switch to using lm_alias 2017-01-11 13:56:50 +00:00
userfaultfd.c userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings 2017-02-22 16:41:28 -08:00
util.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
vmacache.c mm: unrig VMA cache hit ratio 2016-10-07 18:46:27 -07:00
vmalloc.c mm, page_alloc: warn_alloc print nodemask 2017-02-22 16:41:30 -08:00
vmpressure.c mm/vmpressure.c: fix subtree pressure detection 2016-02-03 08:28:43 -08:00
vmscan.c Revert "mm: bail out in shrink_inactive_list()" 2017-02-22 16:41:30 -08:00
vmstat.c mm, compaction: add vmstats for kcompactd work 2017-02-22 16:41:29 -08:00
workingset.c mm, vmscan: cleanup lru size claculations 2017-02-22 16:41:30 -08:00
z3fold.c mm/z3fold.c: limit first_num to the actual range of possible buddy indexes 2017-02-22 16:41:31 -08:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c mm: fix some typos in mm/zsmalloc.c 2017-02-22 16:41:29 -08:00
zswap.c zswap: disable changing params if init fails 2017-02-03 14:13:19 -08:00