linux_dsm_epyc7002/mm
Michal Hocko 03049269de mm, oom_reaper: implement OOM victims queuing
wake_oom_reaper has allowed only 1 oom victim to be queued.  The main
reason for that was the simplicity as other solutions would require some
way of queuing.  The current approach is racy and that was deemed
sufficient as the oom_reaper is considered a best effort approach to
help with oom handling when the OOM victim cannot terminate in a
reasonable time.  The race could lead to missing an oom victim which can
get stuck

out_of_memory
  wake_oom_reaper
    cmpxchg // OK
    			oom_reaper
			  oom_reap_task
			    __oom_reap_task
oom_victim terminates
			      atomic_inc_not_zero // fail
out_of_memory
  wake_oom_reaper
    cmpxchg // fails
			  task_to_reap = NULL

This race requires 2 OOM invocations in a short time period which is not
very likely but certainly not impossible.  E.g.  the original victim
might have not released a lot of memory for some reason.

The situation would improve considerably if wake_oom_reaper used a more
robust queuing.  This is what this patch implements.  This means adding
oom_reaper_list list_head into task_struct (eat a hole before embeded
thread_struct for that purpose) and a oom_reaper_lock spinlock for
queuing synchronization.  wake_oom_reaper will then add the task on the
queue and oom_reaper will dequeue it.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-25 16:37:42 -07:00
..
kasan kernel: add kcov code coverage 2016-03-22 15:36:02 -07:00
backing-dev.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
balloon_compaction.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
bootmem.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
cleancache.c cleancache: constify cleancache_ops structure 2016-01-27 09:09:57 -05:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
cma.c mm/cma.c: suppress warning 2015-11-05 19:34:48 -08:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
compaction.c mm, kswapd: replace kswapd compaction with waking up kcompactd 2016-03-17 15:09:34 -07:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm: introduce page reference manipulation functions 2016-03-17 15:09:34 -07:00
dmapool.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c mm: use radix_tree_iter_retry() 2016-03-17 15:09:34 -07:00
frame_vector.c mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm 2016-02-16 10:11:12 +01:00
frontswap.c frontswap: allow multiple backends 2015-06-24 17:49:45 -07:00
gup.c mm/core, x86/mm/pkeys: Differentiate instruction fetches 2016-02-18 19:46:29 +01:00
highmem.c
huge_memory.c powerpc updates for 4.6 2016-03-19 15:38:41 -07:00
hugetlb_cgroup.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
hugetlb.c mm: convert pr_warning to pr_warn 2016-03-17 15:09:34 -07:00
hwpoison-inject.c hwpoison: use page_cgroup_ino for filtering by memcg 2015-09-10 13:29:01 -07:00
init-mm.c
internal.h mm, oom: introduce oom reaper 2016-03-25 16:37:42 -07:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
Kconfig.debug mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
kmemcheck.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
ksm.c mm/core: Do not enforce PKEY permissions on remote mm access 2016-02-18 19:46:28 +01:00
list_lru.c mm: memcontrol: move kmem accounting code to CONFIG_MEMCG 2016-01-20 17:09:18 -08:00
maccess.c mm/maccess.c: actually return -EFAULT from strncpy_from_unsafe 2015-11-05 19:34:48 -08:00
madvise.c mm/madvise: update comment on sys_madvise() 2016-03-15 16:55:16 -07:00
Makefile kernel: add kcov code coverage 2016-03-22 15:36:02 -07:00
memblock.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
memcontrol.c mm: memcontrol: zap oom_info_lock 2016-03-17 15:09:34 -07:00
memory_hotplug.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
memory-failure.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
memory.c mm, oom: introduce oom reaper 2016-03-25 16:37:42 -07:00
mempolicy.c Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
mempool.c mm, mempool: only set __GFP_NOMEMALLOC if there are free elements 2016-03-17 15:09:34 -07:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: make remove_migration_ptes() beyond mm/migration.c 2016-03-17 15:09:34 -07:00
mincore.c thp: change pmd_trans_huge_lock() interface to return ptl 2016-01-21 17:20:51 -08:00
mlock.c mm: fix mlock accouting 2016-01-21 17:20:51 -08:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
mmu_context.c
mmu_notifier.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
mmzone.c mm/mmzone.c: memmap_valid_within() can be boolean 2016-01-14 16:00:49 -08:00
mprotect.c mm/mprotect.c: don't imply PROT_EXEC on non-exec fs 2016-03-22 15:36:02 -07:00
mremap.c mm: cleanup *pte_alloc* interfaces 2016-03-17 15:09:34 -07:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
nommu.c Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
oom_kill.c mm, oom_reaper: implement OOM victims queuing 2016-03-25 16:37:42 -07:00
page_alloc.c mm,oom: do not loop !__GFP_FS allocation if the OOM killer is disabled 2016-03-17 15:09:34 -07:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm/page_poisoning.c: allow for zero poisoning 2016-03-15 16:55:16 -07:00
page_idle.c mm: add page_check_address_transhuge() helper 2016-01-15 17:56:32 -08:00
page_io.c zram: revive swap_slot_free_notify 2016-03-22 15:36:02 -07:00
page_isolation.c mm/page_isolation: do some cleanup in "undo_isolate_page_range" 2016-01-15 17:56:32 -08:00
page_owner.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
page_poison.c mm/page_poisoning.c: allow for zero poisoning 2016-03-15 16:55:16 -07:00
page-writeback.c mm: remove unnecessary uses of lock_page_memcg() 2016-03-15 16:55:16 -07:00
pagewalk.c thp: rename split_huge_page_pmd() to split_huge_pmd() 2016-01-15 17:56:32 -08:00
percpu-km.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
percpu-vm.c
percpu.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
pgtable-generic.c mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range 2016-03-17 15:09:34 -07:00
process_vm_access.c mm/gup: Introduce get_user_pages_remote() 2016-02-16 10:04:09 +01:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: move lru_to_page to mm_inline.h 2016-01-14 16:00:49 -08:00
rmap.c thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers 2016-03-17 15:09:34 -07:00
shmem.c radix-tree,shmem: introduce radix_tree_iter_next() 2016-03-17 15:09:34 -07:00
slab_common.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
slab.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
slab.h mm: memcontrol: report slab usage in cgroup2 memory.stat 2016-03-17 15:09:34 -07:00
slob.c mm: slab: free kmem_cache_node after destroy sysfs file 2016-02-18 16:23:24 -08:00
slub.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
sparse-vmemmap.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
sparse.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_cgroup.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_state.c mm: memcontrol: charge swap to cgroup2 2016-01-20 17:09:18 -08:00
swap.c mm, x86: get_user_pages() for dax mappings 2016-01-15 17:56:32 -08:00
swapfile.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2016-03-21 13:48:00 -07:00
truncate.c mm: remove unnecessary uses of lock_page_memcg() 2016-03-15 16:55:16 -07:00
userfaultfd.c mm: cleanup *pte_alloc* interfaces 2016-03-17 15:09:34 -07:00
util.c Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
vmacache.c mm/vmacache: inline vmacache_valid_mm() 2015-11-05 19:34:48 -08:00
vmalloc.c mm/vmalloc: use PAGE_ALIGNED() to check PAGE_SIZE alignment 2016-03-17 15:09:34 -07:00
vmpressure.c mm/vmpressure.c: fix subtree pressure detection 2016-02-03 08:28:43 -08:00
vmscan.c mm: introduce page reference manipulation functions 2016-03-17 15:09:34 -07:00
vmstat.c thp, vmstats: count deferred split events 2016-03-17 15:09:34 -07:00
workingset.c mm: workingset: make shadow node shrinker memcg aware 2016-03-17 15:09:34 -07:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c mm/zsmalloc: add `freeable' column to pool stat 2016-03-17 15:09:34 -07:00
zswap.c mm/zswap: change incorrect strncmp use to strcmp 2015-12-18 14:25:40 -08:00