linux_dsm_epyc7002/mm
Ying Han 4779280d1e mm: make get_user_pages() interruptible
The initial implementation of checking TIF_MEMDIE covers the cases of OOM
killing.  If the process has been OOM killed, the TIF_MEMDIE is set and it
return immediately.  This patch includes:

1.  add the case that the SIGKILL is sent by user processes.  The
   process can try to get_user_pages() unlimited memory even if a user
   process has sent a SIGKILL to it(maybe a monitor find the process
   exceed its memory limit and try to kill it).  In the old
   implementation, the SIGKILL won't be handled until the get_user_pages()
   returns.

2.  change the return value to be ERESTARTSYS.  It makes no sense to
   return ENOMEM if the get_user_pages returned by getting a SIGKILL
   signal.  Considering the general convention for a system call
   interrupted by a signal is ERESTARTNOSYS, so the current return value
   is consistant to that.

Lee:

An unfortunate side effect of "make-get_user_pages-interruptible" is that
it prevents a SIGKILL'd task from munlock-ing pages that it had mlocked,
resulting in freeing of mlocked pages.  Freeing of mlocked pages, in
itself, is not so bad.  We just count them now--altho' I had hoped to
remove this stat and add PG_MLOCKED to the free pages flags check.

However, consider pages in shared libraries mapped by more than one task
that a task mlocked--e.g., via mlockall().  If the task that mlocked the
pages exits via SIGKILL, these pages would be left mlocked and
unevictable.

Proposed fix:

Add another GUP flag to ignore sigkill when calling get_user_pages from
munlock()--similar to Kosaki Motohiro's 'IGNORE_VMA_PERMISSIONS flag for
the same purpose.  We are not actually allocating memory in this case,
which "make-get_user_pages-interruptible" intends to avoid.  We're just
munlocking pages that are already resident and mapped, and we're reusing
get_user_pages() to access those pages.

??  Maybe we should combine 'IGNORE_VMA_PERMISSIONS and '_IGNORE_SIGKILL
into a single flag: GUP_FLAGS_MUNLOCK ???

[Lee.Schermerhorn@hp.com: ignore sigkill in get_user_pages during munlock]
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Ying Han <yinghan@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rohit Seth <rohitseth@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:08 -08:00
..
allocpercpu.c
backing-dev.c mm: change dirty limit type specifiers to unsigned long 2009-01-06 15:59:02 -08:00
bootmem.c
bounce.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c badpage: remove vma from page_remove_rmap 2009-01-06 15:59:07 -08:00
filemap.c mm: write_cache_pages integrity fix 2009-01-06 15:58:59 -08:00
fremap.c badpage: remove vma from page_remove_rmap 2009-01-06 15:59:07 -08:00
highmem.c
hugetlb.c hugetlb: fix sparse warnings 2009-01-06 15:59:06 -08:00
internal.h mm: make get_user_pages() interruptible 2009-01-06 15:59:08 -08:00
Kconfig
maccess.c
madvise.c
Makefile
memcontrol.c mm: make mem_cgroup_resize_limit() static 2009-01-06 15:59:04 -08:00
memory_hotplug.c mm: remove GFP_HIGHUSER_PAGECACHE 2009-01-06 15:59:01 -08:00
memory.c mm: make get_user_pages() interruptible 2009-01-06 15:59:08 -08:00
mempolicy.c
mempool.c
migrate.c mm: add Set,ClearPageSwapCache stubs 2009-01-06 15:59:02 -08:00
mincore.c
mlock.c mm: make get_user_pages() interruptible 2009-01-06 15:59:08 -08:00
mm_init.c
mmap.c mm: update my address 2009-01-05 17:44:42 -08:00
mmu_notifier.c
mmzone.c
mprotect.c mm: cleanup: remove #ifdef CONFIG_MIGRATION 2009-01-06 15:59:00 -08:00
mremap.c mm: update my address 2009-01-05 17:44:42 -08:00
msync.c add a vfs_fsync helper 2009-01-05 11:54:28 -05:00
nommu.c inode->i_op is never NULL 2009-01-05 11:54:28 -05:00
oom_kill.c oom: print triggering task's cpuset and mems allowed 2009-01-06 15:58:59 -08:00
page_alloc.c badpage: KERN_ALERT BUG instead of KERN_EMERG 2009-01-06 15:59:08 -08:00
page_cgroup.c mm: make init_section_page_cgroup() static 2009-01-06 15:59:04 -08:00
page_io.c mm: try_to_free_swap replaces remove_exclusive_swap_page 2009-01-06 15:59:03 -08:00
page_isolation.c
page-writeback.c mm: add dirty_background_bytes and dirty_bytes sysctls 2009-01-06 15:59:03 -08:00
pagewalk.c
pdflush.c cpumask: convert mm/ 2009-01-01 10:12:29 +10:30
prio_tree.c
quicklist.c
readahead.c
rmap.c badpage: remove vma from page_remove_rmap 2009-01-06 15:59:07 -08:00
shmem_acl.c
shmem.c mm: don't mark_page_accessed in shmem_fault 2009-01-06 15:58:58 -08:00
slab.c cpumask: convert mm/ 2009-01-01 10:12:29 +10:30
slob.c
slub.c cpumask: convert mm/ 2009-01-01 10:12:29 +10:30
sparse-vmemmap.c
sparse.c
swap_state.c mm: remove gfp_mask from add_to_swap 2009-01-06 15:59:04 -08:00
swap.c mm: try_to_free_swap replaces remove_exclusive_swap_page 2009-01-06 15:59:03 -08:00
swapfile.c badpage: zap print_bad_pte on swap and file 2009-01-06 15:59:07 -08:00
thrash.c
tiny-shmem.c
truncate.c
util.c
vmalloc.c mm: vmalloc make lazy unmapping configurable 2009-01-06 15:59:01 -08:00
vmscan.c vmscan: shrink_active_list(): reduce lru_lock hold time 2009-01-06 15:59:08 -08:00
vmstat.c cpumask: convert mm/ 2009-01-01 10:12:29 +10:30