linux_dsm_epyc7002/include
Nishanth Aravamudan d1c3fb1f8f hugetlb: introduce nr_overcommit_hugepages sysctl
hugetlb: introduce nr_overcommit_hugepages sysctl

While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
became convinced that having a boolean sysctl was insufficient:

1) To support per-node control of hugepages, I have previously submitted
patches to add a sysfs attribute related to nr_hugepages. However, with
a boolean global value and per-mount quota enforcement constraining the
dynamic pool, adding corresponding control of the dynamic pool on a
per-node basis seems inconsistent to me.

2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
mount points is, arguably, more arduous than it needs to be. Each quota
would need to be set separately, and the sum would need to be monitored.

To ease the administration, and to help make the way for per-node
control of the static & dynamic hugepage pool, I added a separate
sysctl, nr_overcommit_hugepages. This value serves as a high watermark
for the overall hugepage pool, while nr_hugepages serves as a low
watermark. The boolean sysctl can then be removed, as the condition

	nr_overcommit_hugepages > 0

indicates the same administrative setting as

	hugetlb_dynamic_pool == 1

Quotas still serve as local enforcement of the size of the pool on a
per-mount basis.

A few caveats:

1) There is a race whereby the global surplus huge page counter is
incremented before a hugepage has allocated. Another process could then
try grow the pool, and fail to convert a surplus huge page to a normal
huge page and instead allocate a fresh huge page. I believe this is
benign, as no memory is leaked (the actual pages are still tracked
correctly) and the counters won't go out of sync.

2) Shrinking the static pool while a surplus is in effect will allow the
number of surplus huge pages to exceed the overcommit value. As long as
this condition holds, however, no more surplus huge pages will be
allowed on the system until one of the two sysctls are increased
sufficiently, or the surplus huge pages go out of use and are freed.

Successfully tested on x86_64 with the current libhugetlbfs snapshot,
modified to use the new sysctl.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:17 -08:00
..
acpi Pull cpuidle into release branch 2007-11-20 01:18:37 -05:00
asm-alpha alpha: build fixes 2007-12-17 19:28:16 -08:00
asm-arm Merge branch 'pxa-fixes' 2007-12-08 14:41:29 +00:00
asm-avr32 [AVR32] Fix copy_to_user_page() breakage 2007-12-07 14:54:47 +01:00
asm-blackfin Blackfin SPI driver: move hard coded pin_req to board file 2007-12-05 09:21:20 -08:00
asm-cris CRIS tlb.h should include linux/pagemap.h 2007-11-14 18:45:47 -08:00
asm-frv frv: Remove bogus NO_IRQ = -1 define 2007-11-09 15:11:44 -08:00
asm-generic sched: fix RLIMIT_CPU comment 2007-11-26 21:21:49 +01:00
asm-h8300 asm-h8300: parentheses around definition CLOCK_TICK_RATE 2007-12-10 19:43:54 -08:00
asm-ia64 [IA64] iosapic cleanup 2007-12-07 16:11:12 -08:00
asm-m32r m32r: Update sys_rt_sigsuspend 2007-11-28 01:24:04 +09:00
asm-m68k Add CONFIG_DEBUG_SG sg validation 2007-10-22 21:20:03 +02:00
asm-m68knommu m68knommu: fix pread/pwrite defines 2007-11-05 15:12:33 -08:00
asm-mips [MIPS] Alchemy: fix PCI resource conflict 2007-12-14 17:34:29 +00:00
asm-parisc [PARISC] print more than one character at a time for pdc console 2007-12-06 09:32:15 -08:00
asm-powerpc [POWERPC] Kill non-existent symbols from ksyms and commproc.h 2007-12-13 22:44:28 -06:00
asm-ppc ppc: fix AT_VECTOR_SIZE on arch/ppc 2007-10-22 19:18:54 -07:00
asm-s390 [S390] pud_present/pmd_present bug. 2007-12-17 16:25:56 +01:00
asm-sh sh: Fix copy_{to,from}_user_page() with cache disabled. 2007-11-19 12:55:51 +09:00
asm-sh64 sh: remove PTRACE_O_TRACESYSGOOD from asm/ptrace.h 2007-11-07 11:13:54 +09:00
asm-sparc [SPARC32]: Silence sparc32 warnings on missing syscalls. 2007-12-14 10:59:50 -08:00
asm-sparc64 [SPARC64]: Fix two kernel linear mapping setup bugs. 2007-12-13 06:13:38 -08:00
asm-um uml: update address space affected by pud_clear 2007-11-14 18:45:37 -08:00
asm-v850 Add CONFIG_DEBUG_SG sg validation 2007-10-22 21:20:03 +02:00
asm-x86 x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
asm-xtensa xtensa: dma-mapping.h is using linux/scatterlist.h functions, so include it 2007-10-24 13:28:40 +02:00
crypto
keys
linux hugetlb: introduce nr_overcommit_hugepages sysctl 2007-12-17 19:28:17 -08:00
math-emu
media V4L/DVB (6601): V4L: videobuf-core locking fixes and comments 2007-12-11 18:08:08 -02:00
mtd
net [SCTP]: Fix the bind_addr info during migration. 2007-12-07 01:07:49 -08:00
pcmcia [AVR32] pcmcia ioaddr_t should be 32 bits on AVR32 2007-11-15 13:47:19 +01:00
rdma cleanup asm/scatterlist.h includes 2007-11-02 08:47:06 +01:00
rxrpc
scsi esp_scsi: fix reset cleanup spinlock recursion 2007-12-10 19:43:55 -08:00
sound [ALSA] version 1.0.15 2007-11-20 20:16:43 +01:00
video Make asm-x86/bootparam.h includable from userspace. 2007-10-23 15:49:47 +10:00
xen
Kbuild