mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-28 01:15:15 +07:00
1d798ca3f1
Hugh has pointed that compound_head() call can be unsafe in some context. There's one example: CPU0 CPU1 isolate_migratepages_block() page_count() compound_head() !!PageTail() == true put_page() tail->first_page = NULL head = tail->first_page alloc_pages(__GFP_COMP) prep_compound_page() tail->first_page = head __SetPageTail(p); !!PageTail() == true <head == NULL dereferencing> The race is pure theoretical. I don't it's possible to trigger it in practice. But who knows. We can fix the race by changing how encode PageTail() and compound_head() within struct page to be able to update them in one shot. The patch introduces page->compound_head into third double word block in front of compound_dtor and compound_order. Bit 0 encodes PageTail() and the rest bits are pointer to head page if bit zero is set. The patch moves page->pmd_huge_pte out of word, just in case if an architecture defines pgtable_t into something what can have the bit 0 set. hugetlb_cgroup uses page->lru.next in the second tail page to store pointer struct hugetlb_cgroup. The patch switch it to use page->private in the second tail page instead. The space is free since ->first_page is removed from the union. The patch also opens possibility to remove HUGETLB_CGROUP_MIN_ORDER limitation, since there's now space in first tail page to store struct hugetlb_cgroup pointer. But that's out of scope of the patch. That means page->compound_head shares storage space with: - page->lru.next; - page->next; - page->rcu_head.next; That's too long list to be absolutely sure, but looks like nobody uses bit 0 of the word. page->rcu_head.next guaranteed[1] to have bit 0 clean as long as we use call_rcu(), call_rcu_bh(), call_rcu_sched(), or call_srcu(). But future call_rcu_lazy() is not allowed as it makes use of the bit and we can get false positive PageTail(). [1] http://lkml.kernel.org/g/20150827163634.GD4029@linux.vnet.ibm.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
671 lines
22 KiB
Plaintext
671 lines
22 KiB
Plaintext
config SELECT_MEMORY_MODEL
|
|
def_bool y
|
|
depends on ARCH_SELECT_MEMORY_MODEL
|
|
|
|
choice
|
|
prompt "Memory model"
|
|
depends on SELECT_MEMORY_MODEL
|
|
default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
|
|
default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
|
|
default FLATMEM_MANUAL
|
|
|
|
config FLATMEM_MANUAL
|
|
bool "Flat Memory"
|
|
depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || ARCH_FLATMEM_ENABLE
|
|
help
|
|
This option allows you to change some of the ways that
|
|
Linux manages its memory internally. Most users will
|
|
only have one option here: FLATMEM. This is normal
|
|
and a correct option.
|
|
|
|
Some users of more advanced features like NUMA and
|
|
memory hotplug may have different options here.
|
|
DISCONTIGMEM is a more mature, better tested system,
|
|
but is incompatible with memory hotplug and may suffer
|
|
decreased performance over SPARSEMEM. If unsure between
|
|
"Sparse Memory" and "Discontiguous Memory", choose
|
|
"Discontiguous Memory".
|
|
|
|
If unsure, choose this option (Flat Memory) over any other.
|
|
|
|
config DISCONTIGMEM_MANUAL
|
|
bool "Discontiguous Memory"
|
|
depends on ARCH_DISCONTIGMEM_ENABLE
|
|
help
|
|
This option provides enhanced support for discontiguous
|
|
memory systems, over FLATMEM. These systems have holes
|
|
in their physical address spaces, and this option provides
|
|
more efficient handling of these holes. However, the vast
|
|
majority of hardware has quite flat address spaces, and
|
|
can have degraded performance from the extra overhead that
|
|
this option imposes.
|
|
|
|
Many NUMA configurations will have this as the only option.
|
|
|
|
If unsure, choose "Flat Memory" over this option.
|
|
|
|
config SPARSEMEM_MANUAL
|
|
bool "Sparse Memory"
|
|
depends on ARCH_SPARSEMEM_ENABLE
|
|
help
|
|
This will be the only option for some systems, including
|
|
memory hotplug systems. This is normal.
|
|
|
|
For many other systems, this will be an alternative to
|
|
"Discontiguous Memory". This option provides some potential
|
|
performance benefits, along with decreased code complexity,
|
|
but it is newer, and more experimental.
|
|
|
|
If unsure, choose "Discontiguous Memory" or "Flat Memory"
|
|
over this option.
|
|
|
|
endchoice
|
|
|
|
config DISCONTIGMEM
|
|
def_bool y
|
|
depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
|
|
|
|
config SPARSEMEM
|
|
def_bool y
|
|
depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL
|
|
|
|
config FLATMEM
|
|
def_bool y
|
|
depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
|
|
|
|
config FLAT_NODE_MEM_MAP
|
|
def_bool y
|
|
depends on !SPARSEMEM
|
|
|
|
#
|
|
# Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
|
|
# to represent different areas of memory. This variable allows
|
|
# those dependencies to exist individually.
|
|
#
|
|
config NEED_MULTIPLE_NODES
|
|
def_bool y
|
|
depends on DISCONTIGMEM || NUMA
|
|
|
|
config HAVE_MEMORY_PRESENT
|
|
def_bool y
|
|
depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
|
|
|
|
#
|
|
# SPARSEMEM_EXTREME (which is the default) does some bootmem
|
|
# allocations when memory_present() is called. If this cannot
|
|
# be done on your architecture, select this option. However,
|
|
# statically allocating the mem_section[] array can potentially
|
|
# consume vast quantities of .bss, so be careful.
|
|
#
|
|
# This option will also potentially produce smaller runtime code
|
|
# with gcc 3.4 and later.
|
|
#
|
|
config SPARSEMEM_STATIC
|
|
bool
|
|
|
|
#
|
|
# Architecture platforms which require a two level mem_section in SPARSEMEM
|
|
# must select this option. This is usually for architecture platforms with
|
|
# an extremely sparse physical address space.
|
|
#
|
|
config SPARSEMEM_EXTREME
|
|
def_bool y
|
|
depends on SPARSEMEM && !SPARSEMEM_STATIC
|
|
|
|
config SPARSEMEM_VMEMMAP_ENABLE
|
|
bool
|
|
|
|
config SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
|
|
def_bool y
|
|
depends on SPARSEMEM && X86_64
|
|
|
|
config SPARSEMEM_VMEMMAP
|
|
bool "Sparse Memory virtual memmap"
|
|
depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
|
|
default y
|
|
help
|
|
SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
|
|
pfn_to_page and page_to_pfn operations. This is the most
|
|
efficient option when sufficient kernel resources are available.
|
|
|
|
config HAVE_MEMBLOCK
|
|
bool
|
|
|
|
config HAVE_MEMBLOCK_NODE_MAP
|
|
bool
|
|
|
|
config HAVE_MEMBLOCK_PHYS_MAP
|
|
bool
|
|
|
|
config HAVE_GENERIC_RCU_GUP
|
|
bool
|
|
|
|
config ARCH_DISCARD_MEMBLOCK
|
|
bool
|
|
|
|
config NO_BOOTMEM
|
|
bool
|
|
|
|
config MEMORY_ISOLATION
|
|
bool
|
|
|
|
config MOVABLE_NODE
|
|
bool "Enable to assign a node which has only movable memory"
|
|
depends on HAVE_MEMBLOCK
|
|
depends on NO_BOOTMEM
|
|
depends on X86_64
|
|
depends on NUMA
|
|
default n
|
|
help
|
|
Allow a node to have only movable memory. Pages used by the kernel,
|
|
such as direct mapping pages cannot be migrated. So the corresponding
|
|
memory device cannot be hotplugged. This option allows the following
|
|
two things:
|
|
- When the system is booting, node full of hotpluggable memory can
|
|
be arranged to have only movable memory so that the whole node can
|
|
be hot-removed. (need movable_node boot option specified).
|
|
- After the system is up, the option allows users to online all the
|
|
memory of a node as movable memory so that the whole node can be
|
|
hot-removed.
|
|
|
|
Users who don't use the memory hotplug feature are fine with this
|
|
option on since they don't specify movable_node boot option or they
|
|
don't online memory as movable.
|
|
|
|
Say Y here if you want to hotplug a whole node.
|
|
Say N here if you want kernel to use memory on all nodes evenly.
|
|
|
|
#
|
|
# Only be set on architectures that have completely implemented memory hotplug
|
|
# feature. If you are not sure, don't touch it.
|
|
#
|
|
config HAVE_BOOTMEM_INFO_NODE
|
|
def_bool n
|
|
|
|
# eventually, we can have this option just 'select SPARSEMEM'
|
|
config MEMORY_HOTPLUG
|
|
bool "Allow for memory hot-add"
|
|
depends on SPARSEMEM || X86_64_ACPI_NUMA
|
|
depends on ARCH_ENABLE_MEMORY_HOTPLUG
|
|
depends on (IA64 || X86 || PPC_BOOK3S_64 || SUPERH || S390)
|
|
|
|
config MEMORY_HOTPLUG_SPARSE
|
|
def_bool y
|
|
depends on SPARSEMEM && MEMORY_HOTPLUG
|
|
|
|
config MEMORY_HOTREMOVE
|
|
bool "Allow for memory hot remove"
|
|
select MEMORY_ISOLATION
|
|
select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
|
|
depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
|
|
depends on MIGRATION
|
|
|
|
# Heavily threaded applications may benefit from splitting the mm-wide
|
|
# page_table_lock, so that faults on different parts of the user address
|
|
# space can be handled with less contention: split it at this NR_CPUS.
|
|
# Default to 4 for wider testing, though 8 might be more appropriate.
|
|
# ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
|
|
# PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
|
|
# DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
|
|
#
|
|
config SPLIT_PTLOCK_CPUS
|
|
int
|
|
default "999999" if !MMU
|
|
default "999999" if ARM && !CPU_CACHE_VIPT
|
|
default "999999" if PARISC && !PA20
|
|
default "4"
|
|
|
|
config ARCH_ENABLE_SPLIT_PMD_PTLOCK
|
|
bool
|
|
|
|
#
|
|
# support for memory balloon
|
|
config MEMORY_BALLOON
|
|
bool
|
|
|
|
#
|
|
# support for memory balloon compaction
|
|
config BALLOON_COMPACTION
|
|
bool "Allow for balloon memory compaction/migration"
|
|
def_bool y
|
|
depends on COMPACTION && MEMORY_BALLOON
|
|
help
|
|
Memory fragmentation introduced by ballooning might reduce
|
|
significantly the number of 2MB contiguous memory blocks that can be
|
|
used within a guest, thus imposing performance penalties associated
|
|
with the reduced number of transparent huge pages that could be used
|
|
by the guest workload. Allowing the compaction & migration for memory
|
|
pages enlisted as being part of memory balloon devices avoids the
|
|
scenario aforementioned and helps improving memory defragmentation.
|
|
|
|
#
|
|
# support for memory compaction
|
|
config COMPACTION
|
|
bool "Allow for memory compaction"
|
|
def_bool y
|
|
select MIGRATION
|
|
depends on MMU
|
|
help
|
|
Allows the compaction of memory for the allocation of huge pages.
|
|
|
|
#
|
|
# support for page migration
|
|
#
|
|
config MIGRATION
|
|
bool "Page migration"
|
|
def_bool y
|
|
depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
|
|
help
|
|
Allows the migration of the physical location of pages of processes
|
|
while the virtual addresses are not changed. This is useful in
|
|
two situations. The first is on NUMA systems to put pages nearer
|
|
to the processors accessing. The second is when allocating huge
|
|
pages as migration can relocate pages to satisfy a huge page
|
|
allocation instead of reclaiming.
|
|
|
|
config ARCH_ENABLE_HUGEPAGE_MIGRATION
|
|
bool
|
|
|
|
config PHYS_ADDR_T_64BIT
|
|
def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT
|
|
|
|
config ZONE_DMA_FLAG
|
|
int
|
|
default "0" if !ZONE_DMA
|
|
default "1"
|
|
|
|
config BOUNCE
|
|
bool "Enable bounce buffers"
|
|
default y
|
|
depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
|
|
help
|
|
Enable bounce buffers for devices that cannot access
|
|
the full range of memory available to the CPU. Enabled
|
|
by default when ZONE_DMA or HIGHMEM is selected, but you
|
|
may say n to override this.
|
|
|
|
# On the 'tile' arch, USB OHCI needs the bounce pool since tilegx will often
|
|
# have more than 4GB of memory, but we don't currently use the IOTLB to present
|
|
# a 32-bit address to OHCI. So we need to use a bounce pool instead.
|
|
config NEED_BOUNCE_POOL
|
|
bool
|
|
default y if TILE && USB_OHCI_HCD
|
|
|
|
config NR_QUICK
|
|
int
|
|
depends on QUICKLIST
|
|
default "2" if AVR32
|
|
default "1"
|
|
|
|
config VIRT_TO_BUS
|
|
bool
|
|
help
|
|
An architecture should select this if it implements the
|
|
deprecated interface virt_to_bus(). All new architectures
|
|
should probably not select this.
|
|
|
|
|
|
config MMU_NOTIFIER
|
|
bool
|
|
select SRCU
|
|
|
|
config KSM
|
|
bool "Enable KSM for page merging"
|
|
depends on MMU
|
|
help
|
|
Enable Kernel Samepage Merging: KSM periodically scans those areas
|
|
of an application's address space that an app has advised may be
|
|
mergeable. When it finds pages of identical content, it replaces
|
|
the many instances by a single page with that content, so
|
|
saving memory until one or another app needs to modify the content.
|
|
Recommended for use with KVM, or with other duplicative applications.
|
|
See Documentation/vm/ksm.txt for more information: KSM is inactive
|
|
until a program has madvised that an area is MADV_MERGEABLE, and
|
|
root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).
|
|
|
|
config DEFAULT_MMAP_MIN_ADDR
|
|
int "Low address space to protect from user allocation"
|
|
depends on MMU
|
|
default 4096
|
|
help
|
|
This is the portion of low virtual memory which should be protected
|
|
from userspace allocation. Keeping a user from writing to low pages
|
|
can help reduce the impact of kernel NULL pointer bugs.
|
|
|
|
For most ia64, ppc64 and x86 users with lots of address space
|
|
a value of 65536 is reasonable and should cause no problems.
|
|
On arm and other archs it should not be higher than 32768.
|
|
Programs which use vm86 functionality or have some need to map
|
|
this low address space will need CAP_SYS_RAWIO or disable this
|
|
protection by setting the value to 0.
|
|
|
|
This value can be changed after boot using the
|
|
/proc/sys/vm/mmap_min_addr tunable.
|
|
|
|
config ARCH_SUPPORTS_MEMORY_FAILURE
|
|
bool
|
|
|
|
config MEMORY_FAILURE
|
|
depends on MMU
|
|
depends on ARCH_SUPPORTS_MEMORY_FAILURE
|
|
bool "Enable recovery from hardware memory errors"
|
|
select MEMORY_ISOLATION
|
|
select RAS
|
|
help
|
|
Enables code to recover from some memory failures on systems
|
|
with MCA recovery. This allows a system to continue running
|
|
even when some of its memory has uncorrected errors. This requires
|
|
special hardware support and typically ECC memory.
|
|
|
|
config HWPOISON_INJECT
|
|
tristate "HWPoison pages injector"
|
|
depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
|
|
select PROC_PAGE_MONITOR
|
|
|
|
config NOMMU_INITIAL_TRIM_EXCESS
|
|
int "Turn on mmap() excess space trimming before booting"
|
|
depends on !MMU
|
|
default 1
|
|
help
|
|
The NOMMU mmap() frequently needs to allocate large contiguous chunks
|
|
of memory on which to store mappings, but it can only ask the system
|
|
allocator for chunks in 2^N*PAGE_SIZE amounts - which is frequently
|
|
more than it requires. To deal with this, mmap() is able to trim off
|
|
the excess and return it to the allocator.
|
|
|
|
If trimming is enabled, the excess is trimmed off and returned to the
|
|
system allocator, which can cause extra fragmentation, particularly
|
|
if there are a lot of transient processes.
|
|
|
|
If trimming is disabled, the excess is kept, but not used, which for
|
|
long-term mappings means that the space is wasted.
|
|
|
|
Trimming can be dynamically controlled through a sysctl option
|
|
(/proc/sys/vm/nr_trim_pages) which specifies the minimum number of
|
|
excess pages there must be before trimming should occur, or zero if
|
|
no trimming is to occur.
|
|
|
|
This option specifies the initial value of this option. The default
|
|
of 1 says that all excess pages should be trimmed.
|
|
|
|
See Documentation/nommu-mmap.txt for more information.
|
|
|
|
config TRANSPARENT_HUGEPAGE
|
|
bool "Transparent Hugepage Support"
|
|
depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
|
|
select COMPACTION
|
|
help
|
|
Transparent Hugepages allows the kernel to use huge pages and
|
|
huge tlb transparently to the applications whenever possible.
|
|
This feature can improve computing performance to certain
|
|
applications by speeding up page faults during memory
|
|
allocation, by reducing the number of tlb misses and by speeding
|
|
up the pagetable walking.
|
|
|
|
If memory constrained on embedded, you may want to say N.
|
|
|
|
choice
|
|
prompt "Transparent Hugepage Support sysfs defaults"
|
|
depends on TRANSPARENT_HUGEPAGE
|
|
default TRANSPARENT_HUGEPAGE_ALWAYS
|
|
help
|
|
Selects the sysfs defaults for Transparent Hugepage Support.
|
|
|
|
config TRANSPARENT_HUGEPAGE_ALWAYS
|
|
bool "always"
|
|
help
|
|
Enabling Transparent Hugepage always, can increase the
|
|
memory footprint of applications without a guaranteed
|
|
benefit but it will work automatically for all applications.
|
|
|
|
config TRANSPARENT_HUGEPAGE_MADVISE
|
|
bool "madvise"
|
|
help
|
|
Enabling Transparent Hugepage madvise, will only provide a
|
|
performance improvement benefit to the applications using
|
|
madvise(MADV_HUGEPAGE) but it won't risk to increase the
|
|
memory footprint of applications without a guaranteed
|
|
benefit.
|
|
endchoice
|
|
|
|
#
|
|
# UP and nommu archs use km based percpu allocator
|
|
#
|
|
config NEED_PER_CPU_KM
|
|
depends on !SMP
|
|
bool
|
|
default y
|
|
|
|
config CLEANCACHE
|
|
bool "Enable cleancache driver to cache clean pages if tmem is present"
|
|
default n
|
|
help
|
|
Cleancache can be thought of as a page-granularity victim cache
|
|
for clean pages that the kernel's pageframe replacement algorithm
|
|
(PFRA) would like to keep around, but can't since there isn't enough
|
|
memory. So when the PFRA "evicts" a page, it first attempts to use
|
|
cleancache code to put the data contained in that page into
|
|
"transcendent memory", memory that is not directly accessible or
|
|
addressable by the kernel and is of unknown and possibly
|
|
time-varying size. And when a cleancache-enabled
|
|
filesystem wishes to access a page in a file on disk, it first
|
|
checks cleancache to see if it already contains it; if it does,
|
|
the page is copied into the kernel and a disk access is avoided.
|
|
When a transcendent memory driver is available (such as zcache or
|
|
Xen transcendent memory), a significant I/O reduction
|
|
may be achieved. When none is available, all cleancache calls
|
|
are reduced to a single pointer-compare-against-NULL resulting
|
|
in a negligible performance hit.
|
|
|
|
If unsure, say Y to enable cleancache
|
|
|
|
config FRONTSWAP
|
|
bool "Enable frontswap to cache swap pages if tmem is present"
|
|
depends on SWAP
|
|
default n
|
|
help
|
|
Frontswap is so named because it can be thought of as the opposite
|
|
of a "backing" store for a swap device. The data is stored into
|
|
"transcendent memory", memory that is not directly accessible or
|
|
addressable by the kernel and is of unknown and possibly
|
|
time-varying size. When space in transcendent memory is available,
|
|
a significant swap I/O reduction may be achieved. When none is
|
|
available, all frontswap calls are reduced to a single pointer-
|
|
compare-against-NULL resulting in a negligible performance hit
|
|
and swap data is stored as normal on the matching swap device.
|
|
|
|
If unsure, say Y to enable frontswap.
|
|
|
|
config CMA
|
|
bool "Contiguous Memory Allocator"
|
|
depends on HAVE_MEMBLOCK && MMU
|
|
select MIGRATION
|
|
select MEMORY_ISOLATION
|
|
help
|
|
This enables the Contiguous Memory Allocator which allows other
|
|
subsystems to allocate big physically-contiguous blocks of memory.
|
|
CMA reserves a region of memory and allows only movable pages to
|
|
be allocated from it. This way, the kernel can use the memory for
|
|
pagecache and when a subsystem requests for contiguous area, the
|
|
allocated pages are migrated away to serve the contiguous request.
|
|
|
|
If unsure, say "n".
|
|
|
|
config CMA_DEBUG
|
|
bool "CMA debug messages (DEVELOPMENT)"
|
|
depends on DEBUG_KERNEL && CMA
|
|
help
|
|
Turns on debug messages in CMA. This produces KERN_DEBUG
|
|
messages for every CMA call as well as various messages while
|
|
processing calls such as dma_alloc_from_contiguous().
|
|
This option does not affect warning and error messages.
|
|
|
|
config CMA_DEBUGFS
|
|
bool "CMA debugfs interface"
|
|
depends on CMA && DEBUG_FS
|
|
help
|
|
Turns on the DebugFS interface for CMA.
|
|
|
|
config CMA_AREAS
|
|
int "Maximum count of the CMA areas"
|
|
depends on CMA
|
|
default 7
|
|
help
|
|
CMA allows to create CMA areas for particular purpose, mainly,
|
|
used as device private area. This parameter sets the maximum
|
|
number of CMA area in the system.
|
|
|
|
If unsure, leave the default value "7".
|
|
|
|
config MEM_SOFT_DIRTY
|
|
bool "Track memory changes"
|
|
depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
|
|
select PROC_PAGE_MONITOR
|
|
help
|
|
This option enables memory changes tracking by introducing a
|
|
soft-dirty bit on pte-s. This bit it set when someone writes
|
|
into a page just as regular dirty bit, but unlike the latter
|
|
it can be cleared by hands.
|
|
|
|
See Documentation/vm/soft-dirty.txt for more details.
|
|
|
|
config ZSWAP
|
|
bool "Compressed cache for swap pages (EXPERIMENTAL)"
|
|
depends on FRONTSWAP && CRYPTO=y
|
|
select CRYPTO_LZO
|
|
select ZPOOL
|
|
default n
|
|
help
|
|
A lightweight compressed cache for swap pages. It takes
|
|
pages that are in the process of being swapped out and attempts to
|
|
compress them into a dynamically allocated RAM-based memory pool.
|
|
This can result in a significant I/O reduction on swap device and,
|
|
in the case where decompressing from RAM is faster that swap device
|
|
reads, can also improve workload performance.
|
|
|
|
This is marked experimental because it is a new feature (as of
|
|
v3.11) that interacts heavily with memory reclaim. While these
|
|
interactions don't cause any known issues on simple memory setups,
|
|
they have not be fully explored on the large set of potential
|
|
configurations and workloads that exist.
|
|
|
|
config ZPOOL
|
|
tristate "Common API for compressed memory storage"
|
|
default n
|
|
help
|
|
Compressed memory storage API. This allows using either zbud or
|
|
zsmalloc.
|
|
|
|
config ZBUD
|
|
tristate "Low density storage for compressed pages"
|
|
default n
|
|
help
|
|
A special purpose allocator for storing compressed pages.
|
|
It is designed to store up to two compressed pages per physical
|
|
page. While this design limits storage density, it has simple and
|
|
deterministic reclaim properties that make it preferable to a higher
|
|
density approach when reclaim will be used.
|
|
|
|
config ZSMALLOC
|
|
tristate "Memory allocator for compressed pages"
|
|
depends on MMU
|
|
default n
|
|
help
|
|
zsmalloc is a slab-based memory allocator designed to store
|
|
compressed RAM pages. zsmalloc uses virtual memory mapping
|
|
in order to reduce fragmentation. However, this results in a
|
|
non-standard allocator interface where a handle, not a pointer, is
|
|
returned by an alloc(). This handle must be mapped in order to
|
|
access the allocated space.
|
|
|
|
config PGTABLE_MAPPING
|
|
bool "Use page table mapping to access object in zsmalloc"
|
|
depends on ZSMALLOC
|
|
help
|
|
By default, zsmalloc uses a copy-based object mapping method to
|
|
access allocations that span two pages. However, if a particular
|
|
architecture (ex, ARM) performs VM mapping faster than copying,
|
|
then you should select this. This causes zsmalloc to use page table
|
|
mapping rather than copying for object mapping.
|
|
|
|
You can check speed with zsmalloc benchmark:
|
|
https://github.com/spartacus06/zsmapbench
|
|
|
|
config ZSMALLOC_STAT
|
|
bool "Export zsmalloc statistics"
|
|
depends on ZSMALLOC
|
|
select DEBUG_FS
|
|
help
|
|
This option enables code in the zsmalloc to collect various
|
|
statistics about whats happening in zsmalloc and exports that
|
|
information to userspace via debugfs.
|
|
If unsure, say N.
|
|
|
|
config GENERIC_EARLY_IOREMAP
|
|
bool
|
|
|
|
config MAX_STACK_SIZE_MB
|
|
int "Maximum user stack size for 32-bit processes (MB)"
|
|
default 80
|
|
range 8 256 if METAG
|
|
range 8 2048
|
|
depends on STACK_GROWSUP && (!64BIT || COMPAT)
|
|
help
|
|
This is the maximum stack size in Megabytes in the VM layout of 32-bit
|
|
user processes when the stack grows upwards (currently only on parisc
|
|
and metag arch). The stack will be located at the highest memory
|
|
address minus the given value, unless the RLIMIT_STACK hard limit is
|
|
changed to a smaller value in which case that is used.
|
|
|
|
A sane initial value is 80 MB.
|
|
|
|
# For architectures that support deferred memory initialisation
|
|
config ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
|
|
bool
|
|
|
|
config DEFERRED_STRUCT_PAGE_INIT
|
|
bool "Defer initialisation of struct pages to kswapd"
|
|
default n
|
|
depends on ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
|
|
depends on MEMORY_HOTPLUG
|
|
help
|
|
Ordinarily all struct pages are initialised during early boot in a
|
|
single thread. On very large machines this can take a considerable
|
|
amount of time. If this option is set, large machines will bring up
|
|
a subset of memmap at boot and then initialise the rest in parallel
|
|
when kswapd starts. This has a potential performance impact on
|
|
processes running early in the lifetime of the systemm until kswapd
|
|
finishes the initialisation.
|
|
|
|
config IDLE_PAGE_TRACKING
|
|
bool "Enable idle page tracking"
|
|
depends on SYSFS && MMU
|
|
select PAGE_EXTENSION if !64BIT
|
|
help
|
|
This feature allows to estimate the amount of user pages that have
|
|
not been touched during a given period of time. This information can
|
|
be useful to tune memory cgroup limits and/or for job placement
|
|
within a compute cluster.
|
|
|
|
See Documentation/vm/idle_page_tracking.txt for more details.
|
|
|
|
config ZONE_DEVICE
|
|
bool "Device memory (pmem, etc...) hotplug support" if EXPERT
|
|
default !ZONE_DMA
|
|
depends on !ZONE_DMA
|
|
depends on MEMORY_HOTPLUG
|
|
depends on MEMORY_HOTREMOVE
|
|
depends on X86_64 #arch_add_memory() comprehends device memory
|
|
|
|
help
|
|
Device memory hotplug support allows for establishing pmem,
|
|
or other device driver discovered memory regions, in the
|
|
memmap. This allows pfn_to_page() lookups of otherwise
|
|
"device-physical" addresses which is needed for using a DAX
|
|
mapping in an O_DIRECT operation, among other things.
|
|
|
|
If FS_DAX is enabled, then say Y.
|
|
|
|
config FRAME_VECTOR
|
|
bool
|