linux_dsm_epyc7002/mm
Roman Gushchin bf4f059954 mm: memcg/slab: obj_cgroup API
Obj_cgroup API provides an ability to account sub-page sized kernel
objects, which potentially outlive the original memory cgroup.

The top-level API consists of the following functions:
  bool obj_cgroup_tryget(struct obj_cgroup *objcg);
  void obj_cgroup_get(struct obj_cgroup *objcg);
  void obj_cgroup_put(struct obj_cgroup *objcg);

  int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size);
  void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size);

  struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg);
  struct obj_cgroup *get_obj_cgroup_from_current(void);

Object cgroup is basically a pointer to a memory cgroup with a per-cpu
reference counter.  It substitutes a memory cgroup in places where it's
necessary to charge a custom amount of bytes instead of pages.

All charged memory rounded down to pages is charged to the corresponding
memory cgroup using __memcg_kmem_charge().

It implements reparenting: on memcg offlining it's getting reattached to
the parent memory cgroup.  Each online memory cgroup has an associated
active object cgroup to handle new allocations and the list of all
attached object cgroups.  On offlining of a cgroup this list is reparented
and for each object cgroup in the list the memcg pointer is swapped to the
parent memory cgroup.  It prevents long-living objects from pinning the
original memory cgroup in the memory.

The implementation is based on byte-sized per-cpu stocks.  A sub-page
sized leftover is stored in an atomic field, which is a part of obj_cgroup
object.  So on cgroup offlining the leftover is automatically reparented.

memcg->objcg is rcu protected.  objcg->memcg is a raw pointer, which is
always pointing at a memory cgroup, but can be atomically swapped to the
parent memory cgroup.  So a user must ensure the lifetime of the
cgroup, e.g.  grab rcu_read_lock or css_set_lock.

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/20200623174037.3951353-7-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:24 -07:00
..
kasan mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h 2020-06-04 19:06:21 -07:00
backing-dev.c writeback: remove struct bdi_writeback_congested 2020-07-08 17:05:53 -06:00
balloon_compaction.c mm/balloon_compaction: suppress allocation warnings 2019-09-04 07:42:01 -04:00
cleancache.c Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
cma_debug.c debugfs: make sure we can remove u32_array files cleanly 2020-07-10 13:54:00 -07:00
cma.c mm/cma.c: use exact_nid true to fix possible per-numa cma leak 2020-07-03 16:15:25 -07:00
cma.h debugfs: make sure we can remove u32_array files cleanly 2020-07-10 13:54:00 -07:00
compaction.c mm, compaction: make capture control handling safe wrt interrupts 2020-06-26 00:27:36 -07:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug_vm_pgtable.c Documentation/mm: add descriptions for arch page table helpers 2020-08-07 11:33:23 -07:00
debug.c mm, dump_page: do not crash with bad compound_mapcount() 2020-08-07 11:33:23 -07:00
dmapool.c mm/dmapool.c: micro-optimisation remove unnecessary branch 2020-04-07 10:43:42 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
failslab.c mm/failslab.c: by default, do not fail allocations with direct reclaim only 2019-07-12 11:05:43 -07:00
filemap.c mm: filemap: add missing FGP_ flags in kerneldoc comment for pagecache_get_page 2020-08-07 11:33:23 -07:00
frame_vector.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
frontswap.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
gup_benchmark.c mm/gup_benchmark: support pin_user_pages() and related calls 2020-04-02 09:35:27 -07:00
gup.c mm/gup.c: fix the comment of return value for populate_vma_page_range() 2020-08-07 11:33:23 -07:00
highmem.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
hmm.c mm/hmm: provide the page mapping order in hmm_range_fault() 2020-07-10 16:24:28 -03:00
huge_memory.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
hugetlb_cgroup.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
hugetlb.c mm/hugetlb: avoid hardcoding while checking if cma is enabled 2020-07-24 12:42:41 -07:00
hwpoison-inject.c mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
init-mm.c mmap locking API: add MMAP_LOCK_INITIALIZER 2020-06-09 09:39:14 -07:00
internal.h mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
interval_tree.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 2019-06-19 17:09:08 +02:00
Kconfig docs: move nommu-mmap.txt to admin-guide and rename to ReST 2020-06-26 11:33:35 -06:00
Kconfig.debug treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
khugepaged.c khugepaged: fix null-pointer dereference due to race 2020-07-24 12:42:41 -07:00
kmemleak-test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
kmemleak.c mm/kmemleak.c: use address-of operator on section symbols 2020-04-02 09:35:26 -07:00
ksm.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
list_lru.c mm/list_lru.c: Rename kvfree_rcu() to local variant 2020-06-29 11:59:25 -07:00
maccess.c maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault 2020-06-17 10:57:41 -07:00
madvise.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
Makefile The Kernel Concurrency Sanitizer (KCSAN) 2020-06-11 18:55:43 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: update huge page-table entry callbacks 2020-04-02 09:35:29 -07:00
memblock.c mm/memblock: expose only miminal interface to add/walk physmem 2020-07-10 15:08:09 +02:00
memcontrol.c mm: memcg/slab: obj_cgroup API 2020-08-07 11:33:24 -07:00
memfd.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
memory_hotplug.c mm/memory_hotplug.c: fix false softlockup during pfn range removal 2020-06-26 00:27:38 -07:00
memory-failure.c mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread 2020-06-11 18:17:47 -07:00
memory.c Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
mempolicy.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memremap.c mm/memremap: set caching mode for PCI P2PDMA memory to WC 2020-04-10 15:36:21 -07:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c mm/migrate: fix migrate_pgmap_owner w/o CONFIG_MMU_NOTIFIER 2020-08-07 11:33:21 -07:00
mincore.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
mlock.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mm_init.c mm/mm_init.c: report kasan-tag information stored in page->flags 2020-06-02 10:59:12 -07:00
mmap.c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2020-07-31 00:15:53 +02:00
mmu_gather.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmu_notifier.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mremap.c mm: document warning in move_normal_pmd() and make it warn only once 2020-07-13 11:37:39 -07:00
msync.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
nommu.c It's been a busy cycle for documentation - hopefully the busiest for a 2020-08-04 22:47:54 -07:00
oom_kill.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
page_alloc.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
page_counter.c mm, memcg: prevent memory.min load/store tearing 2020-04-02 09:35:29 -07:00
page_ext.c mm/page_ext.c: drop pfn_present() check when onlining 2020-04-07 10:43:40 -07:00
page_idle.c mm/page_idle.c: skip offline pages 2020-06-08 11:05:55 -07:00
page_io.c mm/page_io.c: use blk_io_schedule() for avoiding task hung in sync io 2020-08-07 11:33:24 -07:00
page_isolation.c mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE 2020-06-04 15:36:52 -04:00
page_owner.c mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention 2020-06-03 20:09:45 -07:00
page_poison.c mm/page_poison.c: fix a typo in a comment 2019-09-24 15:54:08 -07:00
page_reporting.c mm/page_reporting: add budget limit on how many pages can be reported per pass 2020-04-07 10:43:39 -07:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page 2020-01-31 10:30:38 -08:00
page-writeback.c mm/page-writeback: fix a typo in comment "effictive"->"effective" 2020-06-04 19:06:24 -07:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-km.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-stats.c percpu: update copyright emails to dennis@kernel.org 2020-04-01 10:09:12 -07:00
percpu-vm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
pgtable-generic.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
process_vm_access.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
ptdump.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
readahead.c mm: use memalloc_nofs_save in readahead path 2020-06-02 10:59:07 -07:00
rmap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
rodata_test.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
shmem.c tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
shuffle.c mm/shuffle: don't move pages between zones and don't read garbage memmaps 2020-08-07 11:33:21 -07:00
shuffle.h mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
slab_common.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
slab.c mm, kcsan: instrument SLAB/SLUB free with "ASSERT_EXCLUSIVE_ACCESS" 2020-08-07 11:33:23 -07:00
slab.h mm: memcontrol: decouple reference counting from page accounting 2020-08-07 11:33:24 -07:00
slob.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
slub.c mm: slub: implement SLUB version of obj_to_index() 2020-08-07 11:33:24 -07:00
sparse-vmemmap.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
sparse.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized 2020-08-07 11:33:24 -07:00
swap_state.c mm: swap: fix kerneldoc of swap_vma_readahead() 2020-08-07 11:33:24 -07:00
swap.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
swapfile.c block: remove the bd_queue field from struct block_device 2020-07-01 08:08:20 -06:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c usercopy: Avoid HIGHMEM pfn warning 2019-09-17 15:20:17 -07:00
userfaultfd.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
util.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm: remove vmalloc_exec 2020-06-26 00:27:38 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
vmstat.c mm: memcg: prepare for byte-sized vmstat items 2020-08-07 11:33:24 -07:00
workingset.c mm: memcg: convert vmstat slab counters to bytes 2020-08-07 11:33:24 -07:00
z3fold.c mm/z3fold: silence kmemleak false positives of slots 2020-05-28 11:35:40 -07:00
zbud.c mm: use false for bool variable 2020-06-04 19:06:24 -07:00
zpool.c zpool: add malloc_support_movable to zpool_driver 2019-09-24 15:54:12 -07:00
zsmalloc.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
zswap.c mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00