Go to file
Roman Gushchin 475d0487a2 mm: memcontrol: use per-cpu stocks for socket memory uncharging
We've noticed a quite noticeable performance overhead on some hosts with
significant network traffic when socket memory accounting is enabled.

Perf top shows that socket memory uncharging path is hot:
  2.13%  [kernel]                [k] page_counter_cancel
  1.14%  [kernel]                [k] __sk_mem_reduce_allocated
  1.14%  [kernel]                [k] _raw_spin_lock
  0.87%  [kernel]                [k] _raw_spin_lock_irqsave
  0.84%  [kernel]                [k] tcp_ack
  0.84%  [kernel]                [k] ixgbe_poll
  0.83%  < workload >
  0.82%  [kernel]                [k] enqueue_entity
  0.68%  [kernel]                [k] __fget
  0.68%  [kernel]                [k] tcp_delack_timer_handler
  0.67%  [kernel]                [k] __schedule
  0.60%  < workload >
  0.59%  [kernel]                [k] __inet6_lookup_established
  0.55%  [kernel]                [k] __switch_to
  0.55%  [kernel]                [k] menu_select
  0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned

To address this issue, the existing per-cpu stock infrastructure can be
used.

refill_stock() can be called from mem_cgroup_uncharge_skmem() to move
charge to a per-cpu stock instead of calling atomic
page_counter_uncharge().

To prevent the uncontrolled growth of per-cpu stocks, refill_stock()
will explicitly drain the cached charge, if the cached value exceeds
CHARGE_BATCH.

This allows significantly optimize the load:
  1.21%  [kernel]                [k] _raw_spin_lock
  1.01%  [kernel]                [k] ixgbe_poll
  0.92%  [kernel]                [k] _raw_spin_lock_irqsave
  0.90%  [kernel]                [k] enqueue_entity
  0.86%  [kernel]                [k] tcp_ack
  0.85%  < workload >
  0.74%  perf-11120.map          [.] 0x000000000061bf24
  0.73%  [kernel]                [k] __schedule
  0.67%  [kernel]                [k] __fget
  0.63%  [kernel]                [k] __inet6_lookup_established
  0.62%  [kernel]                [k] menu_select
  0.59%  < workload >
  0.59%  [kernel]                [k] __switch_to
  0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned

Link: http://lkml.kernel.org/r/20170829100150.4580-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:47 -07:00
arch mm/memory_hotplug: introduce add_pages 2017-09-08 18:26:46 -07:00
block SCSI misc on 20170907 2017-09-07 21:11:05 -07:00
certs modsign: add markers to endif-statements in certs/Makefile 2017-07-14 11:01:37 +10:00
crypto crypto: af_alg - get_page upon reassignment to TX SGL 2017-08-22 15:03:27 +08:00
Documentation hmm: heterogeneous memory management documentation 2017-09-08 18:26:45 -07:00
drivers mm: change the call sites of numa statistics items 2017-09-08 18:26:47 -07:00
firmware firmware/Makefile: force recompilation if makefile changes 2017-05-08 17:15:10 -07:00
fs userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS 2017-09-08 18:26:47 -07:00
include mm: consider the number in local CPUs when reading NUMA stats 2017-09-08 18:26:47 -07:00
init Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2017-09-06 21:33:12 -07:00
ipc Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2017-08-21 09:45:19 +02:00
kernel mm/device-public-memory: device memory cache coherent with CPU 2017-09-08 18:26:46 -07:00
lib Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-09-07 12:41:48 -07:00
mm mm: memcontrol: use per-cpu stocks for socket memory uncharging 2017-09-08 18:26:47 -07:00
net Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2017-09-06 22:25:25 -07:00
samples media updates for v4.14-rc1 2017-09-07 12:53:14 -07:00
scripts Merge branch 'gperf-removal' 2017-09-07 21:39:15 -07:00
security audit/stable-4.14 PR 20170907 2017-09-07 20:48:25 -07:00
sound - New Drivers 2017-09-07 13:51:13 -07:00
tools powerpc updates for 4.14 2017-09-07 10:15:40 -07:00
usr ramfs: clarify help text that compression applies to ramfs as well as legacy ramdisk. 2017-07-06 16:24:30 -07:00
virt KVM: update to new mmu_notifier semantic v2 2017-08-31 16:13:00 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: Add support to generate LLVM assembly files 2017-04-25 08:13:52 +09:00
.mailmap power supply and reset changes for the v4.12 series (part 2) 2017-05-12 12:02:21 -07:00
COPYING
CREDITS avr32: remove support for AVR32 architecture 2017-05-01 09:27:15 +02:00
Kbuild kbuild: Consolidate header generation from ASM offset information 2017-04-13 05:43:37 +09:00
Kconfig kbuild: migrate all arch to the kconfig mainmenu upgrade 2010-09-19 22:54:11 -04:00
MAINTAINERS hmm: heterogeneous memory management documentation 2017-09-08 18:26:45 -07:00
Makefile Merge branch 'docs-next' of git://git.lwn.net/linux 2017-09-03 21:07:29 -07:00
README README: add a new README file, pointing to the Documentation/ 2016-10-24 08:12:35 -02:00

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.