linux_dsm_epyc7002/Documentation
David Ahern 9ab948a91b ipv4: Allow amount of dirty memory from fib resizing to be controllable
fib_trie implementation calls synchronize_rcu when a certain amount of
pages are dirty from freed entries. The number of pages was determined
experimentally in 2009 (commit c3059477fc).

At the current setting, synchronize_rcu is called often -- 51 times in a
second in one test with an average of an 8 msec delay adding a fib entry.
The total impact is a lot of slow down modifying the fib. This is seen
in the output of 'time' - the difference between real time and sys+user.
For example, using 720,022 single path routes and 'ip -batch'[1]:

    $ time ./ip -batch ipv4/routes-1-hops
    real    0m14.214s
    user    0m2.513s
    sys     0m6.783s

So roughly 35% of the actual time to install the routes is from the ip
command getting scheduled out, most notably due to synchronize_rcu (this
is observed using 'perf sched timehist').

This patch makes the amount of dirty memory configurable between 64k where
the synchronize_rcu is called often (small, low end systems that are memory
sensitive) to 64M where synchronize_rcu is called rarely during a large
FIB change (for high end systems with lots of memory). The default is 512kB
which corresponds to the current setting of 128 pages with a 4kB page size.

As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu
in a second blocking for up to 30 msec in a single instance, and a total
of almost 100 msec across the 4 calls in the second. The trade off is
allowing FIB entries to consume more memory in a given time window but
but with much better fib insertion rates (~30% increase in prefixes/sec).
With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch
file runs in:

    $ time ./ip -batch ipv4/routes-1-hops
    real    0m9.692s
    user    0m2.491s
    sys     0m6.769s

So the dead time is reduced to about 1/2 second or <5% of the real time.

[1] 'ip' modified to not request ACK messages which improves route
    insertion times by about 20%

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21 13:29:53 -07:00
..
ABI A large number of bug fixes and cleanups. One new feature to allow 2019-03-12 15:03:21 -07:00
accelerators
accounting
acpi ACPI: Documentation: Fix path for acpidbg tool 2019-03-07 11:28:33 +01:00
admin-guide Merge branch 'core-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-10 13:46:08 -07:00
aoe
arm
arm64 arm64 updates for 5.1: 2019-03-10 10:17:23 -07:00
auxdisplay
backlight
block
blockdev
bpf docs/bpf: minor casing/punctuation fixes 2019-03-02 00:40:04 +01:00
bus-devices
cdrom
cgroup-v1 A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
cma
connector
console
core-api Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
cpu-freq
crypto
dev-tools
device-mapper dm cache: add support for discard passdown to the origin device 2019-03-05 14:53:52 -05:00
devicetree Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-14 09:28:12 -07:00
doc-guide
driver-api dmaengine updates for v5.1-rc1 2019-03-14 09:11:54 -07:00
driver-model Merge branches 'clk-optional', 'clk-devm-clkdev-register', 'clk-allwinner', 'clk-meson' and 'clk-renesas' into clk-next 2019-03-08 10:27:21 -08:00
early-userspace
EDID
extcon
fault-injection
fb
features
filesystems The highlights are: 2019-03-12 14:58:35 -07:00
firmware_class
fmc
fpga
gpio
gpu
hid
hwmon A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
i2c i2c: gpio: fault-injector: add 'inject_panic' injector 2019-02-23 10:34:08 +01:00
ia64
ide
iio
infiniband
input
interconnect
ioctl
isdn
kbuild kbuild: remove cc-version macro 2019-03-04 22:34:59 +09:00
kdump
kernel-hacking
laptops
leds
lightnvm
livepatch
locking Documentation/locking/lockdep: Drop last two chars of sample states 2019-03-04 12:55:18 -07:00
m68k
maintainer
md
media media: Documentation: fix several typos 2019-03-01 09:54:06 -05:00
memory-devices
mic
mips
misc-devices
mmc
mtd
namespaces
netlabel
networking ipv4: Allow amount of dirty memory from fib resizing to be controllable 2019-03-21 13:29:53 -07:00
nfc
nios2
nvdimm
nvmem
openrisc
parisc
PCI
pcmcia
perf
phy
platform
power
powerpc
pps
process A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
pti
ptp
rapidio
RCU A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
riscv
s390
scheduler
scsi scsi: ufs-bsg: Allow reading descriptors 2019-02-27 09:00:02 -05:00
security doc: security: Add kern-doc for lsm_hooks.h 2019-02-22 08:54:09 -07:00
serial
sh
sound ASoC: More changes for v5.1 2019-02-28 13:30:55 +01:00
sparc
sphinx
sphinx-static
spi
sysctl A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
target
thermal
timers Docs: Correct /proc/stat path 2019-02-22 08:50:17 -07:00
trace doc: trace: Fix documentation for uprobe_profile 2019-02-21 10:28:55 -05:00
translations A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
usb usb: core: add option of only authorizing internal devices 2019-02-22 09:27:55 +01:00
userspace-api
virtual virtio-ccw: diag 500 may return a negative cookie 2019-03-06 11:19:33 -05:00
vm
w1
watchdog Documentation/watchdog: Add documentation mlx-wdt driver 2019-03-02 15:28:20 +01:00
wimax
x86
xilinx
xtensa
.gitignore
atomic_bitops.txt
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
digsig.txt
DMA-API-HOWTO.txt Documentation/DMA-API-HOWTO: update dma_mask sections 2019-02-20 07:29:47 -07:00
DMA-API.txt virtio: fixes, cleanups 2019-03-10 12:47:57 -07:00
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
efi-stub.txt
eisa.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt lib/lzo: separate lzo-rle from lzo 2019-03-07 18:32:03 -08:00
mailbox.txt
Makefile
memory-barriers.txt
men-chameleon-bus.txt
nommu-mmap.txt
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt
vfio.txt
video-output.txt
xillybus.txt
xz.txt
zorro.txt