linux_dsm_epyc7002/Documentation
Dave Rodgman 5ee4014af9 lib/lzo: implement run-length encoding
Patch series "lib/lzo: run-length encoding support", v5.

Following on from the previous lzo-rle patchset:

  https://lkml.org/lkml/2018/11/30/972

This patchset contains only the RLE patches, and should be applied on
top of the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ).

Previously, some questions were raised around the RLE patches.  I've
done some additional benchmarking to answer these questions.  In short:

 - RLE offers significant additional performance (data-dependent)

 - I didn't measure any regressions that were clearly outside the noise

One concern with this patchset was around performance - specifically,
measuring RLE impact separately from Matt Sealey's patches (CTZ & fast
copy).  I have done some additional benchmarking which I hope clarifies
the benefits of each part of the patchset.

Firstly, I've captured some memory via /dev/fmem from a Chromebook with
many tabs open which is starting to swap, and then split this into 4178
4k pages.  I've excluded the all-zero pages (as zram does), and also the
no-zero pages (which won't tell us anything about RLE performance).
This should give a realistic test dataset for zram.  What I found was
that the data is VERY bimodal: 44% of pages in this dataset contain 5%
or fewer zeros, and 44% contain over 90% zeros (30% if you include the
no-zero pages).  This supports the idea of special-casing zeros in zram.

Next, I've benchmarked four variants of lzo on these pages (on 64-bit
Arm at max frequency): baseline LZO; baseline + Matt Sealey's patches
(aka MS); baseline + RLE only; baseline + MS + RLE.  Numbers are for
weighted roundtrip throughput (the weighting reflects that zram does
more compression than decompression).

  https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing

Matt's patches help in all cases for Arm (and no effect on Intel), as
expected.

RLE also behaves as expected: with few zeros present, it makes no
difference; above ~75%, it gives a good improvement (50 - 300 MB/s on
top of the benefit from Matt's patches).

Best performance is seen with both MS and RLE patches.

Finally, I have benchmarked the same dataset on an x86-64 device.  Here,
the MS patches make no difference (as expected); RLE helps, similarly as
on Arm.  There were no definite regressions; allowing for observational
error, 0.1% (3/4178) of cases had a regression > 1 standard deviation,
of which the largest was 4.6% (1.2 standard deviations).  I think this
is probably within the noise.

  https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing

One point to note is that the graphs show RLE appears to help very
slightly with no zeros present! This is because the extra code causes
the clang optimiser to change code layout in a way that happens to have
a significant benefit.  Taking baseline LZO and adding a do-nothing line
like "__builtin_prefetch(out_len);" immediately before the "goto next"
has the same effect.  So this is a real, but basically spurious effect -
it's small enough not to upset the overall findings.

This patch (of 3):

When using zram, we frequently encounter long runs of zero bytes.  This
adds a special case which identifies runs of zeros and encodes them
using run-length encoding.

This is faster for both compression and decompresion.  For high-entropy
data which doesn't hit this case, impact is minimal.

Compression ratio is within a few percent in all cases.

This modifies the bitstream in a way which is backwards compatible
(i.e., we can decompress old bitstreams, but old versions of lzo cannot
decompress new bitstreams).

Link: http://lkml.kernel.org/r/20190205155944.16007-2-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Matt Sealey <matt.sealey@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
..
ABI USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
accelerators ocxl: Document new OCXL IOCTLs 2018-06-03 20:40:33 +10:00
accounting psi: cgroup support 2018-10-26 16:26:32 -07:00
acpi ACPI / tables: table override from built-in initrd 2019-01-14 11:42:18 +01:00
admin-guide USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
aoe
arm Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
arm64 clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability 2019-02-23 12:13:45 +01:00
auxdisplay Doc: misc-devices: move lcd-panel-cgram.txt to auxdisplay/ 2018-04-12 16:08:02 +02:00
backlight
block block: doc: add slice_idle_us to bfq documentation 2019-01-09 07:38:48 -07:00
blockdev zram: idle writeback fixes and cleanup 2019-01-08 17:15:10 -08:00
bpf docs/bpf: minor casing/punctuation fixes 2019-03-02 00:40:04 +01:00
bus-devices
cdrom Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
cgroup-v1 mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
cma
connector
console Documentation: corrections to console/console.txt 2018-08-10 16:09:40 -06:00
core-api refcount_t: Add ACQUIRE ordering on success for dec(sub)_and_test() variants 2019-02-04 09:03:31 +01:00
cpu-freq Documentation: cpu-freq: Frequencies aren't always sorted 2018-11-07 13:29:04 +01:00
crypto crypto: skcipher - remove remnants of internal IV generators 2018-12-23 11:52:45 +08:00
dev-tools A fairly normal cycle for documentation stuff. We have a new 2018-12-29 11:21:49 -08:00
device-mapper Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
devicetree USB/PHY patches for 5.1-rc1 2019-03-06 16:48:27 -08:00
doc-guide doc:process: add links where missing 2018-12-06 10:21:19 -07:00
driver-api Driver core patches for 5.1-rc1 2019-03-06 14:52:48 -08:00
driver-model Documentation: driver core: remove use of BUS_ATTR 2019-01-08 15:17:45 +01:00
early-userspace Correct gen_init_cpio tool's documentation 2018-11-25 12:25:53 -07:00
EDID Docs/EDID: Calculate CRC while building the code 2018-11-06 07:36:22 -07:00
extcon
fault-injection
fb fbdev: fbmem: convert CONFIG_FB_LOGO_CENTER into a cmd line option 2019-01-16 17:42:35 +01:00
features Documentation/features: Add csky kernel features 2019-01-07 22:22:16 +08:00
filesystems Documentation: driver core: remove use of BUS_ATTR 2019-01-08 15:17:45 +01:00
firmware_class
fmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
fpga docs: fpga: add a document for FPGA Device Feature List (DFL) Framework Overview 2018-07-15 13:55:44 +02:00
gpio Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
gpu A fairly normal cycle for documentation stuff. We have a new 2018-12-29 11:21:49 -08:00
hid HID: doc: fix wrong data structure reference for UHID_OUTPUT 2018-12-18 14:55:22 +01:00
hwmon hwmon: (lm85) add support for LM96000 high frequencies 2019-02-18 14:23:29 -08:00
i2c i2c: add i2c bus driver for NVIDIA GPU 2018-11-09 17:46:43 +01:00
ia64
ide Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
iio
infiniband
input Input: add REL_WHEEL_HI_RES and REL_HWHEEL_HI_RES 2018-12-07 16:27:11 +01:00
interconnect interconnect: Add generic on-chip interconnect API 2019-01-22 13:37:25 +01:00
ioctl seccomp: add a return code to trap to userspace 2018-12-11 16:28:41 -08:00
isdn Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
kbuild kbuild: generate asm-generic wrappers if mandatory headers are missing 2019-01-06 09:46:51 +09:00
kdump
kernel-hacking doc:it_IT: translation for kernel-hacking 2018-07-26 16:21:09 -06:00
laptops platform-drivers-x86 for v4.20-1 2018-11-01 08:42:21 -07:00
leds Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
lightnvm
livepatch livepatch: Remove not longer valid limitations from the documentation 2018-05-24 15:37:57 +02:00
locking This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
m68k Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
maintainer docs: Fix more broken references 2018-06-15 18:11:26 -03:00
md
media A fairly normal cycle for documentation stuff. We have a new 2018-12-29 11:21:49 -08:00
memory-devices
mic
mips Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
misc-devices pci_endpoint_test: Add 2 ioctl commands 2018-07-19 11:46:57 +01:00
mmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
mtd Documentation: mtd: remove stale pxa3xx NAND controller documentation 2018-09-04 23:37:38 +02:00
namespaces
netlabel Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2019-03-04 10:14:31 -08:00
nfc
nios2
nvdimm libnvdimm/security: Add documentation for nvdimm security support 2018-12-21 12:44:41 -08:00
nvmem Documentation: nvmem: document cell tables and lookup entries 2018-09-28 15:14:54 +02:00
openrisc
parisc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
PCI pci-v4.20-changes 2018-10-25 06:50:48 -07:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf Documentation: perf: Add documentation for ThunderX2 PMU uncore driver 2018-12-06 12:29:47 +00:00
phy
platform
power PM/EM: Document the Energy Model framework 2019-01-27 12:29:37 +01:00
powerpc powerpc/fadump: Reservationless firmware assisted dump 2018-12-21 11:32:49 +11:00
pps
process configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED 2019-03-07 18:32:02 -08:00
pti
ptp
rapidio
RCU Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD 2019-02-09 08:47:52 -08:00
riscv perf: riscv: Add Document for Future Porting Guide 2018-06-04 14:02:11 -07:00
s390 Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
scheduler sched/doc: Document Energy Aware Scheduling 2019-01-27 12:29:37 +01:00
scsi SCSI misc on 20181224 2018-12-28 14:48:06 -08:00
security Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-01-02 09:43:14 -08:00
serial Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
sh sh: remove board_time_init() callback 2018-12-18 16:13:04 +01:00
sound ASoC: More changes for v5.1 2019-02-28 13:30:55 +01:00
sparc
sphinx Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
sphinx-static docs: improve readability for people with poorer eyesight 2018-10-07 09:16:50 -06:00
spi pxa2xx: replace spi_master with spi_controller 2019-01-23 10:59:56 +00:00
sysctl Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 08:14:05 -08:00
target
thermal Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
timers Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
trace Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-sleep' 2019-01-11 10:09:51 +01:00
translations configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED 2019-03-07 18:32:02 -08:00
usb usb: core: add option of only authorizing internal devices 2019-02-22 09:27:55 +01:00
userspace-api x86/speculation: Add PR_SPEC_DISABLE_NOEXEC 2019-01-29 22:11:49 +01:00
virtual Documentation/virtual/kvm: Update URL for AMD SEV API specification 2019-01-11 18:38:07 +01:00
vm A fairly normal cycle for documentation stuff. We have a new 2018-12-29 11:21:49 -08:00
w1 Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
watchdog watchdog: docs: kernel-api: don't reference removed functions 2018-12-24 13:15:06 +01:00
wimax
x86 x86/resctrl: Avoid confusion over the new X86_RESCTRL config 2019-02-02 10:34:52 +01:00
xilinx Documentation: xilinx: Add documentation for eemi APIs 2018-10-09 13:26:05 +02:00
xtensa
.gitignore
atomic_bitops.txt
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt Documentation: remove stale firmware API reference 2018-05-14 16:44:41 +02:00
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt dma-mapping: deprecate dma_zalloc_coherent 2018-12-20 08:14:09 +01:00
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
efi-stub.txt efi_stub: update documentation on dtb= parameter 2018-09-09 14:46:44 -06:00
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst docs: tidy up TOCs and refs to license-rules.rst 2018-08-31 16:50:50 -06:00
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt doc: Update removal of RCU-bh/sched update machinery 2018-08-30 10:59:48 -07:00
kobject.txt kref/kobject: Improve documentation 2018-12-06 13:57:03 +01:00
kprobes.txt kprobes/Documentation: Fix various typos 2018-06-22 11:10:55 +02:00
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt lib/lzo: implement run-length encoding 2019-03-07 18:32:02 -08:00
mailbox.txt
Makefile kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
memory-barriers.txt Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
men-chameleon-bus.txt
nommu-mmap.txt Documentation: nommu-map: Fix duplicate word typo 2018-06-26 09:01:27 -06:00
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt Documentation: preempt-locking: Use better example 2018-10-12 11:35:47 -06:00
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt rfkill: Fix several typos in documentation 2018-06-15 13:36:08 +02:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
SubmittingPatches
svga.txt
switchtec.txt NTB: switchtec_ntb: Update switchtec documentation with prerequisites for NTB 2018-10-11 11:28:53 -05:00
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-06-08 10:24:27 -06:00
vfio.txt vfio: fix documentation 2018-05-08 09:16:41 -06:00
video-output.txt
xillybus.txt
xz.txt
zorro.txt