linux_dsm_epyc7002/drivers
Sergey Senozhatsky ebaf9ab56d zram: switch to crypto compress API
We don't have an idle zstreams list anymore and our write path now works
absolutely differently, preventing preemption during compression.  This
removes possibilities of read paths preempting writes at wrong places
(which could badly affect the performance of both paths) and at the same
time opens the door for a move from custom LZO/LZ4 compression backends
implementation to a more generic one, using crypto compress API.

Joonsoo Kim [1] attempted to do this a while ago, but faced with the
need of introducing a new crypto API interface.  The root cause was the
fact that crypto API compression algorithms require a compression stream
structure (in zram terminology) for both compression and decompression
ops, while in reality only several of compression algorithms really need
it.  This resulted in a concept of context-less crypto API compression
backends [2].  Both write and read paths, though, would have been
executed with the preemption enabled, which in the worst case could have
resulted in a decreased worst-case performance, e.g.  consider the
following case:

	CPU0

	zram_write()
	  spin_lock()
	    take the last idle stream
	  spin_unlock()

	<< preempted >>

		zram_read()
		  spin_lock()
		   no idle streams
			  spin_unlock()
			  schedule()

	resuming zram_write compression()

but it took me some time to realize that, and it took even longer to
evolve zram and to make it ready for crypto API.  The key turned out to be
-- drop the idle streams list entirely.  Without the idle streams list we
are free to use compression algorithms that require compression stream for
decompression (read), because streams are now placed in per-cpu data and
each write path has to disable preemption for compression op, almost
completely eliminating the aforementioned case (technically, we still have
a small chance, because write path has a fast and a slow paths and the
slow path is executed with the preemption enabled; but the frequency of
failed fast path is too low).

TEST
====

- 4 CPUs, x86_64 system
- 3G zram, lzo
- fio tests: read, randread, write, randwrite, rw, randrw

test script [3] command:
 ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh

                   BASE           PATCHED
jobs1
READ:           2527.2MB/s	 2482.7MB/s
READ:           2102.7MB/s	 2045.0MB/s
WRITE:          1284.3MB/s	 1324.3MB/s
WRITE:          1080.7MB/s	 1101.9MB/s
READ:           430125KB/s	 437498KB/s
WRITE:          430538KB/s	 437919KB/s
READ:           399593KB/s	 403987KB/s
WRITE:          399910KB/s	 404308KB/s
jobs2
READ:           8133.5MB/s	 7854.8MB/s
READ:           7086.6MB/s	 6912.8MB/s
WRITE:          3177.2MB/s	 3298.3MB/s
WRITE:          2810.2MB/s	 2871.4MB/s
READ:           1017.6MB/s	 1023.4MB/s
WRITE:          1018.2MB/s	 1023.1MB/s
READ:           977836KB/s	 984205KB/s
WRITE:          979435KB/s	 985814KB/s
jobs3
READ:           13557MB/s	 13391MB/s
READ:           11876MB/s	 11752MB/s
WRITE:          4641.5MB/s	 4682.1MB/s
WRITE:          4164.9MB/s	 4179.3MB/s
READ:           1453.8MB/s	 1455.1MB/s
WRITE:          1455.1MB/s	 1458.2MB/s
READ:           1387.7MB/s	 1395.7MB/s
WRITE:          1386.1MB/s	 1394.9MB/s
jobs4
READ:           20271MB/s	 20078MB/s
READ:           18033MB/s	 17928MB/s
WRITE:          6176.8MB/s	 6180.5MB/s
WRITE:          5686.3MB/s	 5705.3MB/s
READ:           2009.4MB/s	 2006.7MB/s
WRITE:          2007.5MB/s	 2004.9MB/s
READ:           1929.7MB/s	 1935.6MB/s
WRITE:          1926.8MB/s	 1932.6MB/s
jobs5
READ:           18823MB/s	 19024MB/s
READ:           18968MB/s	 19071MB/s
WRITE:          6191.6MB/s	 6372.1MB/s
WRITE:          5818.7MB/s	 5787.1MB/s
READ:           2011.7MB/s	 1981.3MB/s
WRITE:          2011.4MB/s	 1980.1MB/s
READ:           1949.3MB/s	 1935.7MB/s
WRITE:          1940.4MB/s	 1926.1MB/s
jobs6
READ:           21870MB/s	 21715MB/s
READ:           19957MB/s	 19879MB/s
WRITE:          6528.4MB/s	 6537.6MB/s
WRITE:          6098.9MB/s	 6073.6MB/s
READ:           2048.6MB/s	 2049.9MB/s
WRITE:          2041.7MB/s	 2042.9MB/s
READ:           2013.4MB/s	 1990.4MB/s
WRITE:          2009.4MB/s	 1986.5MB/s
jobs7
READ:           21359MB/s	 21124MB/s
READ:           19746MB/s	 19293MB/s
WRITE:          6660.4MB/s	 6518.8MB/s
WRITE:          6211.6MB/s	 6193.1MB/s
READ:           2089.7MB/s	 2080.6MB/s
WRITE:          2085.8MB/s	 2076.5MB/s
READ:           2041.2MB/s	 2052.5MB/s
WRITE:          2037.5MB/s	 2048.8MB/s
jobs8
READ:           20477MB/s	 19974MB/s
READ:           18922MB/s	 18576MB/s
WRITE:          6851.9MB/s	 6788.3MB/s
WRITE:          6407.7MB/s	 6347.5MB/s
READ:           2134.8MB/s	 2136.1MB/s
WRITE:          2132.8MB/s	 2134.4MB/s
READ:           2074.2MB/s	 2069.6MB/s
WRITE:          2087.3MB/s	 2082.4MB/s
jobs9
READ:           19797MB/s	 19994MB/s
READ:           18806MB/s	 18581MB/s
WRITE:          6878.7MB/s	 6822.7MB/s
WRITE:          6456.8MB/s	 6447.2MB/s
READ:           2141.1MB/s	 2154.7MB/s
WRITE:          2144.4MB/s	 2157.3MB/s
READ:           2084.1MB/s	 2085.1MB/s
WRITE:          2091.5MB/s	 2092.5MB/s
jobs10
READ:           19794MB/s	 19784MB/s
READ:           18794MB/s	 18745MB/s
WRITE:          6984.4MB/s	 6676.3MB/s
WRITE:          6532.3MB/s	 6342.7MB/s
READ:           2150.6MB/s	 2155.4MB/s
WRITE:          2156.8MB/s	 2161.5MB/s
READ:           2106.4MB/s	 2095.6MB/s
WRITE:          2109.7MB/s	 2098.4MB/s

                                    BASE                       PATCHED
jobs1                              perfstat
stalled-cycles-frontend     102,480,595,419 (  41.53%)	  114,508,864,804 (  46.92%)
stalled-cycles-backend       51,941,417,832 (  21.05%)	   46,836,112,388 (  19.19%)
instructions                283,612,054,215 (    1.15)	  283,918,134,959 (    1.16)
branches                     56,372,560,385 ( 724.923)	   56,449,814,753 ( 733.766)
branch-misses                   374,826,000 (   0.66%)	      326,935,859 (   0.58%)
jobs2                              perfstat
stalled-cycles-frontend     155,142,745,777 (  40.99%)	  164,170,979,198 (  43.82%)
stalled-cycles-backend       70,813,866,387 (  18.71%)	   66,456,858,165 (  17.74%)
instructions                463,436,648,173 (    1.22)	  464,221,890,191 (    1.24)
branches                     91,088,733,902 ( 760.088)	   91,278,144,546 ( 769.133)
branch-misses                   504,460,363 (   0.55%)	      394,033,842 (   0.43%)
jobs3                              perfstat
stalled-cycles-frontend     201,300,397,212 (  39.84%)	  223,969,902,257 (  44.44%)
stalled-cycles-backend       87,712,593,974 (  17.36%)	   81,618,888,712 (  16.19%)
instructions                642,869,545,023 (    1.27)	  644,677,354,132 (    1.28)
branches                    125,724,560,594 ( 690.682)	  126,133,159,521 ( 694.542)
branch-misses                   527,941,798 (   0.42%)	      444,782,220 (   0.35%)
jobs4                              perfstat
stalled-cycles-frontend     246,701,197,429 (  38.12%)	  280,076,030,886 (  43.29%)
stalled-cycles-backend      119,050,341,112 (  18.40%)	  110,955,641,671 (  17.15%)
instructions                822,716,962,127 (    1.27)	  825,536,969,320 (    1.28)
branches                    160,590,028,545 ( 688.614)	  161,152,996,915 ( 691.068)
branch-misses                   650,295,287 (   0.40%)	      550,229,113 (   0.34%)
jobs5                              perfstat
stalled-cycles-frontend     298,958,462,516 (  38.30%)	  344,852,200,358 (  44.16%)
stalled-cycles-backend      137,558,742,122 (  17.62%)	  129,465,067,102 (  16.58%)
instructions              1,005,714,688,752 (    1.29)	1,007,657,999,432 (    1.29)
branches                    195,988,773,962 ( 697.730)	  196,446,873,984 ( 700.319)
branch-misses                   695,818,940 (   0.36%)	      624,823,263 (   0.32%)
jobs6                              perfstat
stalled-cycles-frontend     334,497,602,856 (  36.71%)	  387,590,419,779 (  42.38%)
stalled-cycles-backend      163,539,365,335 (  17.95%)	  152,640,193,639 (  16.69%)
instructions              1,184,738,177,851 (    1.30)	1,187,396,281,677 (    1.30)
branches                    230,592,915,640 ( 702.902)	  231,253,802,882 ( 702.356)
branch-misses                   747,934,786 (   0.32%)	      643,902,424 (   0.28%)
jobs7                              perfstat
stalled-cycles-frontend     396,724,684,187 (  37.71%)	  460,705,858,952 (  43.84%)
stalled-cycles-backend      188,096,616,496 (  17.88%)	  175,785,787,036 (  16.73%)
instructions              1,364,041,136,608 (    1.30)	1,366,689,075,112 (    1.30)
branches                    265,253,096,936 ( 700.078)	  265,890,524,883 ( 702.839)
branch-misses                   784,991,589 (   0.30%)	      729,196,689 (   0.27%)
jobs8                              perfstat
stalled-cycles-frontend     440,248,299,870 (  36.92%)	  509,554,793,816 (  42.46%)
stalled-cycles-backend      222,575,930,616 (  18.67%)	  213,401,248,432 (  17.78%)
instructions              1,542,262,045,114 (    1.29)	1,545,233,932,257 (    1.29)
branches                    299,775,178,439 ( 697.666)	  300,528,458,505 ( 694.769)
branch-misses                   847,496,084 (   0.28%)	      748,794,308 (   0.25%)
jobs9                              perfstat
stalled-cycles-frontend     506,269,882,480 (  37.86%)	  592,798,032,820 (  44.43%)
stalled-cycles-backend      253,192,498,861 (  18.93%)	  233,727,666,185 (  17.52%)
instructions              1,721,985,080,913 (    1.29)	1,724,666,236,005 (    1.29)
branches                    334,517,360,255 ( 694.134)	  335,199,758,164 ( 697.131)
branch-misses                   873,496,730 (   0.26%)	      815,379,236 (   0.24%)
jobs10                             perfstat
stalled-cycles-frontend     549,063,363,749 (  37.18%)	  651,302,376,662 (  43.61%)
stalled-cycles-backend      281,680,986,810 (  19.07%)	  277,005,235,582 (  18.55%)
instructions              1,901,859,271,180 (    1.29)	1,906,311,064,230 (    1.28)
branches                    369,398,536,153 ( 694.004)	  370,527,696,358 ( 688.409)
branch-misses                   967,929,335 (   0.26%)	      890,125,056 (   0.24%)

                            BASE           PATCHED
seconds elapsed        79.421641008	78.735285546
seconds elapsed        61.471246133	60.869085949
seconds elapsed        62.317058173	62.224188495
seconds elapsed        60.030739363	60.081102518
seconds elapsed        74.070398362	74.317582865
seconds elapsed        84.985953007	85.414364176
seconds elapsed        97.724553255	98.173311344
seconds elapsed        109.488066758	110.268399318
seconds elapsed        122.768189405	122.967164498
seconds elapsed        135.130035105	136.934770801

On my other system (8 x86_64 CPUs, short version of test results):

                            BASE           PATCHED
seconds elapsed        19.518065994	19.806320662
seconds elapsed        15.172772749	15.594718291
seconds elapsed        13.820925970	13.821708564
seconds elapsed        13.293097816	14.585206405
seconds elapsed        16.207284118	16.064431606
seconds elapsed        17.958376158	17.771825767
seconds elapsed        19.478009164	19.602961508
seconds elapsed        21.347152811	21.352318709
seconds elapsed        24.478121126	24.171088735
seconds elapsed        26.865057442	26.767327618

So performance-wise the numbers are quite similar.

Also update zcomp interface to be more aligned with the crypto API.

[1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2
[2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2
[3] https://github.com/sergey-senozhatsky/zram-perf-test

Link: http://lkml.kernel.org/r/20160531122017.2878-3-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
..
accessibility
acpi Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 19:15:35 -07:00
amba ARM: 8566/1: drivers: amba: properly handle devices with power domains 2016-05-05 19:00:40 +01:00
android
ata Merge branch 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-07-23 11:46:59 +09:00
atm atm: iphase: off by one in rx_pkt() 2016-05-31 11:52:59 -07:00
auxdisplay
base memory-hotplug: use zone_can_shift() for sysfs valid_zones attribute 2016-07-26 16:19:19 -07:00
bcma x86/quirks: Add early quirk to reset Apple AirPort card 2016-07-10 20:13:53 +02:00
block zram: switch to crypto compress API 2016-07-26 16:19:19 -07:00
bluetooth Bluetooth: Add USB ID 13D3:3487 to ath3k 2016-05-13 16:54:59 +02:00
bus Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-05-19 10:02:26 -07:00
cdrom
char Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 15:34:18 -07:00
clk clk: at91: fix clk_programmable_set_parent() 2016-07-18 17:45:41 -07:00
clocksource clocksource/drivers/time-armada-370-xp: Fix return value check 2016-07-12 17:33:22 +02:00
connector connector: fix out-of-order cn_proc netlink message delivery 2016-06-28 08:48:33 -04:00
cpufreq Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 20:43:12 -07:00
cpuidle cpuidle: Fix last_residency division 2016-07-04 14:17:34 +02:00
crypto crypto: qat - make qat_asym_algs.o depend on asn1 headers 2016-07-21 12:19:53 +08:00
dax /dev/dax, core: file operations and dax-mmap 2016-05-20 22:02:55 -07:00
dca
devfreq PM / devfreq: Send the DEVFREQ_POSTCHANGE notification when target() is failed 2016-06-23 23:15:12 +02:00
dio
dma dmaengine: hsu: Export hsu_dma_get_status() 2016-06-25 14:30:42 -07:00
dma-buf dma-buf: use vma_pages() 2016-05-31 22:17:05 +05:30
edac EDAC, sb_edac: Fix Knights Landing 2016-07-16 06:11:59 +09:00
eisa
extcon extcon: adc-jack: add suspend/resume support 2016-07-02 14:31:34 +09:00
firewire treewide: replace dev->trans_start update with helper 2016-05-04 14:16:49 -04:00
firmware efi: Convert efi_call_virt() to efi_call_virt_pointer() 2016-06-27 13:06:56 +02:00
fmc
fpga
gpio gpio: tegra: don't auto-enable for COMPILE_TEST 2016-07-22 15:29:32 +02:00
gpu Merge tag 'topic/kbl-4.7-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes 2016-07-19 18:00:15 +10:00
hid HID: multitouch: enable palm rejection for Windows Precision Touchpad 2016-06-28 13:24:14 +02:00
hsi HSI: omap-ssi: move omap_ssi_port_update_fclk 2016-05-09 22:45:18 +02:00
hv
hwmon hwmon: (ftsteutates) Remove unused including <linux/version.h> 2016-07-20 07:12:12 -07:00
hwspinlock drivers/hwspinlock: use correct radix tree API 2016-05-20 17:58:30 -07:00
hwtracing intel_th: Fixes -stable 2016-07-15 14:19:11 +09:00
i2c i2c: mux: reg: wrong condition checked for of_address_to_resource return value 2016-07-06 00:33:49 +09:00
ide
idle x86/intel_idle: Use Intel family macros for intel_idle 2016-06-08 13:03:25 +02:00
iio Third set of IIO new device support, features and cleanups for the 4.8 cycle. 2016-07-14 12:05:29 +09:00
infiniband i40iw: Enable remote access rights for stag allocation 2016-07-12 10:46:34 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2016-07-23 12:10:48 +09:00
iommu iommu/amd: Fix unity mapping initialization race 2016-07-06 18:04:55 +02:00
ipack
irqchip Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 21:35:03 -07:00
isdn TTY and Serial driver update for 4.7-rc1 2016-05-20 20:57:27 -07:00
leds leds: handle suspend/resume in heartbeat trigger 2016-06-08 11:47:06 +02:00
lguest
lightnvm lightnvm: reserved space calculation incorrect 2016-05-06 12:51:10 -06:00
macintosh
mailbox mailbox: Fix devm_ioremap_resource error detection code 2016-05-08 22:44:46 +05:30
mcb mcb: Acquire reference to carrier module in core 2016-06-13 18:49:30 -07:00
md Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2016-05-27 14:28:09 -07:00
media media: fix airspy usb probe error path 2016-07-16 06:15:40 +09:00
memory memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing 2016-06-16 11:43:48 +03:00
memstick drivers/memstick/core/mspro_block: use kmemdup 2016-05-23 17:04:14 -07:00
message SCSI misc on 20160517 2016-05-18 16:38:59 -07:00
mfd mfd: max77620: Fix FPS switch statements 2016-06-30 07:44:23 +01:00
misc lkdtm: silence warnings about function declarations 2016-07-15 16:14:45 -07:00
mmc Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 20:43:12 -07:00
mtd Late MTD fix for v4.7: 2016-07-16 09:53:34 +09:00
net Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 20:43:12 -07:00
nfc NFC: pn533: handle interrupted commands in pn533_recv_frame 2016-05-10 00:01:47 +02:00
ntb
nubus
nvdimm libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment 2016-06-23 17:50:39 -07:00
nvme nvme: Remove RCU namespace protection 2016-07-14 08:48:08 -07:00
nvmem nvmem: imx-ocotp: Fix assignment warning. 2016-06-25 07:42:55 -07:00
of drivers/of: Fix depth for sub-tree blob in unflatten_dt_nodes() 2016-06-09 14:36:34 -05:00
oprofile
parisc
parport
pci Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 19:15:35 -07:00
pcmcia
perf arm: pmu: Fix non-devicetree probing 2016-06-15 09:51:35 +01:00
phy phy: for 4.8 -rc1 2016-07-14 12:03:50 +09:00
pinctrl pinctrl: baytrail: Fix mingled clock pins 2016-06-23 11:05:04 +02:00
platform Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 19:15:35 -07:00
pnp x86/uaccess: Move thread_info::addr_limit to thread_struct 2016-07-15 10:26:30 +02:00
power Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 20:43:12 -07:00
powercap x86, powercap, rapl: Add Skylake Server model number 2016-06-08 13:03:25 +02:00
pps pps: do not crash when failed to register 2016-07-23 10:25:54 +09:00
ps3
ptp ptp: oops in ptp_ioctl() 2016-05-29 22:32:27 -07:00
pwm pwm: atmel-hlcdc: Fix default PWM polarity 2016-06-14 10:51:45 +02:00
rapidio rapidio/mport_cdev: fix uapi type definitions 2016-05-05 17:38:53 -07:00
ras
regulator regulator: Fix qcom-smd list voltage issues for msm8974 2016-07-13 04:22:16 +09:00
remoteproc remoteproc: Add additional crash reasons 2016-05-12 15:50:19 -07:00
reset
rpmsg rpmsg: add THIS_MODULE to rpmsg_driver in rpmsg core 2016-05-06 11:08:58 -07:00
rtc rtc: tps6586x: rename so module can be autoloaded 2016-05-21 17:07:17 +02:00
s390 qeth: delete napi struct when removing a qeth device 2016-07-04 23:32:08 -07:00
sbus openprom: fix warning 2016-05-20 18:33:37 -07:00
scsi Merge branch 'jejb-fixes' into fixes 2016-07-06 07:25:55 -07:00
sfi
sh drivers: sh: Stop using the legacy clock domain on ARM 2016-05-30 09:41:11 +09:00
sn
soc soc: mtk-pmic-wrap: avoid integer overflow warning 2016-05-19 15:20:24 +02:00
spi Merge remote-tracking branches 'spi/fix/ep93xx', 'spi/fix/rockchip', 'spi/fix/sunxi' and 'spi/fix/ti-qspi' into spi-linus 2016-06-30 13:17:29 +01:00
spmi
ssb
staging Third set of IIO new device support, features and cleanups for the 4.8 cycle. 2016-07-14 12:05:29 +09:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-05-28 12:04:17 -07:00
tc
thermal Merge branch 'x86/cpu' into x86/platform, to avoid conflict 2016-06-14 12:25:07 +02:00
thunderbolt
tty mm: oom: add memcg to oom_control 2016-07-26 16:19:19 -07:00
uio
usb Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 20:43:12 -07:00
uwb
vfio vfio/pci: Allow VPD short read 2016-05-31 21:25:52 -06:00
vhost target: make close_session optional 2016-05-10 01:19:26 -07:00
video fbmon: remove unused function argument 2016-07-26 16:19:19 -07:00
virt
virtio virtio_balloon: fix PFN format for virtio-1 2016-05-22 19:44:13 +03:00
vlynq
vme
w1
watchdog watchdog: ebc-c384_wdt: Allow build for X86_64 2016-06-17 20:21:12 -07:00
xen xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 2016-07-08 14:53:13 +01:00
zorro
Kconfig libnvdimm for 4.7 2016-05-23 11:18:01 -07:00
Makefile drivers: sh: Stop using the legacy clock domain on ARM 2016-05-30 09:41:11 +09:00