linux_dsm_epyc7002/drivers
Yufen Yu e236858243 md/raid5: set default stripe_size as 4096
In RAID5, if issued bio size is bigger than stripe_size, it will be
split in the unit of stripe_size and process them one by one. Even
for size less then stripe_size, RAID5 also request data from disk at
least of stripe_size.

Nowdays, stripe_size is equal to the value of PAGE_SIZE. Since filesystem
usually issue bio in the unit of 4KB, there is no problem for PAGE_SIZE
as 4KB. But, for 64KB PAGE_SIZE, bio from filesystem requests 4KB data
while RAID5 issue IO at least stripe_size (64KB) each time. That will
waste resource of disk bandwidth and compute xor.

To avoding the waste, we want to make stripe_size configurable. This
patch just set default stripe_size as 4096. User can also set the value
bigger than 4KB for some special requirements, such as we know the
issued io size is more than 4KB.

To evaluate the new feature, we create raid5 device '/dev/md5' with
4 SSD disk and test it on arm64 machine with 64KB PAGE_SIZE.

1) We format /dev/md5 with mkfs.ext4 and mount ext4 with default
 configure on /mnt directory. Then, trying to test it by dbench with
 command: dbench -D /mnt -t 1000 10. Result show as:

 'stripe_size = 64KB'

  Operation      Count    AvgLat    MaxLat
  ----------------------------------------
  NTCreateX    9805011     0.021    64.728
  Close        7202525     0.001     0.120
  Rename        415213     0.051    44.681
  Unlink       1980066     0.079    93.147
  Deltree          240     1.793     6.516
  Mkdir            120     0.004     0.007
  Qpathinfo    8887512     0.007    37.114
  Qfileinfo    1557262     0.001     0.030
  Qfsinfo      1629582     0.012     0.152
  Sfileinfo     798756     0.040    57.641
  Find         3436004     0.019    57.782
  WriteX       4887239     0.021    57.638
  ReadX        15370483     0.005    37.818
  LockX          31934     0.003     0.022
  UnlockX        31933     0.001     0.021
  Flush         687205    13.302   530.088

 Throughput 307.799 MB/sec  10 clients  10 procs  max_latency=530.091 ms
 -------------------------------------------------------

 'stripe_size = 4KB'

  Operation      Count    AvgLat    MaxLat
  ----------------------------------------
  NTCreateX    11999166     0.021    36.380
  Close        8814128     0.001     0.122
  Rename        508113     0.051    29.169
  Unlink       2423242     0.070    38.141
  Deltree          300     1.885     7.155
  Mkdir            150     0.004     0.006
  Qpathinfo    10875921     0.007    35.485
  Qfileinfo    1905837     0.001     0.032
  Qfsinfo      1994304     0.012     0.125
  Sfileinfo     977450     0.029    26.489
  Find         4204952     0.019     9.361
  WriteX       5981890     0.019    27.804
  ReadX        18809742     0.004    33.491
  LockX          39074     0.003     0.025
  UnlockX        39074     0.001     0.014
  Flush         841022    10.712   458.848

 Throughput 376.777 MB/sec  10 clients  10 procs  max_latency=458.852 ms
 -------------------------------------------------------

 It show that setting stripe_size as 4KB has higher thoughput, i.e.
 (376.777 vs 307.799) and has smaller latency than that setting as 64KB.

 2) We try to evaluate IO throughput for /dev/md5 by fio with config:

 [4KB randwrite]
 direct=1
 numjob=2
 iodepth=64
 ioengine=libaio
 filename=/dev/md5
 bs=4KB
 rw=randwrite

 [64KB write]
 direct=1
 numjob=2
 iodepth=64
 ioengine=libaio
 filename=/dev/md5
 bs=1MB
 rw=write

 The result as follow:

               +                   +
               | stripe_size(64KB) | stripe_size(4KB)
 +----------------------------------------------------+
 4KB randwrite |     15MB/s        |      100MB/s
 +----------------------------------------------------+
 1MB write     |   1000MB/s        |      700MB/s

 The result show that when size of io is bigger than 4KB (64KB),
 64KB stripe_size has much higher IOPS. But for 4KB randwrite, that
 means, size of io issued to device are smaller, 4KB stripe_size
 have better performance.

Normally, default value (4096) can get relatively good performance.
But if each issued io is bigger than 4096, setting value more than
4096 may get better performance.

Here, we just set default stripe_size as 4096, and we will try to
support setting different stripe_size by sysfs interface in the
following patch.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-21 17:18:17 -07:00
..
accessibility treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
acpi Merge branch 'acpi-fan' 2020-07-03 16:15:31 +02:00
amba ARM: tegra: Replace zero-length array with flexible-array 2020-06-15 23:08:28 -05:00
android binder: fix null deref of proc->context 2020-06-23 07:54:46 +02:00
ata libata-5.8-2020-06-19 2020-06-19 13:09:40 -07:00
atm treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
auxdisplay treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
base Power management fixes for 5.8-rc3 2020-06-26 12:32:11 -07:00
bcma
block rsxx: switch from 'pci_free_consistent()' to 'dma_free_coherent()' 2020-07-11 09:27:09 -06:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
bus Fixes for omaps for v5.8 2020-06-28 14:41:55 +02:00
cdrom Merge branch 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:05:54 -07:00
char tpm_tis: Remove the HID IFX0102 2020-07-02 17:49:00 +03:00
clk clk: sifive: allocate sufficient memory for struct __prci_data 2020-06-25 15:04:13 -07:00
clocksource clocksource/drivers/timer-riscv: Use per-CPU timer interrupt 2020-06-09 19:11:22 -07:00
connector treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
counter
cpufreq cpufreq: intel_pstate: Add one more OOB control bit 2020-06-23 17:24:32 +02:00
cpuidle cpuidle: Rearrange s2idle-specific idle state entry code 2020-06-25 13:52:53 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-06-21 10:01:03 -07:00
dax block: remove the bd_queue field from struct block_device 2020-07-01 08:08:20 -06:00
dca
devfreq PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex 2020-05-28 18:02:40 +09:00
dio maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
dma dmaengine: tegra-apb: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
dma-buf dma-buf: Move dma_buf_release() from fops to dentry_ops 2020-06-25 16:05:40 +05:30
edac EDAC/amd64: Read back the scrub rate PCI register on F15h 2020-06-18 20:25:25 +02:00
eisa treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
extcon extcon: arizona: Fix runtime PM imbalance on error 2020-05-29 17:36:02 +09:00
firewire firewire: ohci: Replace zero-length array with flexible-array 2020-06-15 23:08:31 -05:00
firmware ARM: SoC fixes for v5.8 2020-06-28 14:55:18 -07:00
fpga FPGA Manager fixes for 5.8-rc1 2020-06-26 17:26:31 +02:00
fsi treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
gnss treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
gpio treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
gpu * dma-buf: fix a use-after-free bug 2020-07-03 11:18:21 +10:00
greybus treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
hid treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
hsi treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
hv Drivers: hv: Change flag to write log level in panic msg to false 2020-06-29 10:30:35 +00:00
hwmon hwmon: (pmbus) fix a typo in Kconfig SENSORS_IR35221 option 2020-07-02 17:43:14 -07:00
hwspinlock
hwtracing stm class: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
i2c i2c: mlxcpld: check correct size of maximum RECV_LEN packet 2020-07-04 08:20:38 +02:00
i3c
ide treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
idle
iio treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
infiniband IB/hfi1: Add atomic triggered sleep/wakeup 2020-06-24 16:13:38 -03:00
input maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
interconnect More power management updates for 5.8-rc1 2020-06-10 14:04:39 -07:00
iommu iommu/vt-d: Fix misuse of iommu_domain_identity_map() 2020-06-23 10:08:32 +02:00
ipack treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
irqchip irqchip/gic: Atomically update affinity 2020-06-21 15:24:46 +01:00
isdn treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
leds LEDs pull request for 5.8-rc1. 2020-06-04 11:03:45 -07:00
lightnvm block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
macintosh treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
mailbox mailbox: qcom: Add ipq6018 apcs compatible 2020-06-10 22:43:57 -05:00
mcb
md md/raid5: set default stripe_size as 4096 2020-07-21 17:18:17 -07:00
media media: omap3isp: remove cacheflush.h 2020-06-26 00:27:37 -07:00
memory Merge branch 'baikal/drivers' into arm/drivers 2020-05-28 14:18:11 +02:00
memstick
message scsi: mptfusion: Don't use GFP_ATOMIC for larger DMA allocations 2020-06-26 22:51:53 -04:00
mfd mfd: mt6360: Fix register driver NULL pointer by adding driver name 2020-06-16 09:32:43 +01:00
misc habanalabs: increase h/w timer when checking idle 2020-06-24 12:35:23 +03:00
mmc blk-mq: move failure injection out of blk_mq_complete_request 2020-06-24 09:15:57 -06:00
most
mtd This pull request contains a single change for UBI: 2020-06-10 13:24:40 -07:00
mux
net wil6210: account for napi_gro_receive never returning GRO_DROP 2020-06-25 16:16:21 -07:00
nfc treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
ntb NTB: perf: Fix race condition when run with ntb_test 2020-06-05 20:02:09 -04:00
nubus
nvdimm block: move ->make_request_fn to struct block_device_operations 2020-07-01 07:27:24 -06:00
nvme block: add max_active_zones to blk-sysfs 2020-07-15 14:26:11 -06:00
nvmem
of of: of_mdio: Correct loop scanning logic 2020-06-19 13:39:00 -07:00
opp treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
oprofile oprofile: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
parisc
parport treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
pci Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
pcmcia treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
perf arm64 merge window fixes for -rc1 2020-06-11 12:53:23 -07:00
phy phy: samsung: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
pinctrl pinctrl: single: fix function name in documentation 2020-06-20 22:41:32 +02:00
platform Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
pnp treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
power power supply and reset changes for the v5.8 series 2020-06-10 11:28:35 -07:00
powercap Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
pps treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
ps3
ptp treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
pwm pwm: Add missing "CONFIG_" prefix 2020-06-04 19:09:28 +02:00
rapidio rapidio: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
ras
regulator regulator: mt6358: Remove BROKEN dependency 2020-06-17 13:01:19 +01:00
remoteproc remoteproc updates for v5.8 2020-06-08 13:01:08 -07:00
reset Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
rpmsg remoteproc updates for v5.8 2020-06-08 13:01:08 -07:00
rtc RTC for 5.8 2020-06-07 16:11:23 -07:00
s390 s390/dasd: Use struct_size() helper 2020-07-15 08:47:11 -06:00
sbus treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
scsi block: add max_active_zones to blk-sysfs 2020-07-15 14:26:11 -06:00
sfi treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
sh
siox
slimbus
soc ARM: OMAP fixes for v5.8 2020-06-28 14:57:14 -07:00
soundwire
spi spi: Fixes for v5.8 2020-06-29 10:10:16 -07:00
spmi
ssb
staging Staging: rtl8723bs: prevent buffer overflow in update_sta_support_rate() 2020-06-16 21:25:38 +02:00
target Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
tc
tee mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
thermal thermal/drivers/rcar_gen3: Fix undefined temperature if negative 2020-06-29 12:15:34 +02:00
thunderbolt USB/PHY driver updates for 5.8-rc1 2020-06-07 09:42:16 -07:00
tty Linux 5.8-rc4 2020-07-08 08:02:13 -06:00
uio
usb USB fixes for 5.8-rc3 2020-06-27 13:12:10 -07:00
vdpa vdpa: fix typos in the comments for __vdpa_alloc_device() 2020-06-22 12:34:21 -04:00
vfio vfio/pci: Fix SR-IOV VF handling with MMIO blocking 2020-06-25 11:04:23 -06:00
vhost tools/virtio: Add --reset 2020-06-22 12:34:21 -04:00
video Short summary of fixes pull (less than what git shortlog provides): 2020-06-26 13:49:17 +10:00
virt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
virtio virtio-mem: add memory via add_memory_driver_managed() 2020-06-22 12:34:21 -04:00
visorbus treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
vlynq
vme treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
w1 w1: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
watchdog treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
xen xen: branch for v5.8-rc4 2020-07-03 23:58:12 -07:00
zorro treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Kconfig
Makefile