linux_dsm_epyc7002/include
Jens Axboe fe0f07d08e direct-io: only inc/dec inode->i_dio_count for file systems
do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
 | 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
 | 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
 | 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
 | 99.99th=[  165]

After:

clat percentiles (usec):
 |  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
 | 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
 | 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
 | 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
 | 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24 15:45:28 -04:00
..
acpi Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2015-02-19 11:28:36 -08:00
asm-generic OK, this has the big virtio 1.0 implementation, as specified by OASIS. 2015-02-18 09:24:01 -08:00
clocksource
crypto crypto: af_alg - Allow to link sgl 2015-03-23 16:41:37 -04:00
drm drm/ttm: device address space != CPU address space 2015-03-05 09:04:39 +10:00
dt-bindings ARM: dts: am43xx: fix SLEWCTRL_FAST pinctrl binding 2015-03-06 09:21:03 -08:00
keys
kvm arm/arm64: KVM: Keep elrsr/aisr in sync with software model 2015-03-14 13:42:07 +01:00
linux direct-io: only inc/dec inode->i_dio_count for file systems 2015-04-24 15:45:28 -04:00
math-emu
media
memory
misc
net 9p: switch p9_client_read() to passing struct iov_iter * 2015-04-11 22:28:27 -04:00
pcmcia
ras
rdma
rxrpc RxRPC: Handle VERSION Rx protocol packets 2015-04-01 16:31:26 +01:00
scsi Merge branch 'for-3.20/core' of git://git.kernel.dk/linux-block 2015-02-12 14:13:23 -08:00
soc pm: at91: Workaround DDRSDRC self-refresh bug with LPDDR1 memories. 2015-03-03 19:43:59 +01:00
sound
target target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 2015-03-19 23:26:46 -07:00
trace VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-04-06 22:34:15 -04:00
video OMAPDSS: fix regression with display sysfs files 2015-02-26 10:23:15 +02:00
xen xen: Remove trailing semicolon from xenbus_register_frontend() definition 2015-03-02 10:38:59 +00:00
Kbuild