Go to file
Davidlohr Bueso 002b343669 fs/epoll: loosen irq safety in ep_scan_ready_list()
Patch series "fs/epoll: loosen irq safety when possible".

Both patches replace saving+restoring interrupts when taking the ep->lock
(now the waitqueue lock), with just disabling local irqs.  This shows
immediate performance benefits in patch 1 for an epoll workload running on
Xen.  The main concern we need to have with this sort of changes in epoll
is the ep_poll_callback() which is passed to the wait queue wakeup and is
done very often under irq context, this patch does not touch this call.

Patches have been tested pretty heavily with the customer workload,
microbenchmarks, ltp testcases and two high level workloads that use epoll
under the hood: nginx and libevent benchmarks.

This patch (of 2):

Saving and restoring interrupts in ep_scan_ready_list() is an
overkill as it is never called with interrupts disabled. Loosen
this to simply disabling local irqs such that archs where managing
irqs is expensive or virtual environments. This patch yields
some throughput improvements on a workload that is epoll intensive
running on a single Xen DomU.

1 Job	 7500	-->    8800 enq/s  (+17%)
2 Jobs	14000   -->   15200 enq/s  (+8%)
3 Jobs	20500	-->   22300 enq/s  (+8%)
4 Jobs	25000   -->   28000 enq/s  (+8-12)%

On bare metal:

For a 2-socket 40-core (ht) IvyBridge on a few workloads, unfortunately I
don't have a xen environment and the results for Xen I do have (which
numbers are in patch 1) I don't have the actual workload, so cannot
compare them directly.

1) Different configurations were used for a epoll_wait (pipes io)
   microbench (http://linux-scalability.org/epoll/epoll-test.c) and shows
   around a 7-10% improvement in overall total number of times the
   epoll_wait() loops when using both regular and nested epolls, so very
   raw numbers, but measurable nonetheless.

# threads	vanilla		dirty
     1		1677717		1805587
     2		1660510		1854064
     4		1610184		1805484
     8		1577696		1751222
     16		1568837		1725299
     32		1291532		1378463
     64		 752584		 787368

   Note that stddev is pretty small.

2) Another pipe test, which shows no real measurable improvement.
   (http://www.xmailserver.org/linux-patches/pipetest.c)

Link: http://lkml.kernel.org/r/20180720172956.2883-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:47 -07:00
arch module: use relative references for __ksymtab entries 2018-08-22 10:52:47 -07:00
block for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
certs Replace magic for trusting the secondary keyring with #define 2018-08-16 09:57:20 -07:00
crypto DMAengine updates for v4.19-rc1 2018-08-18 15:55:59 -07:00
Documentation kernel/hung_task.c: allow to set checking interval separately from timeout 2018-08-22 10:52:47 -07:00
drivers PCI: Add support for relative addressing in quirk tables 2018-08-22 10:52:47 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs fs/epoll: loosen irq safety in ep_scan_ready_list() 2018-08-22 10:52:47 -07:00
include kernel: tracepoints: add support for relative references 2018-08-22 10:52:47 -07:00
init init: allow initcall tables to be emitted using relative references 2018-08-22 10:52:47 -07:00
ipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
kernel sched/wait: assert the wait_queue_head lock is held in __wake_up_common 2018-08-22 10:52:47 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-08-19 11:51:45 -07:00
LICENSES LICENSES: Add Linux-OpenIB license text 2018-04-27 16:41:53 -06:00
mm bdi: use irqsave variant of refcount_dec_and_lock() 2018-08-22 10:52:46 -07:00
net The main things are support for cephx v2 authentication protocol and 2018-08-20 18:26:55 -07:00
samples samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM 2018-08-16 21:55:32 +02:00
scripts spelling.txt: add more spellings to spelling.txt 2018-08-22 10:52:47 -07:00
security init: allow initcall tables to be emitted using relative references 2018-08-22 10:52:47 -07:00
sound DMAengine updates for v4.19-rc1 2018-08-18 15:55:59 -07:00
tools proc: test /proc/thread-self symlink 2018-08-22 10:52:45 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS - New Drivers 2018-08-20 15:38:44 -07:00
Makefile Updates for v4.19: 2018-08-20 18:32:00 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.