Go to file
Sunil Muthuswamy cb359b6041 hvsock: fix epollout hang from race condition
Currently, hvsock can enter into a state where epoll_wait on EPOLLOUT will
not return even when the hvsock socket is writable, under some race
condition. This can happen under the following sequence:
- fd = socket(hvsocket)
- fd_out = dup(fd)
- fd_in = dup(fd)
- start a writer thread that writes data to fd_out with a combination of
  epoll_wait(fd_out, EPOLLOUT) and
- start a reader thread that reads data from fd_in with a combination of
  epoll_wait(fd_in, EPOLLIN)
- On the host, there are two threads that are reading/writing data to the
  hvsocket

stack:
hvs_stream_has_space
hvs_notify_poll_out
vsock_poll
sock_poll
ep_poll

Race condition:
check for epollout from ep_poll():
	assume no writable space in the socket
	hvs_stream_has_space() returns 0
check for epollin from ep_poll():
	assume socket has some free space < HVS_PKT_LEN(HVS_SEND_BUF_SIZE)
	hvs_stream_has_space() will clear the channel pending send size
	host will not notify the guest because the pending send size has
		been cleared and so the hvsocket will never mark the
		socket writable

Now, the EPOLLOUT will never return even if the socket write buffer is
empty.

The fix is to set the pending size to the default size and never change it.
This way the host will always notify the guest whenever the writable space
is bigger than the pending size. The host is already optimized to *only*
notify the guest when the pending size threshold boundary is crossed and
not everytime.

This change also reduces the cpu usage somewhat since hv_stream_has_space()
is in the hotpath of send:
vsock_stream_sendmsg()->hv_stream_has_space()
Earlier hv_stream_has_space was setting/clearing the pending size on every
call.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18 21:41:12 -04:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
block blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests 2019-06-13 03:05:58 -06:00
certs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
crypto SPDX update for 5.2-rc4 2019-06-08 12:52:42 -07:00
Documentation Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
drivers net: lio_core: fix potential sign-extension overflow on large shift 2019-06-18 21:00:40 -04:00
fs fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
include ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL 2019-06-18 20:48:45 -04:00
init Char/Misc driver fixes for 5.2-rc4 2019-06-08 12:50:36 -07:00
ipc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
lib lib/genalloc: introduce chunk owners 2019-06-13 17:34:56 -10:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-06-16 07:28:14 -10:00
net hvsock: fix epollout hang from race condition 2019-06-18 21:41:12 -04:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
scripts scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE 2019-06-13 17:34:56 -10:00
security Smack: Restore the smackfsdef mount option and add missing prefixes 2019-06-14 14:25:04 -10:00
sound sound fixes for 5.2-rc5 2019-06-14 05:37:06 -10:00
tools Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
.clang-format Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-17 11:26:25 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore .gitignore: exclude .get_maintainer.ignore and .gitattributes 2019-05-18 11:49:54 +09:00
.mailmap A reasonably busy cycle for docs, including: 2019-05-08 12:42:50 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Farewell Martin Schwidefsky 2019-05-31 10:14:11 +02:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS RISC-V patches for v5.2-rc6 2019-06-17 10:34:03 -07:00
Makefile Linux 5.2-rc5 2019-06-16 08:49:45 -10:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.