Stanse found that in init_vqs, memory is leaked under certain
circumstanses (the fail path order is incorrect). Fix that by checking
allocations in one turn and free all of them at once if some fails
(some may be NULL, but this is OK).
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The ports are char devices; do not have seeking capabilities. Calling
nonseekable_open() from the fops_open() call and setting the llseek fops
pointer to no_llseek ensures an lseek() call from userspace returns
-ESPIPE.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a port has registered for SIGIO signals, let the application
know that the port is getting unplugged.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Send a SIGIO signal when new data arrives on a port. This is sent only
when the process has requested for the signal to be sent using fcntl().
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A process can request for SIGIO on host connect / disconnect events
using the O_ASYNC file flag using fcntl().
If that's requested, and if the guest-side connection for the port is
open, any host-side open/close events for that port will raise a SIGIO.
The process can then use poll() within the signal handler to find out
which port triggered the signal.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Explain in a comment why there's no need to reference-count the portdev
struct: when a device is yanked out, we can't do anything more with it
anyway so just give up doing anything more with the data or the vqs and
exit cleanly.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port got hot-unplugged, when a port was open, any file operation
after the unplugging resulted in a crash. This is fixed by ref-counting
the port structure, and releasing it only when the file is closed.
This splits the unplug operation in two parts: first marks the port
as unavailable, removes all the buffers in the vqs and removes the port
from the per-device list of ports. The second stage, invoked when all
references drop to zero, releases the chardev and frees all other memory.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This moves to using cdev on the heap instead of it being embedded in the
ports struct. This helps individual refcounting and will allow us to
properly remove cdev structs after hot-unplugs and close operations.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
To convert to using cdev as a pointer to avoid kref troubles, we have to
use a different method to get to a port from an inode than the current
container_of method.
Add find_port_by_devt() that looks up all portdevs and ports with those
portdevs to find the right port.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The virtio_console.c driver is capable of handling multiple devices at a
time. Maintain a list of devices for future traversal.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port is removed, we have to assume the port is gone. So a
success/failure return value doesn't make sense.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port is hot-unplugged while an app was blocked on a write() call,
the call was unblocked but would not get an error returned.
Return -ENODEV to ensure the app knows the port has gone away.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port is hot-unplugged while an app was blocked on a read() call,
the call was unblocked but would not get an error returned.
Return -ENODEV to ensure the app knows the port has gone away.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port is hot-unplugged while an app is blocked on poll(), unblock
the poll() and return.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a chardev is closed, any blocked read / poll calls should just return
and not attempt to use other state.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A portdev may have been hot-unplugged while a port was open()ed. Skip
sending control messages when the portdev isn't valid.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a portdev isn't using multiport support, it won't have any control vq
data to remove.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The virtqueues should be disabled before attempting to remove the
device.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the host is slow in reading data or doesn't read data at all,
blocking write calls not only blocked the program that called write()
but the entire guest itself.
To overcome this, let's not block till the host signals it has given
back the virtio ring element we passed it. Instead, send the buffer to
the host and return to userspace. This operation then becomes similar
to how non-blocking writes work, so let's use the existing code for this
path as well.
This code change also ensures blocking write calls do get blocked if
there's not enough room in the virtio ring as well as they don't return
-EAGAIN to userspace.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
CC: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A userspace could submit a buffer with 0 length to be written to the
host. Prevent such a situation.
This was not needed previously, but recent changes in the way write()
works exposed this condition to trigger a virtqueue event to the host,
causing a NULL buffer to be sent across.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: stable@kernel.org
I found this while working on a Linux agent for spice, the symptom I was
seeing was select blocking on the spice vdagent virtio serial port even
though there were messages queued up there.
virtio_console's port_fops_poll checks port->inbuf != NULL to determine
if read won't block. However if an application reads enough bytes from
inbuf through port_fops_read, to empty the current port->inbuf,
port->inbuf will be NULL even though there may be buffers left in the
virtqueue.
This causes poll() to block even though there is data to be read,
this patch fixes this by using will_read_block(port) instead of the
port->inbuf != NULL check.
Signed-off-By: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
When a program that has a virtio port opened and blocked for a write
operation, a port hot-unplug event will later led to a crash when
SIGTERM was sent to the program. Fix that.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When removing a port we don't check if a program was blocked for read.
This leads to a crash when SIGTERM is sent to the program after
hot-unplugging the port.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In each case, the first argument to send_control_msg or __send_control_msg,
respectively, has either not been successfully allocated or has been freed
at the point of the call. In the first case, the first argument, port, is
only used to access the portdev and id fields, in order to call
__send_control_msg. Thus it seems possible instead to call
__send_control_msg directly. In the second case, the call to
__send_control_msg is moved up to a place where it seems like the first
argument, portdev, has been initialized sufficiently to make the call to
__send_control_msg meaningful.
This has only been compile tested.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@free@
expression E;
position p;
@@
kfree@p(E)
@@
expression free.E, subE<=free.E, E1;
position free.p;
@@
kfree@p(E)
...
(
subE = E1
|
* E
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The VIRTIO_CONSOLE_RESIZE control message sent to us by the host now
contains the new {rows, cols} values for the console. This ensures each
console port gets its own size, and we don't depend on the config-space
rows and cols values at all now.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: linuxppc-dev@ozlabs.org
CC: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With support for multiple consoles, just using one {rows,cols} pair in
the config space is not going to suffice.
Store each console's size as part of the console struct.
This changes the behaviour for one case when multiport is not enabled:
when notifier_add_vio() is called, the console size is taken from that
of the last config-space update instead of fetching it afresh from the
config space.
Also add a helper to update the size in the console struct as we'll need
to use the same code to update the size via control messages when
multiport support is enabled.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: linuxppc-dev@ozlabs.org
CC: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When using multiport, we'll use control messages. Ensure we don't
accidentally update port 0 size on config interrupts.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: linuxppc-dev@ozlabs.org
CC: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If the host port is not open, a write() should either just return if the
file is opened in non-blocking mode, or block till the host port is
opened.
Also, don't spin till host consumes data for nonblocking ports. For
non-blocking ports, we can do away with the spinning and reclaim the
buffers consumed by the host on the next write call or on the condition
that'll make poll return.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We'll introduce a function that checks if write will block. Have
function names that are similar for the two cases.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we're using multiport, there's no point in always creating a console
port. Create the console port only if the host doesn't support
multiport.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of the host and guest independently enumerating ports, switch to
a control message to add ports where the host supplies the port number
so there's no ambiguity or a possibility of a race between the host and
the guest port numbers.
We now no longer need the 'nr_ports' config value. Since no kernel has
been released with the MULTIPORT changes yet, we have a chance to fiddle
with the config space without adding compatibility features.
This is beneficial for management software, which would now be able to
instantiate ports at known locations and avoid problems that arise with
implicit numbering in the host and the guest. This removes the 'guessing
game' part of it, and management software can now actually indicate
which id to spawn a particular port on.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to use add_port() from handle_control_message() in the next
patch.
Move the add_port() and fill_queue(), which depends on it, above
handle_control_message() to avoid forward declarations.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to switch to using control messages for port hot-plug and
initial port discovery. Remove the config work handler which handled
port hot-plug so far.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
hvc_remove() has some bug which freezes other active hvc ports when one
port is removed.
So disable calling of hvc_remove() which deregisters a port with the
hvc_console.
If the hvc_console code calls into our get_chars() routine as a result
of a poll operation, we will return -EPIPE and the hvc_console code will
then do the necessary cleanup.
This call will be restored when the bug in hvc_remove() is found and
fixed.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
hvc_console handles -EPIPE properly when the connection to the host is
lost.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The host may want to know and let management apps notify of port or
device add failures. Send a control message saying the device or port is
not ready in this case.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We will introduce control messages that operate on the device as a whole
rather than just ports. Make send_control_msg() a wrapper around
__send_control_msg() which does not need a valid port.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This reverts commit b7a413015d.
Multiport support was disabled for 2.6.34 because we wanted to introduce
a new ABI and since we didn't have any released kernel with the older
ABI and were out of the merge window, it didn't make sense keeping the
older ABI around.
Now we revert the patch disabling multiport and rework the ABI in the
following patches.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Switch virtio_console to new virtqueue_xxx wrappers.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Move MULTIPORT feature and related config changes
out of exported headers, and disable the feature
at runtime.
At this point, it seems less risky to keep code around
until we can enable it than rip it out completely.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The get_buf() API sets the second arg to the number of bytes *written*
by the other side; in this case it should be zero as these are output buffers.
lguest gets this right (obviously kvm's console doesn't), resulting in
continual buildup of console writes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Amit Shah <amit.shah@redhat.com>
Currently early_put_chars is not used by virtio_console because it can
only be used once a port has been found, at which point it's too late
because it is no longer needed. This patch should fix it.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
The console port could have been hot-unplugged. Check if it is valid
before working on it.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the host lets us know what 'name' a port is assigned, we create the
sysfs 'name' attribute. Generate a 'change' event after this so that
udev wakes up and acts on the rules for virtio-ports (currently there's
only one rule that creates a symlink from the 'name' to the actual char
device).
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We want to keep track of the number of buffers added to a vq. Use
nr_added_bufs instead of 'ret'.
Also, the users of fill_queue() overloaded a local 'err' variable to
check the numbers of buffers allocated. Use nr_added_bufs instead of
err.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reported-by: Juan Quintela <quintela@redhat.com>
We declare 'len' as int type but it should be 'unsigned int', as
get_buf() wants it to be.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reported-by: Juan Quintela <quintela@redhat.com>
Instead of allocating just one buffer for a port's in_vq, fill
the entire in_vq with buffers so the host need not stall while
an application consumes the data and makes the buffer available
again for the host.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
With MULTIPORT support, the control queue is an integral part of the
functioning of the device. If we can't get any buffers allocated, the
host won't be able to relay important information and the device may not
function as intended.
Ensure 'probe' doesn't succeed until the control queue has at least one
buffer allocated for its ivq.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>