linux_dsm_epyc7002/drivers/usb/usbip/stub_tx.c
Suwan Kim ea44d19076 usbip: Implement SG support to vhci-hcd and stub driver
There are bugs on vhci with usb 3.0 storage device. In USB, each SG
list entry buffer should be divisible by the bulk max packet size.
But with native SG support, this problem doesn't matter because the
SG buffer is treated as contiguous buffer. But without native SG
support, USB storage driver breaks SG list into several URBs and the
error occurs because of a buffer size of URB that cannot be divided
by the bulk max packet size. The error situation is as follows.

When USB Storage driver requests 31.5 KB data and has SG list which
has 3584 bytes buffer followed by 7 4096 bytes buffer for some
reason. USB Storage driver splits this SG list into several URBs
because VHCI doesn't support SG and sends them separately. So the
first URB buffer size is 3584 bytes. When receiving data from device,
USB 3.0 device sends data packet of 1024 bytes size because the max
packet size of BULK pipe is 1024 bytes. So device sends 4096 bytes.
But the first URB buffer has only 3584 bytes buffer size. So host
controller terminates the transfer even though there is more data to
receive. So, vhci needs to support SG transfer to prevent this error.

In this patch, vhci supports SG regardless of whether the server's
host controller supports SG or not, because stub driver splits SG
list into several URBs if the server's host controller doesn't
support SG.

To support SG, vhci sets URB_DMA_MAP_SG flag in urb->transfer_flags
if URB has SG list and this flag will tell stub driver to use SG
list. After receiving urb from stub driver, vhci clear URB_DMA_MAP_SG
flag to avoid unnecessary DMA unmapping in HCD.

vhci sends each SG list entry to stub driver. Then, stub driver sees
the total length of the buffer and allocates SG table and pages
according to the total buffer length calling sgl_alloc(). After stub
driver receives completed URB, it again sends each SG list entry to
vhci.

If the server's host controller doesn't support SG, stub driver
breaks a single SG request into several URBs and submits them to
the server's host controller. When all the split URBs are completed,
stub driver reassembles the URBs into a single return command and
sends it to vhci.

Moreover, in the situation where vhci supports SG, but stub driver
does not, or vice versa, usbip works normally. Because there is no
protocol modification, there is no problem in communication between
server and client even if the one has a kernel without SG support.

In the case of vhci supports SG and stub driver doesn't, because
vhci sends only the total length of the buffer to stub driver as
it did before the patch applied, stub driver only needs to allocate
the required length of buffers using only kmalloc() regardless of
whether vhci supports SG or not. But stub driver has to allocate
buffer with kmalloc() as much as the total length of SG buffer which
is quite huge when vhci sends SG request, so it has overhead in
buffer allocation in this situation.

If stub driver needs to send data buffer to vhci because of IN pipe,
stub driver also sends only total length of buffer as metadata and
then sends real data as vhci does. Then vhci receive data from stub
driver and store it to the corresponding buffer of SG list entry.

And for the case of stub driver supports SG and vhci doesn't, since
the USB storage driver checks that vhci doesn't support SG and sends
the request to stub driver by splitting the SG list into multiple
URBs, stub driver allocates a buffer for each URB with kmalloc() as
it did before this patch.

* Test environment

Test uses two difference machines and two different kernel version
to make mismatch situation between the client and the server where
vhci supports SG, but stub driver does not, or vice versa. All tests
are conducted in both full SG support that both vhci and stub support
SG and half SG support that is the mismatch situation. Test kernel
version is 5.3-rc6 with commit "usb: add a HCD_DMA flag instead of
guestimating DMA capabilities" to avoid unnecessary DMA mapping and
unmapping.

 - Test kernel version
    - 5.3-rc6 with SG support
    - 5.1.20-200.fc29.x86_64 without SG support

* SG support test

 - Test devices
    - Super-speed storage device - SanDisk Ultra USB 3.0
    - High-speed storage device - SMI corporation USB 2.0 flash drive

 - Test description

Test read and write operation of mass storage device that uses the
BULK transfer. In test, the client reads and writes files whose size
is over 1G and it works normally.

* Regression test

 - Test devices
    - Super-speed device - Logitech Brio webcam
    - High-speed device  - Logitech C920 HD Pro webcam
    - Full-speed device  - Logitech bluetooth mouse
                         - Britz BR-Orion speaker
    - Low-speed device   - Logitech wired mouse

 - Test description

Moving and click test for mouse. To test the webcam, use gnome-cheese.
To test the speaker, play music and video on the client. All works
normally.

* VUDC compatibility test

VUDC also works well with this patch. Tests are done with two USB
gadget created by CONFIGFS USB gadget. Both use the BULK pipe.

        1. Serial gadget
        2. Mass storage gadget

 - Serial gadget test

Serial gadget on the host sends and receives data using cat command
on the /dev/ttyGS<N>. The client uses minicom to communicate with
the serial gadget.

 - Mass storage gadget test

After connecting the gadget with vhci, use "dd" to test read and
write operation on the client side.

Read  - dd if=/dev/sd<N> iflag=direct of=/dev/null bs=1G count=1
Write - dd if=<my file path> iflag=direct of=/dev/sd<N> bs=1G count=1

Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
Acked-by: Shuah khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20190828032741.12234-1-suwan.kim027@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-03 16:00:38 +02:00

454 lines
11 KiB
C

// SPDX-License-Identifier: GPL-2.0+
/*
* Copyright (C) 2003-2008 Takahiro Hirofuchi
*/
#include <linux/kthread.h>
#include <linux/socket.h>
#include <linux/scatterlist.h>
#include "usbip_common.h"
#include "stub.h"
/* be in spin_lock_irqsave(&sdev->priv_lock, flags) */
void stub_enqueue_ret_unlink(struct stub_device *sdev, __u32 seqnum,
__u32 status)
{
struct stub_unlink *unlink;
unlink = kzalloc(sizeof(struct stub_unlink), GFP_ATOMIC);
if (!unlink) {
usbip_event_add(&sdev->ud, VDEV_EVENT_ERROR_MALLOC);
return;
}
unlink->seqnum = seqnum;
unlink->status = status;
list_add_tail(&unlink->list, &sdev->unlink_tx);
}
/**
* stub_complete - completion handler of a usbip urb
* @urb: pointer to the urb completed
*
* When a urb has completed, the USB core driver calls this function mostly in
* the interrupt context. To return the result of a urb, the completed urb is
* linked to the pending list of returning.
*
*/
void stub_complete(struct urb *urb)
{
struct stub_priv *priv = (struct stub_priv *) urb->context;
struct stub_device *sdev = priv->sdev;
unsigned long flags;
usbip_dbg_stub_tx("complete! status %d\n", urb->status);
switch (urb->status) {
case 0:
/* OK */
break;
case -ENOENT:
dev_info(&urb->dev->dev,
"stopped by a call to usb_kill_urb() because of cleaning up a virtual connection\n");
return;
case -ECONNRESET:
dev_info(&urb->dev->dev,
"unlinked by a call to usb_unlink_urb()\n");
break;
case -EPIPE:
dev_info(&urb->dev->dev, "endpoint %d is stalled\n",
usb_pipeendpoint(urb->pipe));
break;
case -ESHUTDOWN:
dev_info(&urb->dev->dev, "device removed?\n");
break;
default:
dev_info(&urb->dev->dev,
"urb completion with non-zero status %d\n",
urb->status);
break;
}
/*
* If the server breaks single SG request into the several URBs, the
* URBs must be reassembled before sending completed URB to the vhci.
* Don't wake up the tx thread until all the URBs are completed.
*/
if (priv->sgl) {
priv->completed_urbs++;
/* Only save the first error status */
if (urb->status && !priv->urb_status)
priv->urb_status = urb->status;
if (priv->completed_urbs < priv->num_urbs)
return;
}
/* link a urb to the queue of tx. */
spin_lock_irqsave(&sdev->priv_lock, flags);
if (sdev->ud.tcp_socket == NULL) {
usbip_dbg_stub_tx("ignore urb for closed connection\n");
/* It will be freed in stub_device_cleanup_urbs(). */
} else if (priv->unlinking) {
stub_enqueue_ret_unlink(sdev, priv->seqnum, urb->status);
stub_free_priv_and_urb(priv);
} else {
list_move_tail(&priv->list, &sdev->priv_tx);
}
spin_unlock_irqrestore(&sdev->priv_lock, flags);
/* wake up tx_thread */
wake_up(&sdev->tx_waitq);
}
static inline void setup_base_pdu(struct usbip_header_basic *base,
__u32 command, __u32 seqnum)
{
base->command = command;
base->seqnum = seqnum;
base->devid = 0;
base->ep = 0;
base->direction = 0;
}
static void setup_ret_submit_pdu(struct usbip_header *rpdu, struct urb *urb)
{
struct stub_priv *priv = (struct stub_priv *) urb->context;
setup_base_pdu(&rpdu->base, USBIP_RET_SUBMIT, priv->seqnum);
usbip_pack_pdu(rpdu, urb, USBIP_RET_SUBMIT, 1);
}
static void setup_ret_unlink_pdu(struct usbip_header *rpdu,
struct stub_unlink *unlink)
{
setup_base_pdu(&rpdu->base, USBIP_RET_UNLINK, unlink->seqnum);
rpdu->u.ret_unlink.status = unlink->status;
}
static struct stub_priv *dequeue_from_priv_tx(struct stub_device *sdev)
{
unsigned long flags;
struct stub_priv *priv, *tmp;
spin_lock_irqsave(&sdev->priv_lock, flags);
list_for_each_entry_safe(priv, tmp, &sdev->priv_tx, list) {
list_move_tail(&priv->list, &sdev->priv_free);
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return priv;
}
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return NULL;
}
static int stub_send_ret_submit(struct stub_device *sdev)
{
unsigned long flags;
struct stub_priv *priv, *tmp;
struct msghdr msg;
size_t txsize;
size_t total_size = 0;
while ((priv = dequeue_from_priv_tx(sdev)) != NULL) {
struct urb *urb = priv->urbs[0];
struct usbip_header pdu_header;
struct usbip_iso_packet_descriptor *iso_buffer = NULL;
struct kvec *iov = NULL;
struct scatterlist *sg;
u32 actual_length = 0;
int iovnum = 0;
int ret;
int i;
txsize = 0;
memset(&pdu_header, 0, sizeof(pdu_header));
memset(&msg, 0, sizeof(msg));
if (urb->actual_length > 0 && !urb->transfer_buffer &&
!urb->num_sgs) {
dev_err(&sdev->udev->dev,
"urb: actual_length %d transfer_buffer null\n",
urb->actual_length);
return -1;
}
if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS)
iovnum = 2 + urb->number_of_packets;
else if (usb_pipein(urb->pipe) && urb->actual_length > 0 &&
urb->num_sgs)
iovnum = 1 + urb->num_sgs;
else if (usb_pipein(urb->pipe) && priv->sgl)
iovnum = 1 + priv->num_urbs;
else
iovnum = 2;
iov = kcalloc(iovnum, sizeof(struct kvec), GFP_KERNEL);
if (!iov) {
usbip_event_add(&sdev->ud, SDEV_EVENT_ERROR_MALLOC);
return -1;
}
iovnum = 0;
/* 1. setup usbip_header */
setup_ret_submit_pdu(&pdu_header, urb);
usbip_dbg_stub_tx("setup txdata seqnum: %d\n",
pdu_header.base.seqnum);
if (priv->sgl) {
for (i = 0; i < priv->num_urbs; i++)
actual_length += priv->urbs[i]->actual_length;
pdu_header.u.ret_submit.status = priv->urb_status;
pdu_header.u.ret_submit.actual_length = actual_length;
}
usbip_header_correct_endian(&pdu_header, 1);
iov[iovnum].iov_base = &pdu_header;
iov[iovnum].iov_len = sizeof(pdu_header);
iovnum++;
txsize += sizeof(pdu_header);
/* 2. setup transfer buffer */
if (usb_pipein(urb->pipe) && priv->sgl) {
/* If the server split a single SG request into several
* URBs because the server's HCD doesn't support SG,
* reassemble the split URB buffers into a single
* return command.
*/
for (i = 0; i < priv->num_urbs; i++) {
iov[iovnum].iov_base =
priv->urbs[i]->transfer_buffer;
iov[iovnum].iov_len =
priv->urbs[i]->actual_length;
iovnum++;
}
txsize += actual_length;
} else if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) != PIPE_ISOCHRONOUS &&
urb->actual_length > 0) {
if (urb->num_sgs) {
unsigned int copy = urb->actual_length;
int size;
for_each_sg(urb->sg, sg, urb->num_sgs, i) {
if (copy == 0)
break;
if (copy < sg->length)
size = copy;
else
size = sg->length;
iov[iovnum].iov_base = sg_virt(sg);
iov[iovnum].iov_len = size;
iovnum++;
copy -= size;
}
} else {
iov[iovnum].iov_base = urb->transfer_buffer;
iov[iovnum].iov_len = urb->actual_length;
iovnum++;
}
txsize += urb->actual_length;
} else if (usb_pipein(urb->pipe) &&
usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) {
/*
* For isochronous packets: actual length is the sum of
* the actual length of the individual, packets, but as
* the packet offsets are not changed there will be
* padding between the packets. To optimally use the
* bandwidth the padding is not transmitted.
*/
int i;
for (i = 0; i < urb->number_of_packets; i++) {
iov[iovnum].iov_base = urb->transfer_buffer +
urb->iso_frame_desc[i].offset;
iov[iovnum].iov_len =
urb->iso_frame_desc[i].actual_length;
iovnum++;
txsize += urb->iso_frame_desc[i].actual_length;
}
if (txsize != sizeof(pdu_header) + urb->actual_length) {
dev_err(&sdev->udev->dev,
"actual length of urb %d does not match iso packet sizes %zu\n",
urb->actual_length,
txsize-sizeof(pdu_header));
kfree(iov);
usbip_event_add(&sdev->ud,
SDEV_EVENT_ERROR_TCP);
return -1;
}
}
/* 3. setup iso_packet_descriptor */
if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) {
ssize_t len = 0;
iso_buffer = usbip_alloc_iso_desc_pdu(urb, &len);
if (!iso_buffer) {
usbip_event_add(&sdev->ud,
SDEV_EVENT_ERROR_MALLOC);
kfree(iov);
return -1;
}
iov[iovnum].iov_base = iso_buffer;
iov[iovnum].iov_len = len;
txsize += len;
iovnum++;
}
ret = kernel_sendmsg(sdev->ud.tcp_socket, &msg,
iov, iovnum, txsize);
if (ret != txsize) {
dev_err(&sdev->udev->dev,
"sendmsg failed!, retval %d for %zd\n",
ret, txsize);
kfree(iov);
kfree(iso_buffer);
usbip_event_add(&sdev->ud, SDEV_EVENT_ERROR_TCP);
return -1;
}
kfree(iov);
kfree(iso_buffer);
total_size += txsize;
}
spin_lock_irqsave(&sdev->priv_lock, flags);
list_for_each_entry_safe(priv, tmp, &sdev->priv_free, list) {
stub_free_priv_and_urb(priv);
}
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return total_size;
}
static struct stub_unlink *dequeue_from_unlink_tx(struct stub_device *sdev)
{
unsigned long flags;
struct stub_unlink *unlink, *tmp;
spin_lock_irqsave(&sdev->priv_lock, flags);
list_for_each_entry_safe(unlink, tmp, &sdev->unlink_tx, list) {
list_move_tail(&unlink->list, &sdev->unlink_free);
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return unlink;
}
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return NULL;
}
static int stub_send_ret_unlink(struct stub_device *sdev)
{
unsigned long flags;
struct stub_unlink *unlink, *tmp;
struct msghdr msg;
struct kvec iov[1];
size_t txsize;
size_t total_size = 0;
while ((unlink = dequeue_from_unlink_tx(sdev)) != NULL) {
int ret;
struct usbip_header pdu_header;
txsize = 0;
memset(&pdu_header, 0, sizeof(pdu_header));
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
usbip_dbg_stub_tx("setup ret unlink %lu\n", unlink->seqnum);
/* 1. setup usbip_header */
setup_ret_unlink_pdu(&pdu_header, unlink);
usbip_header_correct_endian(&pdu_header, 1);
iov[0].iov_base = &pdu_header;
iov[0].iov_len = sizeof(pdu_header);
txsize += sizeof(pdu_header);
ret = kernel_sendmsg(sdev->ud.tcp_socket, &msg, iov,
1, txsize);
if (ret != txsize) {
dev_err(&sdev->udev->dev,
"sendmsg failed!, retval %d for %zd\n",
ret, txsize);
usbip_event_add(&sdev->ud, SDEV_EVENT_ERROR_TCP);
return -1;
}
usbip_dbg_stub_tx("send txdata\n");
total_size += txsize;
}
spin_lock_irqsave(&sdev->priv_lock, flags);
list_for_each_entry_safe(unlink, tmp, &sdev->unlink_free, list) {
list_del(&unlink->list);
kfree(unlink);
}
spin_unlock_irqrestore(&sdev->priv_lock, flags);
return total_size;
}
int stub_tx_loop(void *data)
{
struct usbip_device *ud = data;
struct stub_device *sdev = container_of(ud, struct stub_device, ud);
while (!kthread_should_stop()) {
if (usbip_event_happened(ud))
break;
/*
* send_ret_submit comes earlier than send_ret_unlink. stub_rx
* looks at only priv_init queue. If the completion of a URB is
* earlier than the receive of CMD_UNLINK, priv is moved to
* priv_tx queue and stub_rx does not find the target priv. In
* this case, vhci_rx receives the result of the submit request
* and then receives the result of the unlink request. The
* result of the submit is given back to the usbcore as the
* completion of the unlink request. The request of the
* unlink is ignored. This is ok because a driver who calls
* usb_unlink_urb() understands the unlink was too late by
* getting the status of the given-backed URB which has the
* status of usb_submit_urb().
*/
if (stub_send_ret_submit(sdev) < 0)
break;
if (stub_send_ret_unlink(sdev) < 0)
break;
wait_event_interruptible(sdev->tx_waitq,
(!list_empty(&sdev->priv_tx) ||
!list_empty(&sdev->unlink_tx) ||
kthread_should_stop()));
}
return 0;
}