mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-24 23:51:17 +07:00
cc62a7eb63
There are cases, when 80% max isochronous bandwidth is too limiting. For example I have two USB video capture cards which stream uncompressed video, and to stream full NTSC + PAL videos we'd need NTSC 640x480 YUV422 @30fps ~17.6 MB/s PAL 720x576 YUV422 @25fps ~19.7 MB/s isoc bandwidth. Now, due to limited alt settings in capture devices NTSC one ends up streaming with max_pkt_size=2688 and PAL with max_pkt_size=2892, both with interval=1. In terms of microframe time allocation this gives NTSC ~53us PAL ~57us and together ~110us > 100us == 80% of 125us uframe time. So those two devices can't work together simultaneously because the'd over allocate isochronous bandwidth. 80% seemed a bit arbitrary to me, and I've tried to raise it to 90% and both devices started to work together, so I though sometimes it would be a good idea for users to override hardcoded default of max 80% isoc bandwidth. After all, isn't it a user who should decide how to load the bus? If I can live with 10% or even 5% bulk bandwidth that should be ok. I'm a USB newcomer, but that 80% set in stone by USB 2.0 specification seems to be chosen pretty arbitrary to me, just to serve as a reasonable default. NOTE 1 ~~~~~~ for two streams with max_pkt_size=3072 (worst case) both time allocation would be 60us+60us=120us which is 96% periodic bandwidth leaving 4% for bulk and control. Alan Stern suggested that bulk then would be problematic (less than 300*8 bittimes left per microframe), but I think that is still enough for control traffic. NOTE 2 ~~~~~~ Sarah Sharp expressed concern that maxing out periodic bandwidth could lead to vendor-specific hardware bugs on host controllers, because > It's entirely possible that you'll run into > vendor-specific bugs if you try to pack the schedule with isochronous > transfers. I don't think any hardware designer would seriously test or > validate their hardware with a schedule that is basically a violation of > the USB bus spec (more than 80% for periodic transfers). So far I've only tested this patch on my HP Mini 5103 with N10 chipset kirr@mini:~$ lspci 00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge 00:02.0 VGA compatible controller: Intel Corporation N10 Family Integrated Graphics Controller 00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller 00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 02) 00:1c.3 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 4 (rev 02) 00:1d.0 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 02) 00:1d.7 USB Controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation N10/ICH7 Family SATA AHCI Controller (rev 02) 01:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01) 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8059 PCI-E Gigabit Ethernet Controller (rev 11) and the system works stable with 110us/uframe (~88%) isoc bandwith allocated for above-mentioned isochronous transfers. NOTE 3 ~~~~~~ This feature is off by default. I mean max periodic bandwidth is set to 100us/uframe by default exactly as it was before the patch. So only those of us who need the extreme settings are taking the risk - normal users who do not alter uframe_periodic_max sysfs attribute should not see any change at all. NOTE 4 ~~~~~~ I've tried to update documentation in Documentation/ABI/ thoroughly, but only "TBD" was put into Documentation/usb/ehci.txt -- the text there seems to be outdated and much needing refreshing, before it could be amended. Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
215 lines
9.7 KiB
Plaintext
215 lines
9.7 KiB
Plaintext
27-Dec-2002
|
|
|
|
The EHCI driver is used to talk to high speed USB 2.0 devices using
|
|
USB 2.0-capable host controller hardware. The USB 2.0 standard is
|
|
compatible with the USB 1.1 standard. It defines three transfer speeds:
|
|
|
|
- "High Speed" 480 Mbit/sec (60 MByte/sec)
|
|
- "Full Speed" 12 Mbit/sec (1.5 MByte/sec)
|
|
- "Low Speed" 1.5 Mbit/sec
|
|
|
|
USB 1.1 only addressed full speed and low speed. High speed devices
|
|
can be used on USB 1.1 systems, but they slow down to USB 1.1 speeds.
|
|
|
|
USB 1.1 devices may also be used on USB 2.0 systems. When plugged
|
|
into an EHCI controller, they are given to a USB 1.1 "companion"
|
|
controller, which is a OHCI or UHCI controller as normally used with
|
|
such devices. When USB 1.1 devices plug into USB 2.0 hubs, they
|
|
interact with the EHCI controller through a "Transaction Translator"
|
|
(TT) in the hub, which turns low or full speed transactions into
|
|
high speed "split transactions" that don't waste transfer bandwidth.
|
|
|
|
At this writing, this driver has been seen to work with implementations
|
|
of EHCI from (in alphabetical order): Intel, NEC, Philips, and VIA.
|
|
Other EHCI implementations are becoming available from other vendors;
|
|
you should expect this driver to work with them too.
|
|
|
|
While usb-storage devices have been available since mid-2001 (working
|
|
quite speedily on the 2.4 version of this driver), hubs have only
|
|
been available since late 2001, and other kinds of high speed devices
|
|
appear to be on hold until more systems come with USB 2.0 built-in.
|
|
Such new systems have been available since early 2002, and became much
|
|
more typical in the second half of 2002.
|
|
|
|
Note that USB 2.0 support involves more than just EHCI. It requires
|
|
other changes to the Linux-USB core APIs, including the hub driver,
|
|
but those changes haven't needed to really change the basic "usbcore"
|
|
APIs exposed to USB device drivers.
|
|
|
|
- David Brownell
|
|
<dbrownell@users.sourceforge.net>
|
|
|
|
|
|
FUNCTIONALITY
|
|
|
|
This driver is regularly tested on x86 hardware, and has also been
|
|
used on PPC hardware so big/little endianness issues should be gone.
|
|
It's believed to do all the right PCI magic so that I/O works even on
|
|
systems with interesting DMA mapping issues.
|
|
|
|
Transfer Types
|
|
|
|
At this writing the driver should comfortably handle all control, bulk,
|
|
and interrupt transfers, including requests to USB 1.1 devices through
|
|
transaction translators (TTs) in USB 2.0 hubs. But you may find bugs.
|
|
|
|
High Speed Isochronous (ISO) transfer support is also functional, but
|
|
at this writing no Linux drivers have been using that support.
|
|
|
|
Full Speed Isochronous transfer support, through transaction translators,
|
|
is not yet available. Note that split transaction support for ISO
|
|
transfers can't share much code with the code for high speed ISO transfers,
|
|
since EHCI represents these with a different data structure. So for now,
|
|
most USB audio and video devices can't be connected to high speed buses.
|
|
|
|
Driver Behavior
|
|
|
|
Transfers of all types can be queued. This means that control transfers
|
|
from a driver on one interface (or through usbfs) won't interfere with
|
|
ones from another driver, and that interrupt transfers can use periods
|
|
of one frame without risking data loss due to interrupt processing costs.
|
|
|
|
The EHCI root hub code hands off USB 1.1 devices to its companion
|
|
controller. This driver doesn't need to know anything about those
|
|
drivers; a OHCI or UHCI driver that works already doesn't need to change
|
|
just because the EHCI driver is also present.
|
|
|
|
There are some issues with power management; suspend/resume doesn't
|
|
behave quite right at the moment.
|
|
|
|
Also, some shortcuts have been taken with the scheduling periodic
|
|
transactions (interrupt and isochronous transfers). These place some
|
|
limits on the number of periodic transactions that can be scheduled,
|
|
and prevent use of polling intervals of less than one frame.
|
|
|
|
|
|
USE BY
|
|
|
|
Assuming you have an EHCI controller (on a PCI card or motherboard)
|
|
and have compiled this driver as a module, load this like:
|
|
|
|
# modprobe ehci-hcd
|
|
|
|
and remove it by:
|
|
|
|
# rmmod ehci-hcd
|
|
|
|
You should also have a driver for a "companion controller", such as
|
|
"ohci-hcd" or "uhci-hcd". In case of any trouble with the EHCI driver,
|
|
remove its module and then the driver for that companion controller will
|
|
take over (at lower speed) all the devices that were previously handled
|
|
by the EHCI driver.
|
|
|
|
Module parameters (pass to "modprobe") include:
|
|
|
|
log2_irq_thresh (default 0):
|
|
Log2 of default interrupt delay, in microframes. The default
|
|
value is 0, indicating 1 microframe (125 usec). Maximum value
|
|
is 6, indicating 2^6 = 64 microframes. This controls how often
|
|
the EHCI controller can issue interrupts.
|
|
|
|
If you're using this driver on a 2.5 kernel, and you've enabled USB
|
|
debugging support, you'll see three files in the "sysfs" directory for
|
|
any EHCI controller:
|
|
|
|
"async" dumps the asynchronous schedule, used for control
|
|
and bulk transfers. Shows each active qh and the qtds
|
|
pending, usually one qtd per urb. (Look at it with
|
|
usb-storage doing disk I/O; watch the request queues!)
|
|
"periodic" dumps the periodic schedule, used for interrupt
|
|
and isochronous transfers. Doesn't show qtds.
|
|
"registers" show controller register state, and
|
|
|
|
The contents of those files can help identify driver problems.
|
|
|
|
|
|
Device drivers shouldn't care whether they're running over EHCI or not,
|
|
but they may want to check for "usb_device->speed == USB_SPEED_HIGH".
|
|
High speed devices can do things that full speed (or low speed) ones
|
|
can't, such as "high bandwidth" periodic (interrupt or ISO) transfers.
|
|
Also, some values in device descriptors (such as polling intervals for
|
|
periodic transfers) use different encodings when operating at high speed.
|
|
|
|
However, do make a point of testing device drivers through USB 2.0 hubs.
|
|
Those hubs report some failures, such as disconnections, differently when
|
|
transaction translators are in use; some drivers have been seen to behave
|
|
badly when they see different faults than OHCI or UHCI report.
|
|
|
|
|
|
PERFORMANCE
|
|
|
|
USB 2.0 throughput is gated by two main factors: how fast the host
|
|
controller can process requests, and how fast devices can respond to
|
|
them. The 480 Mbit/sec "raw transfer rate" is obeyed by all devices,
|
|
but aggregate throughput is also affected by issues like delays between
|
|
individual high speed packets, driver intelligence, and of course the
|
|
overall system load. Latency is also a performance concern.
|
|
|
|
Bulk transfers are most often used where throughput is an issue. It's
|
|
good to keep in mind that bulk transfers are always in 512 byte packets,
|
|
and at most 13 of those fit into one USB 2.0 microframe. Eight USB 2.0
|
|
microframes fit in a USB 1.1 frame; a microframe is 1 msec/8 = 125 usec.
|
|
|
|
So more than 50 MByte/sec is available for bulk transfers, when both
|
|
hardware and device driver software allow it. Periodic transfer modes
|
|
(isochronous and interrupt) allow the larger packet sizes which let you
|
|
approach the quoted 480 MBit/sec transfer rate.
|
|
|
|
Hardware Performance
|
|
|
|
At this writing, individual USB 2.0 devices tend to max out at around
|
|
20 MByte/sec transfer rates. This is of course subject to change;
|
|
and some devices now go faster, while others go slower.
|
|
|
|
The first NEC implementation of EHCI seems to have a hardware bottleneck
|
|
at around 28 MByte/sec aggregate transfer rate. While this is clearly
|
|
enough for a single device at 20 MByte/sec, putting three such devices
|
|
onto one bus does not get you 60 MByte/sec. The issue appears to be
|
|
that the controller hardware won't do concurrent USB and PCI access,
|
|
so that it's only trying six (or maybe seven) USB transactions each
|
|
microframe rather than thirteen. (Seems like a reasonable trade off
|
|
for a product that beat all the others to market by over a year!)
|
|
|
|
It's expected that newer implementations will better this, throwing
|
|
more silicon real estate at the problem so that new motherboard chip
|
|
sets will get closer to that 60 MByte/sec target. That includes an
|
|
updated implementation from NEC, as well as other vendors' silicon.
|
|
|
|
There's a minimum latency of one microframe (125 usec) for the host
|
|
to receive interrupts from the EHCI controller indicating completion
|
|
of requests. That latency is tunable; there's a module option. By
|
|
default ehci-hcd driver uses the minimum latency, which means that if
|
|
you issue a control or bulk request you can often expect to learn that
|
|
it completed in less than 250 usec (depending on transfer size).
|
|
|
|
Software Performance
|
|
|
|
To get even 20 MByte/sec transfer rates, Linux-USB device drivers will
|
|
need to keep the EHCI queue full. That means issuing large requests,
|
|
or using bulk queuing if a series of small requests needs to be issued.
|
|
When drivers don't do that, their performance results will show it.
|
|
|
|
In typical situations, a usb_bulk_msg() loop writing out 4 KB chunks is
|
|
going to waste more than half the USB 2.0 bandwidth. Delays between the
|
|
I/O completion and the driver issuing the next request will take longer
|
|
than the I/O. If that same loop used 16 KB chunks, it'd be better; a
|
|
sequence of 128 KB chunks would waste a lot less.
|
|
|
|
But rather than depending on such large I/O buffers to make synchronous
|
|
I/O be efficient, it's better to just queue up several (bulk) requests
|
|
to the HC, and wait for them all to complete (or be canceled on error).
|
|
Such URB queuing should work with all the USB 1.1 HC drivers too.
|
|
|
|
In the Linux 2.5 kernels, new usb_sg_*() api calls have been defined; they
|
|
queue all the buffers from a scatterlist. They also use scatterlist DMA
|
|
mapping (which might apply an IOMMU) and IRQ reduction, all of which will
|
|
help make high speed transfers run as fast as they can.
|
|
|
|
|
|
TBD: Interrupt and ISO transfer performance issues. Those periodic
|
|
transfers are fully scheduled, so the main issue is likely to be how
|
|
to trigger "high bandwidth" modes.
|
|
|
|
TBD: More than standard 80% periodic bandwidth allocation is possible
|
|
through sysfs uframe_periodic_max parameter. Describe that.
|