linux_dsm_epyc7002/drivers/usb/host
Sarah Sharp 0d9f78a92e xhci: Fix hang on back-to-back Set TR Deq Ptr commands.
The Microsoft LifeChat 3000 USB headset was causing a very reproducible
hang whenever it was plugged in.  At first, I thought the host
controller was producing bad transfer events, because the log was filled
with errors like:

xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD

However, it turned out to be an xHCI driver bug in the ring expansion
patches.  The bug is triggered When there are two ring segments, and a
TD that ends just before a link TRB, like so:

 ______________                     _____________
|              |              ---> | setup TRB B |
 ______________               |     _____________
|              |              |    |  data TRB B |
 ______________               |     _____________
| setup TRB A  | <-- deq      |    |  data TRB B |
 ______________               |     _____________
| data TRB A   |              |    |             | <-- enq, deq''
 ______________               |     _____________
| status TRB A |              |    |             |
 ______________               |     _____________
|  link TRB    |---------------    |  link TRB   |
 _____________  <--- deq'           _____________

TD A (the first control transfer) stalls on the data phase.  That halts
the ring.  The xHCI driver moves the hardware dequeue pointer to the
first TRB after the stalled transfer, which happens to be the link TRB.

Once the Set TR dequeue pointer command completes, the function
update_ring_for_set_deq_completion runs.  That function is supposed to
update the xHCI driver's dequeue pointer to match the internal hardware
dequeue pointer.  On the first call this would work fine, and the
software dequeue pointer would move to deq'.

However, if the transfer immediately after that stalled (TD B in this
case), another Set TR Dequeue command would be issued.  That would move
the hardware dequeue pointer to deq''.  Once that command completed,
update_ring_for_set_deq_completion would run again.

The original code would unconditionally increment the software dequeue
pointer, which moved the pointer off the ring segment into la-la-land.
The while loop would happy increment the dequeue pointer (possibly
wrapping it) until it matched the hardware pointer value.

The while loop would also access all the memory in between the first
ring segment and the second ring segment to determine if it was a link
TRB.  This could cause general protection faults, although it was
unlikely because the ring segments came from a DMA pool, and would often
have consecutive memory addresses.

If nothing in that space looked like a link TRB, the deq_seg pointer for
the ring would remain on the first segment.  Thus, the deq_seg and the
software dequeue pointer would get out of sync.

When the next transfer event came in after the stalled transfer, the
xHCI driver code would attempt to convert the software dequeue pointer
into a DMA address in order to compare the DMA address for the completed
transfer.  Since the deq_seg and the dequeue pointer were out of sync,
xhci_trb_virt_to_dma would return NULL.

The transfer event would get ignored, the transfer would eventually
timeout, and we would mistakenly convert the finished transfer to no-op
TRBs.  Some kernel driver (maybe xHCI?) would then get stuck in an
infinite loop in interrupt context, and the whole machine would hang.

This patch should be backported to kernels as old as 3.4, that contain
the commit b008df60c6 "xHCI: count free
TRBs on transfer ring"

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Andiry Xu <andiry.xu@amd.com>
Cc: stable@vger.kernel.org
2012-07-02 12:51:25 -07:00
..
whci Merge branch 'for-next/dwc3' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next 2011-12-12 15:19:53 -08:00
alchemy-common.c MIPS: Alchemy: Au1300 SoC support 2011-12-07 22:02:05 +00:00
bcma-hcd.c usb/bcma: Add missing #include <linux/slab.h> 2012-04-23 13:22:00 -07:00
ehci-atmel.c USB: ehci-atmel: add needed of.h header file 2012-04-04 18:35:43 +02:00
ehci-au1xxx.c usb: Remove ehci_reset call from ehci_run 2011-12-08 09:38:53 -08:00
ehci-cns3xxx.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
ehci-dbg.c EHCI: maintain the ehci->command value properly 2012-04-23 12:05:44 -07:00
ehci-fsl.c USB: ehci-fsl: Use usb_put_transceiver instead of put_device 2012-05-14 08:49:50 -07:00
ehci-fsl.h fsl/usb: Add controller version based ULPI and UTMI phy support 2012-04-18 13:46:42 -07:00
ehci-grlib.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
ehci-hcd.c USB: fix PS3 EHCI systems 2012-06-13 17:24:54 -07:00
ehci-hub.c EHCI: maintain the ehci->command value properly 2012-04-23 12:05:44 -07:00
ehci-ixp4xx.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
ehci-lpm.c
ehci-ls1x.c USB: Add EHCI bus glue for Loongson1x SoCs (UPDATED) 2012-01-24 15:28:02 -08:00
ehci-mem.c
ehci-msm.c usb: otg: Convert all users to pass struct usb_otg for OTG functions 2012-02-27 15:41:52 +02:00
ehci-mv.c usb: otg: Convert all users to pass struct usb_otg for OTG functions 2012-02-27 15:41:52 +02:00
ehci-mxc.c USB ehci mxc: sanitize clock handling 2012-04-25 17:03:41 +02:00
ehci-octeon.c usb: Remove ehci_reset call from ehci_run 2011-12-08 09:38:53 -08:00
ehci-omap.c Fix OMAP EHCI suspend/resume failure (i693) 2012-06-13 17:36:22 -07:00
ehci-orion.c ARM: Orion: EHCI: Add support for enabling clocks 2012-05-08 16:33:59 -07:00
ehci-pci.c USB: add NO_D3_DURING_SLEEP flag and revert 151b612847 2012-06-13 13:11:39 -07:00
ehci-platform.c USB: ehci-platform: remove update_device 2012-05-18 16:37:55 -07:00
ehci-pmcmsp.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
ehci-ppc-of.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
ehci-ps3.c usb: PS3 EHCI HC reset work-around 2011-12-08 09:38:53 -08:00
ehci-q.c USB: ehci-q.c: remove dbg() usage 2012-05-01 21:33:35 -07:00
ehci-s5p.c USB: ehci-s5p: add clock gating to suspend/resume 2012-04-18 13:52:36 -07:00
ehci-sched.c USB: EHCI: improve full-speed isochronous scheduling routine 2012-05-14 12:50:22 -07:00
ehci-sead3.c usb: host: mips: sead3: Fix for big endian. 2012-05-11 15:17:30 -07:00
ehci-sh.c usb: ehci-sh: fix illegal phy_init() running when platform_data is NULL 2012-06-14 17:13:34 -07:00
ehci-spear.c USB: ehci: ohci: Add clk_{un}prepare() support 2012-04-18 14:33:43 -07:00
ehci-sysfs.c USB: EHCI: Allow users to override 80% max periodic bandwidth 2011-07-08 14:51:33 -07:00
ehci-tegra.c arm-soc: driver specific updates 2012-05-26 12:22:27 -07:00
ehci-vt8500.c usb: Remove ehci_reset call from ehci_run 2011-12-08 09:38:53 -08:00
ehci-w90x900.c usb: Remove ehci_reset call from ehci_run 2011-12-08 09:38:53 -08:00
ehci-xilinx-of.c USB: EHCI: Fix build warning in xilinx ehci driver 2012-06-13 17:24:54 -07:00
ehci-xls.c Merge 3.2-rc3 into usb-linus 2011-11-26 19:46:48 -08:00
ehci.h Revert "USB: EHCI: work around bug in the Philips ISP1562 controller" 2012-05-21 08:54:43 -07:00
fhci-dbg.c
fhci-hcd.c usb: convert drivers/usb/* to use module_platform_driver() 2011-11-28 06:48:32 +09:00
fhci-hub.c
fhci-mem.c
fhci-q.c
fhci-sched.c QE/FHCI: fixed the CONTROL bug 2011-10-18 13:51:34 -07:00
fhci-tds.c usb: Fix various typo within usb 2012-04-18 13:57:26 -07:00
fhci.h
fsl-mph-dr-of.c fsl/usb: Add controller version based ULPI and UTMI phy support 2012-04-18 13:46:42 -07:00
hwa-hc.c Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
imx21-dbg.c usb: Fix typo in imx21-dbg.c 2012-02-13 14:32:34 -08:00
imx21-hcd.c Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
imx21-hcd.h
isp116x-hcd.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
isp116x.h
isp1362-hcd.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
isp1362.h
isp1760-hcd.c isp1760-hcd: fix possible memory leak if urb could not be enqueued 2012-04-18 13:51:19 -07:00
isp1760-hcd.h usb/isp1760: Allow to optionally trigger low-level chip reset via GPIOLIB. 2011-10-19 13:29:06 -07:00
isp1760-if.c isp1760-if: make module unloads correctly 2012-04-18 13:50:44 -07:00
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-05-22 19:22:50 -07:00
Makefile USB: Add driver for the ssb bus 2012-04-18 13:43:30 -07:00
octeon2-common.c usb: Configure octeon2 glue logic for proper uSOF cycle period. 2011-05-03 10:09:32 -07:00
ohci-at91.c USB: ohci-at91: add a reset function to fix race condition 2012-05-09 15:22:27 -07:00
ohci-au1xxx.c usb: [MIPS] fix unresolved err() reference in host/ohci-au1xxx.c 2012-05-01 18:36:09 -04:00
ohci-cns3xxx.c USB: ohci-cns3xxx.c: remove err() usage 2012-04-27 11:24:40 -07:00
ohci-da8xx.c ohci-da8xx: set MODULE_ALIAS to allow autoloading 2012-05-08 09:26:10 -07:00
ohci-dbg.c USB: ohci-dbg.c: remove dbg() usage 2012-05-01 21:33:37 -07:00
ohci-ep93xx.c USB: ohci-ep93xx.c: remove dbg() usage 2012-05-01 21:33:38 -07:00
ohci-exynos.c USB: ohci-exynos.c: remove err() usage 2012-04-27 11:24:41 -07:00
ohci-hcd.c USB: OHCI: remove old SSB OHCI driver 2012-04-18 13:43:30 -07:00
ohci-hub.c USB: ohci-hub: Mark ohci_finish_controller_resume() as __maybe_unused 2012-06-13 17:26:11 -07:00
ohci-jz4740.c
ohci-mem.c
ohci-nxp.c USB: ohci-nxp: Use isp1301 driver 2012-05-01 13:36:18 -04:00
ohci-octeon.c USB: irq: Remove IRQF_DISABLED 2011-09-18 01:39:36 -07:00
ohci-omap3.c ARM: OMAP: USBHOST: Replace usbhs core driver APIs by Runtime pm APIs 2011-12-16 04:29:57 -07:00
ohci-omap.c USB 3.5-rc1 pull request 2012-05-22 15:50:46 -07:00
ohci-pci.c usb: add support for STA2X11 host driver 2012-01-24 14:15:37 -08:00
ohci-platform.c usb: Fix various typo within usb 2012-04-18 13:57:26 -07:00
ohci-pnx8550.c usb: [MIPS] fix unresolved err() reference in host/ohci-pnx8550.c 2012-05-01 18:36:09 -04:00
ohci-ppc-of.c USB: ohci-ppc-of.c: remove err() usage 2012-04-27 11:24:42 -07:00
ohci-ppc-soc.c USB: ohci-ppc-soc.c: remove err() usage 2012-04-27 11:24:42 -07:00
ohci-ps3.c USB: ohci-ps3.c: remove err() usage 2012-04-27 11:24:43 -07:00
ohci-pxa27x.c usb: [ARM] fix unresolved err() reference in host/ohci-pxa27x.c 2012-05-01 18:36:09 -04:00
ohci-q.c OHCI: remove uses of hcd->state 2011-11-18 10:51:00 -08:00
ohci-s3c2410.c USB: ohci-s3c2410.c: remove err() usage 2012-04-27 11:24:43 -07:00
ohci-sa1111.c USB: ohci-sa1111.c: remove dbg() usage 2012-05-01 21:33:39 -07:00
ohci-sh.c USB: ohci-sh.c: remove err() usage 2012-04-27 11:24:44 -07:00
ohci-sm501.c OHCI: remove uses of hcd->state 2011-11-18 10:51:00 -08:00
ohci-spear.c USB: ehci: ohci: Add clk_{un}prepare() support 2012-04-18 14:33:43 -07:00
ohci-tmio.c USB: ohci-tmio.c: remove err() usage 2012-04-27 11:24:44 -07:00
ohci-xls.c USB: ohci-xls.c: remove err() usage 2012-04-27 11:24:45 -07:00
ohci.h usb: otg: Rename otg_transceiver to usb_phy 2012-02-13 13:34:36 +02:00
oxu210hp-hcd.c USB: oxu210hp-hcd.c: remove dbg() usage 2012-05-01 21:33:43 -07:00
oxu210hp.h
pci-quirks.c xhci: Add Lynx Point to list of Intel switchable hosts. 2012-05-03 13:18:40 -07:00
pci-quirks.h Intel xhci: Support EHCI/xHCI port switching. 2011-05-27 12:07:36 -07:00
r8a66597-hcd.c Revert "usb: move struct usb_device->children to struct usb_hub_port->child" 2012-05-14 09:20:37 -07:00
r8a66597.h usb: r8a66597-hcd: add function for external controller 2011-07-08 14:57:11 -07:00
sl811_cs.c pcmcia: Convert pcmcia_device_id declarations to const 2011-05-06 07:46:22 +02:00
sl811-hcd.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
sl811.h
ssb-hcd.c usb/ssb: Add missing #include <linux/slab.h> 2012-04-23 13:22:00 -07:00
u132-hcd.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
uhci-debug.c USB: UHCI: Add support for big endian descriptors 2011-05-19 16:43:20 -07:00
uhci-grlib.c treewide: Convert uses of struct resource to resource_size(ptr) 2011-06-10 14:55:36 +02:00
uhci-hcd.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
uhci-hcd.h USB: UHCI: Add support for big endian descriptors 2011-05-19 16:43:20 -07:00
uhci-hub.c UHCI: hub_status_data should indicate if ports are resuming 2012-04-09 15:43:21 -07:00
uhci-pci.c USB: UHCI: Move PCI specific functions to uhci-pci.c 2011-05-06 18:24:00 -07:00
uhci-q.c usb: fix number of mapped SG DMA entries 2011-12-09 16:18:19 -08:00
xhci-dbg.c xHCI: correct to print the true HSEE of USBCMD 2012-04-10 15:21:52 -07:00
xhci-ext-caps.h xHCI: Correct the #define XHCI_LEGACY_DISABLE_SMI 2012-04-11 08:31:06 -07:00
xhci-hub.c usb: Add support for root hub port status CAS 2012-07-02 12:51:24 -07:00
xhci-mem.c xhci: Don't free endpoints in xhci_mem_cleanup() 2012-06-13 16:37:30 -07:00
xhci-pci.c xhci: Add Intel U1/U2 timeout policy. 2012-05-18 15:42:04 -07:00
xhci-plat.c usb: host: xhci: add platform driver support 2012-03-13 10:30:59 -07:00
xhci-ring.c xhci: Fix hang on back-to-back Set TR Deq Ptr commands. 2012-07-02 12:51:25 -07:00
xhci.c xHCI: Increase the timeout for controller save/restore state operation 2012-06-13 16:39:38 -07:00
xhci.h usb: Add support for root hub port status CAS 2012-07-02 12:51:24 -07:00