linux_dsm_epyc7002/drivers/pci/pcie
Sebastian Andrzej Siewior 4ae2182b1e PCI/AER: Flush workqueue on device remove to avoid use-after-free
A Root Port's AER structure (rpc) contains a queue of events.  aer_irq()
enqueues AER status information and schedules aer_isr() to dequeue and
process it.  When we remove a device, aer_remove() waits for the queue to
be empty, then frees the rpc struct.

But aer_isr() references the rpc struct after dequeueing and possibly
emptying the queue, which can cause a use-after-free error as in the
following scenario with two threads, aer_isr() on the left and a
concurrent aer_remove() on the right:

  Thread A                      Thread B
  --------                      --------
  aer_irq():
    rpc->prod_idx++
                                aer_remove():
                                  wait_event(rpc->prod_idx == rpc->cons_idx)
                                  # now blocked until queue becomes empty
  aer_isr():                      # ...
    rpc->cons_idx++               # unblocked because queue is now empty
    ...                           kfree(rpc)
    mutex_unlock(&rpc->rpc_mutex)

To prevent this problem, use flush_work() to wait until the last scheduled
instance of aer_isr() has completed before freeing the rpc struct in
aer_remove().

I reproduced this use-after-free by flashing a device FPGA and
re-enumerating the bus to find the new device.  With SLUB debug, this
crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in
GPR25:

  pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000
  Unable to handle kernel paging request for data at address 0x27ef9e3e
  Workqueue: events aer_isr
  GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0
  NIP [602f5328] pci_walk_bus+0xd4/0x104

[bhelgaas: changelog, stable tag]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
2016-01-25 10:08:00 -06:00
..
aer PCI/AER: Flush workqueue on device remove to avoid use-after-free 2016-01-25 10:08:00 -06:00
aspm.c PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show() 2015-12-03 10:42:59 -06:00
Kconfig PCI / PM: Drop CONFIG_PM_RUNTIME from the PCI core 2014-12-04 00:50:33 +01:00
Makefile PCI: PCIe: Move PCIe PME code to the pcie directory 2010-08-24 13:47:48 -07:00
pme.c PCI / PM: handle failure to enable wakeup on PCIe PME 2014-10-23 22:47:28 +02:00
portdrv_acpi.c PCI: Fix missing prototype for pcie_port_acpi_setup() 2013-04-12 11:17:47 -06:00
portdrv_bus.c PCI: Fix whitespace, capitalization, and spelling errors 2013-11-14 11:28:18 -07:00
portdrv_core.c PCI: Fix pcie_port_device_resume() comment 2015-07-14 13:41:04 -05:00
portdrv_pci.c PCI/PM: Drop unused runtime PM support code for PCIe ports 2014-09-02 17:12:15 -06:00
portdrv.h PCI: Fix whitespace, capitalization, and spelling errors 2013-11-14 11:28:18 -07:00