* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PCI PM: Make pci_prepare_to_sleep() disable wake-up if needed
radeonfb: Use __pci_complete_power_transition()
PCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)
PCI PM: Restore config spaces of all devices during early resume
PCI PM: Make pci_set_power_state() handle devices with no PM support
PCI PM: Put devices into low power states during late suspend (rev. 2)
PCI PM: Move pci_restore_standard_config to pci-driver.c
PCI PM: Use pci_set_power_state during early resume
PCI PM: Consistently use variable name "error" for pm call return values
kexec: Change kexec jump code ordering
PM: Change hibernation code ordering
PM: Change suspend code ordering
PM: Rework handling of interrupts during suspend-resume
PM: Introduce functions for suspending and resuming device interrupts
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
dma-debug: make memory range checks more consistent
dma-debug: warn of unmapping an invalid dma address
dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
dma-debug/x86: register pci bus for dma-debug leak detection
dma-debug: add a check dma memory leaks
dma-debug: add checks for kernel text and rodata
dma-debug: print stacktrace of mapping path on unmap error
dma-debug: Documentation update
dma-debug: x86 architecture bindings
dma-debug: add function to dump dma mappings
dma-debug: add checks for sync_single_sg_*
dma-debug: add checks for sync_single_range_*
dma-debug: add checks for sync_single_*
dma-debug: add checking for [alloc|free]_coherent
dma-debug: add add checking for map/unmap_sg
dma-debug: add checking for map/unmap_page/single
dma-debug: add core checking functions
dma-debug: add debugfs interface
dma-debug: add kernel command line parameters
dma-debug: add initialization code
...
Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c
If the device is not supposed to wake up the system, ie. when
device_may_wakeup(&dev->dev) returns 'false', pci_prepare_to_sleep()
should pass 'false' to pci_enable_wake() so that it calls the
platform to disable the wake-up capability of the device.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The radeonfb driver needs to program the device's PMCSR directly due
to some quirky hardware it has to handle (see
http://bugzilla.kernel.org/show_bug.cgi?id=12846 for details) and
after doing that it needs to call the platform (usually ACPI) to
finish the power transition of the device. Currently it uses
pci_set_power_state() for this purpose, however making a specific
assumption about the internal behavior of this function, which has
changed recently so that this assumption is no longer satisfied.
For this reason, introduce __pci_complete_power_transition() that may
be called by the radeonfb driver to complete the power transition of
the device. For symmetry, introduce __pci_start_power_transition().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
At present the configuration spaces of PCI devices that have no
drivers or no PM support in the drivers (either legacy or through a
pm object) are not saved during suspend and, consequently, they are
not restored during resume. This generally may lead to the state of
the system being slightly inconsistent after the resume, so it's
better to save and restore the configuration spaces of these devices
as well.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There is a problem with PCI devices without any PM support (either
native or through the platform) that pci_set_power_state() always
returns error code for them, even if they are being put into D0.
However, such devices are always in D0, so pci_set_power_state()
should return success when attempting to put such a device into D0.
It also should update the current_state field for these devices as
appropriate. This modification is necessary so that the standard
configuration registers of these devices are successfully restored by
pci_restore_standard_config() during the "early" phase of resume.
In addition, pci_set_power_state() should check the value of
current_state before calling the platform to change the power state
of the device to avoid doing that unnecessarily.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Once we have allowed timer interrupts to be enabled during the late
phase of suspending devices, we are now able to use the generic
pci_set_power_state() to put PCI devices into low power states at
that time. We can also use some related platform callbacks, like the
ones preparing devices for wake-up, during the late suspend.
Doing this will allow us to avoid the race condition where a device
using shared interrupts is put into a low power state with interrupts
enabled and then an interrupt (for another device) comes in and
confuses its driver. At the same time, devices that don't support
the native PCI PM or that require some additional, platform-specific
operations to be carried out to put them into low power states will
be handled as appropriate.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move pci_restore_standard_config() from pci.c to pci-driver.c and
make it static.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Once we have allowed timer interrupts to be enabled during the early
phase of resuming devices, we are now able to use the generic
pci_set_power_state() to put PCI devices into D0 at that time. Then,
the platform-specific PM code will have a chance to handle devices
that don't implement the native PCI PM or that require some
additional, platform-specific operations to be carried out to power
them up. Also, by doing this we can simplify the code quite a bit.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I noticed two functions use a variable "i" to store the return value of PM
function calls while the rest of the file uses "error". As "i" normally
indicates a counter of some sort it seems better to keep this consistent.
Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
x86: disable __do_IRQ support
sparseirq, powerpc/cell: fix unused variable warning in interrupt.c
genirq: deprecate obsolete typedefs and defines
genirq: deprecate __do_IRQ
genirq: add doc to struct irqaction
genirq: use kzalloc instead of explicit zero initialization
genirq: make irqreturn_t an enum
genirq: remove redundant if condition
genirq: remove unused hw_irq_controller typedef
irq: export remove_irq() and setup_irq() symbols
irq: match remove_irq() args with setup_irq()
irq: add remove_irq() for freeing of setup_irq() irqs
genirq: assert that irq handlers are indeed running in hardirq context
irq: name 'p' variables a bit better
irq: further clean up the free_irq() code flow
irq: refactor and clean up the free_irq() code flow
irq: clean up manage.c
irq: use GFP_KERNEL for action allocation in request_irq()
kernel/irq: fix sparse warning: make symbol static
irq: optimize init_kstat_irqs/init_copy_kstat_irqs
...
Impact: invalid use of GFP_KERNEL in interrupt context
Queued invalidation and interrupt-remapping will get initialized with
interrupts disabled (while enabling interrupt-remapping). So use
GFP_ATOMIC instead of GFP_KERNEL for memory alloacations.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: fix interrupt table entry leak
Fix the typo which was not clearing all the interrupt remapping table
entries corresponding to an irq.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: cleanup/sanitization
Start from a sane state while enabling dma and interrupt-remapping, by
clearing the previous recorded faults and disabling previously
enabled queued invalidation and interrupt-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: new interfaces (not yet used)
Routines for disabling queued invalidation and interrupt remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: interface augmentation (not yet used)
Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: code movement
Move page fault handling code to dmar.c
This will be shared both by DMA-remapping and Intr-remapping code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: fix potential deadlock on x2apic
fix "hard-safe -> hard-unsafe lock order detected" with irq_2_ir_lock
On x2apic enabled system:
[ INFO: hard-safe -> hard-unsafe lock order detected ]
2.6.27-03151-g4480f15b #1
------------------------------------------------------
swapper/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(irq_2_ir_lock){--..}, at: [<ffffffff8038ebc0>] get_irte+0x2f/0x95
and this task is already holding:
(&irq_desc_lock_class){+...}, at: [<ffffffff802649ed>] setup_irq+0x67/0x281
which would create a new lock dependency:
(&irq_desc_lock_class){+...} -> (irq_2_ir_lock){--..}
but this new dependency connects a hard-irq-safe lock:
(&irq_desc_lock_class){+...}
... which became hard-irq-safe at:
[<ffffffffffffffff>] 0xffffffffffffffff
to a hard-irq-unsafe lock:
(irq_2_ir_lock){--..}
... which became hard-irq-unsafe at:
... [<ffffffff802547b5>] __lock_acquire+0x571/0x706
[<ffffffff8025499f>] lock_acquire+0x55/0x71
[<ffffffff8062f2c4>] _spin_lock+0x2c/0x38
[<ffffffff8038ee50>] alloc_irte+0x8a/0x14b
[<ffffffff8021f733>] setup_IO_APIC_irq+0x119/0x30e
[<ffffffff8090860e>] setup_IO_APIC+0x146/0x6e5
[<ffffffff809058fc>] native_smp_prepare_cpus+0x24e/0x2e9
[<ffffffff808f982c>] kernel_init+0x5a/0x176
[<ffffffff8020c289>] child_rip+0xa/0x11
[<ffffffffffffffff>] 0xffffffffffffffff
Fix this theoretical lock order issue by using spin_lock_irqsave() instead of
spin_lock()
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The PCIe port driver calls pci_enable_device() during probe but
never calls pci_disable_device() during remove.
Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Prakash's system needs MSI disabled on some bridges, but not all.
This seems to be the minimal fix for 2.6.29, but should be replaced
during 2.6.30.
Signed-off-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
The RPA PCI hotplug driver calls EEH routines, so should depend on
EEH. Also PPC_PSERIES implies PPC64, so remove that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Commit 47a8b0cc (Enable PCIe AER only after checking firmware
support) wants to walk the PCI bus in the remove path to disable
AER, and calls pci_walk_bus for downstream bridges.
Unfortunately, in the remove path, we remove devices and bridges
in a depth-first manner, starting with the furthest downstream
bridge and working our way backwards.
The furthest downstream bridges will not have a dev->subordinate,
and we hit a NULL deref in pci_walk_bus.
Check for dev->subordinate first before attempting to walk the
PCI hierarchy below us.
Acked-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
This patch is intended to disable L0s ASPM link state for 82598 (ixgbe)
parts due to the fact that it is possible to corrupt TX data when coming
back out of L0s on some systems. The workaround had been added for 82575
(igb) previously, but did not use the ASPM api. This quirk uses the ASPM
api to prevent the ASPM subsystem from re-enabling the L0s state.
Instead of adding the fix in igb to the ixgbe driver as well it was
decided to move it into a pci quirk. It is necessary to move the fix out
of the driver and into a pci quirk in order to prevent the issue from
occuring prior to driver load to handle the possibility of the device being
passed to a VM via direct assignment.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: AMD 813x B2 devices do not need boot interrupt quirk
PCI: Enable PCIe AER only after checking firmware support
PCI: pciehp: Handle interrupts that happen during initialization.
PCI: don't enable too many HT MSI mappings
PCI: add some sysfs ABI docs
PCI quirk: enable MSI on 8132
Turns out that the new AMD 813x devices do not need the
quirk_disable_amd_813x_boot_interrupt quirk to be run on them. If it
is, no interrupts are seen on the PCI-X adapter.
From: Stefan Assmann <sassmann@novell.com>
Reported-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Tested-by: Jamie Wellnitz <Jamie.Wellnitz@emulex.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
The PCIe port driver currently sets the PCIe AER error reporting bits for
any root or switch port without first checking to see if firmware will grant
control. This patch moves setting these bits to the AER service driver
aer_enable_port routine. The bits are then set for the root port and any
downstream switch ports after the check for firmware support (aer_osc_setup)
is made. The patch also unsets the bits in a similar fashion when the AER
service driver is unloaded.
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
Move the enabling of interrupts after all of the data structures
are setup so that we can safely run the interrupt handler as
soon as it is registered.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
Prakash reported that his c51-mcp51 ondie sound card doesn't work with
MSI. But if he hacks out the HT-MSI quirk, MSI works fine.
So this patch reworks the nv_msi_ht_cap_quirk(). It will now only
enable ht_msi on own its root device, avoiding enabling it on devices
following that root dev.
Reported-by: Prakash Punnoor <prakash@punnoor.de>
Tested-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
David reported that LSI SAS doesn't work with MSI. It turns out that
his BIOS doesn't enable it, but the HT MSI 8132 does support HT MSI.
Add quirk to enable it
Cc: stable@kernel.org
Reported-by: David Lang <david@lang.hm>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.
Override that bit on the affected chipsets, and everything is happy
again.
Thanks to Chris and Bhavesh and others for helping to debug.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.
Override that bit on the affected chipsets, and everything is happy
again.
Thanks to Chris and Bhavesh and others for helping to debug.
Should resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=479996http://bugzilla.kernel.org/show_bug.cgi?id=12578
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Tested-and-acked-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix pci kernel-doc parameter missing notation, correct
function name, and fix typo:
Warning(linux-2.6.28-git10//drivers/pci/pci.c:1511): No description found for parameter 'exclusive'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Hidetoshi Seto points out that commit
bffac3c593 has wrong values in the array.
Rather than correct the array, we can just use a bounds check and
perform the calculation specified in the comment. As a bonus, this will
not run off the end of the array if the device specifies an illegal
value in the MSI capability.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>