Commit Graph

170 Commits

Author SHA1 Message Date
Mika Westerberg
9d26d3a8f1 PCI: Put PCIe ports into D3 during suspend
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered.  Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.

With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:

  - The PCIe port hardware is recent enough, starting from 2015.

  - Devices connected to PCIe ports are effectively in D3cold once the port
    is transitioned to D3 (the config space is not accessible anymore and
    the link may be powered down).

  - Devices behind the PCIe port need to be allowed to transition to D3cold
    and back.  There is a way both drivers and userspace can forbid this.

  - If the device behind the PCIe port is capable of waking the system it
    needs to be able to do so from D3cold.

This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port.  When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.

Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-13 14:57:36 -05:00
Imre Deak
06bf403de3 PCI / PM: Tune down retryable runtime suspend error messages
The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
suspend hooks as errors, but they still show up as errors in dmesg. Tune
them down. See rpm_suspend() for details of handling these return values.

Note that we use dev_dbg() for the retryable retvals, so after this
change you'll need either CONFIG_DYNAMIC_DEBUG or CONFIG_PCI_DEBUG
for them to show up in the log.

One problem caused by this was noticed by Daniel: the i915 driver
returns EAGAIN to signal a temporary failure to suspend and as a request
towards the RPM core for scheduling a suspend again. This is a normal
event, but the resulting error message flags a breakage during the
driver's automated testing which parses dmesg and picks up the error.

Reported-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://bugs.freedesktop.org/show_bug.cgi?id=92992
Signed-off-by: Imre Deak <imre.deak@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-02 15:24:21 +01:00
Linus Torvalds
3c87b79188 PCI changes for the v4.4 merge window:
Resource management
     Add support for Enhanced Allocation devices (Sean O. Stalley)
     Add Enhanced Allocation register entries (Sean O. Stalley)
     Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney)
     Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney)
     Handle Enhanced Allocation capability for SR-IOV devices (David Daney)
     Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas)
     Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas)
     Expand Enhanced Allocation BAR output (Bjorn Helgaas)
     Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier)
     Fix lookup of linux,pci-probe-only property (Marc Zyngier)
     Add sparc mem64 resource parsing for root bus (Yinghai Lu)
 
   PCI device hotplug
     pciehp: Queue power work requests in dedicated function (Guenter Roeck)
 
   Driver binding
     Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker)
 
   Virtualization
     Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck)
     Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck)
     Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck)
     Reorder pcibios_sriov_disable() (Alexander Duyck)
     Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck)
     Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck)
     Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton)
     Don't try to restore VF BARs (Wei Yang)
 
   MSI
     Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel)
     Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach)
     Export all remapped MSIs to sysfs attributes (Romain Bezut)
     Disable MSI on SiS 761 (Ondrej Zary)
 
   AER
     Clear error status registers during enumeration and restore (Taku Izumi)
 
   Generic host bridge driver
     Fix lookup of linux,pci-probe-only property (Marc Zyngier)
     Allow multiple hosts with different map_bus() methods (David Daney)
     Pass starting bus number to pci_scan_root_bus() (David Daney)
     Fix address window calculation for non-zero starting bus (David Daney)
 
   Altera host bridge driver
     Add msi.h to ARM Kbuild (Ley Foon Tan)
     Add Altera PCIe host controller driver (Ley Foon Tan)
     Add Altera PCIe MSI driver (Ley Foon Tan)
 
   APM X-Gene host bridge driver
     Remove msi_controller assignment (Duc Dang)
 
   Broadcom iProc host bridge driver
     Fix header comment "Corporation" misspelling (Florian Fainelli)
     Fix code comment to match code (Ray Jui)
     Remove unused struct iproc_pcie.irqs[] (Ray Jui)
     Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui)
     Fix PCIe reset logic (Ray Jui)
     Improve link detection logic (Ray Jui)
     Update PCIe device tree bindings (Ray Jui)
     Add outbound mapping support (Ray Jui)
 
   Freescale i.MX6 host bridge driver
     Return real error code from imx6_add_pcie_port() (Fabio Estevam)
     Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam)
 
   Freescale Layerscape host bridge driver
     Remove ls_pcie_establish_link() (Minghuan Lian)
     Ignore PCIe controllers in Endpoint mode (Minghuan Lian)
     Factor out SCFG related function (Minghuan Lian)
     Update ls_add_pcie_port() (Minghuan Lian)
     Remove unused fields from struct ls_pcie (Minghuan Lian)
     Add support for LS1043a and LS2080a (Minghuan Lian)
     Add ls_pcie_msi_host_init() (Minghuan Lian)
 
   HiSilicon host bridge driver
     Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang)
 
   Marvell MVEBU host bridge driver
     Return zero for reserved or unimplemented config space (Russell King)
     Use exact config access size; don't read/modify/write (Russell King)
     Use of_get_available_child_count() (Russell King)
     Use for_each_available_child_of_node() to walk child nodes (Russell King)
     Report full node name when reporting a DT error (Russell King)
     Use port->name rather than "PCIe%d.%d" (Russell King)
     Move port parsing and resource claiming to  separate function (Russell King)
     Fix memory leaks and refcount leaks (Russell King)
     Split port parsing and resource claiming from  port setup (Russell King)
     Use gpio_set_value_cansleep() (Russell King)
     Use devm_kcalloc() to allocate an array (Russell King)
     Use gpio_desc to carry around gpio (Russell King)
     Improve clock/reset handling (Russell King)
     Add PCI Express root complex capability block (Russell King)
     Remove code restricting accesses to slot 0 (Russell King)
 
   NVIDIA Tegra host bridge driver
     Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel)
 
   Renesas R-Car host bridge driver
     Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven)
     Build pcie-rcar.c only on ARM (Geert Uytterhoeven)
     Make PCI aware of the I/O resources (Phil Edworthy)
     Remove dependency on ARM-specific struct hw_pci (Phil Edworthy)
     Set root bus nr to that provided in DT (Phil Edworthy)
     Fix I/O offset for multiple host bridges (Phil Edworthy)
 
   ST Microelectronics SPEAr13xx host bridge driver
     Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni)
 
   Synopsys DesignWare host bridge driver
     Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma)
     Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni)
     Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni)
     Require config accesses to be naturally aligned (Gabriele Paoloni)
     Make "num-lanes" an optional DT property (Gabriele Paoloni)
     Move calculation of bus addresses to DRA7xx (Gabriele Paoloni)
     Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni)
     Factor out MSI msg setup (Lucas Stach)
     Implement multivector MSI IRQ setup (Lucas Stach)
     Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach)
     Set up high part of MSI target address (Lucas Stach)
     Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang)
     Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang)
     Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang)
     Make driver arch-agnostic (Zhou Wang)
 
   Miscellaneous
     Make x86 pci_subsys_init() static (Alexander Kuleshov)
     Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPM9aAAoJEFmIoMA60/r89f8QALFMHegqKk5M08ZcCaG7unLF
 5U5t88y6KF/kNYF6HOqLQiHh79U3ToU5HdrNaAtutr0UnvgbFC2WulLqKYgLiq6Y
 YJnwR3EfgGmdG7DKAVAXq19+nc2hgTPAEe8sciU7HKlTbqQmGj6//y3sQULGNLx3
 zur0C33DCtrDgKDP7to273lkHO8Vl0YuLzyqwt0ePCMNcXR0h1dK8QxSTjuXBxaR
 c+T1V1Hw64MTxLz3xJd1/1ipy32u+9LnqqcdRz0zRq6qi48G9ch/i4Z6DHa8kTbj
 DUZrrTYKILQ2TcjcZSBmTueX11Z1Xa4/I/45sehIi6gVWL9qQbmGpt2E5YtFED+4
 GdcmBSbWG/qsNsabXk38uiM3ww7+ltXEOhTXbcK+EgjvIhE6gSK/plYG0fU9pybs
 AKViEXVdHoT1X0N1dLK12mq7kvDCQvShHn08lz97Q9YrZ32wv1Fnij6WVSbJvfWt
 DubtPtisVM+rVy+VTpOImNR9wO54lTmG5jK53yNqH7I20K89y1kqARlN9nMXMB1a
 2nQnwe9yWlsGj9gVNCn1KmyQSPOWjg+3Z+ekfwbxpca14s1AaN3Jm0N9Z61dXFoF
 y2ygoQtZ/z9BHr3quBpxXGt+aVUg2kcNw5GYeDYiALxXdJSObyzRrZ6HDb/zicU2
 ZH9hBj0ctXvucmy6I2mt
 =uZrt
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Resource management:
   - Add support for Enhanced Allocation devices (Sean O. Stalley)
   - Add Enhanced Allocation register entries (Sean O. Stalley)
   - Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney)
   - Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney)
   - Handle Enhanced Allocation capability for SR-IOV devices (David Daney)
   - Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas)
   - Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas)
   - Expand Enhanced Allocation BAR output (Bjorn Helgaas)
   - Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier)
   - Fix lookup of linux,pci-probe-only property (Marc Zyngier)
   - Add sparc mem64 resource parsing for root bus (Yinghai Lu)

  PCI device hotplug:
   - pciehp: Queue power work requests in dedicated function (Guenter Roeck)

  Driver binding:
   - Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker)

  Virtualization:
   - Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck)
   - Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck)
   - Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck)
   - Reorder pcibios_sriov_disable() (Alexander Duyck)
   - Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck)
   - Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck)
   - Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton)
   - Don't try to restore VF BARs (Wei Yang)

  MSI:
   - Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel)
   - Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach)
   - Export all remapped MSIs to sysfs attributes (Romain Bezut)
   - Disable MSI on SiS 761 (Ondrej Zary)

  AER:
   - Clear error status registers during enumeration and restore (Taku Izumi)

  Generic host bridge driver:
   - Fix lookup of linux,pci-probe-only property (Marc Zyngier)
   - Allow multiple hosts with different map_bus() methods (David Daney)
   - Pass starting bus number to pci_scan_root_bus() (David Daney)
   - Fix address window calculation for non-zero starting bus (David Daney)

  Altera host bridge driver:
   - Add msi.h to ARM Kbuild (Ley Foon Tan)
   - Add Altera PCIe host controller driver (Ley Foon Tan)
   - Add Altera PCIe MSI driver (Ley Foon Tan)

  APM X-Gene host bridge driver:
   - Remove msi_controller assignment (Duc Dang)

  Broadcom iProc host bridge driver:
   - Fix header comment "Corporation" misspelling (Florian Fainelli)
   - Fix code comment to match code (Ray Jui)
   - Remove unused struct iproc_pcie.irqs[] (Ray Jui)
   - Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui)
   - Fix PCIe reset logic (Ray Jui)
   - Improve link detection logic (Ray Jui)
   - Update PCIe device tree bindings (Ray Jui)
   - Add outbound mapping support (Ray Jui)

  Freescale i.MX6 host bridge driver:
   - Return real error code from imx6_add_pcie_port() (Fabio Estevam)
   - Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam)

  Freescale Layerscape host bridge driver:
   - Remove ls_pcie_establish_link() (Minghuan Lian)
   - Ignore PCIe controllers in Endpoint mode (Minghuan Lian)
   - Factor out SCFG related function (Minghuan Lian)
   - Update ls_add_pcie_port() (Minghuan Lian)
   - Remove unused fields from struct ls_pcie (Minghuan Lian)
   - Add support for LS1043a and LS2080a (Minghuan Lian)
   - Add ls_pcie_msi_host_init() (Minghuan Lian)

  HiSilicon host bridge driver:
   - Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang)

  Marvell MVEBU host bridge driver:
   - Return zero for reserved or unimplemented config space (Russell King)
   - Use exact config access size; don't read/modify/write (Russell King)
   - Use of_get_available_child_count() (Russell King)
   - Use for_each_available_child_of_node() to walk child nodes (Russell King)
   - Report full node name when reporting a DT error (Russell King)
   - Use port->name rather than "PCIe%d.%d" (Russell King)
   - Move port parsing and resource claiming to  separate function (Russell King)
   - Fix memory leaks and refcount leaks (Russell King)
   - Split port parsing and resource claiming from  port setup (Russell King)
   - Use gpio_set_value_cansleep() (Russell King)
   - Use devm_kcalloc() to allocate an array (Russell King)
   - Use gpio_desc to carry around gpio (Russell King)
   - Improve clock/reset handling (Russell King)
   - Add PCI Express root complex capability block (Russell King)
   - Remove code restricting accesses to slot 0 (Russell King)

  NVIDIA Tegra host bridge driver:
   - Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel)

  Renesas R-Car host bridge driver:
   - Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven)
   - Build pcie-rcar.c only on ARM (Geert Uytterhoeven)
   - Make PCI aware of the I/O resources (Phil Edworthy)
   - Remove dependency on ARM-specific struct hw_pci (Phil Edworthy)
   - Set root bus nr to that provided in DT (Phil Edworthy)
   - Fix I/O offset for multiple host bridges (Phil Edworthy)

  ST Microelectronics SPEAr13xx host bridge driver:
   - Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni)

  Synopsys DesignWare host bridge driver:
   - Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma)
   - Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni)
   - Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni)
   - Require config accesses to be naturally aligned (Gabriele Paoloni)
   - Make "num-lanes" an optional DT property (Gabriele Paoloni)
   - Move calculation of bus addresses to DRA7xx (Gabriele Paoloni)
   - Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni)
   - Factor out MSI msg setup (Lucas Stach)
   - Implement multivector MSI IRQ setup (Lucas Stach)
   - Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach)
   - Set up high part of MSI target address (Lucas Stach)
   - Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang)
   - Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang)
   - Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang)
   - Make driver arch-agnostic (Zhou Wang)

  Miscellaneous:
   - Make x86 pci_subsys_init() static (Alexander Kuleshov)
   - Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai)"

* tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  PCI: altera: Add Altera PCIe MSI driver
  PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver
  PCI: layerscape: Add ls_pcie_msi_host_init()
  PCI: layerscape: Add support for LS1043a and LS2080a
  PCI: layerscape: Remove unused fields from struct ls_pcie
  PCI: layerscape: Update ls_add_pcie_port()
  PCI: layerscape: Factor out SCFG related function
  PCI: layerscape: Ignore PCIe controllers in Endpoint mode
  PCI: layerscape: Remove ls_pcie_establish_link()
  PCI: designware: Make "clocks" and "clock-names" optional DT properties
  PCI: designware: Make driver arch-agnostic
  ARM/PCI: Replace pci_sys_data->align_resource with global function pointer
  PCI: designware: Use of_pci_get_host_bridge_resources() to parse DT
  Revert "PCI: designware: Program ATU with untranslated address"
  PCI: designware: Move calculation of bus addresses to DRA7xx
  PCI: designware: Make "num-lanes" an optional DT property
  PCI: designware: Require config accesses to be naturally aligned
  PCI: designware: Simplify dw_pcie_cfg_read/write() interfaces
  PCI: designware: Use exact access size in dw_pcie_cfg_read()
  PCI: spear: Fix dw_pcie_cfg_read/write() usage
  ...
2015-11-06 11:29:53 -08:00
Rafael J. Wysocki
58a1fbbb2e PM / PCI / ACPI: Kick devices that might have been reset by firmware
There is a concern that if the platform firmware was involved in
the system resume that's being completed,  some devices might have
been reset by it and if those devices had the power.direct_complete
flag set during the preceding suspend transition, they may stay
in a reset-power-on state indefinitely (until they are runtime-resumed
and then suspended again).  That may not be a big deal from the
individual device's perspective, but if the system is an SoC, it may
be prevented from entering deep SoC-wide low-power states on idle
because of that.

The devices that are most likely to be affected by this issue are
PCI devices and ACPI-enumerated devices using the general ACPI PM
domain, so to prevent it from happening for those devices, force a
runtime resume for them if they have their power.direct_complete
flags set and the platform firmware was involved in the resume
transition currently in progress.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-14 02:17:34 +02:00
Rafael J. Wysocki
c2df86ea92 PM / sleep: Drop pm_request_idle() from pm_generic_complete()
The pm_request_idle() in pm_generic_complete() is pointless as it is
called with the runtime PM usage counter different from zero (bumped
up by the core during the prepare phase of system suspend) and the
core calls pm_runtime_put() for all devices after executing their
complete callbacks, so drop it.

This allows the PCI PM layer to use pm_generic_complete() too.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-10-14 02:17:06 +02:00
Rafael J. Wysocki
2cef548adf PCI / PM: Avoid resuming more devices during system suspend
Commit bac2a909a0 (PCI / PM: Avoid resuming PCI devices during
system suspend) introduced a mechanism by which some PCI devices that
were runtime-suspended at the system suspend time might be left in
that state for the duration of the system suspend-resume cycle.
However, it overlooked devices that were marked as capable of waking
up the system just because PME support was detected in their PCI
config space.

Namely, in that case, device_can_wakeup(dev) returns 'true' for the
device and if the device is not configured for system wakeup,
device_may_wakeup(dev) returns 'false' and it will be resumed during
system suspend even though configuring it for system wakeup may not
really make sense at all.

To avoid this problem, simply disable PME for PCI devices that have
not been configured for system wakeup and are runtime-suspended at
the system suspend time for the duration of the suspend-resume cycle.

If the device is in D3cold, its config space is not available and it
shouldn't be written to, but that's only possible if the device
has platform PM support and the platform code is responsible for
checking whether or not the device's configuration is suitable for
system suspend in that case.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-12 22:29:57 +02:00
Rafael J. Wysocki
a8360062cc PCI / PM: Update runtime PM documentation for PCI devices
Section 3.2 "Device Runtime Power Management" of pci.txt has become
outdated, so update it to correctly reflect the current code flow.

Also update the comment in local_pci_probe() to document the fact
that pm_runtime_put_noidle() is not the only runtime PM helper
function that can be used to decrement the device's runtime PM
usage counter in .probe().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
2015-09-25 02:48:44 +02:00
Zhen Lei
9222097f0d PCI: Remove unnecessary "if" statement
In store_remove_id(), set the default return value to -ENODEV, and
overwrite it with the input buffer size if we find a matching list entry.
Then we don't need to test whether to return an error or the count.

No functional change.

[bhelgaas: changelog]
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-09-18 15:55:07 -05:00
Dave Young
2965faa5e0 kexec: split kexec_load syscall from kexec core code
There are two kexec load syscalls, kexec_load another and kexec_file_load.
 kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
split kexec_load syscall code to kernel/kexec.c.

And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.

The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
kexec_load syscall can bypass the checking.

Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.

Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
kexec_load syscall.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Jiang Liu
890e484758 PCI: Add pcibios_alloc_irq() and pcibios_free_irq()
Add pcibios_alloc_irq() and pcibios_free_irq(), which are called when
binding/unbinding PCI device drivers.

PCI arch code may implement these to manage IRQ resources for hotplugged
devices.

[bhelgaas: changelog]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-30 13:59:47 -05:00
Linus Torvalds
872912352c ACPI and power management updates for v3.20-rc1
- Rework of the core ACPI resources parsing code to fix issues
    in it and make using resource offsets more convenient and
    consolidation of some resource-handing code in a couple of places
    that have grown analagous data structures and code to cover the
    the same gap in the core (Jiang Liu, Thomas Gleixner, Lv Zheng).
 
  - ACPI-based IOAPIC hotplug support on top of the resources handling
    rework (Jiang Liu, Yinghai Lu).
 
  - ACPICA update to upstream release 20150204 including an interrupt
    handling rework that allows drivers to install raw handlers for
    ACPI GPEs which then become entirely responsible for the given GPE
    and the ACPICA core code won't touch it (Lv Zheng, David E Box,
    Octavian Purdila).
 
  - ACPI EC driver rework to fix several concurrency issues and other
    problems related to events handling on top of the ACPICA's new
    support for raw GPE handlers (Lv Zheng).
 
  - New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
    Subsystem) driver for Intel chips (Ken Xue).
 
  - Two minor fixes of the ACPI LPSS driver (Heikki Krogerus,
    Jarkko Nikula).
 
  - Two new blacklist entries for machines (Samsung 730U3E/740U3E and
    510R) where the native backlight interface doesn't work correctly
    while the ACPI one does (Hans de Goede).
 
  - Rework of the ACPI processor driver's handling of idle states
    to make the code more straightforward and less bloated overall
    (Rafael J Wysocki).
 
  - Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
    Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki,
    Yaowei Bai).
 
  - PCI core power management modification to avoid resuming (some)
    runtime-suspended devices during system suspend if they are in
    the right states already (Rafael J Wysocki).
 
  - New SFI-based cpufreq driver for Intel platforms using SFI
    (Srinidhi Kasagar).
 
  - cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
    Doug Anderson, Wolfram Sang).
 
  - SkyLake CPU support and other updates for the intel_pstate driver
    (Kristen Carlson Accardi, Srinivas Pandruvada).
 
  - cpufreq-dt driver cleanup (Markus Elfring).
 
  - Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).
 
  - Generic power domains core code fixes and cleanups (Ulf Hansson).
 
  - Operating Performance Points (OPP) core code cleanups and kernel
    documentation update (Nishanth Menon).
 
  - New dabugfs interface to make the list of PM QoS constraints
    available to user space (Nishanth Menon).
 
  - New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).
 
  - New devfreq class (devfreq_event) to provide raw utilization data
    to devfreq governors (Chanwoo Choi).
 
  - Assorted minor fixes and cleanups related to power management
    (Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist,
    Pavel Machek, Todd E Brandt, Wonhong Kwon).
 
  - turbostat updates (Len Brown) and cpupower Makefile improvement
    (Sriram Raghunathan).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJU2neOAAoJEILEb/54YlRx51QP/jrv1Wb5eMaemzMksPIWI5Zn
 I8IbxzToxu7wDDsrTBRv+LuyllMPrnppFOHHvB35gUYu7Y6I066s3ErwuqeFlbmy
 +VicmyGMahv3yN74qg49MXzWtaJZa8hrFXn8ItujiUIcs08yELi0vBQFlZImIbTB
 PdQngO88VfiOVjDvmKkYUU//9Sc9LCU0ZcdUQXSnA1oNOxuUHjiARz98R03hhSqu
 BWR+7M0uaFbu6XeK+BExMXJTpKicIBZ1GAF6hWrS8V4aYg+hH1cwjf2neDAzZkcU
 UkXieJlLJrCq+ZBNcy7WEhkWQkqJNWei5WYiy6eoQeQpNoliY2V+2OtSMJaKqDye
 PIiMwXstyDc5rgyULN0d1UUzY6mbcUt2rOL0VN2bsFVIJ1HWCq8mr8qq689pQUYv
 tcH18VQ2/6r2zW28sTO/ByWLYomklD/Y6bw2onMhGx3Knl0D8xYJKapVnTGhr5eY
 d4k41ybHSWNKfXsZxdJc+RxndhPwj9rFLfvY/CZEhLcW+2pAiMarRDOPXDoUI7/l
 aJpmPzy/6mPXGBnTfr6jKDSY3gXNazRIvfPbAdiGayKcHcdRM4glbSbNH0/h1Iq6
 HKa8v9Fx87k1X5r4ZbhiPdABWlxuKDiM7725rfGpvjlWC3GNFOq7YTVMOuuBA225
 Mu9PRZbOsZsnyNkixBpX
 =zZER
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:
 "We have a few new features this time, including a new SFI-based
  cpufreq driver, a new devfreq driver for Tegra Activity Monitor, a new
  devfreq class for providing its governors with raw utilization data
  and a new ACPI driver for AMD SoCs.

  Still, the majority of changes here are reworks of existing code to
  make it more straightforward or to prepare it for implementing new
  features on top of it.  The primary example is the rework of ACPI
  resources handling from Jiang Liu, Thomas Gleixner and Lv Zheng with
  support for IOAPIC hotplug implemented on top of it, but there is
  quite a number of changes of this kind in the cpufreq core, ACPICA,
  ACPI EC driver, ACPI processor driver and the generic power domains
  core code too.

  The most active developer is Viresh Kumar with his cpufreq changes.

  Specifics:

   - Rework of the core ACPI resources parsing code to fix issues in it
     and make using resource offsets more convenient and consolidation
     of some resource-handing code in a couple of places that have grown
     analagous data structures and code to cover the the same gap in the
     core (Jiang Liu, Thomas Gleixner, Lv Zheng).

   - ACPI-based IOAPIC hotplug support on top of the resources handling
     rework (Jiang Liu, Yinghai Lu).

   - ACPICA update to upstream release 20150204 including an interrupt
     handling rework that allows drivers to install raw handlers for
     ACPI GPEs which then become entirely responsible for the given GPE
     and the ACPICA core code won't touch it (Lv Zheng, David E Box,
     Octavian Purdila).

   - ACPI EC driver rework to fix several concurrency issues and other
     problems related to events handling on top of the ACPICA's new
     support for raw GPE handlers (Lv Zheng).

   - New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power
     Subsystem) driver for Intel chips (Ken Xue).

   - Two minor fixes of the ACPI LPSS driver (Heikki Krogerus, Jarkko
     Nikula).

   - Two new blacklist entries for machines (Samsung 730U3E/740U3E and
     510R) where the native backlight interface doesn't work correctly
     while the ACPI one does (Hans de Goede).

   - Rework of the ACPI processor driver's handling of idle states to
     make the code more straightforward and less bloated overall (Rafael
     J Wysocki).

   - Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht,
     Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki, Yaowei
     Bai).

   - PCI core power management modification to avoid resuming (some)
     runtime-suspended devices during system suspend if they are in the
     right states already (Rafael J Wysocki).

   - New SFI-based cpufreq driver for Intel platforms using SFI
     (Srinidhi Kasagar).

   - cpufreq core fixes, cleanups and simplifications (Viresh Kumar,
     Doug Anderson, Wolfram Sang).

   - SkyLake CPU support and other updates for the intel_pstate driver
     (Kristen Carlson Accardi, Srinivas Pandruvada).

   - cpufreq-dt driver cleanup (Markus Elfring).

   - Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla).

   - Generic power domains core code fixes and cleanups (Ulf Hansson).

   - Operating Performance Points (OPP) core code cleanups and kernel
     documentation update (Nishanth Menon).

   - New dabugfs interface to make the list of PM QoS constraints
     available to user space (Nishanth Menon).

   - New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso).

   - New devfreq class (devfreq_event) to provide raw utilization data
     to devfreq governors (Chanwoo Choi).

   - Assorted minor fixes and cleanups related to power management
     (Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist, Pavel
     Machek, Todd E Brandt, Wonhong Kwon).

   - turbostat updates (Len Brown) and cpupower Makefile improvement
     (Sriram Raghunathan)"

* tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (151 commits)
  tools/power turbostat: relax dependency on APERF_MSR
  tools/power turbostat: relax dependency on invariant TSC
  Merge branch 'pci/host-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into acpi-resources
  tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS
  tools/power turbostat: relax dependency on root permission
  ACPI / video: Add disable_native_backlight quirk for Samsung 510R
  ACPI / PM: Remove unneeded nested #ifdef
  USB / PM: Remove unneeded #ifdef and associated dead code
  intel_pstate: provide option to only use intel_pstate with HWP
  ACPI / EC: Add GPE reference counting debugging messages
  ACPI / EC: Add query flushing support
  ACPI / EC: Refine command storm prevention support
  ACPI / EC: Add command flushing support.
  ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag
  ACPI: add AMD ACPI2Platform device support for x86 system
  ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse()
  ACPI / EC: Update revision due to raw handler mode.
  ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp.
  ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
  ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model
  ...
2015-02-10 15:09:41 -08:00
Rafael J. Wysocki
bac2a909a0 PCI / PM: Avoid resuming PCI devices during system suspend
Commit f25c0ae2b4 (ACPI / PM: Avoid resuming devices in ACPI PM
domain during system suspend) modified the ACPI PM domain's system
suspend callbacks to allow devices attached to it to be left in the
runtime-suspended state during system suspend so as to optimize
the suspend process.

This was based on the general mechanism introduced by commit
aae4518b31 (PM / sleep: Mechanism to avoid resuming runtime-suspended
devices unnecessarily).

Extend that approach to PCI devices by modifying the PCI bus type's
->prepare callback to return 1 for devices that are runtime-suspended
when it is being executed and that are in a suitable power state and
need not be resumed going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2015-01-23 22:13:54 +01:00
Ricardo Ribalda Delgado
145b3fe579 PCI: Generate uppercase hex for modalias var in uevent
Some implementations of modprobe fail to load the driver for a PCI device
automatically because the "interface" part of the modalias from the kernel
is lowercase, and the modalias from file2alias is uppercase.

The "interface" is the low-order byte of the Class Code, defined in PCI
r3.0, Appendix D.  Most interface types defined in the spec do not use
alpha characters, so they won't be affected.  For example, 00h, 01h, 10h,
20h, etc. are unaffected.

Print the "interface" byte of the Class Code in uppercase hex, as we
already do for the Vendor ID, Device ID, Class, etc.

Commit 89ec3dcf17 ("PCI: Generate uppercase hex for modalias interface
class") fixed only half of the problem.  Some udev implementations rely on
the uevent file and not the modalias file.

Fixes: d1ded203ad ("PCI: add MODALIAS to hotplug event for pci devices")
Fixes: 89ec3dcf17 ("PCI: Generate uppercase hex for modalias interface class")
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: stable@vger.kernel.org
2015-01-09 10:19:59 -07:00
Rafael J. Wysocki
fbb988be7f PCI / PM: Drop CONFIG_PM_RUNTIME from the PCI core
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so quite a few
depend on CONFIG_PM.

Replace CONFIG_PM_RUNTIME with CONFIG_PM in the PCI core code.

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-04 00:50:33 +01:00
Tobias Klauser
3b7f101662 PCI: Remove unnecessary variable in pci_add_dynid()
The variable "retval" in pci_add_dynid() is only used to store the return
value of driver_attach() and is then directly returned.  Remove the
variable and directly pass on driver_attach()'s return value.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-03 12:42:33 -06:00
Andreas Noever
7d2a01b87f PCI: Add pci_fixup_suspend_late quirk pass
Add pci_fixup_suspend_late as a new pci_fixup_pass. The pass is called
from suspend_noirq and poweroff_noirq. Using the same pass for suspend
and hibernate is consistent with resume_early which is called by
resume_noirq and restore_noirq.

The new quirk pass is required for Thunderbolt support on Apple
hardware.

Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-19 14:08:41 -07:00
Ryan Desfosses
3c78bc61f5 PCI: Whitespace cleanup
Fix various whitespace errors.

No functional change.

[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:19 -06:00
Ryan Desfosses
b7fe943421 PCI: Move EXPORT_SYMBOL so it immediately follows function/variable
Move EXPORT_SYMBOL so it immediately follows the function or variable.

No functional change.

[bhelgaas: squash similar changes, fix hotplug, probe, rom, search, too]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 13:36:10 -06:00
Bjorn Helgaas
d1a2523d2a Merge branches 'pci/hotplug', 'pci/pci_is_bridge' and 'pci/virtualization' into next
* pci/hotplug:
  PCI: cpqphp: Fix possible null pointer dereference
  NVMe: Implement PCIe reset notification callback
  PCI: Notify driver before and after device reset

* pci/pci_is_bridge:
  pcmcia: Use pci_is_bridge() to simplify code
  PCI: pciehp: Use pci_is_bridge() to simplify code
  PCI: acpiphp: Use pci_is_bridge() to simplify code
  PCI: cpcihp: Use pci_is_bridge() to simplify code
  PCI: shpchp: Use pci_is_bridge() to simplify code
  PCI: rpaphp: Use pci_is_bridge() to simplify code
  sparc/PCI: Use pci_is_bridge() to simplify code
  powerpc/PCI: Use pci_is_bridge() to simplify code
  ia64/PCI: Use pci_is_bridge() to simplify code
  x86/PCI: Use pci_is_bridge() to simplify code
  PCI: Use pci_is_bridge() to simplify code
  PCI: Add new pci_is_bridge() interface
  PCI: Rename pci_is_bridge() to pci_has_subordinate()

* pci/virtualization:
  PCI: Introduce new device binding path using pci_dev.driver_override

Conflicts:
	drivers/pci/pci-sysfs.c
2014-05-28 16:21:07 -06:00
Alex Williamson
782a985d7a PCI: Introduce new device binding path using pci_dev.driver_override
The driver_override field allows us to specify the driver for a device
rather than relying on the driver to provide a positive match of the
device.  This shortcuts the existing process of looking up the vendor and
device ID, adding them to the driver new_id, binding the device, then
removing the ID, but it also provides a couple advantages.

First, the above existing process allows the driver to bind to any device
matching the new_id for the window where it's enabled.  This is often not
desired, such as the case of trying to bind a single device to a meta
driver like pci-stub or vfio-pci.  Using driver_override we can do this
deterministically using:

  echo pci-stub > /sys/bus/pci/devices/0000:03:00.0/driver_override
  echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
  echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

Previously we could not invoke drivers_probe after adding a device to
new_id for a driver as we get non-deterministic behavior whether the driver
we intend or the standard driver will claim the device.  Now it becomes a
deterministic process, only the driver matching driver_override will probe
the device.

To return the device to the standard driver, we simply clear the
driver_override and reprobe the device:

  echo > /sys/bus/pci/devices/0000:03:00.0/driver_override
  echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
  echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

Another advantage to this approach is that we can specify a driver override
to force a specific binding or prevent any binding.  For instance when an
IOMMU group is exposed to userspace through VFIO we require that all
devices within that group are owned by VFIO.  However, devices can be
hot-added into an IOMMU group, in which case we want to prevent the device
from binding to any driver (override driver = "none") or perhaps have it
automatically bind to vfio-pci.  With driver_override it's a simple matter
for this field to be set internally when the device is first discovered to
prevent driver matches.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-28 16:04:53 -06:00
Yijing Wang
326c1cdae7 PCI: Rename pci_is_bridge() to pci_has_subordinate()
Previously, pci_is_bridge() returned true only when a subordinate bus
existed.  Rename pci_is_bridge() to pci_has_subordinate() to better
indicate what we're checking.

No functional change.

[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-05-27 11:23:46 -06:00
Bjorn Helgaas
efdd4070f3 PCI: Remove dead code
"pdev" can never be NULL here, so remove the test.

Found by Coverity (CID 744313).

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-04-29 17:36:44 -06:00
Bandan Das
8895d3bcb8 PCI: Fail new_id for vendor/device values already built into driver
While using the sysfs new_id interface, the user can unintentionally feed
incorrect values if the driver static table has a matching entry.  This is
possible since only the device and vendor fields are mandatory and the rest
are optional.  As a result, store_new_id() will fill in default values that
are then passed on to the driver and can have unintended consequences.

As an example, consider the ixgbe driver and the 82599EB network card:

  echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id

This will pass a pci_device_id with driver_data = 0 to ixgbe_probe(), which
uses that zero to index a table of card operations.  The zeroth entry of
the table does *not* correspond to the 82599 operations.

This change returns an error if the user attempts to add a dynid for a
vendor/device combination for which a static entry already exists.
However, if the user intentionally wants a different set of values, she
must provide all the 7 fields and that will be accepted.

[bhelgaas: drop KVM text since the problem isn't KVM-specific]
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2014-04-29 17:36:44 -06:00
Rafael J. Wysocki
7cd0602d78 PCI / PM: Resume runtime-suspended devices later during system suspend
Runtime-suspended devices are resumed during system suspend by
pci_pm_prepare() for two reasons: First, because they may need
to be reprogrammed in order to change their wakeup settings and,
second, because they may need to be operatonal for their children
to be successfully suspended.  That is a problem, though, if there
are many runtime-suspended devices that need to be resumed this
way during system suspend, because the .prepare() PM callbacks of
devices are executed sequentially and the times taken by them
accumulate, which may increase the total system suspend time quite
a bit.

For this reason, move the resume of runtime-suspended devices up
to the next phase of device suspend (during system suspend), except
for the ones that have power.ignore_children set.  The exception is
made, because the devices with power.ignore_children set may still
be necessary for their children to be successfully suspended (during
system suspend) and they won't be resumed automatically as a result
of the runtime resume of their children.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2014-03-04 00:18:16 +01:00
Khalid Aziz
4fc9bbf98f PCI: Disable Bus Master only on kexec reboot
Add a flag to tell the PCI subsystem that kernel is shutting down in
preparation to kexec a kernel.  Add code in PCI subsystem to use this flag
to clear Bus Master bit on PCI devices only in case of kexec reboot.

This fixes a power-off problem on Acer Aspire V5-573G and likely other
machines and avoids any other issues caused by clearing Bus Master bit on
PCI devices in normal shutdown path.  The problem was introduced by
b566a22c23 ("PCI: disable Bus Master on PCI device shutdown").

This patch is based on discussion at
http://marc.info/?l=linux-pci&m=138425645204355&w=2

Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861
Reported-by: Chang Liu <cl91tp@gmail.com>
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: stable@vger.kernel.org	# v3.5+
2013-12-07 14:20:28 -07:00
Alexander Duyck
12c3156f10 PCI: Avoid unnecessary CPU switch when calling driver .probe() method
If we are already on a CPU local to the device, call the driver .probe()
method directly without using work_on_cpu().

This is a workaround for a lockdep warning in the following scenario:

  pci_call_probe
    work_on_cpu(cpu, local_pci_probe, ...)
      driver .probe
        pci_enable_sriov
          ...
            pci_bus_add_device
              ...
                pci_call_probe
                  work_on_cpu(cpu, local_pci_probe, ...)

It would be better to fix PCI so we don't call VF driver .probe() methods
from inside a PF driver .probe() method, but that's a bigger project.

[bhelgaas: open bugzilla, rework comments & changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=65071
Link: http://lkml.kernel.org/r/CAE9FiQXYQEAZ=0sG6+2OdffBqfLS9MpoN1xviRR9aDbxPxcKxQ@mail.gmail.com
Link: http://lkml.kernel.org/r/20130624195942.40795.27292.stgit@ahduyck-cp1.jf.intel.com
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-11-25 14:34:45 -07:00
Bjorn Helgaas
f7625980f5 PCI: Fix whitespace, capitalization, and spelling errors
Fix whitespace, capitalization, and spelling errors.  No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.

Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-14 11:28:18 -07:00
Bjorn Helgaas
c245f24220 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Warn on driver probe return value greater than zero
  PCI: Drop warning about drivers that don't use pci_set_master()
  PCI: Workaround missing pci_set_master in pci drivers
  PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms
2013-11-06 16:26:48 -07:00
Stephen M. Cameron
f92d74c1f5 PCI: Warn on driver probe return value greater than zero
Ages ago, drivers could return values greater than zero from their probe
function and this would be regarded as success.

But after f3ec4f87d6 ("PCI: change device runtime PM settings for probe
and remove") and 967577b062 ("PCI/PM: Keep runtime PM enabled for unbound
PCI devices"), we set dev->driver to NULL if the driver's probe function
returns a value greater than zero.

__pci_device_probe() treats this as success, and drivers can still mostly
work even with dev->driver == NULL, but PCI power management doesn't work,
and we don't call the driver's remove function on rmmod.

To help catch these driver problems, issue a warning in this case.

[bhelgaas: changelog]
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-11-06 16:08:17 -07:00
Bjorn Helgaas
33de1b8bf6 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Report pci_pme_active() kmalloc failure
  mn10300/PCI: Remove useless pcibios_last_bus
  frv/PCI: Remove pcibios_last_bus
  PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
  x86/PCI: Coalesce multiple overlapping host bridge windows
  MAINTAINERS: Add arch/x86/pci to PCI file patterns
  PCI/PM: Remove pci_pm_complete()
  PCI: Add pci_dev_show_local_cpu() to simplify code
  mn10300/PCI: Remove unused pci_mem_start
  cris/PCI: Remove unused pci_mem_start
  PCI: Make pci_dev_pm_ops static

Conflicts:
	drivers/pci/pci-sysfs.c
2013-10-31 14:12:40 -06:00
Liu Chuansheng
84822b158f PCI/PM: Remove pci_pm_complete()
88d26136 ("PM: Prevent runtime suspend during system resume") removed the
pm_runtime_put_sync() from pci_pm_complete() to PM core code
device_complete().

Here the pci_pm_complete() is doing the same work which can be done in
device_complete(), so we can remove it directly.

Signed-off-by: Liu Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-07 15:29:27 -06:00
Sachin Kamat
f91da04d0a PCI: Make pci_dev_pm_ops static
pci_dev_pm_ops is local to pci-driver.c.  Make it static.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-07 15:12:45 -06:00
Greg Kroah-Hartman
5136b2da77 PCI: convert bus code to use dev_groups
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the PCI bus code to use the
correct field.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-07 14:58:42 -06:00
Greg Kroah-Hartman
2229c1fbc5 PCI: convert bus code to use drv_groups
The drv_attrs field of struct bus_type is going away soon, drv_groups
should be used instead.  This converts the PCI bus code to use the
correct field.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-07 14:51:20 -06:00
Greg Kroah-Hartman
0f49ba5599 PCI: convert bus code to use bus_groups
The bus_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the PCI bus code to use the
correct field.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-07 14:51:02 -06:00
Sebastian Ott
699c198558 PCI: Add pcibios_pm_ops for optional arch-specific hibernate functionality
Platforms may want to provide architecture-specific functionality when
a PCI device is doing a hibernate transition.  Add a weak symbol
pcibios_pm_ops that architectures can override to do so.

[bhelgaas: fold in return value checks from v2 patch]
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-08-22 14:11:32 -06:00
Rafael J. Wysocki
45f0a85c82 PM / Runtime: Rework the "runtime idle" helper routine
The "runtime idle" helper routine, rpm_idle(), currently ignores
return values from .runtime_idle() callbacks executed by it.
However, it turns out that many subsystems use
pm_generic_runtime_idle() which checks the return value of the
driver's callback and executes pm_runtime_suspend() for the device
unless that value is not 0.  If that logic is moved to rpm_idle()
instead, pm_generic_runtime_idle() can be dropped and its users
will not need any .runtime_idle() callbacks any more.

Moreover, the PCI, SCSI, and SATA subsystems' .runtime_idle()
routines, pci_pm_runtime_idle(), scsi_runtime_idle(), and
ata_port_runtime_idle(), respectively, as well as a few drivers'
ones may be simplified if rpm_idle() calls rpm_suspend() after 0 has
been returned by the .runtime_idle() callback executed by it.

To reduce overall code bloat, make the changes described above.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
2013-06-03 21:49:52 +02:00
Konstantin Khlebnikov
6e0eda3c38 PCI: Don't try to disable Bus Master on disconnected PCI devices
This is a fix for commit 7897e60227 ("PCI: Disable Bus Master
unconditionally in pci_device_shutdown()").  Vivek reported that
with this commit, kexec failed because none of his SATA disks
came up.

A ->shutdown() callback might put the device in D3cold, which means config
space is no longer available.

[bhelgaas: changelog]
Link: https://lkml.org/lkml/2013/3/12/529
Reported-and-Tested-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-02 18:03:09 -06:00
Bjorn Helgaas
20f24208f6 Merge branch 'pci/konstantin-runtime-pm' into next
* pci/konstantin-runtime-pm:
  PCI/PM: Clear state_saved during suspend
  PCI: Use atomic_inc_return() rather than atomic_add_return()
  PCI: Catch attempts to disable already-disabled devices
  PCI: Disable Bus Master unconditionally in pci_device_shutdown()
2013-02-12 13:42:36 -07:00
Rafael J. Wysocki
82fee4d67a PCI/PM: Clear state_saved during suspend
This patch clears pci_dev->state_saved at the beginning of suspending.
PCI config state may be saved long before that.  Some drivers call
pci_save_state() from the ->probe() callback to get snapshot of sane
configuration space to use in the ->slot_reset() callback.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> # add comment
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-02-11 17:40:27 -07:00
Konstantin Khlebnikov
7897e60227 PCI: Disable Bus Master unconditionally in pci_device_shutdown()
Commit b566a22c23 ("PCI: disable Bus Master on PCI device shutdown")
used pci_disable_device(), but that doesn't disable Bus Mastering
unconditionally; we allow nested enable/disable calls, and only the
last disable call actually does anything.

This uses pci_clear_master() to unconditionally clear the Bus Master
bit.

Matthew Garrett and Alan Cox said (see LKML link below) that clearing Bus
Master for all PCI devices may lead to unpredictable consequences: some
devices ignores this bit and continue DMA, some of them hang after that or
crash the whole system.  But we're already trying to clear Bus Master in
general because of b566a22c23; this merely deals with the cases where
drivers haven't shut down the device correctly.

[bhelgaas: changelog]
Link: https://lkml.org/lkml/2012/6/6/278
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-11 16:28:50 -07:00
Yinghai Lu
58d9a38f6f PCI: Skip attaching driver in device_add()
We want to add PCI devices to the device tree as early as possible but
delay attaching drivers.

device_add() adds a device to the device hierarchy and (via
device_attach()) attaches a matching driver and calls its .probe() method.
We want to separate adding the device to the hierarchy from attaching the
driver.

This patch does that by adding "match_driver" in struct pci_dev.  When
false, we return failure from pci_bus_match(), which makes device_attach()
believe there's no matching driver.

Later, we set "match_driver = true" and call device_attach() again, which
now attaches the driver and calls its .probe() method.

[bhelgaas: changelog, explicitly init dev->match_driver,
fold device_attach() call into pci_bus_add_device()]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-25 15:10:12 -07:00
Linus Torvalds
193c0d6825 PCI changes for the v3.8 merge window:
Host bridge hotplug:
     - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
     - Request _OSC control before scanning root bus (Taku Izumi)
     - Assign resources when adding host bridge (Yinghai Lu)
     - Remove root bus when removing host bridge (Yinghai Lu)
     - Remove _PRT during hot remove (Yinghai Lu)
 
   SRIOV
     - Add sysfs knobs to control numVFs (Don Dutile)
 
   Power management
     - Notify devices when power resource turned on (Huang Ying)
 
   Bug fixes
     - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
     - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
     - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
     - Fix xen frontend shutdown issue (David Vrabel)
     - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
 
   Miscellaneous
     - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
     - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
     - NumaChip remote PCI support (Daniel Blueman)
     - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
     - Convert dev_printk() to dev_info(), etc (Joe Perches)
     - Add support for non PCI BAR ROM data (Matthew Garrett)
     - Add x86 support for host bridge translation offset (Mike Yoknis)
     - Report success only when every driver supports AER (Vijay Pandarathil)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
 5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
 eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
 +05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
 3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
 HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
 T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
 1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
 3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
 MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
 8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
 Xf48OyYV5EjpC3FYUSiU
 =OE3Q
 -----END PGP SIGNATURE-----

Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI update from Bjorn Helgaas:
 "Host bridge hotplug:
   - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
   - Request _OSC control before scanning root bus (Taku Izumi)
   - Assign resources when adding host bridge (Yinghai Lu)
   - Remove root bus when removing host bridge (Yinghai Lu)
   - Remove _PRT during hot remove (Yinghai Lu)

  SRIOV
    - Add sysfs knobs to control numVFs (Don Dutile)

  Power management
   - Notify devices when power resource turned on (Huang Ying)

  Bug fixes
   - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
   - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
   - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
   - Fix xen frontend shutdown issue (David Vrabel)
   - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)

  Miscellaneous
   - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
   - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
   - NumaChip remote PCI support (Daniel Blueman)
   - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
     Han)
   - Convert dev_printk() to dev_info(), etc (Joe Perches)
   - Add support for non PCI BAR ROM data (Matthew Garrett)
   - Add x86 support for host bridge translation offset (Mike Yoknis)
   - Report success only when every driver supports AER (Vijay
     Pandarathil)"

Fix up trivial conflicts.

* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
  PCI: Use phys_addr_t for physical ROM address
  x86/PCI: Add NumaChip remote PCI support
  ath9k: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: collapse wrapper for pcie_capability_read_word()
  iwlegacy: Use standard #defines for PCIe Capability ASPM fields
  iwlegacy: collapse wrapper for pcie_capability_read_word()
  cxgb3: Use standard #defines for PCIe Capability ASPM fields
  PCI: Add standard PCIe Capability Link ASPM field names
  PCI/portdrv: Use PCI Express Capability accessors
  PCI: Use standard PCIe Capability Link register field names
  x86: Use PCI setup data
  PCI: Add support for non-BAR ROMs
  PCI: Add pcibios_add_device
  EFI: Stash ROMs if they're not in the PCI BAR
  PCI: Add and use standard PCI-X Capability register names
  PCI/PM: Keep runtime PM enabled for unbound PCI devices
  xen-pcifront: Handle backend CLOSED without CLOSING
  PCI: SRIOV control and status via sysfs (documentation)
  PCI/AER: Report success only when every device has AER-aware driver
  ...
2012-12-13 12:14:47 -08:00
Bjorn Helgaas
edb1daab8e Merge branch 'pci/huang-d3cold-fixes' into next
* pci/huang-d3cold-fixes:
  PCI/PM: Keep runtime PM enabled for unbound PCI devices
2012-12-04 16:13:03 -07:00
Huang Ying
967577b062 PCI/PM: Keep runtime PM enabled for unbound PCI devices
For unbound PCI devices, what we need is:

 - Always in D0 state, because some devices do not work again after
   being put into D3 by the PCI bus.

 - In SUSPENDED state if allowed, so that the parent devices can still
   be put into low power state.

To satisfy these requirements, the runtime PM for the unbound PCI
devices are disabled and set to SUSPENDED state.  One issue of this
solution is that the PCI devices will be put into SUSPENDED state even
if the SUSPENDED state is forbidden via the sysfs interface
(.../power/control) of the device.  This is not an issue for most
devices, because most PCI devices are not used at all if unbound.
But there are exceptions.  For example, unbound VGA card can be used
for display, but suspending its parents makes it stop working.

To fix the issue, we keep the runtime PM enabled when the PCI devices
are unbound.  But the runtime PM callbacks will do nothing if the PCI
devices are unbound.  This way, we can put the PCI devices into
SUSPENDED state without putting the PCI devices into D3 state.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48201
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org          # v3.6+
2012-12-04 16:04:09 -07:00
Bill Pemberton
8ccc9aa17a PCI: Move pci_uevent into pci-driver.c
With the demise of CONFIG_HOTPLUG as an option, the pci_uevent
function located in hotplug.c will now always be used and doesn't need
special treatment in the Makefile.  Move pci_uevent into pci-driver.c
and remove hotplug.c

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-28 13:00:58 -08:00
Bill Pemberton
b40b97ae73 PCI: Remove CONFIG_HOTPLUG ifdefs
Remove conditional code based on CONFIG_HOTPLUG being false.  It's
always on now in preparation of it going away as an option.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-28 12:53:46 -08:00
Bjorn Helgaas
d3fe3988fb Merge branch 'for-linus' into next
* for-linus:
  PCI/portdrv: Don't create hotplug slots unless port supports hotplug
  PCI/PM: Fix proc config reg access for D3cold and bridge suspending
  PCI/PM: Resume device before shutdown
  PCI/PM: Fix deadlock when unbinding device if parent in D3cold
2012-11-26 13:00:57 -07:00
Dave Airlie
42eca23021 PCI: Don't touch card regs after runtime suspend D3
If the driver takes care of state saving, don't touch any registers on it.

Optimus (dual-gpu) laptops seem to have their own form of D3cold, but
unfortunately enter it on normal D3 transitions via the ACPI callback.

So when we use runtime PM to transition to D3, the card disappears off
the PCI bus, however we then try to access registers on it in the
runtime suspend finish, which really doesn't work.

This patch checks whether the pci state is saved and doesn't attempt to hit
any registers after that point if it is.

(Looks okay to Rafael)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-07 15:24:18 -07:00
Huang Ying
3ff2de9ba1 PCI/PM: Resume device before shutdown
Some actions during shutdown need device to be in D0 state, such as
MSI shutdown etc, so resume device before shutdown.

Without this patch, a device may not be enumerated after a kexec
because the corresponding bridge is not in D0, so that
configuration space of the device is not accessible.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org		# v3.6+
2012-11-02 16:57:18 -06:00