* pci/virtualization:
PCI: Add comments about ROM BAR updating
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
PCI: Don't update VF BARs while VF memory space is enabled
PCI: Separate VF BAR updates from standard BAR updates
PCI: Update BARs using property bits appropriate for type
PCI: Ignore BAR updates on virtual functions
PCI: Do any VF BAR updates before enabling the BARs
PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
PCI: Convert Mellanox broken INTx quirks to be for listed devices only
PCI: Convert broken INTx masking quirks from HEADER to FINAL
net/mlx4_core: Use device ID defines
PCI: Add Mellanox device IDs
* pci/pm:
x86/platform/intel-mid: Constify mid_pci_platform_pm
PCI: pciehp: Add runtime PM support for PCIe hotplug ports
ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
PCI: Unfold conditions to block runtime PM on PCIe ports
PCI: Consolidate conditions to allow runtime PM on PCIe ports
PCI: Activate runtime PM on a PCIe port only if it can suspend
PCI: Speed up algorithm in pci_bridge_d3_update()
PCI: Autosense device removal in pci_bridge_d3_update()
PCI: Don't acquire ref on parent in pci_bridge_d3_update()
USB: UHCI: report non-PME wakeup signalling for Intel hardware
PCI: Check for PME in targeted sleep state
Move PCI configuration space size macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE) from drivers/pci/pci.h to
include/uapi/linux/pci_regs.h so they can be used by more drivers and
eliminate duplicate definitions.
[bhelgaas: Expand comment to include PCI-X details]
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The acpi_get_rc_resources() is used to get the RC register address that can
not be described in MCFG. It takes the _HID & segment to look for and
outputs the RC address resource. Use PNP0C02 devices to describe such RC
address resource. Use _UID to match segment to tell which root bus the
PNP0C02 resource belongs to.
[bhelgaas: add dev argument, wrap in #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.
Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.
Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).
This patch:
- Renames pci_update_resource() to pci_std_update_resource(),
- Adds pci_iov_update_resource(),
- Makes pci_update_resource() a wrapper that calls the appropriate one,
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Currently pcie_portdrv_probe() activates runtime PM on a PCIe port even
if it will never actually suspend because the BIOS is too old or the
"pcie_port_pm=off" option was specified on the kernel command line.
A few CPU cycles can be saved by not activating runtime PM at all in these
cases, because rpm_idle() and rpm_suspend() will bail out right at the
beginning when calling rpm_check_suspend_allowed(), instead of carrying out
various locking and assignments, invoking rpm_callback(), getting back
-EBUSY and rolling everything back.
The conditions checked in pci_bridge_d3_possible() are all static, they
never change during uptime of the system, hence it's safe to call this to
determine if runtime PM should be activated.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The algorithm to update the flag indicating whether a bridge may go to D3
makes a few optimizations based on whether the update was caused by the
removal of a device on the one hand, versus the addition of a device or the
change of its D3cold flags on the other hand.
The information whether the update pertains to a removal is currently
passed in by the caller, but the function may as well determine that itself
by examining the device in question, thereby allowing for a considerable
simplification and reduction of the code.
Out of several options to determine removal, I've chosen the function
device_is_registered() because it's cheap: It merely returns the
dev->kobj.state_in_sysfs flag. That flag is set through device_add() when
the root bus is scanned and cleared through device_remove(). The call to
pci_bridge_d3_update() happens after each of these calls, respectively, so
the ordering is correct.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/pm:
PCI: Avoid unnecessary resume after direct-complete
PCI: Recognize D3cold in pci_update_current_state()
PCI: Query platform firmware for device power state
PCI: Afford direct-complete to devices with non-standard PM
Usually the most accurate way to determine a PCI device's power state is to
read its PM Control & Status Register. There are two cases however when
this is not an option: If the device doesn't have the PM capability at
all, or if it is in D3cold (in which case its config space is
inaccessible).
In both cases, we can alternatively query the platform firmware for its
opinion on the device's power state. To facilitate this, augment struct
pci_platform_pm_ops with a ->get_power callback and implement it for
acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
It is used by a forthcoming commit to let pci_update_current_state()
recognize D3cold.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add Precision Time Measurement (PTM) support (see PCIe r3.1, sec 6.22).
Enable PTM on PTM Root devices and switch ports. This does not enable PTM
on endpoints.
There currently are no PTM-capable devices on the market, but it is
expected to be supported by the Intel Apollo Lake platform.
[bhelgaas: complete rework]
Signed-off-by: Jonathan Yong <jonathan.yong@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered. Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.
With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:
- The PCIe port hardware is recent enough, starting from 2015.
- Devices connected to PCIe ports are effectively in D3cold once the port
is transitioned to D3 (the config space is not accessible anymore and
the link may be powered down).
- Devices behind the PCIe port need to be allowed to transition to D3cold
and back. There is a way both drivers and userspace can forbid this.
- If the device behind the PCIe port is capable of waking the system it
needs to be able to do so from D3cold.
This patch adds a new flag to struct pci_device called 'bridge_d3'. This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port. When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.
Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After 104daa71b3 ("PCI: Determine actual VPD size on first access"), the
PCI core computes the valid VPD size by parsing the VPD starting at offset
0x0. We don't attempt to read past that valid size because that causes
some devices to crash.
However, some devices do have data past that valid size. For example,
Chelsio adapters contain two VPD structures, and the driver needs both of
them.
Add pci_set_vpd_size(). If a driver knows it is safe to read past the end
of the VPD data structure at offset 0, it can use pci_set_vpd_size() to
allow access to as much data as it needs.
[bhelgaas: changelog, split patches, rename to pci_set_vpd_size() and
return int (not ssize_t)]
Fixes: 104daa71b3 ("PCI: Determine actual VPD size on first access")
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We only support one flavor of VPD, so there's no need to complicate things
by having a "generic" struct pci_vpd and a more specific struct
pci_vpd_pci22.
Fold struct pci_vpd_pci22 directly into struct pci_vpd.
[bhelgaas: remove NULL check before kfree of dev->vpd (per kfreeaddr.cocci)]
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
There's only one kind of VPD, so we don't need to qualify it as "the
version described by PCI spec rev 2.2."
Rename the following symbols to remove unnecessary "pci22":
PCI_VPD_PCI22_SIZE -> PCI_VPD_MAX_SIZE
pci_vpd_pci22_size() -> pci_vpd_size()
pci_vpd_pci22_wait() -> pci_vpd_wait()
pci_vpd_pci22_read() -> pci_vpd_read()
pci_vpd_pci22_write() -> pci_vpd_write()
pci_vpd_pci22_ops -> pci_vpd_ops
pci_vpd_pci22_init() -> pci_vpd_init()
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
The struct pci_vpd_ops.release function pointer is always
pci_vpd_pci22_release(), so there's no need for the flexibility of a
function pointer.
Inline the pci_vpd_pci22_release() body into pci_vpd_release() and remove
pci_vpd_pci22_release() and the struct pci_vpd_ops.release function
pointer.
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Move pci_vpd_release() so it's next to the other VPD functions. This puts
it next to pci_vpd_pci22_init(), which allocates the space freed by
pci_vpd_release().
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
The pci_platform_pm_ops structure is never modified, so declare it as
const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4a7cc83167 ("genirq/MSI: Move msi_list from struct pci_dev to struct
device") removed the contents of pci_msi_init_pci_dev(). All
implementation of it are now empty, so remove it completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit b3a72384fe ("ARM/PCI: Replace pci_sys_data->align_resource with
global function pointer") introduced an ARM-specific align_resource()
function pointer. This is not portable to other arches and doesn't work
for platforms with two different PCIe host bridge controllers.
Move the function pointer to the pci_host_bridge structure so each host
bridge driver can specify its own align_resource() function.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Add support for devices using Enhanced Allocation entries instead of BARs.
This allows the kernel to parse the EA Extended Capability structure in PCI
config space and claim the BAR-equivalent resources.
See https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_Allocation_23_Oct_2014_Final.pdf
[bhelgaas: add spec URL, s/pci_ea_set_flags/pci_ea_flags/, consolidate
declarations, print unknown property in hex to match spec]
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
[david.daney@cavium.com: Add more support/checking for Entry Properties,
allow EA behind bridges, rewrite some error messages.]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit bac2a909a0 (PCI / PM: Avoid resuming PCI devices during
system suspend) introduced a mechanism by which some PCI devices that
were runtime-suspended at the system suspend time might be left in
that state for the duration of the system suspend-resume cycle.
However, it overlooked devices that were marked as capable of waking
up the system just because PME support was detected in their PCI
config space.
Namely, in that case, device_can_wakeup(dev) returns 'true' for the
device and if the device is not configured for system wakeup,
device_may_wakeup(dev) returns 'false' and it will be resumed during
system suspend even though configuring it for system wakeup may not
really make sense at all.
To avoid this problem, simply disable PME for PCI devices that have
not been configured for system wakeup and are runtime-suspended at
the system suspend time for the duration of the suspend-resume cycle.
If the device is in D3cold, its config space is not available and it
shouldn't be written to, but that's only possible if the device
has platform PM support and the platform code is responsible for
checking whether or not the device's configuration is suitable for
system suspend in that case.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some quirks search for a HyperTransport capability and use a hard-coded TTL
value of 48 to avoid an infinite loop.
Move the definition of PCI_FIND_CAP_TTL to pci.h and use it instead of the
hard-coded TTL values.
[bhelgaas: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_ari_enabled() is useful outside of drivers/pci, particularly for
deriving INTx routing via ACPI _PRT, so move it to the global header.
Also convert to bool return.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Don Dutile <ddutile@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Move pci_msi_set_enable() and pci_msix_clear_and_set_ctrl() to
drivers/pci/pci.h so they're available even when MSI isn't configured
into the kernel.
No functional change.
[bhelgaas: changelog, split into separate patch]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
- Numerous minor fixes, cleanups etc.
- More EEH work from Gavin to remove its dependency on device_nodes.
- Memory hotplug implemented entirely in the kernel from Nathan Fontenot.
- Removal of redundant CONFIG_PPC_OF by Kevin Hao.
- Rewrite of VPHN parsing logic & tests from Greg Kurz.
- A fix from Nish Aravamudan to reduce memory usage by clamping
nodes_possible_map.
- Support for pstore on powernv from Hari Bathini.
- Removal of old powerpc specific byte swap routines by David Gibson.
- Fix from Vasant Hegde to prevent the flash driver telling you it was flashing
your firmware when it wasn't.
- Patch from Ben Herrenschmidt to add an OPAL heartbeat driver.
- Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek.
- Some fixes for migration from Tyrel Datwyler.
- A new syscall to switch the cpu endian by Michael Ellerman.
- Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn.
- A fix for the OPAL sensor driver from Cédric Le Goater.
- Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman.
- Large series from Daniel Axtens to make our PCI hooks per PHB rather than per
machine.
- Small patch from Sam Bobroff to explicitly abort non-suspended transactions
on syscalls, plus a test to exercise it.
- Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu.
- Small patch to enable the hard lockup detector from Anton Blanchard.
- Fix from Dave Olson for missing L2 cache information on some CPUs.
- Some fixes from Michael Ellerman to get Cell machines booting again.
- Freescale updates from Scott: Highlights include BMan device tree nodes, an
MSI erratum workaround, a couple minor performance improvements, config
updates, and misc fixes/cleanup.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVL2cxAAoJEFHr6jzI4aWAR8cP/19VTo/CzCE4ffPSx7qR464n
F+WFZcbNjIMXu6+B0YLuJZEsuWtKKrCit/MCg3+mSgE4iqvxmtI+HDD0445Buszj
UD4E4HMdPrXQ+KUSUDORvRjv/FFUXIa94LSv/0g2UeMsPz/HeZlhMxEu7AkXw9Nf
rTxsmRTsOWME85Y/c9ss7XHuWKXT3DJV7fOoK9roSaN3dJAuWTtG3WaKS0nUu0ok
0M81D6ZczoD6ybwh2DUMPD9K6SGxLdQ4OzQwtW6vWzcQIBDfy5Pdeo0iAFhGPvXf
T4LLPkv4cF4AwHsAC4rKDPHQNa+oZBoLlScrHClaebAlDiv+XYKNdMogawUObvSh
h7avKmQr0Ygp1OvvZAaXLhuDJI9FJJ8lf6AOIeULgHsDR9SyKMjZWxRzPe11uarO
Fyi0qj3oJaQu6LjazZraApu8mo+JBtQuD3z3o5GhLxeFtBBF60JXj6zAXJikufnl
kk1/BUF10nKUhtKcDX767AMUCtMH3fp5hx8K/z9T5v+pobJB26Wup1bbdT68pNBT
NjdKUppV6QTjZvCsA6U2/ECu6E9KeIaFtFSL2IRRoiI0dWBN5/5eYn3RGkO2ZFoL
1NdwKA2XJcchwTPkpSRrUG70sYH0uM2AldNYyaLfjzrQqza7Y6lF699ilxWmCN/H
OplzJAE5cQ8Am078veTW
=03Yh
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:
- Numerous minor fixes, cleanups etc.
- More EEH work from Gavin to remove its dependency on device_nodes.
- Memory hotplug implemented entirely in the kernel from Nathan
Fontenot.
- Removal of redundant CONFIG_PPC_OF by Kevin Hao.
- Rewrite of VPHN parsing logic & tests from Greg Kurz.
- A fix from Nish Aravamudan to reduce memory usage by clamping
nodes_possible_map.
- Support for pstore on powernv from Hari Bathini.
- Removal of old powerpc specific byte swap routines by David Gibson.
- Fix from Vasant Hegde to prevent the flash driver telling you it was
flashing your firmware when it wasn't.
- Patch from Ben Herrenschmidt to add an OPAL heartbeat driver.
- Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan
Stancek.
- Some fixes for migration from Tyrel Datwyler.
- A new syscall to switch the cpu endian by Michael Ellerman.
- Large series from Wei Yang to implement SRIOV, reviewed and acked by
Bjorn.
- A fix for the OPAL sensor driver from Cédric Le Goater.
- Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman.
- Large series from Daniel Axtens to make our PCI hooks per PHB rather
than per machine.
- Small patch from Sam Bobroff to explicitly abort non-suspended
transactions on syscalls, plus a test to exercise it.
- Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu.
- Small patch to enable the hard lockup detector from Anton Blanchard.
- Fix from Dave Olson for missing L2 cache information on some CPUs.
- Some fixes from Michael Ellerman to get Cell machines booting again.
- Freescale updates from Scott: Highlights include BMan device tree
nodes, an MSI erratum workaround, a couple minor performance
improvements, config updates, and misc fixes/cleanup.
* tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits)
powerpc/powermac: Fix build error seen with powermac smp builds
powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE
powerpc: Remove PPC32 code from pseries specific find_and_init_phbs()
powerpc/cell: Fix iommu breakage caused by controller_ops change
powerpc/eeh: Fix crash in eeh_add_device_early() on Cell
powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH
powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails
powerpc/pseries: Correct memory hotplug locking
powerpc: Fix missing L2 cache size in /sys/devices/system/cpu
powerpc: Add ppc64 hard lockup detector support
oprofile: Disable oprofile NMI timer on ppc64
powerpc/perf/hv-24x7: Add missing put_cpu_var()
powerpc/perf/hv-24x7: Break up single_24x7_request
powerpc/perf/hv-24x7: Define update_event_count()
powerpc/perf/hv-24x7: Whitespace cleanup
powerpc/perf/hv-24x7: Define add_event_to_24x7_request()
powerpc/perf/hv-24x7: Rename hv_24x7_event_update
powerpc/perf/hv-24x7: Move debug prints to separate function
powerpc/perf/hv-24x7: Drop event_24x7_request()
powerpc/perf/hv-24x7: Use pr_devel() to log message
...
Conflicts:
tools/testing/selftests/powerpc/Makefile
tools/testing/selftests/powerpc/tm/Makefile
The find_pci_host_bridge() function can be useful to other PCI code so
export it. Change its name to pci_find_host_bridge().
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
An SR-IOV device can change its First VF Offset and VF Stride based on the
values of ARI Capable Hierarchy and NumVFs. The number of buses required
for all VFs is determined by NumVFs, First VF Offset, and VF Stride (see
SR-IOV spec r1.1, sec 2.1.2).
Previously pci_iov_bus_range() computed how many buses would be required by
TotalVFs, but this was based on a single NumVFs value and may not have been
the maximum for all NumVFs configurations.
Iterate over all valid NumVFs and calculate the maximum number of bus
numbers that could ever be required for VFs of this device.
[bhelgaas: changelog, compute busnr of NumVFs, not TotalVFs, remove
kerenl-doc comment marker]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we don't store the individual VF BAR size. We calculate it when
needed by dividing the PF's IOV resource size (which contains space for
*all* the VFs) by total_VFs or by reading the BAR in the SR-IOV capability
again.
Keep the individual VF BAR size in struct pci_sriov.barsz[], add
pci_iov_resource_size() to retrieve it, and use that instead of doing the
division or reading the SR-IOV capability BAR.
[bhelgaas: rename to "barsz[]", simplify barsz[] index computation, remove
SR-IOV capability BAR sizing]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit f25c0ae2b4 (ACPI / PM: Avoid resuming devices in ACPI PM
domain during system suspend) modified the ACPI PM domain's system
suspend callbacks to allow devices attached to it to be left in the
runtime-suspended state during system suspend so as to optimize
the suspend process.
This was based on the general mechanism introduced by commit
aae4518b31 (PM / sleep: Mechanism to avoid resuming runtime-suspended
devices unnecessarily).
Extend that approach to PCI devices by modifying the PCI bus type's
->prepare callback to return 1 for devices that are runtime-suspended
when it is being executed and that are in a suitable power state and
need not be resumed going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Add pci_bus_clip_resource(). If a PCI-PCI bridge window overlaps an
upstream bridge window but is not completely contained by it, this clips
the downstream window so it fits inside the upstream one.
No functional change (this adds the function but no callers).
[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <kordikmarek@gmail.com>
Fixes: 5b28541552 ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.16+
pci_iov_resource_bar() always sets its 'pci_bar_type' parameter to
'pci_bar_unknown'. Drop the parameter and just use 'pci_bar_unknown'
directly in the callers.
No functional change intended.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Yu Zhao <yuzhao@google.com>
Previously we applied _HPX type 2 record Link Control register settings
only to bridges with a subordinate bus. But it's better to apply them to
all devices with a link because if the subordinate bus has not been
allocated yet, we won't apply settings to the device.
Use pcie_cap_has_lnkctl() to determine whether the device has a Link
Control register instead of looking at dev->subordinate.
[bhelgaas: changelog]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/hotplug:
PCI: cpqphp: Fix possible null pointer dereference
NVMe: Implement PCIe reset notification callback
PCI: Notify driver before and after device reset
* pci/pci_is_bridge:
pcmcia: Use pci_is_bridge() to simplify code
PCI: pciehp: Use pci_is_bridge() to simplify code
PCI: acpiphp: Use pci_is_bridge() to simplify code
PCI: cpcihp: Use pci_is_bridge() to simplify code
PCI: shpchp: Use pci_is_bridge() to simplify code
PCI: rpaphp: Use pci_is_bridge() to simplify code
sparc/PCI: Use pci_is_bridge() to simplify code
powerpc/PCI: Use pci_is_bridge() to simplify code
ia64/PCI: Use pci_is_bridge() to simplify code
x86/PCI: Use pci_is_bridge() to simplify code
PCI: Use pci_is_bridge() to simplify code
PCI: Add new pci_is_bridge() interface
PCI: Rename pci_is_bridge() to pci_has_subordinate()
* pci/virtualization:
PCI: Introduce new device binding path using pci_dev.driver_override
Conflicts:
drivers/pci/pci-sysfs.c
Previously, pci_is_bridge() returned true only when a subordinate bus
existed. Rename pci_is_bridge() to pci_has_subordinate() to better
indicate what we're checking.
No functional change.
[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI functions used to be marked __devinit. When CONFIG_HOTPLUG was
not set, these functions were discarded after boot. A few callers of these
__devinit functions were marked __ref to indicate that they could safely
call the __devinit functions even though the callers were not __devinit.
But CONFIG_HOTPLUG and __devinit are now gone, and the need for the __ref
annotations is also gone, so remove them. Relevant historical commits:
54b956b903 Remove __dev* markings from init.h
a8e4b9c101 PCI: add generic pci_hp_add_bridge()
0ab2b57f8d PCI: fix section mismatch warning in pci_scan_child_bus
451124a7cc PCI: fix 4x section mismatch warnings
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This reverts commit 74bb1bcc7d ("PCI: handle SR-IOV Virtual Function
Migration"), removing this exported interface:
pci_sriov_migration()
Since pci_sriov_migration() is unused, it is impossible to schedule
sriov_migration_task() or use any of the other migration infrastructure.
This is based on Stephen Hemminger's patch (see link below), but goes a bit
further.
Link: http://lkml.kernel.org/r/20131227132710.7190647c@nehalam.linuxnetplumber.net
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
Using 'make namespacecheck' identify code which should be declared static.
Checked for users in other driver/archs as well. Compile tested only.
This stops exporting the following interfaces to modules:
pci_target_state()
pci_load_saved_state()
[bhelgaas: retained pci_find_next_ext_capability() and pci_cfg_space_size()]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead. This converts the PCI bus code to use the
correct field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The bus_attrs field of struct bus_type is going away soon, dev_groups
should be used instead. This converts the PCI bus code to use the
correct field.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull networking changes from David Miller:
"Noteworthy changes this time around:
1) Multicast rejoin support for team driver, from Jiri Pirko.
2) Centralize and simplify TCP RTT measurement handling in order to
reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
both timestamps and local RTT measurements are available prefer
the later because there are broken middleware devices which
scramble the timestamp.
From Yuchung Cheng.
3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
memory consumed to queue up unsend user data. From Eric Dumazet.
4) Add a "physical port ID" abstraction for network devices, from
Jiri Pirko.
5) Add a "suppress" operation to influence fib_rules lookups, from
Stefan Tomanek.
6) Add a networking development FAQ, from Paul Gortmaker.
7) Extend the information provided by tcp_probe and add ipv6 support,
from Daniel Borkmann.
8) Use RCU locking more extensively in openvswitch data paths, from
Pravin B Shelar.
9) Add SCTP support to openvswitch, from Joe Stringer.
10) Add EF10 chip support to SFC driver, from Ben Hutchings.
11) Add new SYNPROXY netfilter target, from Patrick McHardy.
12) Compute a rate approximation for sending in TCP sockets, and use
this to more intelligently coalesce TSO frames. Furthermore, add
a new packet scheduler which takes advantage of this estimate when
available. From Eric Dumazet.
13) Allow AF_PACKET fanouts with random selection, from Daniel
Borkmann.
14) Add ipv6 support to vxlan driver, from Cong Wang"
Resolved conflicts as per discussion.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
openvswitch: Fix alignment of struct sw_flow_key.
netfilter: Fix build errors with xt_socket.c
tcp: Add missing braces to do_tcp_setsockopt
caif: Add missing braces to multiline if in cfctrl_linkup_request
bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
vxlan: Fix kernel panic on device delete.
net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
icplus: Use netif_running to determine device state
ethernet/arc/arc_emac: Fix huge delays in large file copies
tuntap: orphan frags before trying to set tx timestamp
tuntap: purge socket error queue on detach
qlcnic: use standard NAPI weights
ipv6:introduce function to find route for redirect
bnx2x: VF RSS support - VF side
bnx2x: VF RSS support - PF side
vxlan: Notify drivers for listening UDP port changes
net: usbnet: update addr_assign_type if appropriate
driver/net: enic: update enic maintainers and driver
driver/net: enic: Exposing symbols for Cisco's low latency driver
...
pcie_link_speed and pcix_bus_speed are arrays used by probe.c to correctly
convert lnksta register values into the pci_bus_speed enum. These static arrays
are useful outside probe for this purpose. This patch makes these defines into
conist arrays and exposes them with an extern header in drivers/pci/pci.h
-v2-
* move extern declarations to drivers/pci/pci.h
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The dev_attrs field of struct class is going away soon, dev_groups
should be used instead. This converts the PCI class code to use the
correct field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
On x86 platforms, the kernel respects PCI resource assignments from
the BIOS and only reassigns resources for unassigned BARs at boot
time. However, with the ACPI-based hotplug (acpiphp), it ignores the
BIOS' PCI resource assignments completely and reassigns all resources
by itself. This causes differences in PCI resource allocation
between boot time and runtime hotplug to occur, which is generally
undesirable and sometimes actively breaks things.
Namely, if there are enough resources, reassigning all PCI resources
during runtime hotplug should work, but it may fail if the resources
are constrained. This may happen, for instance, when some PCI
devices with huge MMIO BARs are involved in the runtime hotplug
operations, because the current PCI MMIO alignment algorithm may
waste huge chunks of MMIO address space in those cases.
On the Alexander's Sony VAIO VPCZ23A4R the BIOS allocates limited
MMIO resources for the dock station which contains a device
(graphics adapter) with a 256MB MMIO BAR. An attempt to reassign
that during runtime hotplug causes the dock station MMIO window to be
exhausted and acpiphp fails to allocate resources for the majority
of devices on the dock station as a result.
To prevent that from happening, modify acpiphp to follow the boot
time resources allocation behavior so that the BIOS' resource
assignments are respected during runtime hotplug too.
[rjw: Changelog]
References: https://bugzilla.kernel.org/show_bug.cgi?id=56531
Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com>
Tested-by: Illya Klymov <xanf@xanf.me>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>