The dw_pcie_readl_rc() and dw_pcie_writel_rc() interfaces already add in
pp->dbi_base, so use those instead of doing it ourselves in the imx6
driver. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Only interfaces used from outside the driver, e.g., those called by the
DesignWare core, need to accept pointers to the generic struct pcie_port.
Internal interfaces can accept pointers to the device-specific struct,
which makes them more straightforward. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pass the struct imx6_pcie pointer, not dbi_base address, to PHY accessors.
This enables future simplifications. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
"np" and "node" are redundant copies of the of_node pointer. Remove "np"
and use "node" instead. Replace the "fsl,max-link-speed" use with "node"
as well. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use a local "struct device *dev" for brevity and consistency with other
drivers. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Swap order of dw_pcie_readl_unroll() arguments to match the "dev, pos, val"
order used by pci_write_config_word() and other drivers. No functional
change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The register accessors are not performance critical and small enough that
the compiler can inline them itself if it makes sense.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Export dw_pcie_readl_rc() and dw_pcie_writel_rc(). Many other drivers can
use these instead of implementing their own versions. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Swap order of dw_pcie_writel_rc() arguments to match the "dev, pos, val"
order used by pci_write_config_word() and other drivers. No functional
change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The struct pcie_host_ops.readl_rc() and .writel_rc() function pointers
allow a driver to override the default DesignWare register accessors.
Make the signature of the override functions the same as the default
accessors. This makes the default dw_pcie_readl_rc() and the corresponding
override more structurally similar: both will compute the final register
address with "pp->dbi_base + reg". Previously dw_pcie_readl_rc() computed
the address and passed it to the override.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
dw_pcie_readl_unroll() and dw_pcie_writel_unroll() duplicate what
dw_pcie_readl_rc() and dw_pcie_writel_rc() already do, so call them
directly.
[bhelgaas: reworked into patch series]
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Rename dw_pcie_valid_config() to dw_pcie_valid_device() and use the result
directly as a boolean value instead of testing against 0. No functional
change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/host-vmd:
x86/PCI: VMD: Move VMD driver to drivers/pci/host
x86/PCI: VMD: Synchronize with RCU freeing MSI IRQ descs
x86/PCI: VMD: Eliminate index member from IRQ list
x86/PCI: VMD: Eliminate vmd_vector member from list type
x86/PCI: VMD: Convert to use pci_alloc_irq_vectors() API
x86/PCI: VMD: Allocate IRQ lists with correct MSI-X count
PCI: Use positive flags in pci_alloc_irq_vectors()
PCI: Update "pci=resource_alignment" documentation
Conflicts:
drivers/pci/host/Kconfig
drivers/pci/host/Makefile
* pci/host-aardvark:
PCI: aardvark: Remove redundant dev_err call in advk_pcie_probe()
* pci/host-altera:
PCI: altera: Remove redundant platform_get_resource() return value check
PCI: altera: Move retrain from fixup to altera_pcie_host_init()
PCI: altera: Rework config accessors for use without a struct pci_bus
PCI: altera: Poll for link training status after retraining the link
* pci/host-artpec:
PCI: artpec6: Drop __init from artpec6_add_pcie_port()
* pci/host-designware:
PCI: designware: Remove redundant platform_get_resource() return value check
PCI: designware: Exchange viewport of `MEMORYs' and `CFGs/IOs'
PCI: designware: Keep viewport fixed for IO transaction if num_viewport > 2
PCI: designware: Check LTSSM training bit before deciding link is up
PCI: designware: Add iATU Unroll feature
PCI: designware: Wait for iATU enable
PCI: designware: Move link wait definitions to .c file
PCI: designware: Return data directly from dw_pcie_readl_rc()
* pci/host-hv:
PCI: hv: Handle hv_pci_generic_compl() error case
PCI: hv: Handle vmbus_sendpacket() failure in hv_compose_msi_msg()
PCI: hv: Remove the unused 'wrk' in struct hv_pcibus_device
PCI: hv: Use pci_function_description[0] in struct definitions
PCI: hv: Use zero-length array in struct pci_packet
PCI: hv: Use list_move_tail() instead of list_del() + list_add_tail()
* pci/host-keystone:
PCI: keystone: Propagate request_irq() failure
* pci/host-rcar:
PCI: rcar: Try increasing PCIe link speed to 5 GT/s at boot
PCI: rcar: Fix some checkpatch warnings
PCI: rcar: Add multi-MSI support
PCI: rcar: Don't disable/unprepare clocks on prepare/enable failure
PCI: rcar: Consolidate register space lookup and ioremap
* pci/host-rockchip:
PCI: rockchip: Fix wrong transmitted FTS count
PCI: rockchip: Improve the deassert sequence of four reset pins
PCI: rockchip: Increase the Max Credit update interval
PCI: rockchip: Add Rockchip PCIe controller support
dt-bindings: PCI: rockchip: Add DT bindings for Rockchip PCIe controller
* pci/host-tegra:
PCI: tegra: Use of_device_get_match_data()
PCI: tegra: Remove redundant _data suffix
* pci/host-xilinx:
microblaze/PCI: Add multidomain support for procfs
PCI: xilinx: Dispose of MSI virtual IRQ
PCI: xilinx: Clear correct MSI set bit
PCI: xilinx: Clear interrupt register for invalid interrupt
PCI: xilinx: Keep both legacy and MSI interrupt domain references
PCI: xilinx-nwl: Enable all MSI interrupts using MSI mask
PCI: xilinx-nwl: Expand error logging
Conflicts:
drivers/pci/host/pcie-xilinx.c
Move the driver source and Kconfig to the PCI host bridge drivers directory
and move the config option to a more appropriate sub-menu instead of
occupying the top-level location.
Update the Kconfig option with the X86_64 dependency that was implicitly
included from the previous location, and add information about the module
name when built as a loadable module.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jon Derrick <jonathan.derrick@intel.com>
If the expected number of FTS aren't received by RC when exiting from L0s,
the LTSSM will fall into recover state, which means it will need to send TS
for retraining which makes the latency of exiting from L0s a little longer
than expected. This issue is caused by an incorrect reset value of FTS
count on PLC1 register (offset 0x4). The expected value for Gen1/2 should
be more than 240 and we may leave a little margin here. Fix this before
starting Gen1 training which will make TS1 contain the correct FTS count.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per TRM, we need to deassert the four reset pins simultaneously. Currently
the reset framework doesn't support that so we did it one by one. It seems
no side effect found but it does impact the state machine of controller, so
sometimes the change speed bit is not set when sending training sequence
from recover state. After the silicon RTL review from SoC guys, we don't
need to do the sequence recommended by TRM, and could just move the
deassert of mgmt_sticky_rst to the first place.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Increase the likelihood of link state to automatically go to L1 and save
some power.
The default credit update interval of 7.5 us results in the rootport
sending UpdateFC-P and UpdateFC-NP packets too often, thus resulting in the
link never going to L1, and always staying in L0/L0s. The value 24 us was
chosen after some experiments and peeking over the PCIe bus to see that we
do enter L1 substate when there is not enough traffic on the PCIe bus.
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
The PCIe link speed is initially set to 2.5 GT/s. Try to increase the link
speed to 5 GT/s.
Based on original patch by Grigory Kletsko
<grigory.kletsko@cogentembedded.com>.
[bhelgaas: remove "Trying speed up" message, remove unused SPCHG]
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
* pci/virtualization:
PCI: xilinx: Relax device number checking to allow SR-IOV
PCI: designware: Relax device number checking to allow SR-IOV
PCI: altera: Relax device number checking to allow SR-IOV
PCI: Check for pci_setup_device() failure in pci_iov_add_virtfn()
PCI: Mark Atheros AR9580 to avoid bus reset
* pci/pm:
PCI: Avoid unnecessary resume after direct-complete
PCI: Recognize D3cold in pci_update_current_state()
PCI: Query platform firmware for device power state
PCI: Afford direct-complete to devices with non-standard PM
* pci/hotplug:
x86/PCI: VMD: Request userspace control of PCIe hotplug indicators
PCI: pciehp: Allow exclusive userspace control of indicators
PCI: pciehp: Remove useless pciehp_get_latch_status() calls
PCI: pciehp: Clean up dmesg "Slot(%s)" messages
PCI: pciehp: Remove unnecessary guard
PCI: pciehp: Don't re-read Slot Status when handling surprise event
PCI: pciehp: Don't re-read Slot Status when queuing hotplug event
PCI: pciehp: Process all hotplug events before looking for new ones
PCI: pciehp: Return IRQ_NONE when we can't read interrupt status
PCI: pciehp: Rename pcie_isr() locals for clarity
PCI: pciehp: Clear attention LED on device add
0516c8bcd2 ("PCI: PCIe portdrv: Simplily probe callback of service
drivers") removed the "id" argument of aer_probe() but neglected to remove
the kernel-doc comment. Update the comment.
[bhelgaas: changelog]
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Resource allocation for VFs is done via the VF BARx registers in the PF's
SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
(see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).
Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
structs describe the space allocated for the VF (this is a piece of the
space described by the VF BARx register in the PF's SR-IOV capability).
It's meaningless to request additional alignment for a VF: the VF BAR
alignment is completely determined by the alignment of the VF BARx in the
PF and the size of the VF BAR.
Ignore the user's alignment requests for VF devices.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Users may request additional alignment of PCI resources, e.g., to align
BARs on page boundaries so they can be shared with guests via VFIO. This
of course may require reallocation if firmware has already assigned the
BARs with smaller alignments.
If the platform has requested PCI_PROBE_ONLY, we should never change any
PCI BARs, so we can't provide any additional alignment. Also, if a BAR is
marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
firmware depends on the current BAR value, we can't change the alignment.
In these cases, log a message and ignore the user's alignment requests.
[bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit 58a1fbbb2e ("PM / PCI / ACPI: Kick devices that might have been
reset by firmware") added a runtime resume for devices that were runtime
suspended when the system entered sleep.
The motivation was that devices might be in a reset-power-on state after
waking from system sleep, so their power state as perceived by Linux
(stored in pci_dev->current_state) would no longer reflect reality. By
resuming such devices, we allow them to return to a low-power state via
autosuspend and also bring their current_state in sync with reality.
However for devices that are *not* in a reset-power-on state, doing an
unconditional resume wastes energy. A more refined approach is called for
which issues a runtime resume only if the power state after direct-complete
is shallower than it was before. To achieve this, update the device's
current_state and compare it to its pre-sleep value.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Whenever a device is resumed or its power state is changed using the
platform, its new power state is read from the PM Control & Status Register
and cached in pci_dev->current_state by calling pci_update_current_state().
If the device is in D3cold, reading from config space typically results in
a fabricated "all ones" response. But if it's in D3hot, the two bits
representing the power state in the PMCSR are *also* set to 1. Thus D3hot
and D3cold are not discernible by just reading the PMCSR.
To account for this, pci_update_current_state() uses two workarounds:
- When transitioning to D3cold using pci_platform_power_transition(), the
new power state is set blindly by pci_update_current_state(), i.e.
without verifying that the device actually *is* in D3cold. This is
achieved by setting the "state" argument to PCI_D3cold. The "state"
argument was originally intended to convey the new state in case the
device doesn't have the PM capability. It is *also* used to convey the
device state if the PM capability is present and the new state is D3cold,
but this was never explained in the kerneldoc.
- Once the current_state is set to D3cold, further invocations of
pci_update_current_state() will blindly assume that the device is still
in D3cold and leave the current_state unmodified. To get out of this
impasse, the current_state has to be set directly, typically by calling
pci_raw_set_power_state() or pci_enable_device().
It would be desirable if pci_update_current_state() could reliably detect
D3cold by itself. That would allow us to do away with these workarounds,
and it would allow for a smarter, more energy conserving runtime resume
strategy after system sleep: Currently devices which utilize
direct_complete are mandatorily runtime resumed in their ->complete stage.
This can be avoided if their power state after system sleep is the same as
before, but it requires a mechanism to detect the power state reliably.
We've just gained the ability to query the platform firmware for its
opinion on the device's power state. On platforms conforming to ACPI 4.0
or newer, this allows recognition of D3cold. Pre-4.0 platforms lack _PR3
and therefore the deepest power state that will ever be reported is D3hot,
even though the device may actually be in D3cold. To detect D3cold in
those cases, accessibility of the vendor ID in config space is probed using
pci_device_is_present(). This also works for devices which are not
platform-power-manageable at all, but can be suspended to D3cold using a
nonstandard mechanism (e.g. some hybrid graphics laptops or Thunderbolt on
the Mac).
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Usually the most accurate way to determine a PCI device's power state is to
read its PM Control & Status Register. There are two cases however when
this is not an option: If the device doesn't have the PM capability at
all, or if it is in D3cold (in which case its config space is
inaccessible).
In both cases, we can alternatively query the platform firmware for its
opinion on the device's power state. To facilitate this, augment struct
pci_platform_pm_ops with a ->get_power callback and implement it for
acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
It is used by a forthcoming commit to let pci_update_current_state()
recognize D3cold.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are devices not power-manageable by the platform, but still able to
runtime suspend to D3cold with a non-standard mechanism. One example is
laptop hybrid graphics where the discrete GPU and its built-in HDA
controller are power-managed either with a _DSM (AMD PowerXpress, Nvidia
Optimus) or a separate gmux controller (MacBook Pro). Another example is
Thunderbolt on Macs which is power-managed with custom ACPI methods.
When putting the system to sleep, we currently handle such devices
improperly by transitioning them from D3cold to D3hot (the default power
state defined at the top of pci_target_state()). This wastes energy and
prolongs the suspend sequence (powering up the Thunderbolt controller takes
2 seconds).
Avoid that by assuming that a non-standard PM mechanism is at work if the
device is not platform-power-manageable but currently in D3cold.
If the device is wakeup enabled, we might still have to wake it up from
D3cold if PME cannot be signaled from that power state.
The check for devices without PM capability comes before the check for
D3cold since such devices could in theory also be powered down by
non-standard means and should then be afforded direct-complete as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Save the position of the error reporting capability so it doesn't need to
be rediscovered during error handling.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Lukas Wunner <lukas@wunner.de>
When handling AER events, we previously allocated a struct aer_err_info,
processed the error, and freed the struct. But aer_isr_one_error() is
serialized by rpc_mutex, so we never need more than one copy of the struct,
and the struct is only about 70 bytes, so we're not saving much by
allocating it dynamically.
Embed a struct aer_err_info directly in struct aer_rpc, which is allocated
at probe-time by aer_probe().
[bhelgaas: changelog]
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add set_dev_domain_options() to set PCI domain-specific options as devices
are added. The first usage is to request exclusive userspace control of
PCIe hotplug indicators in VMD domains.
Devices in a VMD domain use PCIe hotplug Attention and Power Indicators in
a non-standard way; tell pciehp to ignore the indicators so userspace can
control them via the sysfs "attention" file.
To determine whether a bus is within a VMD domain, add a bool to the
pci_sysdata structure that the VMD driver sets during initialization.
[bhelgaas: changelog]
Requested-by: Kapil Karkra <kapil.karkra@intel.com>
Tested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the AER severity is calculated by calling cper_severity_to_aer(),
but the parameter sent is actually the GHES severity. This causes the AER
severity to be incorrect.
Fix the parameter to be the CPER severity instead of the GHES severity.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Currently the AER severity is being translated twice in the code flow for
PCIe errors. It is first translated in ghes_do_proc() before calling into
the AER driver. Then it is translated again when the AER driver calls
cper_print_aer(). This causes the severity that is used in
cper_print_aer() to be incorrect.
Remove the second translation that is in cper_print_aer() since this
function is already receiving the correct AER severity.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Fix a potential race when disabling MSI/MSI-X on a VMD domain device. If
the VMD interrupt service is running, it may see a disabled IRQ. We can
synchronize RCU just before freeing the MSI descriptor. This is safe since
the irq_desc lock isn't held, and the descriptor is valid even though it is
disabled. After vmd_msi_free(), though, the handler is reinitialized to
handle_bad_irq(), so we can't let the VMD ISR's list iteration see the
disabled IRQ after this.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by Jon Derrick: <jonathan.derrick@intel.com>
Use math to discover the IRQ list index number relative to the IRQ list
head.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Eliminate unused vmd and vector members from vmd_irq_list and discover the
vector using pci_irq_vector().
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Convert to use the pci_alloc_irq_vectors() API.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
To reduce the amount of memory required for IRQ lists, only allocate their
space after calling pci_msix_enable_range() which may reduce the number of
MSI-X vectors allocated.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
artpec6_add_pcie_port() is called from artpec6_pcie_probe(), which is not
marked __init. It is wrong to call an __init function from a non-__init
one, so remove __init from artpec6_add_pcie_port().
[bhelgaas: changelog]
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The R-Car PCIe driver causes 13 warnings from scripts/checkpatch.pl --
let's fix at least 10 easier ones:
- line over 80 characters;
- blank line missing after declarations;
- statements not starting on a tabstop.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Implement the MSI .setup_irqs() method which enables allocation of several
MSIs at once.
[Sergei Shtylyov: removed unrelated/unneeded changes, fixed too long lines,
reordered the variable declarations, reworded the summary/description.]
Signed-off-by: Grigory Kletsko <grigory.kletsko@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>