PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use correct memory mapping attributes to map config space
regions to enforce configuration space non-posted writes behaviour.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Tanmay Inamdar <tinamdar@apm.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ray Jui <rjui@broadcom.com>
Cc: Jon Mason <jonmason@broadcom.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stanimir Varbanov <svarbanov@mm-sol.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Wenrui Li <wenrui.li@rock-chips.com>
Cc: Shawn Lin <shawn.lin@rock-chips.com>
PCI configuration space should be mapped with a memory region type that
generate on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Pratyush Anand <pratyush.anand@gmail.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
PCI configuration space should be mapped with a memory region type that
generates on the CPU host bus non-posted write transations. Update the
driver to use the devm_pci_remap_cfg* interface to make sure the correct
memory mappings for PCI configuration space are used.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
The current ECAM kernel implementation uses ioremap() to map the ECAM
configuration space memory region; this is not safe in that on some
architectures the ioremap interface provides mappings that allow posted
write transactions. This, as highlighted in the PCIe specifications (4.0 -
Rev0.3, "Ordering Considerations for the Enhanced Configuration Address
Mechanism"), can create ordering issues for software because posted writes
transactions on the CPU host bus are non posted in the PCI express fabric.
Update the ioremap() interface to use pci_remap_cfgspace() whose mapping
attributes guarantee that non-posted writes transactions are issued for
memory writes within the ECAM memory mapped address region.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
The introduction of the pci_remap_cfgspace() interface allows PCI host
controller drivers to map PCI config space through a dedicated kernel
interface. Current PCI host controller drivers use the devm_ioremap_*()
devres interfaces to map PCI configuration space regions so in order to
update them to the new pci_remap_cfgspace() mapping interface a new set of
devres interfaces should be implemented so that PCI host controller drivers
can make use of them.
Introduce two new functions in the PCI kernel layer and Devres
documentation:
- devm_pci_remap_cfgspace()
- devm_pci_remap_cfg_resource()
so that PCI host controller drivers can make use of them to map PCI
configuration space regions.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Both conflict were simple overlapping changes.
In the kaweth case, Eric Dumazet's skb_cow() bug fix overlapped the
conversion of the driver in net-next to use in-netdev stats.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we've exported pci_remap_iospace() and added proper remove()
support, there's no reason this can't be a loadable module.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
These are useful for PCIe host drivers, and those drivers can be modules.
[bhelgaas: don't remove __weak; it's removed elsewhere]
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Currently, if we try to unbind the platform device, the remove will
succeed, but the removal won't undo most of the registration, leaving
partially-configured PCI devices in the system.
This allows, for example, a simple 'lspci' to crash the system, as it will
try to touch the freed (via devm_*) driver structures, e.g., on RK3399:
# echo f8000000.pcie > /sys/bus/platform/drivers/rockchip-pcie/unbind
# lspci
So let's implement device remove().
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Configuring DMA ops at probe time will allow deferring device probe when
the IOMMU isn't available yet. The dma_configure for the device is
now called from the generic device_attach callback just before the
bus/driver probe is called. This way, configuring the DMA ops for the
device would be called at the same place for all bus_types, hence the
deferred probing mechanism should work for all buses as well.
pci_bus_add_devices (platform/amba)(_device_create/driver_register)
| |
pci_bus_add_device (device_add/driver_register)
| |
device_attach device_initial_probe
| |
__device_attach_driver __device_attach_driver
|
driver_probe_device
|
really_probe
|
dma_configure
Similarly on the device/driver_unregister path __device_release_driver is
called which inturn calls dma_deconfigure.
This patch changes the dma ops configuration to probe time for
both OF and ACPI based platform/amba/pci bus devices.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci part)
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Instead of copy & pasting and old version of the code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The 82599 quirk contained an outdated copy of the FLR code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently we opencode the FLR sequence in lots of place; export a core
helper instead. We split out the probing for FLR support as all the
non-core callers already know their hardware.
Note that in the new pci_has_flr() function the quirk check has been moved
before the capability check as there is no point in reading the capability
in this case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Sometimes it is not desirable to bind SR-IOV VFs to drivers. This can save
host side resource usage by VF instances that will be assigned to VMs.
Add a new PCI sysfs interface "sriov_drivers_autoprobe" to control that
from the PF. To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable
probe) to:
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
Note that this must be done before enabling VFs. The change will not take
effect if VFs are already enabled. Simply, one can disable VFs by setting
sriov_numvfs to 0, choose whether to probe or not, and then re-enable the
VFs by restoring sriov_numvfs.
[bhelgaas: changelog, ABI doc]
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
This will need to call into an arch-provided pci_iobar_pfn() function.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Starting to leave behind the legacy of the pci_mmap_page_range() interface
which takes "user-visible" BAR addresses. This takes just the resource and
offset.
For now, both APIs coexist and depending on the platform, one is
implemented as a wrapper around the other.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In all cases we know which BAR it is. Passing it in means that arch code
(or generic code; watch this space) won't have to go looking for it again.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We store the pointer, and then on *every* use of it we loop over the
device's resources to find out the index. That's kind of silly.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When the kernel is running in secure boot mode, we lock down the kernel to
prevent userspace from modifying the running kernel image. Whilst this
includes prohibiting access to things like /dev/mem, it must also prevent
access by means of configuring driver modules in such a way as to cause a
device to access or modify the kernel image.
To this end, annotate module_param* statements that refer to hardware
configuration and indicate for future reference what type of parameter they
specify. The parameter parser in the core sees this information and can
skip such parameters with an error message if the kernel is locked down.
The module initialisation then runs as normal, but just sees whatever the
default values for those parameters is.
Note that we do still need to do the module initialisation because some
drivers have viable defaults set in case parameters aren't specified and
some drivers support automatic configuration (e.g. PNP or PCI) in addition
to manually coded parameters.
This patch annotates drivers in drivers/pci/hotplug/.
Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
cc: Scott Murray <scott@spiteful.org>
cc: linux-pci@vger.kernel.org
pci_remap_iospace() is marked as a weak symbol even though no architecture
is currently overriding it; given that its implementation internals have
already code paths that are arch specific (ie PCI_IOBASE and
ioremap_page_range() attributes) there is no need to leave the weak symbol
in the kernel since the same functionality can be achieved by customizing
per-arch the corresponding functionality.
Remove the __weak symbol from pci_remap_iospace().
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The "pci=resource_alignment" argument aligns BARs of designated devices by
artificially increasing their size. Increasing the size increases the
alignment and prevents other resources from being assigned in the same
alignment region, e.g., in the same page, but it can break drivers that use
the BAR size to locate things, e.g., ilo_map_device() does this:
off = pci_resource_len(pdev, bar) - 0x2000;
The new pcibios_default_alignment() interface allows an arch to request
that *all* BARs in the system be aligned to a larger size. In this case,
we don't need to artificially increase the resource size because we know
every BAR of every device will be realigned, so nothing will share the same
alignment region.
Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know
we're realigning all BARs in the system.
[bhelgaas: comment, changelog]
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The "pci=resource_alignment=" kernel argument designates devices for which
we want alignment greater than is required by the PCI specs. Previously we
set IORESOURCE_UNSET for every MEM resource of those devices, even if the
resource was *already* sufficiently aligned.
If a resource is already sufficiently aligned, leave it alone and don't try
to reassign it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull the BAR size adjustment out into a new function,
pci_request_resource_alignment(), and add a comment about how and why we
increase the resource size and alignment.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When VFIO passes through a PCI device to a guest, it does not allow the
guest to mmap BARs that are smaller than PAGE_SIZE unless it can reserve
the rest of the page (see vfio_pci_probe_mmaps()). This is because a page
might contain several small BARs for unrelated devices and a guest should
not be able to access all of them.
VFIO emulates guest accesses to non-mappable BARs, which is functional but
slow. On systems with large page sizes, e.g., PowerNV with 64K pages, BARs
are more likely to share a page and performance is more likely to be a
problem.
Add a weak function to set default alignment for all PCI devices. An arch
can override it to force the PCI core to place memory BARs on their own
pages.
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCI/PCI-X to PCI Express bridge, sometimes referred to as a "reverse
bridge", is a bridge with conventional PCI or PCI-X on its primary side and
a PCI Express Port on its secondary (downstream) side.
That PCIe Port is a Downstream Port and could be connected to a slot, just
like a Root Port or a Switch Downstream Port. Make pcie_downstream_port()
return true for them, so we can access the Slot registers in the PCIe
capability.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
crashes during suspend tests. Geert Uytterhoeven managed to reproduce the
issue on an M2-W Koelsch board (r8a7791):
It occurs when the PME scan runs, once per second. During PME scan, the
PCI host bridge (rcar-pci) registers are accessed while its module clock
has already been disabled, leading to the crash.
One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:
# echo 0 > /sys/module/printk/parameters/console_suspend
# echo s2idle > /sys/power/mem_sleep
# echo mem > /sys/power/state
Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test. It does not (or is less likely) to happen during full
system suspend ("core" or "none") because system suspend also disables
timers, and thus the workqueue handling PME scans no longer runs. Geert
believes the issue may still happen in the small window between disabling
module clocks and disabling timers:
# echo 0 > /sys/module/printk/parameters/console_suspend
# echo platform > /sys/power/pm_test # Or "processors"
# echo mem > /sys/power/state
(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)
Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible. To that end, queue the task on a
workqueue that gets frozen before devices suspend.
Rafael notes however that as a result, some wakeup events may be missed if
they are delivered via PME from a device without working IRQ (which hence
must be polled) and occur after the workqueue has been frozen. If that
turns out to be an issue in practice, it may be possible to solve it by
calling pci_pme_list_scan() once directly from one of the host bridge's
pm_ops callbacks.
Stacktrace for posterity:
PM: Syncing filesystems ... [ 38.566237] done.
PM: Preparing system for sleep (mem)
Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system (mem)
PM: suspend of devices complete after 152.456 msecs
PM: late suspend of devices complete after 2.809 msecs
PM: noirq suspend of devices complete after 29.863 msecs
suspend debug: Waiting for 5 second(s).
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000040004003, *pmd=00000000
Internal error: : 1211 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383
Hardware name: Generic R8A7791 (Flattened Device Tree)
Workqueue: events pci_pme_list_scan
task: eb56e140 task.stack: eb58e000
PC is at pci_generic_config_read+0x64/0x6c
LR is at rcar_pci_cfg_base+0x64/0x84
pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093
sp : eb58fe98 ip : c041d750 fp : 00000008
r10: c0e2283c r9 : 00000000 r8 : 600d0013
r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4
r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 30c5387d Table: 6a9f6c80 DAC: 55555555
Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
Stack: (0xeb58fe98 to 0xeb590000)
fe80: 00000002 00000044
fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
[<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
(pci_bus_read_config_word+0x58/0x80)
[<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
(pci_check_pme_status+0x34/0x78)
[<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
[<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
[<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
(process_one_work+0x1bc/0x308)
[<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
[<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
[<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
---[ end trace 667d43ba3aa9e589 ]---
Fixes: df17e62e5b ("PCI: Add support for polling PME state on suspended legacy PCI devices")
Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # 2.6.37+
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
In case that one device's alignment is greater than its size, we may
get an incorrect size and alignment for its bus's memory window in
pbus_size_mem(). Fix this case.
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We would call pci_reassigndev_resource_alignment() before
pci_init_capabilities(). So the requested alignment would never work for
IOV BARs.
Furthermore, it's meaningless to request additional alignment for IOV BARs,
the IOV BAR alignment is only determined by the VF BAR size.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
A 64-bit value is not needed since a PCI ROM address consists in 32 bits.
This fixes a clang warning about "implicit conversion from 'unsigned long'
to 'u32'".
Also remove now unnecessary casts to u32 from __pci_read_base() and
pci_std_update_resource().
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Local variables 'l' and 'sz' are uninitialized. Normally, they would
be initialized by pci_read_config_dword() but when an error occurs,
some drivers immediately return an error code, which leaves the
argument uninitialized.
Provide a safe initial value to make the code more robust.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
These are small wrappers around request_threaded_irq() and free_irq(),
which dynamically allocate space for the device name so that drivers don't
need to keep static buffers for these around. Additionally it works with
device-relative vector numbers to make the usage easier, and force the
IRQF_SHARED flag on given that it has no runtime overhead and should be
supported by all PCI devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
This is relatively esoteric, and knowing that we don't have it makes life
easier in some cases rather than just an eventual -EINVAL from
pci_mmap_page_range().
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Most of the almost-identical versions of pci_mmap_page_range() silently
ignore the 'write_combine' argument and give uncached mappings.
Yet we allow the PCIIOC_WRITE_COMBINE ioctl in /proc/bus/pci, expose the
'resourceX_wc' file in sysfs, and allow an attempted mapping to apparently
succeed.
To fix this, introduce a macro arch_can_pci_mmap_wc() which indicates
whether the platform can do a write-combining mapping. On x86 this ends up
being pat_enabled(), while the few other platforms that support it can just
set it to a literal '1'.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The /proc/bus/pci mmap interface allows the user to specify whether they
want WC or not. Don't let them do so on non-prefetchable BARs.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Don't match MMIO maps with I/O BARs and vice versa.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
refcount_t type and corresponding API should be used instead of atomic_t
when the variable is used as a reference counter. This allows to avoid
accidental refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
On Cavium ThunderX2 arm64 SoCs (formerly known as Broadcom Vulcan), the PCI
topology is slightly unusual. For a multi-node system, it looks like:
00:00.0 PCI bridge to [bus 01-1e]
01:0a.0 PCI-to-PCIe bridge to [bus 02-04]
02:00.0 PCIe Root Port bridge to [bus 03-04] (XLATE_ROOT)
03:00.0 PCIe Endpoint
pci_for_each_dma_alias() assumes IOMMU translation is done at the root of
the PCI hierarchy. It generates 03:00.0, 01:0a.0, and 00:00.0 as DMA
aliases for 03:00.0 because buses 01 and 00 are non-PCIe buses that don't
carry the Requester ID.
Because the ThunderX2 IOMMU is at 02:00.0, the Requester IDs 01:0a.0 and
00:00.0 are never valid for the endpoint. This quirk stops alias
generation at the XLATE_ROOT bridge so we won't generate 01:0a.0 or
00:00.0.
The current IOMMU code only maps the last alias (this is a separate bug in
itself). Prior to this quirk, we only created IOMMU mappings for the
invalid Requester ID 00:00:0, which never matched any DMA transactions.
With this quirk, we create IOMMU mappings for a valid Requester ID, which
fixes devices with no aliases but leaves devices with aliases still broken.
The last alias for the endpoint is also used by the ARM GICv3 MSI-X code.
Without this quirk, the GIC Interrupt Translation Tables are setup with the
invalid Requester ID, and the MSI-X generated by the device fails to be
translated and routed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195447
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: David Daney <david.daney@cavium.com>
Conflicts were simply overlapping changes. In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.
In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new quirk flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT to limit the DMA alias
search to go no further than the bridge where the IOMMU unit is attached.
The flag will be used to indicate a bridge device which forwards the
address translation requests to the IOMMU, i.e., where the interrupt and
DMA requests leave the PCIe hierarchy and go into the system blocks.
Usually this happens at the PCI RC, so this flag is not needed. But on
systems where there are bridges that introduce aliases above the IOMMU,
this flag prevents pci_for_each_dma_alias() from generating aliases that
the IOMMU will never see.
The function pci_for_each_dma_alias() is updated to stop when it see a
bridge with this flag set.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195447
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: David Daney <david.daney@cavium.com>
In the PCI_MMAP_PROCFS case when the address being passed by the user is a
'user visible' resource address based on the bus window, and not the actual
contents of the resource, that's what we need to be checking it against.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
The ITE 8893 bridge has the same problems as the ITE 8892, which were
resulting in crippling an older PCI 1Gbps NIC down to 45Mbps throughput
with IOMMU and VT-d enabled. With the patch, this old e1000 goes back up
to ~900Mbps.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add a couple of special IOCTLs to:
* Inform userspace of firmware partition locations
* Pass event counts and allow userspace to wait on events
* Translate PFF numbers used by the switch to port numbers
[Dan Carpenter <dan.carpenter@oracle.com>: fix off-by-one in
ioctl_event_ctl()]
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Add a few read-only sysfs attributes which provide some device information
that is exposed from the devices, primarily component and device names and
versions.
These are documented in Documentation/ABI/testing/sysfs-class-switchtec.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The "hisilicon,pcie-almost-ecam" binding goes against the usual DT
conventions, and is non-sensical in that it describes the IP based on
what it isn't. Fix the DT binding with "hisilicon,hip06-pcie-ecam"
and "hisilicon,hip07-pcie-ecam".
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
All platforms using Rockchip use a common clock for the Root Port and the
slot connected to it. Indicate this by setting the Slot Clock Configuration
(PCI_EXP_LNKSTA_SLC) bit in the Root Port's Link Status.
Per the Implementation Note in the spec (PCIe r3.1, sec 7.8.7), if the
downstream component also sets PCI_EXP_LNKSTA_SLC, software may set the
Common Clock Configuration (PCI_EXP_LNKCTL_CCC) bits on both ends of the
Link. This is done by pcie_aspm_configure_common_clock().
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: jeffy.chen <jeffy.chen@rock-chips.com>
Adds a new endpoint function driver (to program the virtual test device)
making use of the EP-core library.
[bhelgaas: fold in pci_epf_test_probe() -ENOMEM test from Wei Yongjun
<weiyongjun1@huawei.com>]
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Invoke APIs provided by pci-ep-cfs to create configfs entry for every EPC
device and EPF driver to help users in creating EPF device and binding the
EPF device to the EPC device.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Introduce a new configfs entry to configure the EP function (like
configuring the standard configuration header entries) and to bind the EP
function with EP controller.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Introduce a new EP core layer in order to support endpoint functions in
linux kernel. This comprises the EPC library (Endpoint Controller Library)
and EPF library (Endpoint Function Library). EPC library implements
functions specific to an endpoint controller and EPF library implements
functions specific to an endpoint function.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Unused now that all callers switched to pci_alloc_irq_vectors.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJY6mY1AAoJEHm+PkMAQRiGB14IAImsH28JPjxJVDasMIRPBxVc
euPPlZgoBieu7sNt+kEsEqdkXuu0MLk6gln0IGxWLeoB2S+u3Tz5LMa2YArVqV9Z
tWzOnI9auE73P2Pz/tUMOdyMs5tO0PolQxX3uljbULBozOHjHRh13fsXchX2yQvl
mFeFCDqpPV0KhWRH/ciA8uIHdvYPhMpkKgRtmR8jXL0yzqLp6+2J+Bs8nHG4NNng
HMVxZPC8jOE/TgWq6k/GmXgxh3H/AideFdHFbLKYnIFJW41ZGOI8a262zq3NmjPd
lywpVU7O7RMhSITY5PnuR3LpNV8ftw1hz2y6t35unyFK1P02adOSj5GJ3hGdhaQ=
=Xz5O
-----END PGP SIGNATURE-----
Backmerge tag 'v4.11-rc6' into drm-next
Linux 4.11-rc6
drm-misc needs 4.11-rc5, may as well fix conflicts with rc6.
Save a bit of time and avoid going through link speed change procedure in
configuration where link max speed is limited to Gen1 in DT.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: yurovsky@gmail.com
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Dong Aisheng <dongas86@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
As can be seen from [1]:
"...the different behavior between iMX6Q PCIe and iMX7D PCIe maybe caused
by the different controller version.
Regarding to the DOC description, the DIRECT_SPEED_CHANGE should be
cleared after the speed change from GEN1 to GEN2. Unfortunately, when
GEN1 device is used, the behavior is not documented.
So, IC design guys run the simulation and find out the following
behaviors:
1. DIRECT_SPEED_CHANGE will be cleared in 7D after speed change
from GEN1 to GEN2. This matches doc’s description
2. set MAX link speed(PCIE_CAP_TARGET_LINK_SPEED=0x01) as GEN1 and
re-run the simulation, DIRECT_SPEED_CHANGE will not be cleared;
remain as 1, this matches your result, but function test is
passed, so this bit should not affect the normal PCIe function."
imx6_pcie_wait_for_speed_change() will report false failures for Gen1 ->
Gen1 speed transition, so avoid doing that check and just rely on
imx6_pcie_wait_for_link() only.
[1] https://community.nxp.com/message/867943
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: yurovsky@gmail.com
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Dong Aisheng <dongas86@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Some designs implement reset GPIO via a GPIO expander connected to a
peripheral bus. One such example would be i.MX7 Sabre board where said
GPIO is provided by SPI shift register connected to a bitbanged SPI bus.
To support such designs, allow reset GPIO request to defer probing of the
driver.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: yurovsky@gmail.com
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Dong Aisheng <dongas86@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Add various bits of code needed to support i.MX7D variant of the IP.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Cc: yurovsky@gmail.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Dong Aisheng <dongas86@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree@vger.kernel.org
The memory allocation here needs to be non-blocking. Fix the issue.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Long Li <longli@microsoft.com>
Cc: <stable@vger.kernel.org>
When we have 32 or more CPUs in the affinity mask, we should use a special
constant to specify that to the host. Fix this issue.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Long Li <longli@microsoft.com>
Cc: <stable@vger.kernel.org>
There is no pci_cfg_access_unlocked(). I think the author meant
pci_cfg_access_unlock().
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently devm_request_irq() is being called before base, PCI fields of
dra7xx_pcie structure are populated. It is called even before
pm_runtime_enable() and pm_runtime_get_sync() are called. This will lead
to exceptions if in case an interrupt is triggered before the all of the
above are done. Hence push the devm_request_irq() call to the end of the
probe.
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional change. Rename dw_pcie_writel_unroll/dw_pcie_readl_unroll to
dw_pcie_writel_ob_unroll/dw_pcie_readl_ob_unroll respectively as these
functions are used to perform only outbound configurations. Also move
these _unroll configurations to a separate function.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously dbi accessors can be used to access data of size 4 bytes. But
there might be situations (like accessing MSI_MESSAGE_CONTROL in order to
set/get the number of required MSI interrupts in EP mode) where dbi
accessors must be used to access data of size 2. This is in preparation
for adding endpoint mode support to designware driver.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Niklas Cassel <niklas.cassel@axis.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>
dwc has 2 dbi address space labeled dbics and dbics2. The existing helper
to access dbi address space can access only dbics. However dbics2 has to
be accessed for programming the BAR registers in the case of EP mode. This
is in preparation for adding EP mode support to dwc driver.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Niklas Cassel <niklas.cassel@axis.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>
Populate cpu_addr_fixup ops to extract the least 28 bits of the
corresponding CPU address.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Niklas Cassel <niklas.cassel@axis.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
Populate cpu_addr_fixup ops to extract the least 28 bits of the
corresponding CPU address.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
Some platforms (like dra7xx) require only the least 28 bits of the
corresponding 32 bit CPU address to be programmed in the address
translation unit. This modified address is stored in io_base/mem_base/
cfg0_base/cfg1_base in dra7xx_pcie_host_init(). While this is okay for
host mode where the address range is fixed, device mode requires different
addresses to be programmed based on the host buffer address. Add a new
ops to get the least 28 bits of the corresponding 32 bit CPU address and
invoke it before programming the address translation unit.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
The bug is that "val" is unsigned long but we only initialize 32 bits of
it. Then we test "if (val)" and that might be true not because we set the
bits but because some were never initialized.
Fixes: f342d940ee ("PCI: exynos: Add support for MSI")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use "continue" to skip rest of the loop when possible to save an indent
level. No functional change intended.
Suggested-by: walter harms <wharms@bfs.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix a crash from dereferencing a NULL dw_pcie_ops pointer. For example,
on ARTPEC-6:
Unable to handle kernel NULL pointer dereference at virtual address 00000004
pgd = c0204000
[00000004] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc3-next-20170321 #1
Hardware name: Axis ARTPEC-6 Platform
task: db098000 task.stack: db096000
PC is at dw_pcie_writel_dbi+0x2c/0xd0
Prior to 442ec4c04d ("PCI: dwc: all: Split struct pcie_port into
host-only and core structures"), every driver had a struct pcie_host_ops
with function pointers, typically used as:
if (pp->ops->readl_rc)
return pp->ops->readl_rc(...);
442ec4c04d split struct pcie_host_ops into two pieces: struct
dw_pcie_host_ops and struct dw_pcie_ops, so the above became:
if (pci->ops->readl_dbi)
return pci->ops->readl_dbi(...);
But pcie-artpec6.c and pcie-designware-plat.c don't need the dw_pcie_ops
pointers and didn't supply a pci->ops struct, which leads to NULL pointer
dereferences.
Supply an empty struct dw_pcie_ops to avoid the NULL pointer dereferences.
[bhelgaas: changelog]
Fixes: 442ec4c04d ("PCI: dwc: all: Split struct pcie_port into host-only and core structures")
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Joao Pinto <jpinto@synopsys.com>
Without PCI_HOST_COMMON support enabled, we get a link error:
drivers/pci/dwc/built-in.o: In function `hisi_pcie_map_bus':
pcie-hisi.c:(.text+0x8860): undefined reference to `pci_ecam_map_bus'
drivers/pci/dwc/built-in.o: In function `hisi_pcie_almost_ecam_probe':
pcie-hisi.c:(.text+0x88b4): undefined reference to `pci_host_common_probe'
Add an explicit 'select', as the other users have.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Rockchip Root Ports support either 64 or 128 byte Read Completion Boundary
(RCB). Set the RCB bit in the Link Control register to indicate this.
A 128 byte RCB significantly improves performance of NVMe with libaio.
[bhelgaas: changelog]
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Jeffy Chen <jeffy.chen@rock-chips.com>
Per Intel Specification Update 335553-002 (see link below), some 82579
network adapters advertise a Function Level Reset (FLR) capability, but
they can hang when an FLR is triggered.
To reproduce the problem, attach the device to a VM, then detach and try to
attach again.
Add a quirk to prevent the use of FLR on these devices.
[bhelgaas: changelog, comments]
Link: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82579lm-82579v-gigabit-network-connection-spec-update.pdf
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
SZ_16M PEM resource size includes PEM-specific register and its children
resources. Reservation of the whole SZ_16M range leads to child device
driver failure when pcieport driver is requesting resources:
pcieport 0004:1f:00.0: can't enable device: BAR 0 [mem 0x87e0c0f00000-0x87e0c0ffffff 64bit] not claimed
So we cannot reserve full 16M here and instead we want to reserve
PEM-specific register only which is SZ_64K.
At the end increase PEM resource to SZ_16M since this is what
thunder_pem_init() call expects for proper initialization.
Fixes: 9abb27c759 ("PCI: thunder-pem: Add legacy firmware support for Cavium ThunderX host controller")
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.10+
Only apply the Cavium ACS quirk to devices with ID in the range
0xa000-0xa0ff. These are the on-chip PCI devices for CN81xx/CN83xx/CN88xx.
Fixes: b404bcfbf0 ("PCI: Add ACS quirk for all Cavium devices")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Manish Jaggi <mjaggi@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Detect on probe whether a PCI device is part of a Thunderbolt controller.
Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
on such devices. Detect presence of this VSEC and cache it in a newly
added is_thunderbolt bit in struct pci_dev.
Also, add a helper to check whether a given PCI device is situated on a
Thunderbolt daisy chain (i.e., below a PCI device with is_thunderbolt
set).
The necessity arises from the following:
* If an external Thunderbolt GPU is connected to a dual GPU laptop,
that GPU is currently registered with vga_switcheroo even though it
can neither drive the laptop's panel nor be powered off by the
platform. To vga_switcheroo it will appear as if two discrete
GPUs are present. As a result, when the external GPU is runtime
suspended, vga_switcheroo will cut power to the internal discrete GPU
which may not be runtime suspended at all at this moment. The
solution is to not register external GPUs with vga_switcheroo, which
necessitates a way to recognize if they're on a Thunderbolt daisy
chain.
* Dual GPU MacBook Pros introduced 2011+ can no longer switch external
DisplayPort ports between GPUs. (They're no longer just used for DP
but have become combined DP/Thunderbolt ports.) The driver to switch
the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence
of a Thunderbolt controller and, if found, keep external ports
permanently switched to the discrete GPU.
v2: Make kerneldoc for pci_is_thunderbolt_attached() more precise,
drop portion of commit message pertaining to separate series.
(Bjorn Helgaas)
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Amir Levy <amir.jer.levy@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: http://patchwork.freedesktop.org/patch/msgid/0ab165a4a35c0b60f29d4c306c653ead14fcd8f9.1489145162.git.lukas@wunner.de
If the PCI device is disconnected, return false immediately from
pci_device_is_present(). pci_device_is_present() uses the bus accessors,
so the early return in the device accessors doesn't help here.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Check the device connected state prior to executing device shutdown
operations or writing MSI messages so that tear down on disconnected
devices completes quicker.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
If we've detected the PCI device is disconnected, there is no need to
attempt to access its config space since we know the operation will fail.
Make all the config reads and writes return -ENODEV error immediately when
in such a state.
If a caller requests a config read to a disconnected device, return a data
value of all 1's. This is the same as what hardware is expected to return
when accessing a removed device, but software can do this faster without
relying on hardware.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Add a new state to pci_dev to be set when it is unexpectedly disconnected.
The PCI driver tear down functions can observe this new device state so
they may skip operations that will fail.
The pciehp and pcie-dpc drivers are aware when the link is down, so these
set the flag when their handlers detect the device is disconnected.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Replace the inline PCI device config read and write accessors with exported
functions. This is preparing for these functions to make use of private
data.
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wei Zhang <wzhang@fb.com>
It seems on later Armada 38x, the slot clock configuration bit is not
read-only, but can be written. This means that our RW1C protection ends up
clearing this bit when the link control register is written.
Adjust the mask so that we only avoid writing '1' bits to the RW1C bits of
this register (bits 15 and 14 of the link status) rather than masking out
all the status register bits.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a host bridge driver for the Faraday Technology FPPCI100 host bridge,
used for Cortina Systems Gemini SoC (SL3516) PCI Host Bridge.
This code is inspired by the out-of-tree OpenWRT patch and then extensively
rewritten for device tree and using the modern helpers to cut down and
modernize the code to all new PCI frameworks. A driver exists in U-Boot as
well.
Tested on the ITian Square One SQ201 NAS with the following result in the
boot log (trimmed to relevant parts):
OF: PCI: host bridge /soc/pci@50000000 ranges:
OF: PCI: IO 0x50000000..0x500fffff -> 0x00000000
OF: PCI: MEM 0x58000000..0x5fffffff -> 0x58000000
ftpci100 50000000.pci: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
pci_bus 0000:00: root bus resource [mem 0x58000000-0x5fffffff]
ftpci100 50000000.pci:
DMA MEM1 BASE: 0x0000000000000000 -> 0x0000000007ffffff config 00070000
ftpci100 50000000.pci:
DMA MEM2 BASE: 0x0000000000000000 -> 0x0000000003ffffff config 00060000
ftpci100 50000000.pci:
DMA MEM3 BASE: 0x0000000000000000 -> 0x0000000003ffffff config 00060000
PCI: bus0: Fast back to back transfers disabled
pci 0000:00:00.0: of_irq_parse_pci() failed with rc=-22
pci 0000:00:0c.0: BAR 0: assigned [mem 0x58000000-0x58007fff]
pci 0000:00:09.2: BAR 0: assigned [mem 0x58008000-0x580080ff]
pci 0000:00:09.0: BAR 4: assigned [io 0x1000-0x101f]
pci 0000:00:09.1: BAR 4: assigned [io 0x1020-0x103f]
pci 0000:00:09.0: enabling device (0140 -> 0141)
pci 0000:00:09.0: HCRESET not completed yet!
pci 0000:00:09.1: enabling device (0140 -> 0141)
pci 0000:00:09.1: HCRESET not completed yet!
pci 0000:00:09.2: enabling device (0140 -> 0142)
rt61pci 0000:00:0c.0: enabling device (0140 -> 0142)
ieee80211 phy0: rt2x00_set_chip: Info - Chipset detected -
rt: 2561, rf: 0003, rev: 000c
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-pci 0000:00:09.2: EHCI Host Controller
ehci-pci 0000:00:09.2: new USB bus registered, assigned bus number 1
ehci-pci 0000:00:09.2: irq 125, io mem 0x58008000
ehci-pci 0000:00:09.2: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
uhci_hcd: USB Universal Host Controller Interface driver
uhci_hcd 0000:00:09.0: UHCI Host Controller
uhci_hcd 0000:00:09.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:09.0: HCRESET not completed yet!
uhci_hcd 0000:00:09.0: irq 123, io base 0x00001000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: config failed, hub doesn't have any ports! (err -19)
uhci_hcd 0000:00:09.1: UHCI Host Controller
uhci_hcd 0000:00:09.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:09.1: HCRESET not completed yet!
uhci_hcd 0000:00:09.1: irq 124, io base 0x00001020
hub 3-0:1.0: USB hub found
hub 3-0:1.0: config failed, hub doesn't have any ports! (err -19)
scsi 0:0:0:0: Direct-Access USB Flash Disk 1.00 PQ: 0 ANSI: 2
sd 0:0:0:0: [sda] 7900336 512-byte logical blocks: (4.04 GB/3.77 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] No Caching mode page found
sd 0:0:0:0: [sda] Assuming drive cache: write through
sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI removable disk
ieee80211 phy0: rt2x00lib_request_firmware: Info -
Loading firmware file 'rt2561s.bin'
ieee80211 phy0: rt2x00lib_request_firmware: Info -
Firmware detected - version: 0.8
IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
$ lspci
00:00.0 Class 0600: 159b:4321
00:09.2 Class 0c03: 1106:3104
00:09.0 Class 0c03: 1106:3038
00:09.1 Class 0c03: 1106:3038
00:0c.0 Class 0280: 1814:0301
$ cat /proc/interrupts
CPU0
123: 0 PCI 0 Edge uhci_hcd:usb2
124: 0 PCI 1 Edge uhci_hcd:usb3
125: 159 PCI 2 Edge ehci_hcd:usb1
126: 1082 PCI 3 Edge rt61pci
$ cat /proc/iomem
50000000-500000ff : /soc/pci@50000000
58000000-5fffffff : Gemini PCI MEM
58000000-58007fff : 0000:00:0c.0
58000000-58007fff : 0000:00:0c.0
58008000-580080ff : 0000:00:09.2
58008000-580080ff : ehci_hcd
The EHCI USB hub works fine; I can mount and manage files and the IRQs just
keep ticking up. I can issue iwlist wlan0 scanning and see all the WLANs
here. I don't have wpa_supplicant so have not tried connecting to them.
[bhelgaas: fold in %pap change from Arnd Bergmann <arnd@arndb.de>]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Janos Laube <janos.dev@gmail.com>
CC: Paulius Zaleckas <paulius.zaleckas@gmail.com>
CC: Hans Ulli Kroll <ulli.kroll@googlemail.com>
CC: Florian Fainelli <f.fainelli@gmail.com>
CC: Feng-Hsin Chiang <john453@faraday-tech.com>
CC: Greentime Hu <green.hu@gmail.com>
A PCI_EJECT message can arrive at the same time we are calling
pci_scan_child_bus() in the workqueue for the previous PCI_BUS_RELATIONS
message or in create_root_hv_pci_bus(). In this case we could potentially
modify the bus from multiple places.
Properly lock the bus access.
Thanks Dexuan Cui <decui@microsoft.com> for pointing out the race condition
in create_root_hv_pci_bus().
Reported-by: Xiaofeng Wang <xiaofwan@redhat.com>
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
hv_pci_devices_present() is called in hv_pci_remove() when we remove a PCI
device from the host, e.g., by disabling SR-IOV on a device. In
hv_pci_remove(), the bus is already removed before the call, so we don't
need to rescan the bus in the workqueue scheduled from
hv_pci_devices_present().
By introducing bus state hv_pcibus_removed, we can avoid this situation.
Reported-by: Xiaofeng Wang <xiaofwan@redhat.com>
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
There's no way to get here with 'err != 0'. Just return 0 to be more
obvious and prevent future changes from accidentally erroring out here
without going through the right error paths.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If regulator_get_current_limit() returns 0 or error, return early so the
body of the function doesn't have to be indented as the body of an "if"
statement. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
During early days of PCI quirks support, ThunderX firmware did not provide
PNP0c02 node with PCI configuration space and PEM-specific register ranges.
This means that for legacy FW we are not reserving these resources and
cannot gather PEM-specific resources for further PEM initialization.
To support already deployed legacy FW, calculate PEM-specific ranges and
provide resources reservation as fallback scenario into PEM driver when we
could not gather PEM reg base from ACPI tables.
Tested-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Robert Richter <rrichter@cavium.com>
CC: stable@vger.kernel.org # v4.10+
"CAV" is the only PNP/ACPI hardware ID vendor prefix assigned to Cavium so
fix this as it should be from day one.
Fixes: 44f22bd91e ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller")
Tested-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Robert Richter <rrichter@cavium.com>
CC: stable@vger.kernel.org # v4.10+
regulator_get_current_limit() can return negative error codes. We saved
the return value in an unsigned "curr", and a subsequent check interpreted
a negative error code as a positive (invalid) current limit.
Save the return code as a signed value, which avoids messages like this,
seen on Samsung Chromebook Plus:
rockchip-pcie f8000000.pcie: invalid power supply
[bhelgaas: changelog]
Fixes: 4816c4c7b8 ("PCI: rockchip: Provide captured slot power limit and scale")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Use dev_printk() when possible. This makes messages more consistent with
other device-related messages and, in some cases, adds useful information.
This changes messages like this:
Unable to allocate affinity masks, ignoring
to this:
pci 0000:01:00.0: can't allocate MSI affinity masks for 4 vectors
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2+ PCI devices fail to be discovered due to each bus having the same PCI
domain. This is because the domain defined in the device tree file is not
being added due to PCI_DOMAIN not being enabled. So, every PCI bus has a
domain of zero. When PCI_DOMAIN is selected by the Kconfig, it picks up
the domain defined in the device tree file and everything works as
expected.
Since both PCIE_IPROC_PLATFORM and PCIE_IPROC_BCMA need PCI_DOMAIN, move
it to PCIE_IPROC so it will be automatically selected for both.
Signed-off-by: Jon Mason <jonmason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since commit fcc392d501 ("irqchip/armada-370-xp: Use the generic MSI
infrastructure"), the irqchip driver used on Armada 370, XP, 375, 38x, 39x
for the MPIC interrupt controller has been converted to use the generic MSI
infrastructure.
Since this commit, it is no longer registering an msi_controller structure
with the of_pci_msi_chip_add() function. Therefore, having the PCI driver
used on the same platform calling of_pci_find_msi_chip_by_node() is pretty
useless.
The MSI resolution is now done in the generic interrupt resolution code,
since the MSI controller is an irq domain attached to the interrupt
controller node, which is pointed to by the msi-parent DT property in the
PCIe controller node.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
The MSI support introduced with the initial Aardvark driver was based
on the msi_controller structure and the of_pci_msi_chip_add() /
of_pci_find_msi_chip_by_node() API, which are being deprecated in
favor of the generic MSI support.
Update the Aardvark driver to use the generic MSI support.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
msleep() still sleeps 1 jiffy even when told to sleep for zero
milliseconds. That can end up being 1-2 milliseconds or more. In the
cases of d3_delay and d3cold_delay, that unnecessarily increases suspend
and/or resume latencies.
Do not sleep at all for the respective cases if d3_delay is zero or
d3cold_delay is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The pci_bus_type .shutdown method, pci_device_shutdown(), is called from
device_shutdown() in the kernel restart and shutdown paths.
Previously, pci_device_shutdown() called pci_msi_shutdown() and
pci_msix_shutdown(). This disables MSI and MSI-X, which causes the device
to fall back to raising interrupts via INTx. But the driver is still bound
to the device, it doesn't know about this change, and it likely doesn't
have an INTx handler, so these INTx interrupts cause "nobody cared"
warnings like this:
irq 16: nobody cared (try booting with the "irqpoll" option)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1
Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/
...
The MSI disabling code was added by d52877c7b1 ("pci/irq: let
pci_device_shutdown to call pci_msi_shutdown v2") because a driver left MSI
enabled and kdump failed because the kexeced kernel wasn't prepared to
receive the MSI interrupts.
Subsequent commits 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even
if kernel doesn't support MSI") and e80e7edc55 ("PCI/MSI: Initialize MSI
capability for all architectures") changed the kexeced kernel to disable
all MSIs itself so it no longer depends on the crashed kernel to clean up
after itself.
Stop disabling MSI/MSI-X in pci_device_shutdown(). This resolves the
"nobody cared" unhandled IRQ issue above. It also allows PCI serial
devices, which may rely on the MSI interrupts, to continue outputting
messages during reboot/shutdown.
[bhelgaas: changelog, drop pci_msi_shutdown() and pci_msix_shutdown() calls
altogether]
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=187351
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: David Arcari <darcari@redhat.com>
CC: Myron Stowe <mstowe@redhat.com>
CC: Lukas Wunner <lukas@wunner.de>
CC: Keith Busch <keith.busch@intel.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
The host bridge memory window resource is inserted into the iomem_resource
tree and cannot be deallocated until the host bridge itself is removed.
Previously, the window was on the stack, which meant the iomem_resource
entry pointed into the stack and was corrupted as soon as the probe
function returned, which caused memory corruption and errors like this:
pcie_iproc_bcma bcma0:8: resource collision: [mem 0x40000000-0x47ffffff] conflicts with PCIe MEM space [mem 0x40000000-0x47ffffff]
Move the memory window resource from the stack into struct iproc_pcie so
its lifetime matches that of the host bridge.
Fixes: c3245a5664 ("PCI: iproc: Request host bridge window resources")
Reported-and-tested-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.8+
We call pcie_aspm_exit_link_state() when we remove a device. If the device
is the last PCIe function to be removed below a bridge and the bridge has
an ASPM link_state struct, we disable ASPM on the link. Disabling ASPM
requires link->downstream (used in pcie_config_aspm_link()).
We previously set link->downstream in pcie_aspm_cap_init(), but only if the
device was not blacklisted. Removing the blacklisted device caused a NULL
pointer dereference in the pcie_aspm_exit_link_state() ->
pcie_config_aspm_link() path:
# echo 1 > /sys/bus/pci/devices/0000\:0b\:00.0/remove
...
BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
IP: pcie_config_aspm_link+0x5d/0x2b0
Call Trace:
pcie_aspm_exit_link_state+0x75/0x130
pci_stop_bus_device+0xa4/0xb0
pci_stop_and_remove_bus_device_locked+0x1a/0x30
remove_store+0x50/0x70
dev_attr_store+0x18/0x30
sysfs_kf_write+0x44/0x60
kernfs_fop_write+0x10e/0x190
__vfs_write+0x28/0x110
? rcu_read_lock_sched_held+0x5d/0x80
? rcu_sync_lockdep_assert+0x2c/0x60
? __sb_start_write+0x173/0x1a0
? vfs_write+0xb3/0x180
vfs_write+0xc4/0x180
SyS_write+0x49/0xa0
do_syscall_64+0xa6/0x1c0
entry_SYSCALL64_slow_path+0x25/0x25
---[ end trace bd187ee0267df5d9 ]---
To avoid this, set link->downstream in alloc_pcie_link_state(), so every
pcie_link_state structure has a valid link->downstream pointer.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rajat Jain <rajatja@google.com>
CC: stable@vger.kernel.org
Even when using the PHY framework, we need the elbi_base. Before this
patch, we didn't initialize elbi_base, which caused NULL pointer
dereferences later.
Fixes: e7cd7ef58e ("PCI: exynos: Support the PHY generic framework")
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Microsemi's "Switchtec" line of PCI switch devices is already well
supported by the kernel with standard PCI switch drivers. However, the
Switchtec device advertises a special management endpoint with a separate
PCI function address and class code. This endpoint enables some additional
functionality which includes:
* Packet and Byte Counters
* Switch Firmware Upgrades
* Event and Error logs
* Querying port link status
* Custom user firmware commands
Add a switchtec kernel module which provides PCI driver that exposes a char
device. The char device provides userspace access to this interface
through read, write and (optionally) poll calls.
A userspace tool and library which utilizes this interface is available
at [1]. This tool takes inspiration (and borrows some code) from
nvme-cli [2]. The tool is largely complete at this time but additional
features may be added in the future.
[1] https://github.com/sbates130272/switchtec-user
[2] https://github.com/linux-nvme/nvme-cli
[Dan Carpenter <dan.carpenter@oracle.com>: don't invert error codes]
[Christophe JAILLET <christophe.jaillet@wanadoo.fr>: fix
switchtec_dev_open() error handling]
Tested-by: Krishna Dhulipala <krishnad@fb.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Zhang <wzhang@fb.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYuZvtAAoJEFmIoMA60/r8CUoP/0n8jRKH98KGQdjP14IvfjNY
XivyQzQGRPhEUgOopIPUzQYHZTS3R0uL3RJcJy6ZuZDGUKrPacayv0nMikBJDVHn
KrnVJYkLpCvQZV5TyOq18BVFHNOxG9vMOKo73quy9nLrmDiVtQ+F6mFL3z8CRKS1
r9QOGW0/tVtbHxwGINKFo572DXHKVkQAvhL36hYXmLzrFFUuvUK8XMxQz4Bgudo/
TsRNkES5h4pdBwnRNVjrMUl/uIIfUFrP//Q1OUTitcEVep4PsOV7WJKvGqL2HtGY
5xq0lfEIP9aLEgBJd9jydPXgWnNXXbC2PV2MtoTCw3kw4Jxq//sNRWVycbRgjLoA
nfQ2vsvhaX3eE7CHz2w8e4Jpi4/lz2QPq2U3oTbwco+HoX2jsPiufGhQ2CSSBKJP
13AKBbfMpLkgDmG/eTwvb5Lqsvzj2aSFKkKuhFV5KGx7iXjLQu2G6hXr/3PgsQfB
cI0hbL34sExydsYxD0wPhr+PsPzxymumcJ0cCpN4omErl2Ovt5fIoKdI+/olhTNO
zvIZquE9GMTDh3ajWn5hSIqTiohZlJ0jyo06QMxoy7lmwloI0OTK0xpRY3ZfSncZ
WvWRg4BN1m6E5IFJvMLSzVJ5lTZgjm7h6RvSskfOfV/zMykd2d6E7j9v4cOMU1Hl
MG/amy4uXcWbkhMse2kp
=hgdy
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix NULL pointer dereferences in many DesignWare-based drivers due to
refactoring error
- fix Altera config write breakage due to my refactoring error
* tag 'pci-v4.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: altera: Fix TLP_CFG_DW0 for TLP write
PCI: dwc: Fix crashes seen due to missing assignments
Pull block layer fixes from Jens Axboe:
"A collection of fixes for this merge window, either fixes for existing
issues, or parts that were waiting for acks to come in. This pull
request contains:
- Allocation of nvme queues on the right node from Shaohua.
This was ready long before the merge window, but waiting on an ack
from Bjorn on the PCI bit. Now that we have that, the three patches
can go in.
- Two fixes for blk-mq-sched with nvmeof, which uses hctx specific
request allocations. This caused an oops. One part from Sagi, one
part from Omar.
- A loop partition scan deadlock fix from Omar, fixing a regression
in this merge window.
- A three-patch series from Keith, closing up a hole on clearing out
requests on shutdown/resume.
- A stable fix for nbd from Josef, fixing a leak of sockets.
- Two fixes for a regression in this window from Jan, fixing a
problem with one of his earlier patches dealing with queue vs bdi
life times.
- A fix for a regression with virtio-blk, causing an IO stall if
scheduling is used. From me.
- A fix for an io context lock ordering problem. From me"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Move bdi_unregister() to del_gendisk()
blk-mq: ensure that bd->last is always set correctly
block: don't call ioc_exit_icq() with the queue lock held for blk-mq
block: Initialize bd_bdi on inode initialization
loop: fix LO_FLAGS_PARTSCAN hang
nvme: Complete all stuck requests
blk-mq: Provide freeze queue timeout
blk-mq: Export blk_mq_freeze_queue_wait
nbd: stop leaking sockets
blk-mq: move update of tags->rqs to __blk_mq_alloc_request()
blk-mq: kill blk_mq_set_alloc_data()
blk-mq: make blk_mq_alloc_request_hctx() allocate a scheduler request
blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset
nvme: allocate nvme_queue in correct node
PCI: add an API to get node from vector
blk-mq: allocate blk_mq_tags and requests in correct node
Next patch will use the API to get the node from vector for nvme device
Signed-off-by: Shaohua Li <shli@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Fix up affected files that include this signal functionality via sched.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Highlights include:
- An update of the disassembly code used by xmon to the latest versions in
binutils. We've received permission from all the authors of the relevant
binutils changes to relicense their changes to the relevant files from GPLv3
to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg
work to get permission from everyone.
- Addition of the "architected" Power9 CPU table entry, allowing us to boot
in Power9 architected mode under a hypervisor.
- Updates to the Power9 PMU code.
- Implementation of clear_bit_unlock_is_negative_byte() to optimise
unlock_page().
- Freescale updates from Scott: "Highlights include 8xx breakpoints and perf,
t1042rdb display support, and board updates."
Thanks to:
Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller,
Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney,
Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYthsKAAoJEFHr6jzI4aWAaWMQAJ7mAwX98ncoYschPgRmmIun
f6DtE4IonrxiZ22gp1ct4+c9OFtA+B5FXMcEhOKpfh93lg38PTDjHs9e5kfauD7+
oTQ2Bg1eXaL48FKdmC5Vs4Kt+/J8e9guGafUC1OVIpTyyRPoZeUDH0lx+kSPV5bd
PkL+wY/k3W0Njo8WgD1P9u3W15+BxISo/k8c7ajzKTHGBZlAvj5h2gO6XUBNMLyy
YClB/qIymjZriSB+AeWYD79k8gPbBZPsmZG0ZF1hY060894LgqLB9mPOJdffx/DY
H7/uP6jcsRDOXTOmyueW1SEmPoQbtysiMd1lNrCXKtC/Okr5uhn2cUhi88AsgWvd
1QFly2lobcDAKPah/yB7YQGMAcmYvGGNuqrWaosaV2T7r0KprzUYYgCOqzvC3WSJ
QtVatBzMIqRTMYq+3U4G1aHeCXlRazVQHDuvPby8RdR5b2gIexiqMab2eS7tSMIH
mCOIunRIvT14g/7wxUV7tahN+ifncNxzAk4DvPO+Wc4FQ4sy7wArv2YipSaWRWtE
u7tNdBkEwlDkKhJgRU5T0Op2PyMbHwCP8pWuz7PQIhKIcgwmP9wb07BIWG/GGIqn
07TxJYX2ItabyEMZMsYhzILZqjLyiAaCARANB7ScbQbdP8wdcGZcwismhwnfROIU
NuxsZg63BUDMoxk7Sauu
=rspd
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
"Highlights include:
- an update of the disassembly code used by xmon to the latest
versions in binutils. We've received permission from all the
authors of the relevant binutils changes to relicense their changes
to the relevant files from GPLv3 to GPLv2, for inclusion in Linux.
Thanks to Peter Bergner for doing the leg work to get permission
from everyone.
- addition of the "architected" Power9 CPU table entry, allowing us
to boot in Power9 architected mode under a hypervisor.
- updates to the Power9 PMU code.
- implementation of clear_bit_unlock_is_negative_byte() to optimise
unlock_page().
- Freescale updates from Scott: "Highlights include 8xx breakpoints
and perf, t1042rdb display support, and board updates."
Thanks to:
Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas
Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan,
Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter
Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil
Mehta, Stewart Smith"
* tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
powerpc: Remove leftover cputime_to_nsecs call causing build error
powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU
powerpc/optprobes: Fix TOC handling in optprobes trampoline
powerpc/pseries: Advertise Hot Plug Event support to firmware
cxl: fix nested locking hang during EEH hotplug
powerpc/xmon: Dump memory in CPU endian format
powerpc/pseries: Revert 'Auto-online hotplugged memory'
powerpc/powernv: Make PCI non-optional
powerpc/64: Implement clear_bit_unlock_is_negative_byte()
powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
powerpc/kernel: Remove error message in pcibios_setup_phb_resources()
powerpc/mm: Fix typo in set_pte_at()
pci/hotplug/pnv-php: Disable MSI and PCI device properly
pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts
pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()
powerpc: Add POWER9 architected mode to cputable
powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags()
powerpc/perf: Avoid FAB_*_MATCH checks for power9
powerpc/perf: Add restrictions to PMC5 in power9 DD1
powerpc/perf: Use Instruction Counter value
...
eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage") used
TLP_FMTTYPE_CFGRD* (instead of TLP_FMTTYPE_CFGWR*) for TLP writes, which
causes writing to configuration space to fail. Fix it by using correct
FMTTYPE for write operation.
Fixes: eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage")
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.9+
Fix typos and add the following to the scripts/spelling.txt:
followings||following
While we are here, add a missing colon in the boilerplate in DT binding
documents. The "you SoC" in allwinner,sunxi-pinctrl.txt was fixed as
well.
I reworded "as the followings:" to "as follows:" for
drivers/usb/gadget/udc/renesas_usb3.c.
Link: http://lkml.kernel.org/r/1481573103-11329-32-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bart Van Assche noted that the ib DMA mapping code was significantly
similar enough to the core DMA mapping code that with a few changes
it was possible to remove the IB DMA mapping code entirely and
switch the RDMA stack to use the core DMA mapping code. This resulted
in a nice set of cleanups, but touched the entire tree. This branch
will be submitted separately to Linus at the end of the merge window
as per normal practice for tree wide changes like this.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
3e7mgpcwC+3XfA/6vU3F
=e0Si
-----END PGP SIGNATURE-----
Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma DMA mapping updates from Doug Ledford:
"Drop IB DMA mapping code and use core DMA code instead.
Bart Van Assche noted that the ib DMA mapping code was significantly
similar enough to the core DMA mapping code that with a few changes it
was possible to remove the IB DMA mapping code entirely and switch the
RDMA stack to use the core DMA mapping code.
This resulted in a nice set of cleanups, but touched the entire tree
and has been kept separate for that reason."
* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
IB/core: Remove ib_device.dma_device
nvme-rdma: Switch from dma_device to dev.parent
RDS: net: Switch from dma_device to dev.parent
IB/srpt: Modify a debug statement
IB/srp: Switch from dma_device to dev.parent
IB/iser: Switch from dma_device to dev.parent
IB/IPoIB: Switch from dma_device to dev.parent
IB/rxe: Switch from dma_device to dev.parent
IB/vmw_pvrdma: Switch from dma_device to dev.parent
IB/usnic: Switch from dma_device to dev.parent
IB/qib: Switch from dma_device to dev.parent
IB/qedr: Switch from dma_device to dev.parent
IB/ocrdma: Switch from dma_device to dev.parent
IB/nes: Remove a superfluous assignment statement
IB/mthca: Switch from dma_device to dev.parent
IB/mlx5: Switch from dma_device to dev.parent
IB/mlx4: Switch from dma_device to dev.parent
IB/i40iw: Remove a superfluous assignment statement
IB/hns: Switch from dma_device to dev.parent
...
Fix the following crash, seen in dwc/pci-imx6.
Unable to handle kernel NULL pointer dereference at virtual address 00000070
pgd = c0004000
[00000070] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-09686-g9e31489 #1
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: cb850000 task.stack: cb84e000
PC is at imx6_pcie_probe+0x2f4/0x414
...
While at it, fix the same problem in various drivers instead of waiting for
individual crash reports.
The change in the imx6 driver was tested with qemu. The changes in other
drivers are based on code inspection and have been compile tested only.
Fixes: 442ec4c04d ("PCI: dwc: all: Split struct pcie_port into host-only and core structures")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> # designware-plat
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYrvYlAAoJEFmIoMA60/r8FuQQAMDpia3kacyCAJpa+zjmyMNF
1slytaoIvP37dFq9XF1em031lwGNr5sahZ7nP1EKgALz4odZUzait7BUABcfviIn
Uesz2E1s/miMo4/0X1j9DqY9xV649DmmSIgk1yn3kvCkH/+Ix27dexu47auGzPEb
H/sEfd1RZidjZ5EWaG0ww5FrHcuge+JHtcH6vFQtWsTOspcx++IhaVIGjC0JCpqK
DnlQKilsJ38KUkvuDcxWjtFKxAc8De9jvCR4kX96OvbHahfAWwBO4AtUv7U3JpJN
2nyQk+I5kRagbfBucaXZISUtWM7h4peLiL+TGkvKg8eOVlOCedjYlrZW4SWkbAN+
0qwcHRQ8lwhNmgp3VYq7pmnugIvW4P2Fh3uqaplCAIwlpODxWPDQP7HLM2kyzmvq
gPGi0R4Yo2PdIXqfbilrzbFVeyqkIFECr287a6+5PekC0DxsqZvOG0uA1mWKLIaH
pRQMT0FO2SCCSOpcxRExeIj+XxhXlDVOrIBP6eMiFXAMgzUAyU8fLSZVMtXAvsTS
02hVDOc/Fq2jKlCSoJRIiRp5aj1QDFS/DjBhOnW7pXuvUTCrfYBXY5NCdT9UV3Q7
W6qHWkizRmRDGxUzqSODRt5aU7VOKbWvZnp10eJyKt5s2Iawe6We5V1NX+u18UIS
Scc1nbuPTL6u1n8PsaBG
=4Owc
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add ASPM L1 substate support
- enable PCIe Extended Tags when supported
- configure PCIe MPS settings on iProc, Versatile, X-Gene, and Xilinx
- increase VPD access timeout
- add ACS quirks for Intel Union Point, Qualcomm QDF2400 and QDF2432
- use new pci_irq_alloc_vectors() in more drivers
- fix MSI affinity memory leak
- remove unused MSI interfaces and update documentation
- remove unused AER .link_reset() callback
- avoid pci_lock / p->pi_lock deadlock seen with perf
- serialize sysfs enable/disable num_vfs operations
- move DesignWare IP from drivers/pci/host/ to drivers/pci/dwc/ and
refactor so we can support both hosts and endpoints
- add DT ECAM-like support for HiSilicon Hip06/Hip07 controllers
- add Rockchip system power management support
- add Thunder-X cn81xx and cn83xx support
- add Exynos 5440 PCIe PHY support
* tag 'pci-v4.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (93 commits)
PCI: dwc: Remove dependency of designware on CONFIG_PCI
PCI: dwc: Add CONFIG_PCIE_DW_HOST to enable PCI dwc host
PCI: dwc: Split pcie-designware.c into host and core files
PCI: dwc: designware: Fix style errors in pcie-designware.c
PCI: dwc: designware: Parse "num-lanes" property in dw_pcie_setup_rc()
PCI: dwc: all: Split struct pcie_port into host-only and core structures
PCI: dwc: designware: Get device pointer at the start of dw_pcie_host_init()
PCI: dwc: all: Rename cfg_read/cfg_write to read/write
PCI: dwc: all: Use platform_set_drvdata() to save private data
PCI: dwc: designware: Move register defines to designware header file
PCI: dwc: Use PTR_ERR_OR_ZERO to simplify code
PCI: dra7xx: Group PHY API invocations
PCI: dra7xx: Enable MSI and legacy interrupts simultaneously
PCI: dra7xx: Add support to force RC to work in GEN1 mode
PCI: dra7xx: Simplify probe code with devm_gpiod_get_optional()
PCI: Move DesignWare IP support to new drivers/pci/dwc/ directory
PCI: exynos: Support the PHY generic framework
Documentation: binding: Modify the exynos5440 PCIe binding
phy: phy-exynos-pcie: Add support for Exynos PCIe PHY
Documentation: samsung-phy: Add exynos-pcie-phy binding
...
Pull networking updates from David Miller:
"Highlights:
1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
Varadhan.
2) Simplify classifier state on sk_buff in order to shrink it a bit.
From Willem de Bruijn.
3) Introduce SIPHASH and it's usage for secure sequence numbers and
syncookies. From Jason A. Donenfeld.
4) Reduce CPU usage for ICMP replies we are going to limit or
suppress, from Jesper Dangaard Brouer.
5) Introduce Shared Memory Communications socket layer, from Ursula
Braun.
6) Add RACK loss detection and allow it to actually trigger fast
recovery instead of just assisting after other algorithms have
triggered it. From Yuchung Cheng.
7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
9) Export MPLS packet stats via netlink, from Robert Shearman.
10) Significantly improve inet port bind conflict handling, especially
when an application is restarted and changes it's setting of
reuseport. From Josef Bacik.
11) Implement TX batching in vhost_net, from Jason Wang.
12) Extend the dummy device so that VF (virtual function) features,
such as configuration, can be more easily tested. From Phil
Sutter.
13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
Dumazet.
14) Add new bpf MAP, implementing a longest prefix match trie. From
Daniel Mack.
15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
16) Add new aquantia driver, from David VomLehn.
17) Add bpf tracepoints, from Daniel Borkmann.
18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
Florian Fainelli.
19) Remove custom busy polling in many drivers, it is done in the core
networking since 4.5 times. From Eric Dumazet.
20) Support XDP adjust_head in virtio_net, from John Fastabend.
21) Fix several major holes in neighbour entry confirmation, from
Julian Anastasov.
22) Add XDP support to bnxt_en driver, from Michael Chan.
23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
25) Support GRO in IPSEC protocols, from Steffen Klassert"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
Revert "ath10k: Search SMBIOS for OEM board file extension"
net: socket: fix recvmmsg not returning error from sock_error
bnxt_en: use eth_hw_addr_random()
bpf: fix unlocking of jited image when module ronx not set
arch: add ARCH_HAS_SET_MEMORY config
net: napi_watchdog() can use napi_schedule_irqoff()
tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
net/hsr: use eth_hw_addr_random()
net: mvpp2: enable building on 64-bit platforms
net: mvpp2: switch to build_skb() in the RX path
net: mvpp2: simplify MVPP2_PRS_RI_* definitions
net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
net: mvpp2: remove unused register definitions
net: mvpp2: simplify mvpp2_bm_bufs_add()
net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
net: mvpp2: release reference to txq_cpu[] entry after unmapping
net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
...
* pci/host-rockchip:
PCI: rockchip: Set vendor ID from local core config space
PCI: rockchip: Fix rockchip_pcie_probe() error path to free resource list
PCI: rockchip: Mark PM functions as __maybe_unused
PCI: rockchip: Use readl_poll_timeout() instead of open-coding it
PCI: rockchip: Disable RC's ASPM L0s based on DT "aspm-no-l0s"
PCI: rockchip: Add system PM support
* pci/host-rcar:
PCI: rcar: Use of_device_get_match_data() to simplify probe
PCI: rcar: Add compatible string for r8a7796
PCI: rcar: Return -ENODEV from host bridge probe when no card present
* pci/host-hisi:
PCI: generic: Call pci_fixup_irqs() only on ARM
PCI: Disable MSI for HiSilicon Hip06/Hip07 Root Ports
PCI: hisi: Rename config space accessors to remove "acpi"
PCI: hisi: Add DT almost-ECAM support for Hip06/Hip07 host controllers
PCI: hisi: Use of_device_get_match_data() to simplify probe
Conflicts:
drivers/pci/dwc/pcie-hisi.c
* pci/host-exynos:
PCI: exynos: Support the PHY generic framework
Documentation: binding: Modify the exynos5440 PCIe binding
phy: phy-exynos-pcie: Add support for Exynos PCIe PHY
Documentation: samsung-phy: Add exynos-pcie-phy binding
PCI: exynos: Refactor to make it easier to support other SoCs
PCI: exynos: Remove duplicated code
PCI: exynos: Use the bitops BIT() macro to build bitmasks
PCI: exynos: Remove unnecessary local variables
PCI: exynos: Replace the *_blk/*_phy/*_elb accessors
PCI: exynos: Rename all pointer names from "exynos_pcie" to "ep"
Conflicts:
drivers/pci/dwc/pci-exynos.c
* pci/host-designware:
PCI: dwc: Remove dependency of designware on CONFIG_PCI
PCI: dwc: Add CONFIG_PCIE_DW_HOST to enable PCI dwc host
PCI: dwc: Split pcie-designware.c into host and core files
PCI: dwc: designware: Fix style errors in pcie-designware.c
PCI: dwc: designware: Parse "num-lanes" property in dw_pcie_setup_rc()
PCI: dwc: all: Split struct pcie_port into host-only and core structures
PCI: dwc: designware: Get device pointer at the start of dw_pcie_host_init()
PCI: dwc: all: Rename cfg_read/cfg_write to read/write
PCI: dwc: all: Use platform_set_drvdata() to save private data
PCI: dwc: designware: Move register defines to designware header file
PCI: dwc: Use PTR_ERR_OR_ZERO to simplify code
PCI: dra7xx: Group PHY API invocations
PCI: dra7xx: Enable MSI and legacy interrupts simultaneously
PCI: dra7xx: Add support to force RC to work in GEN1 mode
PCI: dra7xx: Simplify probe code with devm_gpiod_get_optional()
PCI: Move DesignWare IP support to new drivers/pci/dwc/ directory
PCI: designware: Check for iATU unroll only on platforms that use ATU
CONFIG_PCI is used to enable host mode PCI. In preparation for adding
endpoint mode support to designware driver, remove the dependency of
designware on CONFIG_PCI and make only the host-specific part depend on
CONFIG_PCI.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that PCI designware host has a separate file, add a new PCIE_DW_HOST
config symbol to select the host-only driver. This will enable to
independently select host support and endpoint support (when it's added).
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Split pcie-designware.c into pcie-designware-host.c that contains the host
specific parts of the driver and pcie-designware.c that contains the parts
used by both host driver and endpoint driver.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional change. Fix all checkpatch warnings and check errors in
pcie-designware.c
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
The "num-lanes" DT property is parsed in dw_pcie_host_init(). However
num-lanes is applicable to both root complex mode and endpoint mode. As a
first step, move the parsing of this property outside dw_pcie_host_init().
This is in preparation for splitting pcie-designware.c to pcie-designware.c
and pcie-designware-host.c
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Keep only the host-specific members in struct pcie_port and move the common
members (i.e common to both host and endpoint) to struct dw_pcie. This is
in preparation for adding endpoint mode support to designware driver.
While at that also fix checkpatch warnings.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jingoo Han <jingoohan1@gmail.com>
CC: Richard Zhu <hongxing.zhu@nxp.com>
CC: Lucas Stach <l.stach@pengutronix.de>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Minghuan Lian <minghuan.Lian@freescale.com>
CC: Mingkai Hu <mingkai.hu@freescale.com>
CC: Roy Zang <tie-fei.zang@freescale.com>
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: Niklas Cassel <niklas.cassel@axis.com>
CC: Jesper Nilsson <jesper.nilsson@axis.com>
CC: Joao Pinto <Joao.Pinto@synopsys.com>
CC: Zhou Wang <wangzhou1@hisilicon.com>
CC: Gabriele Paoloni <gabriele.paoloni@huawei.com>
CC: Stanimir Varbanov <svarbanov@mm-sol.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
No functional change. Get device pointer at the beginning of
dw_pcie_host_init() instead of getting it all over dw_pcie_host_init().
This is in preparation for splitting struct pcie_port into host and core
structures (once split pcie_port will not have device pointer).
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional change. dw_pcie_cfg_read()/dw_pcie_cfg_write() doesn't do
anything specific to access configuration space. It can be just renamed to
dw_pcie_read()/dw_pcie_write() and used to read/write data to dbi space.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Joao Pinto <jpinto@synopsys.com>
CC: Jingoo Han <jingoohan1@gmail.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Stanimir Varbanov <svarbanov@mm-sol.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
Add platform_set_drvdata() in all designware-based drivers to store the
private data structure of the driver so that dev_set_drvdata() can be used
to get back private data structure in add_pcie_port/host_init. This is in
preparation for splitting struct pcie_port into core and host only
structures. After the split pcie_port will not be part of the driver's
private data structure and *container_of* used now to get the private data
pointer cannot be used.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Jingoo Han <jingoohan1@gmail.com>
CC: Richard Zhu <hongxing.zhu@nxp.com>
CC: Lucas Stach <l.stach@pengutronix.de>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Minghuan Lian <minghuan.Lian@freescale.com>
CC: Mingkai Hu <mingkai.hu@freescale.com>
CC: Roy Zang <tie-fei.zang@freescale.com>
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: Niklas Cassel <niklas.cassel@axis.com>
CC: Jesper Nilsson <jesper.nilsson@axis.com>
CC: Joao Pinto <Joao.Pinto@synopsys.com>
CC: Zhou Wang <wangzhou1@hisilicon.com>
CC: Gabriele Paoloni <gabriele.paoloni@huawei.com>
CC: Stanimir Varbanov <svarbanov@mm-sol.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
No functional change. Move the register defines and other macros from
pcie-designware.c to pcie-designware.h. This is in preparation to split the
pcie-designware.c file into designware core file and host-specific file.
While at that also fix a checkpatch warning.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Joao Pinto <jpinto@synopsys.com>
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR to avoid the
following warnings found by scripts/coccinelle/api/ptr_ret.cocci:
drivers/pci/dwc/pcie-qcom.c:215:1-3: WARNING: PTR_ERR_OR_ZERO can be used
drivers/pci/dwc/pcie-qcom.c:247:1-3: WARNING: PTR_ERR_OR_ZERO can be used
drivers/pci/dwc/pcie-qcom.c:481:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Kishon Vijay Abraham I <kishon@ti.com>
No functional change. PHY APIs like phy_init()/phy_power_on() are invoked
from multiple places. Group all the PHY APIs in dra7xx_pcie_enable_phy()
and dra7xx_pcie_disable_phy() and use these functions for enabling or
disabling the PHY.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci-dra7xx driver had a bug in that if CONFIG_PCI_MSI config is enabled, it
doesn't support legacy interrupt. Fix it here so that both MSI and legacy
interrupts can be enabled simultaneously and the interrupt mechanism
supported by the endpoint device will be used.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe in AM57x/DRA7x devices is by default configured to work in GEN2 mode.
However there may be situations when working in GEN1 mode is desired. One
example is limitation i925 (PCIe GEN2 mode not supported at junction
temperatures < 0C).
Add support to force Root Complex to work in GEN1 mode if so desired, but
don't force GEN1 mode on any board just yet.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional change. Use the new devm_gpiod_get_optional() to simplify
the probe code.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Group all the PCI drivers that use DesignWare core in dwc directory.
dwc IP is capable of operating in both host mode and device mode and
keeping it inside the *host* directory is misleading.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Minghuan Lian <minghuan.Lian@freescale.com>
Cc: Mingkai Hu <mingkai.hu@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Pratyush Anand <pratyush.anand@gmail.com>
Cc: Niklas Cassel <niklas.cassel@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Cc: Stanimir Varbanov <svarbanov@mm-sol.com>
Switch the pci-exynos driver to generic PHY framework. At the same time
backward compatibility is preserved: Warning will be printed for old DTB.
Refer to the binding file:
- Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- Implement wraparound-safe refcount_t and kref_t types based on
generic atomic primitives (Peter Zijlstra)
- Improve and fix the ww_mutex code (Nicolai Hähnle)
- Add self-tests to the ww_mutex code (Chris Wilson)
- Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
Bueso)
- Micro-optimize the current-task logic all around the core kernel
(Davidlohr Bueso)
- Tidy up after recent optimizations: remove stale code and APIs,
clean up the code (Waiman Long)
- ... plus misc fixes, updates and cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
fork: Fix task_struct alignment
locking/spinlock/debug: Remove spinlock lockup detection code
lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
lkdtm: Convert to refcount_t testing
kref: Implement 'struct kref' using refcount_t
refcount_t: Introduce a special purpose refcount type
sched/wake_q: Clarify queue reinit comment
sched/wait, rcuwait: Fix typo in comment
locking/mutex: Fix lockdep_assert_held() fail
locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
locking/rwsem: Reinit wake_q after use
locking/rwsem: Remove unnecessary atomic_long_t casts
jump_labels: Move header guard #endif down where it belongs
locking/atomic, kref: Implement kref_put_lock()
locking/ww_mutex: Turn off __must_check for now
locking/atomic, kref: Avoid more abuse
locking/atomic, kref: Use kref_get_unless_zero() more
locking/atomic, kref: Kill kref_sub()
locking/atomic, kref: Add kref_read()
locking/atomic, kref: Add KREF_INIT()
...
During device setup, msix_setup_entries() and msi_setup_entry() allocate
msi_desc by calling alloc_msi_entry(). alloc_msi_entry() can also allocate
a affinity cpumask. During device teardown free_msi_irqs() is called and
the msi_desc is freed, but the affinity cpumask is leaked.
Fix it by calling free_msi_entry() which frees both the msi_desc and the
affinity cpumask.
[bhelgaas: aa48b6f708 ("genirq/MSI: Move alloc_msi_entry() from PCI into
generic MSI code") moved alloc_msi_entry() from drivers/pci/msi.c to
kernel/irq/msi.c and added a new corresponding free_msi_entry() interface.
After aa48b6f708, pci/msi.c used alloc_msi_entry(), but did its own
kfree() instead of using free_msi_entry(). 28f4b04143 ("genirq/msi: Add
cpumask allocation to alloc_msi_entry") added affinity to both
alloc_msi_entry() and free_msi_entry(), but pci/msi.c didn't use
free_msi_entry(), resulting in this leak.]
Fixes: aa48b6f708 ("genirq/MSI: Move alloc_msi_entry() from PCI into generic MSI code")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Myron Stowe <mstowe@redhat.com>
Previously we extracted 'Completion Status' from b14:12, but it is actually
b15:13. Extract it from the correct bits.
Signed-off-by: Hu Yadi<yadi.hu@windriver.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>
The TRM says the vendor ID in the RC's configure space can be rewritten
and the value must be the same as the value read from the local core
configure space. But we misread that and didn't notice it before. Actually
we should only able to rewrite it from the local core configure space.
Fix that issue to make lspci show the correct IP vendor infomation.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The Qualcomm QDF2xxx root ports don't advertise an ACS capability, but they
do provide ACS-like features to disable peer transactions and validate bus
numbers in requests.
To be specific:
* Hardware supports source validation but it will report the issue as
Completer Abort instead of ACS Violation.
* Hardware doesn't support peer-to-peer and each root port is a root
complex with unique segment numbers.
* It is not possible for one root port to pass traffic to the other root
port. All PCIe transactions are terminated inside the root port.
Add an ACS quirk for the QDF2400 and QDF2432 products.
[bhelgaas: changelog]
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Use the device serial number as the PCI domain. The serial numbers start
with 1 and are unique within a VM. So names, such as VF NIC names, that
include domain number as part of the name, can be shorter than that based
on part of bus UUID previously. The new names will also stay same for VMs
created with copied VHD and same number of devices.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
pnv_php_disable_irq() can be called in two paths: Bailing path in
pnv_php_enable_irq() or releasing slot. The MSI (or MSIx) interrupts
is disabled unconditionally in pnv_php_disable_irq(). It's wrong
because that might be enabled by drivers other than pnv-php.
This disables MSI (or MSIx) interrupts and the PCI device only if
it was enabled by pnv-php. In the error path of pnv_php_enable_irq(),
we rely on the newly added parameter @disable_device. In the path
of releasing slot, @pnv_php->irq is checked.
Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The root port or PCIe switch downstream port might have been associated
with driver other than pnv-php. The MSI or MSIx might also have been
enabled by that driver (e.g. pcieport_drv). Attempt to enable MSI incurs
below backtrace:
PowerPC PowerNV PCI Hotplug Driver version: 0.1
------------[ cut here ]------------
WARNING: CPU: 19 PID: 1004 at drivers/pci/msi.c:1071 \
__pci_enable_msi_range+0x84/0x4e0
NIP [c000000000665c34] __pci_enable_msi_range+0x84/0x4e0
LR [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0
Call Trace:
[c000000384d67600] [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0
[c000000384d676e0] [d00000000aa31b04] pnv_php_register+0x564/0x5a0 [pnv_php]
[c000000384d677c0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php]
[c000000384d678a0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php]
[c000000384d67980] [d00000000aa31dfc] pnv_php_init+0x60/0x98 [pnv_php]
[c000000384d679f0] [c00000000000cfdc] do_one_initcall+0x6c/0x1d0
[c000000384d67ab0] [c000000000b92354] do_init_module+0x94/0x254
[c000000384d67b40] [c00000000019719c] load_module+0x258c/0x2c60
[c000000384d67d30] [c000000000197bb0] SyS_finit_module+0xf0/0x170
[c000000384d67e30] [c00000000000b184] system_call+0x38/0xe0
This fixes the issue by skipping enabling the surprise hotplug
capability if the MSI or MSIx on the PCI slot's upstream port has
been enabled by other driver.
Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Sort the list of Intel devices that have no PCI D3 delay by ID. Add a
comment for group of devices that had not been marked yet.
There is no functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/msi:
PCI/MSI: Update MSI/MSI-X bits in PCIEBUS-HOWTO
PCI/MSI: Document pci_alloc_irq_vectors(), deprecate pci_enable_msi()
PCI/MSI: Return -ENOSPC if pci_enable_msi_range() can't get enough vectors
PCI/portdrv: Use pci_irq_alloc_vectors()
PCI/MSI: Check that we have a legacy interrupt line before using it
PCI/MSI: Remove pci_msi_domain_{alloc,free}_irqs()
PCI/MSI: Remove unused pci_msi_create_default_irq_domain()
PCI/MSI: Return failure when msix_setup_entries() fails
PCI/MSI: Remove pci_enable_msi_{exact,range}()
amd-xgbe: Update PCI support to use new IRQ functions
[media] cobalt: use pci_irq_allocate_vectors()
PCI/MSI: Fix msi_capability_init() kernel-doc warnings
* pci/enumeration:
PCI: Remove duplicate check for positive return value from probe() functions
PCI: Enable PCIe Extended Tags if supported
PCI: Avoid possible deadlock on pci_lock and p->pi_lock
PCI/ACPI: Fix bus range comparison in pci_mcfg_lookup()
PCI: Apply _HPX settings only to relevant devices
We're supporting surprise hotplug on PCI slots behind root port
or PCIe switch downstream ports, which don't claim the capability
in hardware register (offset: PCIe cap + PCI_EXP_SLTCAP). PEX8718
is one of the examples. For those PCI slots, the PDC (Presence
Detection Change) event isn't reliable and the underly (skiboot)
firmware has best judgement.
This masks the PDC event when skiboot requests by "ibm,slot-broken-pdc"
property in PCI slot's device-tree node.
Reported-by: Hank Chang <hankmax0000@gmail.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Willie Liauw <williel@supermicro.com.tw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In PowerNV PCI hotplug driver, the initial PCI slot's state is set
to PNV_PHP_STATE_POPULATED if no PCI devices are connected to the
slot. The PCI devices that are hot added to the slot won't be probed
and populated because of the check in pnv_php_enable():
/* Check if the slot has been configured */
if (php_slot->state != PNV_PHP_STATE_REGISTERED)
return 0;
This fixes the issue by leaving the slot in PNV_PHP_STATE_REGISTERED
state initially if nothing is connected to the slot.
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Cc: stable@vger.kernel.org #v4.9+
Reported-by: Hank Chang <hankmax0000@gmail.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Willie Liauw <williel@supermicro.com.tw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The surprise hotplug is driven by interrupt in PowerNV PCI hotplug
driver. In the interrupt handler, pnv_php_interrupt(), we bail when
pnv_pci_get_presence_state() returns zero wrongly. It causes the
presence change event is always ignored incorrectly.
This fixes the issue by bailing on error (non-zero value) returned
from pnv_pci_get_presence_state().
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Cc: stable@vger.kernel.org #v4.9+
Reported-by: Hank Chang <hankmax0000@gmail.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Willie Liauw <williel@supermicro.com.tw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Since the exit latencies for L1 substates are not advertised by a device,
it is not clear in spec how to do a L1 substate exit latency check. We
assume that the L1 exit latencies advertised by a device include L1
substate latencies (and hence do not do any check). If that is not true,
we should do some sort of check here.
(I'm not clear about what that check should like currently. I'd be glad to
take up any suggestions).
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Configure the L1 substate settings on the upstream and downstream devices,
while taking care of the rules dictated by the PCIe spec.
[bhelgaas: drop "inline"]
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCIe spec (r3.1, sec 7.33) says the L1 PM Substates Capability may be
implemented only in function 0.
Read the L1 substate capability structures of upstream and downstream
components of the link and set it up in the device structure.
[bhelgaas: add specific spec reference]
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add support for ASPM L1 substates. For details about L1 substates, see the
PCIe r3.1 spec, which includes the ECN below in secs 5.5 and 7.33.
Add macros for the 4 new L1 substates, and add a new ASPM "POWER_SUPERSAVE"
policy that can be used to enable L1 substates on a system if desired. The
new policy is in a sense, a superset of the existing POWERSAVE policy. The
4 policies are now:
DEFAULT: Reads and uses whatever ASPM states BIOS enabled
PERFORMANCE: Everything except L0 disabled.
POWERSAVE: L0s and L1 enabled (but not L1 substates)
POWER_SUPERSAVE: L0s + L1 + L1 substates also enabled
[bhelgaas: add PCIe r3.1 spec reference]
Link: https://pcisig.com/sites/default/files/specification_documents/ECN_L1_PM_Substates_with_CLKREQ_31_May_2013_Rev10a.pdf
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the Exynos PCIe driver only supports the Exynos5440 SoC.
Refactor the driver to allow support for other Exynos SoC.
Following are the main changes in this patch:
1) Add separate structs for memory, clock resources
Future Exynos SoC will have different hardware resources such as iomem,
clocks, regmap handles, etc., so keeping these resources in separate
structs will let us initialize them via per-SoC ops and avoid littering
the code with of_machine_is_compatible().
2) Add exynos_pcie_ops struct which will allow us to support the
differences in resources in different Exynos SoC.
No functional change intended.
Signed-off-by: Niyas Ahmed S T <niyas.ahmed@samsung.com>
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
If device doesn't support as many MSI vectors as the driver requested, we
previously returned -EINVAL from __pci_enable_msi_range() and
pci_enable_msi_range(). In other similar situations in both
__pci_enable_msi_range() and __pci_enable_msix_range(), we returned
-ENOSPC.
Return -ENOSPC from __pci_enable_msi_range() so we do it consistently.
[bhelgaas: changelog]
Signed-off-by: Dennis Chen <dennis.chen@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Tejun Heo <tj@kernel.org>
CC: Christoph Hellwig <hch@lst.de>
CC: Tom Long Nguyen <tom.l.nguyen@intel.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Marc Zyngier <marc.zyngier@arm.com>
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Steve Capper <steve.capper@arm.com>
Use pci_irq_alloc_vectors() and greatly simplify the code by managing the
vector number for the subservices directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It seems like there are some devices (e.g. the PCIe root port driver) that
may not always have a INTx interrupt. Check for dev->irq before returning
a legacy interrupt in pci_irq_alloc_vectors to properly handle this case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
rockchip_pcie_probe() calls of_pci_get_host_bridge_resources() to parse
resources from DT and build a resource list. The caller is responsible for
disposing of the resource list. This is normally done by
pci_release_host_bridge_dev() when the host bridge is removed.
If the host bridge probe fails, dispose of the resource list in the probe
error path.
[bhelgaas: changelog]
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The devfn of 00:02.0 is 0x10. devfn_to_wslot(0x10) == 0x2, and
wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code.
Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot()
returns NULL and pci_stop_and_remove_bus_device() is not called.
Later when the real device driver's .remove() is invoked by
hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed
because the VM has lost the access to the underlying device at that
time.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
CC: stable@vger.kernel.org
CC: K. Y. Srinivasan <kys@microsoft.com>
CC: Stephen Hemminger <sthemmin@microsoft.com>
Function __pci_device_probe() tries to be careful about a PCI driver
probe() hook returning a positive value, but this is not really necessary,
since the same fix up is already done in local_pci_probe() (preceded by a
noisy warning), which renders this instance dead code.
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r3.1, sec 6.2.10 and sec 7.13.4, on Root Ports that support "RP
Extensions for DPC",
When the DPC Trigger Status bit is Set and the DPC RP Busy bit is Set,
software must leave the Root Port in DPC until the DPC RP Busy bit reads
0b.
Wait up to 1 second for the Root Port to become non-busy.
[bhelgaas: changelog, spec references]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Decode the currently defined extended event reasons rather than just using
the generic "extended" explanation.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Just call the msi_* version directly instead of having trivial wrappers for
one or two callsites.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
pci_msi_create_default_irq_domain() is never called in the whole tree, so
remove it as well as all the supporting code for a default PCI MSI domain.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Remove support for vendor-defined messages which are not supported by AXI.
Signed-off-by: Bharat Kumar Gogada <bharatku@xilinx.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If alloc_msi_entry() fails, we free resources and set ret = -ENOMEM.
However, msix_setup_entries() returns 0 unconditionally. Return the error
code instead.
Fixes: e75eafb9b0 ("genirq/msi: Switch to new irq spreading infrastructure")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make sure PCIe MPS settings are valid when we enumerate a new hierarchy.
Based-on-patch-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Every PCIe device can generate 5-bit transaction Tags, which allow up to 32
concurrent requests. Some devices can generate 8-bit Extended Tags, which
allow up to 256 concurrent requests.
Per the ECN mentioned below, all PCIe Receivers are expected to support
Extended Tags, so devices are allowed (but not required) to enable them by
default.
If a device supports Extended Tags but does not enable them by default,
enable them. This allows the device to have up to 256 outstanding
transactions at a time, which may improve performance.
[bhelgaas: changelog, check for PCIe device]
Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_fixup_irqs() is problematic because:
- it's called when we enumerate a host bridge, so we don't fixup IRQs for
hot-added PCI devices, and
- it fixes up IRQs for all PCI devices in the system, so if we call it
multiple times, e.g., if we have several host controllers, we may
reallocate an IRQ for a device after a driver has already claimed it.
We plan to replace pci_fixup_irqs() soon, but we still need it on ARM
because we don't have any other generic method for doing this.
On ARM64, we don't need pci_fixup_irqs() because we do IRQ setup when we
bind a driver to the device (in the pci_device_probe() ->
pcibios_alloc_irq() path).
pci-host-common.c is currently only used on ARM and ARM64. In principle,
it could be used on x86, and we wouldn't want pci_fixup_irqs() there
either, because x86 does IRQ setup in the pci_enable_device() path.
[bhelgaas: changelog, use #ifdef ARM, not #ifndef ARM64]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The PCIe Root Port in Hip06/Hip07 SoCs advertises an MSI capability, but it
cannot generate MSIs. It can transfer MSI/MSI-X from downstream devices,
but does not support MSI/MSI-X itself.
Add a quirk to prevent use of MSI/MSI-X by the Root Port.
[bhelgaas: changelog, sort vendor ID #define, drop device ID #define]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
There's nothing ACPI-specific about the config space accessors
hisi_pcie_acpi_rd_conf() and hisi_pcie_acpi_wr_conf(), and they're used for
both the ACPI and the DT driver model.
Rename them to hisi_pcie_rd_conf() and hisi_pcie_wr_conf(). No functional
change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The Broadcom Northstar2 SoC has a number of quirks for the PAXC
(internal/fake) PCI bus. Specifically, the PCI config space is shared
between the root port and the first PF (ie., PF0), and a number of fields
are tied to zero (thus preventing them from being set). These cannot be
"fixed" in device firmware, so we must fix them with a quirk.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make sure PCIe MPS settings are valid when we enumerate a new hierarchy.
Based-on-patch-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make sure PCIe MPS settings are valid when we enumerate a new hierarchy.
Based-on-patch-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make sure PCIe MPS settings are valid when we enumerate a new hierarchy.
[bhelgaas: changelog]
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Remove unnecessary local variables: elbi_base, phy_base, block_base. We
need one resource structure for assigning each resource. Reuse the single
'res' variable for all.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
There is no reason to maintain *_blk/phy/elbi_* as register accessors.
They can be replaced by one accessor to make maintenance easier.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
The current default of 20ms cause some devices, which are slow to
initialize, to not show up during the bus scanning. Change this to the
PCIe spec mandated 100ms and document this in the DT binding.
From PCIe base spec rev 3.0, chapter "6.6.1. Conventional Reset":
To allow components to perform internal initialization, system software
must wait a specified minimum period following the end of a Conventional
Reset of one or more devices before it is permitted to issue
Configuration Requests to those devices.
With a Downstream Port that does not support Link speeds greater than 5.0
GT/s, software must wait a minimum of 100 ms before sending a
Configuration Request to the device immediately below that Port.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
The conflict was an interaction between a bug fix in the
netvsc driver in 'net' and an optimization of the RX path
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
The PCIe controller in HiSilicon Hip06/Hip07 SoCs is not completely
ECAM-compliant. It is non-ECAM only for the RC bus config space; for any
other bus underneath the root bus it does support ECAM access.
Add DT support for the almost-ECAM Hip06/Hip07 controllers.
[bhelgaas: drop dev->of_node test, driver name "hisi-pcie-almost-ecam"]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
The only way to call hisi_pcie_probe() is to match an entry in
hisi_pcie_of_match[], so match cannot be NULL.
Use of_device_get_match_data() to retrieve the soc_ops pointer. No
functional change intended.
[bhelgaas: use of_device_get_match_data(), changelog]
Based-on-suggestion-from: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Shailendra Verma <shailendra.v@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI core uses a fixed 50ms timeout when waiting for VPD accesses to
complete. When an access does not complete within this period, a warning
is logged and an error returned to the caller.
While this default timeout is valid for most hardware, some devices can
experience longer access delays under certain circumstances. For example,
one of the IBM CXL Flash devices can take up to ~120ms in a worst-case
scenario. These types of devices can benefit from an extended timeout.
To support devices with a longer access delay, increase the timeout in
pci_vpd_wait() to 125ms. The PCI specification is silent with respect to
VPD delays, therefore there is no concern for violating a threshold.
Tested-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Bart reported a problem wіth an out of bounds access in the low-level IRQ
affinity code, which we root caused to the qla2xxx driver assigning all its
MSI-X vectors to the pre and post vectors, and not having any left for the
actually spread IRQs.
Fix this issue by not asking for affinity assignment when there are no
vectors to assign left.
Fixes: 402723ad5c ("PCI/MSI: Provide pci_alloc_irq_vectors_affinity()")
Link: https://lkml.kernel.org/r/1485359225.3093.3.camel@sandisk.com
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The only way to call iproc_pcie_pltfm_probe() is to match an entry in
iproc_pcie_of_match_table[], so match cannot be NULL.
Use of_device_get_match_data() to retrieve the pcie->type. No functional
change intended.
Based-on-suggestion-from: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The only way to call ls_pcie_probe() is to match an entry in
ls_pcie_of_match[], so match cannot be NULL.
Use of_device_get_match_data() to retrieve the drvdata pointer. No
functional change intended.
Based-on-suggestion-from: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This is a DT-only driver, so the only way to call rcar_pcie_probe() is to
match an entry in rcar_pcie_of_match[], so of_id cannot be NULL.
Furthermore, of_id->data can only be NULL if an rcar_pcie_of_match[] entry
has a NULL .data member. That's a driver defect, and we don't want to
return -EINVAL, which is easy to ignore. We'd rather take the NULL pointer
dereference so we notice the problem and fix it.
Use of_device_get_match_data() to retrieve the hw_init_fn pointer. No
functional change intended.
Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
The "port" variable was allocated with devm_kzalloc() so if we free it with
kfree() it will be freed twice. Also I changed it to propogate the error
from devm_ioremap_resource() instead of returning -ENOMEM.
Fixes: c5d4603961 ("PCI: Add MCFG quirks for X-Gene host controller")
Also-posted-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Tanmay Inamdar <tinamdar@apm.com>
This causes CPU hangs when the system is reset by the watchdog, as the GPRs
aren't cleared, but the clocks are back to disabled state.
If the bootloader uses PCIe, it must take care to bring it down into a safe
state, before passing control to the Linux kernel. This is the only way to
get a properly operating system at all times and circumstances.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_lock is an IRQ-safe spinlock that protects all accesses to PCI
configuration space (see PCI_OP_READ() and PCI_OP_WRITE() in pci/access.c).
The pci_cfg_access_unlock() path acquires pci_lock, then p->pi_lock (inside
wake_up_all()). According to lockdep, there is a possible path involving
snbep_uncore_pci_read_counter() that could acquire them in the reverse
order: acquiring p->pi_lock, then pci_lock, which could result in a
deadlock. Lockdep details are in the bugzilla below.
Avoid the possible deadlock by dropping pci_lock before waking up any
config access waiters.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192901
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When CONFIG_PM_SLEEP is disabled, we get harmless build warnings:
host/pcie-rockchip.c:1267:12: error: 'rockchip_pcie_resume_noirq' defined but not used [-Werror=unused-function]
host/pcie-rockchip.c:1240:12: error: 'rockchip_pcie_suspend_noirq' defined but not used [-Werror=unused-function]
Marking both functions as __maybe_unused avoids the warning without the
need for #ifdef around them.
Fixes: 013dd3d5e1 ("PCI: rockchip: Add system PM support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Use readl_poll_timeout() instead of open-coding it.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI core will write to the bridge window config multiple times while
they are enabled. This can lead to mbus failures like this:
mvebu_mbus: cannot add window '4:e8', conflicts with another window
mvebu-pcie mbus:pex@e0000000: Could not create MBus window at [mem 0xe0000000-0xe00fffff]: -22
For me this is happening during a hotplug cycle. The PCI core is not
changing the values, just writing them twice while active.
The patch addresses the general case of any change to an active window, but
not atomically. The code is slightly refactored so io and mem can share
more of the window logic.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Rename the simple pointer name as "ep" instead of "exynos_pcie". After
applying this patch, it can save the 10 characthers within one line.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Intel 200-series chipsets have the same errata as 100-series: the ACS
capability doesn't follow the PCIe spec, the capability and control
registers are dwords rather than words. Add PCIe root port device IDs to
existing quirk.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In a struct pcie_link_state, link->root points to the pcie_link_state of
the root of the PCIe hierarchy. For the topmost link, this points to
itself (link->root = link). For others, we copy the pointer from the
parent (link->root = link->parent->root).
Previously we recognized that Root Ports originated PCIe hierarchies, but
we treated PCI/PCI-X to PCIe Bridges as being in the middle of the
hierarchy, and when we tried to copy the pointer from link->parent->root,
there was no parent, and we dereferenced a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [<ffffffff9e424350>] pcie_aspm_init_link_state+0x170/0x820
Recognize that PCI/PCI-X to PCIe Bridges originate PCIe hierarchies just
like Root Ports do, so link->root for these devices should also point to
itself.
Fixes: 51ebfc92b7 ("PCI: Enumerate switches below PCI-to-PCIe bridges")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=193411
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1022181
Tested-by: lists@ssl-mail.com
Tested-by: Jayachandran C. <jnair@caviumnetworks.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.2+
The conversion to the new hotplug state machine introduced a regression
where a successful hotplug registration would be treated as an error,
effectively disabling the MSI driver forever.
Fix it by doing the proper check on the return value.
Fixes: 9c248f8896 ("PCI/xgene-msi: Convert to hotplug state machine")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Duc Dang <dhdang@apm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@vger.kernel.org
Since we need to change the implementation, stop exposing internals.
Provide kref_read() to read the current reference count; typically
used for debug messages.
Kills two anti-patterns:
atomic_read(&kref->refcount)
kref->refcount.counter
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
All multi-MSI allocations are now done through pci_irq_alloc_vectors(), so
remove the old pci_enable_msi_range() and pci_enable_msi_exact()
interfaces.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci-thunder-pem driver was initially developed for cn88xx SoCs. The
cn81xx and cn83xx members of the same family of SoCs have a slightly
different configuration of interrupt resources in the PEM hardware, which
prevents the INTA legacy interrupt source from functioning with the current
driver.
There are two fixes required:
1) Don't fixup the PME interrupt on the newer SoCs as it already has the
proper value.
2) Report MSI-X Capability Table Size of 2 for the newer SoCs, so the core
MSI-X code doesn't inadvertently clobber the INTA machinery that happens to
reside immediately following the table.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Rockchip's RC produces a 100MHz reference clock but there are two methods
for the PHY to generate it:
(1) Use the system PLL to generate a 100MHz clock. The PHY will relock
it, filter signal noise, and output the reference clock. ASPM L0s
works correctly, but circuit noise issues make it difficult to pass
the TX compatibility test.
(2) Share the SoC's 24MHZ crystal oscillator with the PHY and force the
PHY's PLL to generate 100MHz internally. In this case, exit from
ASPM L0s sometimes fails due to a design error in the RC receiver
circuit. Even if we use extended-synch, the PHY sometimes fails to
relock the bits from FTS, which will hang the system.
We want the flexibility to use both clocking methods, so add a DT property,
"aspm-no-l0s". If that's present, disable L0s to avoid the issues with
case (2).
[bhelgaas: changelog]
Reported-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Acked-by: Rob Herring <robh@kernel.org>
Fix kernel-doc warnings in pci/msi.c:
..//drivers/pci/msi.c:623: warning: No description found for parameter 'affd'
..//drivers/pci/msi.c:623: warning: Excess function parameter 'affinity' description in 'msi_capability_init'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When a PCI card is not connected, the following messages are seen on mx6:
imx6q-pcie 1ffc000.pcie: phy link never came up
imx6q-pcie 1ffc000.pcie: Link never came up
The first one comes from the pcie-designware.c core file, so remove
the redundant one from the imx6 driver.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
ibm_apci_table_attr is not modified after being initialized by
ibm_acpiphp_init(). It is passed as an argument to the functions
sysfs_{remove/create}_bin_file(), but both the arguments are const.
Add __ro_after_init to its declaration.
[bhelgaas: changelog]
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
R-Car PCIe does not support hotplug so it is appropriate to treat the
absence of a PCIe card as an -ENODEV error.
Signed-off-by: Harunobu Kurokawa <harunobu.kurokawa.dn@renesas.com>
[simon: updated changelog]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add system PM support for Rockchip's RC. For pre S3, the EP is configured
into D3 state which guarantees the link state should be in L1. So we could
send PME_Turn_Off message to the EP and wait for its ACK to make the link
state into L2 or L3 without the aux-supply. This could help save more
power which I think should be very important for mobile devices.
As note that there is a 5s timeout for RC to wait for the PMA_ACK after
sending PME_Turn_Off. Technically it should depend on the hierarchy of
devices but seems PCIe core framework doesn't handle the L2/3 for S3 at
all. So that means we should presume to set a default value for PME_ACK.
From the bug report[1], we could find a statement that Microsoft Windows
versions typically wait for 5 seconds. So we are prone to take 5s for this
timeout here.
[1] https://lists.launchpad.net/kernel-packages/msg123315.html
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary
interface and a PCI Express secondary interface. The PCIe interface is a
Downstream Port that originates a Link. See the "PCI Express to PCI/PCI-X
Bridge Specification", rev 1.0, sections 1.2 and A.6.
The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below
the bridge:
00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a]
01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a]
02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a]
03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a]
01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by
lspci). As we traverse a PCIe hierarchy, device connections alternate
between PCIe Links and internal Switch logic. Previously we did not
recognize that 01:00.0 had a secondary link, so we thought the 02:00.0
Upstream Port *did* have a secondary link. In fact, it's the other way
around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic
on its secondary side.
When we thought 02:00.0 had a secondary link, the pci_scan_slot() ->
only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0
was the only possible downstream device. But 03:00.0 doesn't exist, so we
didn't look for any other devices on bus 03.
Booting with "pci=pcie_scan_all" is a workaround, but we don't want users
to have to do that.
Recognize that PCI-to-PCIe bridges originate links on their secondary
interfaces.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361
Fixes: d0751b98df ("PCI: Add dev->has_secondary_link to track downstream PCIe links")
Tested-by: Blake Moore <blake.moore@men.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.2+
Previously we checked for iATU unroll support by reading PCIE_ATU_VIEWPORT
even on platforms, e.g., Keystone, that do not have ATU ports. This can
cause bad behavior such as asynchronous external aborts:
OF: PCI: MEM 0x60000000..0x6fffffff -> 0x60000000
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000800004003, *pmd=00000000
Internal error: : 1211 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-00009-g6ff59d2-dirty #7
Hardware name: Keystone
task: eb878000 task.stack: eb866000
PC is at dw_pcie_setup_rc+0x24/0x380
LR is at ks_pcie_host_init+0x10/0x170
Move the dw_pcie_iatu_unroll_enabled() check so we only call it on
platforms that do not use the ATU. These platforms supply their own
->rd_other_conf() and ->wr_other_conf() methods.
[bhelgaas: changelog]
Fixes: a0601a4705 ("PCI: designware: Add iATU Unroll feature")
Fixes: 416379f9eb ("PCI: designware: Check for iATU unroll support after initializing host")
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
CC: stable@vger.kernel.org # v4.9+
Previously we didn't check the type of device before trying to apply Type 1
(PCI-X) or Type 2 (PCIe) Setting Records from _HPX.
We don't support PCI-X Setting Records, so this was harmless, but the
warning was useless.
We do support PCIe Setting Records, and we didn't check whether a device
was PCIe before applying settings. I don't think anything bad happened on
non-PCIe devices because pcie_capability_clear_and_set_word(),
pcie_cap_has_lnkctl(), etc., would fail before doing any harm. But it's
ugly to depend on those internals.
Check the device type before attempting to apply Type 1 and Type 2 Setting
Records (Type 0 records are applicable to PCI, PCI-X, and PCIe devices).
A side benefit is that this prevents useless "not supported" warnings when
a BIOS supplies a Type 1 (PCI-X) Setting Record and we try to apply it to
every single device:
pci 0000:00:00.0: PCI-X settings not supported
After this patch, we'll get the warning only when a BIOS supplies a Type 1
record and we have a PCI-X device to which it should be applied.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Remove res_to_dev_res() debug message. This is printed from a lookup
function. If the message is important, it should be printed from the
caller with more context.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights include:
- Support for the kexec_file_load() syscall, which is a prereq for secure and
trusted boot.
- Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).
- Sort the exception tables at build time, to save time at boot, and store
them as relative offsets to save space in the kernel image & memory.
- Allow building the kernel with thin archives, which should allow us to build
an allyesconfig once some other fixes land.
- Build fixes to allow us to correctly rebuild when changing the kernel endian
from big to little or vice versa.
- Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.
- Initial stack protector support (-fstack-protector).
- Support for dumping the radix (aka. Linux) and hash page tables via debugfs.
- Fix an oops in cxl coredump generation when cxl_get_fd() is used.
- Freescale updates from Scott: "Highlights include 8xx hugepage support,
qbman fixes/cleanup, device tree updates, and some misc cleanup."
- Many and varied fixes and minor enhancements as always.
Thanks to:
Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYU4YSAAoJEFHr6jzI4aWAC4gQALtIAqqPon0Cd5b/FVVcMbW7
mMqB2b/0FGEl5GoRTzGUDaQqElilm6AEVfHO86C7DFji/a6olneFfw87iz+mtWuZ
JvrNq68ZiSnoeszdUy4MgtXFLb5sTzNMev4skaHfjI9E5CepWBoR0zH4G+kNVnd5
WSgudv8Cq4Px+MEuTOigt3QYjHzZ3cw/XNOOm9c+oGj+PDW4O9UItVI+S1WLoey4
rAB2nRcLMDPuwfRQC9XsF3zEbkv4h1dEXo/EBRuRpcF+0lLTzFw1lv1WE8OxlUmS
kAXbty3dIytBfSbtJT0c0Ps6sfQ4HFhu6ZV2fjnxNTz2KDkBIN7LBYHmBYiqY9oZ
9zvbUWtfiTu5ocfRtTq7rC/Hcj4Kbr9S9F/FvXR0WyDsKgu4xxAovqC3gcn6YjYK
Rr1tcCI4nUzyhVJVmd+OEhUvc5JbFy9aGage+YeOyejfvvSbXIunaxWlPjoDkvim
Vjl+UKU8gw51XFssqY5ZBi/HNlMFKYedLpMFp/fItnLglhj50V0eFWkpDgdSCYom
vo9ifPLZx8n8m8De3H7TV4E0F4gCHcTeqZdu7tW9AAUVM6iLJcDLm3asGmtNh21t
snOHNOJ5QSIno6ezUUg29T6VBjbPh46fdJJSlIZrEe8OzLZ1haGyttf0tD00PQvY
Z2W/m3gxafnOeGgBqvyv
=xOzf
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Support for the kexec_file_load() syscall, which is a prereq for
secure and trusted boot.
- Prevent kernel execution of userspace on P9 Radix (similar to
SMEP/PXN).
- Sort the exception tables at build time, to save time at boot, and
store them as relative offsets to save space in the kernel image &
memory.
- Allow building the kernel with thin archives, which should allow us
to build an allyesconfig once some other fixes land.
- Build fixes to allow us to correctly rebuild when changing the
kernel endian from big to little or vice versa.
- Plumbing so that we can avoid doing a full mm TLB flush on P9
Radix.
- Initial stack protector support (-fstack-protector).
- Support for dumping the radix (aka. Linux) and hash page tables via
debugfs.
- Fix an oops in cxl coredump generation when cxl_get_fd() is used.
- Freescale updates from Scott: "Highlights include 8xx hugepage
support, qbman fixes/cleanup, device tree updates, and some misc
cleanup."
- Many and varied fixes and minor enhancements as always.
Thanks to:
Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"
[ And thanks to Michael, who took time off from a new baby to get this
pull request done. - Linus ]
* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
powerpc/fsl/dts: add FMan node for t1042d4rdb
powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
powerpc/fsl/dts: add QMan and BMan nodes on t1024
powerpc/fsl/dts: add QMan and BMan nodes on t1023
soc/fsl/qman: test: use DEFINE_SPINLOCK()
powerpc/fsl-lbc: use DEFINE_SPINLOCK()
powerpc/8xx: Implement support of hugepages
powerpc: get hugetlbpage handling more generic
powerpc: port 64 bits pgtable_cache to 32 bits
powerpc/boot: Request no dynamic linker for boot wrapper
soc/fsl/bman: Use resource_size instead of computation
soc/fsl/qe: use builtin_platform_driver
powerpc/fsl_pmc: use builtin_platform_driver
powerpc/83xx/suspend: use builtin_platform_driver
powerpc/ftrace: Fix the comments for ftrace_modify_code
powerpc/perf: macros for power9 format encoding
powerpc/perf: power9 raw event format encoding
powerpc/perf: update attribute_group data structure
powerpc/perf: factor out the event format field
powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
...