A surprise link down may retrain very quickly causing the same slot
generate a link up event before handling the link down event completes.
Since the link is active, the power off work queued from the first link
down will cause a second down event when power is disabled. However, the
link up event sets the slot state to POWERON_STATE before the event to
handle this is enqueued, making the second down event believe it needs to
do something.
This creates constant link up and down event cycle.
To prevent this it is better to handle each event at the time in order it
occurred, so change the driver to use ordered workqueue instead.
A normal device hotplug triggers two events (presense detect and link up)
that are already handled properly in the driver but we currently log an
error if we find an existing device in the slot. Since this is not an error
change the log level to be debug instead to avoid scaring users.
This is based on the original work by Ashok Raj.
Link: https://patchwork.kernel.org/patch/9469023
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This fixes what appears to be a bug
in passing the wrong pointer to the timer handler (address of ctrl pointer
instead of ctrl pointer).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
When a power fault occurs, the power controller sets Power Fault Detected
in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT
event to handle it.
It also clears Power Fault Detected, but since nothing has yet changed to
correct the power fault, the power controller will likely set it again
immediately, which may cause an infinite loop when pcie_isr() rechecks
Slot Status.
Fix that by masking off Power Fault Detected from new events if the driver
hasn't seen the power fault clear from the previous handling attempt.
Fixes: fad214b0aa ("PCI: pciehp: Process all hotplug events before looking for new ones")
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog, pull test out and add comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: stable@vger.kernel.org # 4.9+
If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.
When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status. The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.
Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event. Skip checking the PDC status if we handled a link
event in the same handler.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Print slot name consistently as "Slot(%s)". I don't know whether that's
ideal, but we can at least do it the same way all the time. No functional
change intended.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
In pcie_isr(), we return early if no status bits other than
PCI_EXP_SLTSTA_CC are set. This was introduced by dbd79aed1a ("pciehp:
fix NULL dereference in interrupt handler"), but it is no longer necessary
because all the subsequent pcie_isr() code is already predicated on a
status bit being set.
Remove the unnecessary test for ~PCI_EXP_SLTSTA_CC. No functional change
intended.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Previously we read Slot Status to learn about hotplug events, then cleared
the events, then re-read Slot Status to find out what happened. But Slot
Status might have changed before the second read.
Capture the Slot Status once before clearing the events. Also capture the
Link Status if we had a link status change.
[bhelgaas: changelog, split to separate patch]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
After 1469d17dd3 ("PCI: pciehp: Handle invalid data when reading from
non-existent devices"), we returned IRQ_HANDLED when we failed to read
interrupt status from the bridge. I think it's better to return IRQ_NONE,
as we do in other cases where there's no interrupt pending. This will
facilitate refactoring the loop in pcie_isr(): we'll be able to call the
ISR in a loop as long as it returns IRQ_HANDLED.
Return IRQ_NONE if we couldn't read interrupt status.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rename "detected" and "intr_loc" to "status" and "events" for clarity.
"status" is the value we read from the Slot Status register; "events" is
the set of hot-plug events we need to process. No functional change
intended.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
If a hotplug port is suspended to D3cold, its slot status register cannot
be read. If that hotplug port happens to share its IRQ with other devices,
whenever an interrupt occurs for one of these devices, pciehp logs a
"no response from device" message and tries to read the PCI_EXP_SLTSTA
register, even though we know that will fail.
Ignore interrupts while we're in D3cold.
[bhelgaas: changelog]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We queued interrupt events for the MRL being opened or closed, but the code
in interrupt_event_handler() that handles these events ignored them.
Stop enabling MRL interrupts and remove the ignored events.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move first slot status read into while to simplify code.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pciehp_handle_*() functions (pciehp_handle_attention_button(), etc.)
only contain a line or two of useful code, so it's clumsy to put
them in separate functions. All they so is add an event to a work queue,
and it's clearer to see that directly in the ISR.
Inline them directly into pcie_isr(). No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
The pciehp debug logging is overly verbose and often redundant. Almost all
of the information printed by dbg_ctrl() is also printed by the normal PCI
core enumeration code and by pcie_init().
Remove the redundant debug info.
When claiming a pciehp bridge, we print the slot characteristics, e.g.,
Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+
Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information,
and print it all in the same order as lspci does.
No functional change except the message text changes.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
The commit referenced below deferred waiting for command completion until
the start of the next command, allowing hardware to do the latching
asynchronously. Unfortunately, being ready to accept a new command is the
only indication we have that the previous command is completed. In cases
where we need that state change to be enabled, we must still wait for
completion. For instance, pciehp_reset_slot() attempts to disable anything
that might generate a surprise hotplug on slots that support presence
detection. If we don't wait for those settings to latch before the
secondary bus reset, we negate any value in attempting to prevent the
spurious hotplug.
Create a base function with optional wait and helper functions so that
pcie_write_cmd() turns back into the "safe" interface which waits before
and after issuing a command and add pcie_write_cmd_nowait(), which
eliminates the trailing wait for asynchronous completion. The following
functions are returned to their previous behavior:
pciehp_power_on_slot
pciehp_power_off_slot
pcie_disable_notification
pciehp_reset_slot
The rationale is that pciehp_power_on_slot() enables the link and therefore
relies on completion of power-on. pciehp_power_off_slot() and
pcie_disable_notification() need a wait because data structures may be
freed after these calls and continued signaling from the device would be
unexpected. And, of course, pciehp_reset_slot() needs to wait for the
scenario outlined above.
Fixes: 3461a06866 ("PCI: pciehp: Wait for hotplug command completion lazily")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.17+
During pciehp initialization, we previously wrote two hotplug commands:
pciehp_probe
pcie_init
pcie_disable_notification
pcie_write_cmd # command 1
pcie_init_notification
pcie_enable_notification
pcie_write_cmd # command 2
For controllers with errata like Intel CF118, we previously waited for a
timeout before issuing the second hotplug command because the first command
only updates interrupt enable bits and is not a "real" hotplug command, so
the controller doesn't report Command Completed for it.
But there's no need to disable notifications in the first place. If BIOS
left them enabled, we could easily take an interrupt before disabling them,
so there's no benefit in disabling them for the tiny window before we
enable them.
Drop the unnecessary pcie_disable_notification() call.
[bhelgaas: changelog]
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add more Slot Control debug output and move one print after
pcie_write_cmd() to be consistent with other debug output.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When we warned about a timeout on a hotplug command, we previously printed
the time between calls to pcie_write_cmd(), without accounting for any time
spent actually waiting. Consider this sequence:
pcie_write_cmd
write SLTCTL
cmd_started = jiffies # T1
pcie_write_cmd
pcie_wait_cmd
now = jiffies # T2
wait_event_timeout # we may wait here
if (timeout)
ctrl_info("Timeout on command issued %u msec ago",
jiffies_to_msecs(now - cmd_started))
We previously printed (T2 - T1), but that doesn't include the time spent in
wait_event_timeout().
Fix this by using the current jiffies value, not the one cached before
calling wait_event_timeout().
[bhelgaas: changelog, use current jiffies instead of adding timeout]
Fixes: 40b960831c ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pcie_poll_cmd() take msecs instead of jiffies, so convert timeout to msecs.
Fixes: 40b960831c ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4283c70e91 ("PCI: pciehp: Make pcie_wait_cmd() self-contained") added
a cache of the most recent command written to the Slot Control register.
This register is only 16 bits wide, but the cache ("slot_ctrl") is 32 bits.
Reduce slot_ctrl to a u16 so it matches the register size. No functional
change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.
Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.
Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.
The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.
This is a reimplementation of f244d8b623 ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.
This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,
This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:
[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !
or while resuming it, as in bug 77261:
radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+
During PCIe hot-plug initialization - pciehp_probe() - data structures
related to slot capabilities are set up. As part of this set up, ISRs are
put in place to handle slot events and all event bits are cleared out.
This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
Slot Status bit to the event bits that are cleared out during
initialization.
If the BIOS doesn't clear DLLSC before handoff to the OS, pciehp notices
that it's set and interprets it as a new Link Up event, which results in
spurious messages:
pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add
pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00
Prior to e48f1b67f6 ("PCI: pciehp: Use link change notifications for
hot-plug and removal"), pciehp ignored DLLSC.
Reference:
PCI-SIG. PCI Express Base Specification Revision 4.0 Version 0.3
(PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).
[bhelgaas: add e48f1b67f6 ref and stable tag]
Fixes: e48f1b67f6 ("PCI: pciehp: Use link change notifications for hot-plug and removal")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79611
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+
"no_cmd_complete" is only used once, and it duplicates read-only
information we already have in the cached Slot Capabilities value.
Remove the field and use the existing macro NO_CMD_CMPL() instead.
[bhelgaas: changelog]
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We use incorrect logic to decide whether a PCIe hotplug controller
generates command completion events.
5808639bfa ("pciehp: fix slow probing") assumed that the Slot Status
"Command Completed" bit was set only for commands affecting slot power,
indicators, or electromechanical interlock. That assumption is false: per
sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot
Control register is a command, and (if command completed events are
supported) software must wait for a command to complete before issuing the
next command.
5808639bfa was to fix boot-time timeouts (see bugzilla below) on a Lenovo
Thinkpad R61 with an Intel hotplug controller. The controller probably has
the Intel CF118 erratum, which means it doesn't report Command Completed
unless the Slot Control power, indicator, or interlock bits are changed.
This causes a timeout because pciehp always waits for Command Complete (if
supported), regardless of which bits are changed.
Remove the incorrect logic because the timeouts have been addressed
differently by these changes:
PCI: pciehp: Wait for hotplug command completion lazily
PCI: pciehp: Compute timeout from hotplug command start time
Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Previously we issued a hotplug command and waited for it to complete. But
there's no need to wait until we're ready to issue the *next* command. The
next command will probably be much later, so the first one may have already
completed and we may not have to actually wait at all.
Because of hardware errata, some controllers generate command completion
events for some commands but not others. In the case of Intel CF118 (see
spec update reference), the controller indicates command completion only
for Slot Control writes that change the value of the following bits:
Power Controller Control
Power Indicator Control
Attention Indicator Control
Electromechanical Interlock Control
Changes to other bits, e.g., the interrupt enable bits, do not cause the
Command Completed bit to be set. Controllers from AMD and Nvidia are
reported to have similar errata.
These errata cause timeouts when pcie_enable_notification() enables
interrupts. Previously that timeout occurred at boot-time. With this
change, the timeout occurs later, when we change the state of the slot
power, indicators, or interlock. This speeds up boot but causes a timeout
at the first hotplug event on the slot. Subsequent events don't timeout
because only the first (boot-time) hotplug command updates Slot Control
without touching the power/indicator/interlock controls.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
pcie_wait_cmd() waits for the controller to finish a hotplug command. Move
the associated logic (to determine whether waiting is required and whether
we're using interrupts or polling) from pcie_write_cmd() to
pcie_wait_cmd().
No functional change.
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.
No functional change.
[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix various whitespace errors.
No functional change.
[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In case of a spurious "cmd completed", pcie_write_cmd() does not clear it,
but yet expects more "cmd completed" events to be generated. This does not
happen because the previous (spurious) event has not been acknowledged.
Fix that.
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In case a card is physically yanked out, it should immediately be removed,
regardless of the "surprise" capability bit. Thus:
- Always handle the physical removal - regardless of the "surprise" bit.
- Don't use "surprise" capability when making decisions about enabling
presence detect notifications.
- Reword the comments to indicate the intent.
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Today it is there is no protection around pciehp_enable_slot() and
pciehp_disable_slot() to ensure that they complete before another
hot-plug operation can be done on that particular slot.
This patch introduces the slot->hotplug_lock to ensure that any hotplug
operations (add / remove) complete before another hotplug event can begin
processing on that particular slot.
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Disable the link notification (in addition to presence detect
notifications) across the slot reset since the reset could flap the link,
and we don't want to treat it as hot unplug followed by a hotplug.
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need future link up events for hot-add, thus don't disable the link
permanently during device removal. Also, remove the static functions that
are now left unused.
This reverts part of 2debd92899 ("PCI: pciehp: Disable/enable link during
slot power off/on"). This was discussed at the URL below, where it was
revealed that it was done for a bug in a PCIe repeater chip on that
particular platform.
Link: https://lkml.kernel.org/r/CAErSpo72KZ-a2OSQLWoK71GCgwBt676XZdGt4tEYm-6UYnLmPw@mail.gmail.com
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enable the Link state notifications unconditionally. Enable the
presence detection notification only if attention button is absent.
This was discussed at this thread:
https://lkml.kernel.org/r/529E5C0E.80903@gmail.com
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A lot of systems do not have the fancy buttons and LEDs, and instead
want to rely only on the Link state change events to drive the hotplug
and removal state machinery.
(http://www.spinics.net/lists/hotplug/msg05802.html)
This patch adds support for that functionality. Here are the details
about the patch itself:
* Define and use interrupt events for linkup / linkdown.
* Make the pcie_isr() also look at link events, and direct control to
corresponding (new) link state change handler function.
* Introduce the functions to handle link-up and link-down events and
queue the add / removal work in the slot->wq to be processed by
pciehp_power_thread()
As a side note, this patch also fixes the bug
https://bugzilla.kernel.org/show_bug.cgi?id=65521 "pciehp ignores Data Link
Layer State Changed bit."
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
check_link_active() functionality needs to be used by subsequent patches
(that introduce link state change based hotplug). Thus make the function
non-static, and rename it to pciehp_check_link_active() so as to be
consistent with other non-static functions.
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously, the caller checked ATTN_LED() or PWR_LED() to see whether the
slot has indicators before setting the indicator state. That clutters the
caller unnecessarily, so this moves the test inside the callees. The test
may not even be necessary; per spec it should be harmless to try to turn on
a non-existent LED. But checking first does avoid unnecessary hotplug
commands.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add symbolic constants for the PCIe Slot Control indicator and power
control fields defined by spec and use them instead of open-coded hex
constants.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It's simpler to test the PCI_EXP_SLTSTA_PFD bit directly and to write the
constant back to PCI_EXP_SLTSTA.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We already have the vendor/device IDs from pci_setup_device(), so drop that
info and print things that will be more useful for debugging: the slot
number and presence of button/indicators/link active reporting/etc.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
These functions:
pcie_enable_notification()
pciehp_power_off_slot()
pciehp_get_power_status()
pciehp_get_attention_status()
pciehp_set_attention_status()
pciehp_get_latch_status()
pciehp_get_adapter_status()
pcie_write_cmd()
now always return success, so this patch makes them void and drops the
error-checking code in their callers.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There's not much point in checking the return value from every config space
access because the only likely errors are design-time things like unaligned
accesses or invalid register numbers. The checking clutters the code
significantly, so this patch removes it.
No functional change.
Reference: http://lkml.kernel.org/r/CA+55aFzP4xEbcNmZ+MS0SQ3LrULzSq+dBiT_X9U-bPpR-Ukgrw@mail.gmail.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pciehp_readw() and pciehp_writew() wrappers only look up the pci_dev
and call the PCIe Capability accessors, so we can make things a little
more straightforward by just using the PCIe Capability accessors directly.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.
Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>