linux_dsm_epyc7002/arch/powerpc
Gavin Shan d2b0f6f77e powerpc/eeh: No hotplug on permanently removed dev
The issue was detected in a bit complicated test case where
we have multiple hierarchical PEs shown as following figure:

                +-----------------+
                | PE#3     p2p#0  |
                |          p2p#1  |
                +-----------------+
                        |
                +-----------------+
                | PE#4     pdev#0 |
                |          pdev#1 |
                +-----------------+

PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
bridges. We accidentally had less-known scenario: PE#4 was removed
permanently from the system because of permanent failure (e.g.
exceeding the max allowd failure times in last hour), then we detects
EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
for pdev#0/1 were not detached from PE#4, which was still connected to
PE#3. All of that was because of the fact that we rely on count-based
pcibios_release_device(), which isn't reliable enough. When doing
recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
are not valid any more. Eventually, we run into kernel crash.

The patch fixes above issue from two aspects. For unplug, we simply
skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
&& !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
its PCI config so that PCI core will omit them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:32 +10:00
..
boot powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048 2014-04-28 16:32:02 +10:00
configs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-04-12 14:49:50 -07:00
crypto powerpc: Fix compile of sha1-powerpc-asm.S on 32-bit 2013-03-05 16:56:26 +11:00
include powerpc/eeh: No hotplug on permanently removed dev 2014-04-28 17:34:32 +10:00
kernel powerpc/eeh: No hotplug on permanently removed dev 2014-04-28 17:34:32 +10:00
kvm ppc/kvm: Clear the runlatch bit of a vcpu before napping 2014-04-28 16:32:49 +10:00
lib selftests/powerpc: Import Anton's memcpy / copy_tofrom_user tests 2014-03-07 15:53:12 +11:00
math-emu powerpc: Correct emulated mtfsf instruction 2014-04-07 10:33:11 +10:00
mm powerpc/mm: Fix tlbie to add AVAL fields for 64K pages 2014-04-28 13:11:24 +10:00
net net: filter: add jited flag to indicate jit compiled filters 2014-03-31 00:45:08 -04:00
oprofile cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE} 2014-03-19 14:10:24 +01:00
perf powerpc/perf/hv-24x7: Catalog version number is be64, not be32 2014-04-28 16:31:50 +10:00
platforms powerpc/eeh: No hotplug on permanently removed dev 2014-04-28 17:34:32 +10:00
sysdev powerpc/4xx: Fix section mismatch in ppc4xx_pci.c 2014-04-28 16:32:53 +10:00
xmon powerpc: Fix xmon disassembler for little-endian 2014-03-07 15:50:12 +11:00
Kconfig Merge git://git.infradead.org/users/eparis/audit 2014-04-12 12:38:53 -07:00
Kconfig.debug Merge branch 'kconfig-diet' from Dave Hansen 2013-07-04 11:25:51 -07:00
Makefile powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules. 2014-04-09 12:53:44 +10:00
relocs_check.pl Fix warning typo "CONFIG_RELCOATABLE" 2013-05-29 15:11:30 +02:00