linux_dsm_epyc7002/drivers/misc/cxl
Daniel Axtens 9d8e27673c cxl: Remove racy attempt to force EEH invocation in reset
cxl_reset currently PERSTs the slot, and then repeatedly tries to
read MMIO space in order to kick off EEH.

There are 2 problems with this: it's unnecessary, and it's racy.

It's unnecessary because the PERST will bring down the PHB link.
That will be picked up by the CAPP, which will send out an HMI.
Skiboot, noticing an HMI from the CAPP, will send an OPAL
notification to the kernel, which will trigger EEH recovery.

It's also racy: the EEH recovery triggered by the CAPP will
eventually cause the MMIO space to have its mapping invalidated
and the pointer NULLed out. This races with our attempt to read
the MMIO space. This is causing OOPSes in testing.

Simply drop all the attempts to force EEH detection, and trust
that Skiboot will send the notification and that we'll act on it.
The Skiboot code to send the EEH notification has been in Skiboot
for as long as CAPP recovery has been supported, so we don't need
to worry about breaking obscure setups with ancient firmware.

Cc: Ryan Grimm <grimm@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 62fa19d4b4 ("cxl: Add ability to reset the card")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-27 13:51:36 +10:00
..
api.c cxl: Allow release of contexts which have been OPENED but not STARTED 2015-08-20 16:15:23 +10:00
base.c cxl: Move include file cxl.h -> cxl-base.h 2015-06-03 13:27:19 +10:00
context.c cxl: Add alternate MMIO error handling 2015-08-18 19:34:43 +10:00
cxl.h cxl: Add alternate MMIO error handling 2015-08-18 19:34:43 +10:00
debugfs.c cxl: sparse: Silence iomem warning in debugfs file creation 2015-08-12 14:49:29 +10:00
fault.c cxl: Only check pid for userspace contexts 2015-06-03 13:27:18 +10:00
file.c cxl: Add alternate MMIO error handling 2015-08-18 19:34:43 +10:00
irq.c cxl: Release irqs if memory allocation fails 2015-08-27 13:51:18 +10:00
Kconfig cxl: Add CONFIG_CXL_EEH symbol 2015-08-17 13:56:29 +10:00
main.c cxl: Destroy cxl_adapter_idr on module_exit 2015-07-16 14:14:55 +10:00
Makefile cxl: Compile with -Werror 2015-08-11 07:43:40 +10:00
native.c cxl: Allocate and release the SPA with the AFU 2015-08-14 21:32:04 +10:00
pci.c cxl: Remove racy attempt to force EEH invocation in reset 2015-08-27 13:51:36 +10:00
sysfs.c cxl: Allow the kernel to trust that an image won't change on PERST. 2015-08-14 21:32:07 +10:00
trace.c cxl: Add tracepoints 2015-01-22 17:31:51 +11:00
trace.h cxl: use more common format specifier 2015-07-13 10:10:54 +10:00
vphb.c cxl: EEH support 2015-08-14 21:32:08 +10:00