linux_dsm_epyc7002/arch
Balbir Singh 4145f35864 powernv/kdump: Fix cases where the kdump kernel can get HMI's
Certain HMI's such as malfunction error propagate through
all threads/core on the system. If a thread was offline
prior to us crashing the system and jumping to the kdump
kernel, bad things happen when it wakes up due to an HMI
in the kdump kernel.

There are several possible ways to solve this problem

1. Put the offline cores in a state such that they are
not woken up for machine check and HMI errors. This
does not work, since we might need to wake up offline
threads to handle TB errors
2. Ignore HMI errors, setup HMEER to mask HMI errors,
but this still leads the window open for any MCEs
and masking them for the duration of the dump might
be a concern
3. Wake up offline CPUs, as in send them to
crash_ipi_callback (not wake them up as in mark them
online as seen by the hotplug). kexec does a
wake_online_cpus() call, this patch does something
similar, but instead sends an IPI and forces them to
crash_ipi_callback()

This patch takes approach #3.

Care is taken to enable this only for powenv platforms
via crash_wake_offline (a global value set at setup
time). The crash code sends out IPI's to all CPU's
which then move to crash_ipi_callback and kexec_smp_wait().

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-16 23:47:11 +11:00
..
alpha treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
arc ARC updates for 4.15-rc1 2017-11-25 08:21:54 -10:00
arm Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm 2017-12-03 10:51:08 -05:00
arm64 arm64 fixes: 2017-12-01 19:37:03 -05:00
blackfin treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
c6x Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
cris pci-v4.15-changes 2017-11-15 15:01:28 -08:00
frv Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
h8300 mm, arch: remove empty_bad_page* 2017-11-15 18:21:03 -08:00
hexagon Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
ia64 arch/ia64/include/asm/topology.h: remove unused parent_node() macro 2017-11-17 16:10:04 -08:00
m32r m32r: fix endianness constraints 2017-11-15 18:21:00 -08:00
m68k m68k/macboing: Fix missed timer callback assignment 2017-11-24 16:19:40 +01:00
metag DeviceTree for 4.15: 2017-11-14 18:25:40 -08:00
microblaze Microblaze patch for 4.15-rc2 2017-11-29 14:19:22 -08:00
mips * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
mn10300 bug: define the "cut here" string in a single place 2017-11-17 16:10:01 -08:00
nios2 DeviceTree for 4.15: 2017-11-14 18:25:40 -08:00
openrisc kmemcheck: remove annotations 2017-11-15 18:21:04 -08:00
parisc treewide: Switch DEFINE_TIMER callbacks to struct timer_list * 2017-11-21 15:57:05 -08:00
powerpc powernv/kdump: Fix cases where the kdump kernel can get HMI's 2018-01-16 23:47:11 +11:00
riscv RISC-V: Fixes for clean allmodconfig build 2017-12-01 13:31:31 -08:00
s390 * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
score License cleanup: add SPDX license identifier to uapi header files with no license 2017-11-02 11:19:54 +01:00
sh treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
sparc Merge branch 'akpm' (patches from Andrew) 2017-11-29 19:12:44 -08:00
tile mm: switch to 'define pmd_write' instead of __HAVE_ARCH_PMD_WRITE 2017-11-29 18:40:42 -08:00
um This pull request contains the following core changes: 2017-11-22 20:46:06 -10:00
unicore32 kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
x86 * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
xtensa libnvdimm for 4.15 2017-11-17 09:51:57 -08:00
.gitignore
Kconfig bpf: Revert bpf_overrid_function() helper changes. 2017-11-11 18:24:55 +09:00