linux_dsm_epyc7002/arch/arm64/kernel
Cristian Marussi f50b7daccc arm64: smp: fix crash_smp_send_stop() behaviour
On a system configured to trigger a crash_kexec() reboot, when only one CPU
is online and another CPU panics while starting-up, crash_smp_send_stop()
will fail to send any STOP message to the other already online core,
resulting in fail to freeze and registers not properly saved.

Moreover even if the proper messages are sent (case CPUs > 2)
it will similarly fail to account for the booting CPU when executing
the final stop wait-loop, so potentially resulting in some CPU not
been waited for shutdown before rebooting.

A tangible effect of this behaviour can be observed when, after a panic
with kexec enabled and loaded, on the following reboot triggered by kexec,
the cpu that could not be successfully stopped fails to come back online:

[  362.291022] ------------[ cut here ]------------
[  362.291525] kernel BUG at arch/arm64/kernel/cpufeature.c:886!
[  362.292023] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  362.292400] Modules linked in:
[  362.292970] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.6.0-rc4-00003-gc780b890948a #105
[  362.293136] Hardware name: Foundation-v8A (DT)
[  362.293382] pstate: 200001c5 (nzCv dAIF -PAN -UAO)
[  362.294063] pc : has_cpuid_feature+0xf0/0x348
[  362.294177] lr : verify_local_elf_hwcaps+0x84/0xe8
[  362.294280] sp : ffff800011b1bf60
[  362.294362] x29: ffff800011b1bf60 x28: 0000000000000000
[  362.294534] x27: 0000000000000000 x26: 0000000000000000
[  362.294631] x25: 0000000000000000 x24: ffff80001189a25c
[  362.294718] x23: 0000000000000000 x22: 0000000000000000
[  362.294803] x21: ffff8000114aa018 x20: ffff800011156a00
[  362.294897] x19: ffff800010c944a0 x18: 0000000000000004
[  362.294987] x17: 0000000000000000 x16: 0000000000000000
[  362.295073] x15: 00004e53b831ae3c x14: 00004e53b831ae3c
[  362.295165] x13: 0000000000000384 x12: 0000000000000000
[  362.295251] x11: 0000000000000000 x10: 00400032b5503510
[  362.295334] x9 : 0000000000000000 x8 : ffff800010c7e204
[  362.295426] x7 : 00000000410fd0f0 x6 : 0000000000000001
[  362.295508] x5 : 00000000410fd0f0 x4 : 0000000000000000
[  362.295592] x3 : 0000000000000000 x2 : ffff8000100939d8
[  362.295683] x1 : 0000000000180420 x0 : 0000000000180480
[  362.296011] Call trace:
[  362.296257]  has_cpuid_feature+0xf0/0x348
[  362.296350]  verify_local_elf_hwcaps+0x84/0xe8
[  362.296424]  check_local_cpu_capabilities+0x44/0x128
[  362.296497]  secondary_start_kernel+0xf4/0x188
[  362.296998] Code: 52805001 72a00301 6b01001f 54000ec0 (d4210000)
[  362.298652] SMP: stopping secondary CPUs
[  362.300615] Starting crashdump kernel...
[  362.301168] Bye!
[    0.000000] Booting Linux on physical CPU 0x0000000003 [0x410fd0f0]
[    0.000000] Linux version 5.6.0-rc4-00003-gc780b890948a (crimar01@e120937-lin) (gcc version 8.3.0 (GNU Toolchain for the A-profile Architecture 8.3-2019.03 (arm-rel-8.36))) #105 SMP PREEMPT Fri Mar 6 17:00:42 GMT 2020
[    0.000000] Machine model: Foundation-v8A
[    0.000000] earlycon: pl11 at MMIO 0x000000001c090000 (options '')
[    0.000000] printk: bootconsole [pl11] enabled
.....
[    0.138024] rcu: Hierarchical SRCU implementation.
[    0.153472] its@2f020000: unable to locate ITS domain
[    0.154078] its@2f020000: Unable to locate ITS domain
[    0.157541] EFI services will not be available.
[    0.175395] smp: Bringing up secondary CPUs ...
[    0.209182] psci: failed to boot CPU1 (-22)
[    0.209377] CPU1: failed to boot: -22
[    0.274598] Detected PIPT I-cache on CPU2
[    0.278707] GICv3: CPU2: found redistributor 1 region 0:0x000000002f120000
[    0.285212] CPU2: Booted secondary processor 0x0000000001 [0x410fd0f0]
[    0.369053] Detected PIPT I-cache on CPU3
[    0.372947] GICv3: CPU3: found redistributor 2 region 0:0x000000002f140000
[    0.378664] CPU3: Booted secondary processor 0x0000000002 [0x410fd0f0]
[    0.401707] smp: Brought up 1 node, 3 CPUs
[    0.404057] SMP: Total of 3 processors activated.

Make crash_smp_send_stop() account also for the online status of the
calling CPU while evaluating how many CPUs are effectively online: this way
the right number of STOPs is sent and all other stopped-cores's registers
are properly saved.

Fixes: 78fd584cde ("arm64: kdump: implement machine_crash_shutdown()")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-17 22:51:19 +00:00
..
probes arm64: remove __exception annotations 2019-10-28 11:22:38 +00:00
vdso arm64: vdso: Remove stale files from old assembly implementation 2019-10-07 11:07:16 +01:00
vdso32 kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
.gitignore
acpi_numa.c acpi: Create subtable parsing infrastructure 2019-04-04 18:41:12 +02:00
acpi_parking_protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
acpi.c arm64: acpi: fix DAIF manipulation with pNMI 2020-01-22 14:41:22 +00:00
alternative.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
armv8_deprecated.c arm64: armv8_deprecated: update the comments of armv8_deprecated_init() 2020-01-08 17:29:41 +00:00
asm-offsets.c arm64: asm-offsets: add S_FP 2019-11-06 14:17:34 +00:00
cacheinfo.c arm64 updates for 5.3: 2019-07-08 09:54:55 -07:00
cpu_errata.c Merge branch 'for-next/errata' into for-next/core 2020-01-22 11:35:05 +00:00
cpu_ops.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
cpu-reset.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
cpu-reset.S arm64: kernel: avoid x18 in __cpu_soft_restart 2020-01-16 17:32:56 +00:00
cpufeature.c Merge branch 'for-next/rng' into for-next/core 2020-01-22 11:38:53 +00:00
cpuidle.c PSCI: cpuidle: Refactor CPU suspend power_state parameter handling 2019-08-09 17:51:39 +01:00
cpuinfo.c Merge branch 'for-next/rng' into for-next/core 2020-01-22 11:38:53 +00:00
crash_core.c
crash_dump.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
debug-monitors.c arm64: Remove unneeded rcu_read_lock from debug handlers 2019-08-01 15:00:27 +01:00
efi-entry.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
efi-header.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
efi-rt-wrapper.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
efi.c mm/pgtable: drop pgtable_t variable from pte_fn_t functions 2019-07-12 11:05:46 -07:00
entry-common.c arm64: entry: cleanup el0 svc handler naming 2020-01-17 13:22:14 +00:00
entry-fpsimd.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
entry-ftrace.S arm64: ftrace: fix ifdeffery 2019-12-06 13:25:14 +00:00
entry.S Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-01-28 10:07:09 -08:00
fpsimd.c arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly 2020-01-14 17:11:53 +00:00
ftrace.c arm64: ftrace: minimize ifdeffery 2019-11-06 14:17:36 +00:00
head.S Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', 'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core 2019-08-30 12:46:12 +01:00
hibernate-asm.S arm64: mm: Logic to make offset_ttbr1 conditional 2019-08-09 11:17:24 +01:00
hibernate.c arm64: hibernate: add trans_pgd public functions 2020-01-08 16:55:12 +00:00
hw_breakpoint.c Printk changes for 5.5 2019-11-25 19:40:40 -08:00
hyp-stub.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
image-vars.h arm64/efi: Move variable assignments after SECTIONS 2019-08-14 17:18:15 +01:00
image.h arm64/efi: Move variable assignments after SECTIONS 2019-08-14 17:18:15 +01:00
insn.c arm64: insn: consistently handle exit text 2019-12-04 11:32:20 +00:00
io.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
irq.c arm64 updates for 5.3: 2019-07-08 09:54:55 -07:00
jump_label.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
kaslr.c arm64: Fix CONFIG_ARCH_RANDOM=n build 2020-02-11 09:47:01 +00:00
kexec_image.c arm64: kexec_file: add crash dump support 2020-01-08 17:05:23 +00:00
kgdb.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
kuser32.S docs: arm: convert docs to ReST and rename to *.rst 2019-07-15 09:20:24 -03:00
machine_kexec_file.c arm64: kexec_file: add crash dump support 2020-01-08 17:05:23 +00:00
machine_kexec.c Revert "arm64: kexec: make dtb_mem always enabled" 2020-01-10 16:00:50 +00:00
Makefile arm64: entry: convert el1_sync to C 2019-10-28 11:22:47 +00:00
module-plts.c arm64: implement ftrace with regs 2019-11-06 14:17:35 +00:00
module.c arm64: implement ftrace with regs 2019-11-06 14:17:35 +00:00
module.lds
paravirt.c arm64: Retrieve stolen time as paravirtualized guest 2019-10-21 19:20:31 +01:00
pci.c pci-v5.3-changes 2019-07-15 20:44:49 -07:00
perf_callchain.c arm64: stacktrace: Factor out backtrace initialisation 2019-07-22 11:44:08 +01:00
perf_event.c arm64: perf: Simplify the ARMv8 PMUv3 event attributes 2019-11-01 14:51:19 +00:00
perf_regs.c
pointer_auth.c arm64: ptr auth: Move per-thread keys from thread_info to thread_struct 2018-12-13 16:42:47 +00:00
process.c arm64: ssbs: Fix context-switch when SSBS is present on all CPUs 2020-02-10 11:29:02 +00:00
psci.c arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill() 2019-10-25 16:29:11 +01:00
ptrace.c arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations 2020-01-14 17:11:39 +00:00
reloc_test_core.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
reloc_test_syms.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
relocate_kernel.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
return_address.c arm64: unwind: Prohibit probing on return_address() 2019-08-01 15:00:26 +01:00
sdei.c firmware: arm_sdei: use common SMCCC_CONDUIT_* 2019-10-14 10:55:14 +01:00
setup.c TTY/Serial driver updates for 5.6-rc1 2020-01-29 10:13:27 -08:00
signal32.c arm64: signal: nofpsimd: Handle fp/simd context for signal frames 2020-01-14 17:11:46 +00:00
signal.c arm64: signal: nofpsimd: Handle fp/simd context for signal frames 2020-01-14 17:11:46 +00:00
sigreturn32.S arm64: compat: Split kuser32 2019-04-23 18:01:57 +01:00
sleep.S arm64: kernel: use aff3 instead of aff2 in comment 2019-06-04 14:51:01 +01:00
smccc-call.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
smp_spin_table.c arm64: prefer __section from compiler_attributes.h 2019-08-13 18:32:15 +01:00
smp.c arm64: smp: fix crash_smp_send_stop() behaviour 2020-03-17 22:51:19 +00:00
ssbd.c Return ENODEV when the selected speculation misfeature is unsupported 2020-01-08 17:27:41 +00:00
stacktrace.c arm64: unwind: Prohibit probing on return_address() 2019-08-01 15:00:26 +01:00
suspend.c
sys32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 452 2019-06-19 17:09:08 +02:00
sys_compat.c arm64: Silence clang warning on mismatched value/register sizes 2019-10-28 09:13:21 +00:00
sys.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
syscall.c arm64: entry: cleanup el0 svc handler naming 2020-01-17 13:22:14 +00:00
time.c arm64: time: Replace <linux/clk-provider.h> by <linux/of_clk.h> 2020-02-12 17:26:38 +00:00
topology.c Merge tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into for-next/cpu-topology 2019-08-14 10:07:00 +01:00
trace-events-emulation.h
traps.c sched/rt, arm64: Use CONFIG_PREEMPTION 2019-12-08 14:37:32 +01:00
vdso.c arm64: compat: VDSO setup for compat layer 2019-06-22 21:21:08 +02:00
vmlinux.lds.S arm64 updates for 5.5: 2019-12-06 14:18:01 -08:00