linux_dsm_epyc7002/arch/powerpc
Michael Ellerman 875ebe940d powerpc/smp: Wait until secondaries are active & online
Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:

  BUG_ON(td->cpu != smp_processor_id());

Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:

  CPU: 0
  Comm: watchdog/130

The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.

This seems to have been introduced by 6acbfb9697 "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.

The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.

Fixes: 6acbfb9697 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-04 13:19:33 +11:00
..
boot powerpc: dts: pq3/85xx: Fix GPIO address 2015-01-29 23:04:32 -06:00
configs The clock framework changes for 3.20 contain the usual driver additions, 2015-02-21 12:30:30 -08:00
crypto crypto: add missing crypto module aliases 2015-01-13 22:29:11 +11:00
include powerpc: Re-enable dynticks 2015-02-23 14:52:04 +11:00
kernel powerpc/smp: Wait until secondaries are active & online 2015-03-04 13:19:33 +11:00
kvm Tighten rules for ACCESS_ONCE 2015-02-14 10:54:28 -08:00
lib powerpc/lib: Makefile, use obj64-y to consolidate 64-bit rules 2015-01-28 15:00:24 +11:00
math-emu powerpc: Correct emulated mtfsf instruction 2014-04-07 10:33:11 +10:00
mm powerpc: drop _PAGE_FILE and pte_file()-related helpers 2015-02-16 17:56:05 -08:00
net module: remove mod arg from module_free, rename module_memfree(). 2015-01-20 11:38:33 +10:30
oprofile powerpc updates for 3.19 2014-12-11 17:48:14 -08:00
perf Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next 2015-02-04 12:03:21 +11:00
platforms The clock framework changes for 3.20 contain the usual driver additions, 2015-02-21 12:30:30 -08:00
sysdev powerpc: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:36 -08:00
xmon powerpc updates for 3.20 2015-02-11 18:15:38 -08:00
Kconfig powerpc/mm: fix undefined reference to `.__kernel_map_pages' on FSL PPC64 2015-01-28 14:22:22 +11:00
Kconfig.debug Patch queue for ppc - 2014-08-01 2014-08-05 09:58:11 +02:00
Makefile kbuild: do not add $(call ...) to invoke cc-version or cc-fullversion 2015-01-09 17:25:44 +01:00
relocs_check.pl Fix warning typo "CONFIG_RELCOATABLE" 2013-05-29 15:11:30 +02:00