linux_dsm_epyc7002/kernel
Peter Zijlstra 85f1abe001 kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().

When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.

Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.

However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:

 - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
   the park and set TASK_RUNNING, or

 - __kthread_bind()'s wait_task_inactive() to observe the competing
   TASK_RUNNING store.

Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.

Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.

The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.

Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-03 07:38:05 +02:00
..
bpf bpf: sockmap remove dead check 2018-04-20 22:09:51 +02:00
cgroup Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2018-04-03 18:00:13 -07:00
configs
debug * Fix 2032 time access issues and new compiler warnings 2018-04-12 10:21:19 -07:00
events Various fixes in tracing: 2018-05-02 17:38:37 -10:00
gcov
irq genirq/affinity: Spread irq vectors among present CPUs as far as possible 2018-04-06 12:19:51 +02:00
livepatch livepatch: Allow to call a custom callback when freeing shadow variables 2018-04-17 13:42:48 +02:00
locking locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches 2018-03-31 07:30:50 +02:00
power PM / QoS: mark expected switch fall-throughs 2018-04-09 13:49:40 +02:00
printk New features: 2018-04-10 11:27:30 -07:00
rcu
sched kthread, sched/wait: Fix kthread_parkme() completion issue 2018-05-03 07:38:05 +02:00
time Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME 2018-04-26 14:53:32 +02:00
trace Various fixes in tracing: 2018-05-02 17:38:37 -10:00
.gitignore
acct.c
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c
audit.c audit/stable-4.17 PR 20180403 2018-04-06 15:01:25 -07:00
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
compat.c mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c 2018-04-02 20:15:32 +02:00
configs.c
context_tracking.c
cpu_pm.c
cpu.c cpu/hotplug: Fix unused function warning 2018-03-15 20:34:40 +01:00
crash_core.c kexec: export PG_swapbacked to VMCOREINFO 2018-04-13 17:10:27 -07:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c kernel: use kernel_wait4() instead of sys_wait4() 2018-04-02 20:14:51 +02:00
extable.c
fail_function.c error-injection: Fix to prohibit jump optimization 2018-03-12 16:16:00 +01:00
fork.c fork: unconditionally clear stack on fork 2018-04-20 17:18:35 -07:00
freezer.c
futex_compat.c
futex.c
groups.c
hung_task.c
irq_work.c
jump_label.c jump_label: Disable jump labels in __exit code 2018-03-20 08:57:17 +01:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c
kexec_core.c
kexec_file.c kernel/kexec_file.c: allow archs to set purgatory load address 2018-04-13 17:10:28 -07:00
kexec_internal.h
kexec.c kexec: call do_kexec_load() in compat syscall directly 2018-04-02 20:15:01 +02:00
kmod.c
kprobes.c kprobes: Fix random address output of blacklist file 2018-04-25 10:27:56 -04:00
ksysfs.c
kthread.c kthread, sched/wait: Fix kthread_parkme() completion issue 2018-05-03 07:38:05 +02:00
latencytop.c
Makefile
memremap.c
module_signing.c
module-internal.h
module.c module: Fix display of wrong module .text address 2018-04-18 22:59:46 +02:00
notifier.c
nsproxy.c
padata.c
panic.c taint: add taint for randstruct 2018-04-11 10:28:35 -07:00
params.c kernel/params.c: downgrade warning for unsafe parameters 2018-04-11 10:28:37 -07:00
pid_namespace.c Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-03 19:15:32 -07:00
pid.c xarray: add the xa_lock to the radix_tree_root 2018-04-11 10:28:39 -07:00
profile.c
ptrace.c
range.c
reboot.c
relay.c
resource.c resource: fix integer overflow at reallocation 2018-04-13 17:10:27 -07:00
seccomp.c
signal.c Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-04-07 11:11:41 -07:00
smp.c
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock 2018-05-03 07:38:03 +02:00
sys_ni.c syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls 2018-04-05 16:59:38 +02:00
sys.c kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() 2018-04-02 20:16:06 +02:00
sysctl_binary.c staging: irda: remove remaining remants of irda code removal 2018-04-16 11:26:49 +02:00
sysctl.c kernel/sysctl.c: add kdoc comments to do_proc_do{u}intvec_minmax_conv_param 2018-04-11 10:28:38 -07:00
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c tracepoint: Do not warn on ENOMEM 2018-04-30 12:09:56 -04:00
tsacct.c
ucount.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
uid16.c fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers 2018-04-02 20:15:59 +02:00
uid16.h kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c 2018-04-02 20:15:30 +02:00
umh.c kernel: use kernel_wait4() instead of sys_wait4() 2018-04-02 20:14:51 +02:00
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c uts: create "struct uts_namespace" from kmem_cache 2018-04-11 10:28:35 -07:00
watchdog_hld.c
watchdog.c
workqueue_internal.h
workqueue.c Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2018-04-03 18:00:13 -07:00