linux_dsm_epyc7002/kernel
Peter Zijlstra ecf7d01c22 sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()
Oleg noticed that its possible to falsely observe p->on_cpu == 0 such
that we'll prematurely continue with the wakeup and effectively run p on
two CPUs at the same time.

Even though the overlap is very limited; the task is in the middle of
being scheduled out; it could still result in corruption of the
scheduler data structures.

        CPU0                            CPU1

        set_current_state(...)

        <preempt_schedule>
          context_switch(X, Y)
            prepare_lock_switch(Y)
              Y->on_cpu = 1;
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

                                        try_to_wake_up(X)
                                          LOCK(p->pi_lock);

                                          t = X->on_cpu; // 0

          context_switch(Y, X)
            prepare_lock_switch(X)
              X->on_cpu = 1;
            finish_lock_switch(Y)
              store_release(Y->on_cpu, 0);
        </preempt_schedule>

        schedule();
          deactivate_task(X);
          X->on_rq = 0;

                                          if (X->on_rq) // false

                                          if (t) while (X->on_cpu)
                                            cpu_relax();

          context_switch(X, ..)
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

Avoid the load of X->on_cpu being hoisted over the X->on_rq load.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04 10:26:43 +01:00
..
bpf
configs
debug
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:36:24 -08:00
gcov
irq Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:30:48 -08:00
livepatch livepatch: x86: fix relocation computation with kASLR 2015-11-11 17:36:04 +01:00
locking
power
printk
rcu
sched sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule() 2015-12-04 10:26:43 +01:00
time Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:30:48 -08:00
trace This contains three more clean up patches. 2015-11-12 16:22:54 -08:00
.gitignore
acct.c
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c
cgroup_pids.c
cgroup.c
compat.c
configs.c
context_tracking.c context_tracking: avoid irq_save/irq_restore on guest entry and exit 2015-11-10 12:06:23 +01:00
cpu_pm.c
cpu.c
cpuset.c
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c
extable.c
fork.c
freezer.c
futex_compat.c
futex.c
groups.c
hung_task.c
irq_work.c
jump_label.c
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec_core.c
kexec_file.c
kexec_internal.h
kexec.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
Makefile
membarrier.c
memremap.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
module_signing.c
module-internal.h
module.c
notifier.c
nsproxy.c
padata.c
panic.c kernel/panic.c: turn off locks debug before releasing console lock 2015-11-20 16:17:32 -08:00
params.c Nothing exciting, minor tweaks and cleanups. 2015-11-09 15:53:39 -08:00
pid_namespace.c
pid.c
profile.c
ptrace.c
range.c
reboot.c
relay.c
resource.c
seccomp.c
signal.c kernel/signal.c: unexport sigsuspend() 2015-11-20 16:17:32 -08:00
smp.c
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c
sysctl_binary.c
sysctl.c
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c
workqueue_internal.h
workqueue.c