linux_dsm_epyc7002/kernel/sched
Peter Zijlstra aacedf26fb sched/core: Optimize try_to_wake_up() for local wakeups
Jens reported that significant performance can be had on some block
workloads by special casing local wakeups. That is, wakeups on the
current task before it schedules out.

Given something like the normal wait pattern:

	for (;;) {
		set_current_state(TASK_UNINTERRUPTIBLE);

		if (cond)
			break;

		schedule();
	}
	__set_current_state(TASK_RUNNING);

Any wakeup (on this CPU) after set_current_state() and before
schedule() would benefit from this.

Normal wakeups take p->pi_lock, which serializes wakeups to the same
task. By eliding that we gain concurrency on:

 - ttwu_stat(); we already had concurrency on rq stats, this now also
   brings it to task stats. -ENOCARE

 - tracepoints; it is now possible to get multiple instances of
   trace_sched_waking() (and possibly trace_sched_wakeup()) for the
   same task. Tracers will have to learn to cope.

Furthermore, p->pi_lock is used by set_special_state(), to order
against TASK_RUNNING stores from other CPUs. But since this is
strictly CPU local, we don't need the lock, and set_special_state()'s
disabling of IRQs is sufficient.

After the normal wakeup takes p->pi_lock it issues
smp_mb__after_spinlock(), in order to ensure the woken task must
observe prior stores before we observe the p->state. If this is CPU
local, this will be satisfied with a compiler barrier, and we rely on
try_to_wake_up() being a funcation call, which implies such.

Since, when 'p == current', 'p->on_rq' must be true, the normal wakeup
would continue into the ttwu_remote() branch, which normally is
concerned with exactly this wakeup scenario, except from a remote CPU.
IOW we're waking a task that is still running. In this case, we can
trivially avoid taking rq->lock, all that's left from this is to set
p->state.

This then yields an extremely simple and fast path for 'p == current'.

Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: gkohli@codeaurora.org
Cc: hch@lst.de
Cc: oleg@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:15:59 +02:00
..
autogroup.c sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] 2018-05-05 08:34:42 +02:00
autogroup.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
clock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
completion.c sched/Documentation: Update wake_up() & co. memory-barrier guarantees 2018-07-17 09:30:34 +02:00
core.c sched/core: Optimize try_to_wake_up() for local wakeups 2019-06-17 12:15:59 +02:00
cpuacct.c sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cpudeadline.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpudeadline.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cpufreq_schedutil.c Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
cpufreq.c sched/cpufreq: Annotate cpufreq_update_util_data pointer with __rcu 2019-04-03 12:34:31 +02:00
cpupri.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpupri.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cputime.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
deadline.c sched/core: Provide a pointer to the valid CPU mask 2019-06-03 11:49:37 +02:00
debug.c sched/core: Remove sd->*_idx 2019-06-03 11:49:40 +02:00
fair.c sched/fair: Clean up definition of NOHZ blocked load functions 2019-06-17 12:15:57 +02:00
features.h sched/fair: Replace source_load() & target_load() with weighted_cpuload() 2019-06-03 11:49:39 +02:00
idle.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
isolation.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
loadavg.c sched: loadavg: make calc_load_n() public 2018-10-26 16:26:32 -07:00
Makefile psi: pressure stall information for CPU, memory, and IO 2018-10-26 16:26:32 -07:00
membarrier.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
pelt.c sched/fair: Update scale invariance of PELT 2019-02-04 09:13:21 +01:00
pelt.h sched/fair: Update scale invariance of PELT 2019-02-04 09:13:21 +01:00
psi.c kernel/sched/psi.c: expose pressure metrics on root cgroup 2019-05-14 19:52:48 -07:00
rt.c sched/core: Provide a pointer to the valid CPU mask 2019-06-03 11:49:37 +02:00
sched-pelt.h sched/fair: Fix "runnable_avg_yN_inv" not used warnings 2019-06-17 12:15:58 +02:00
sched.h sched/core: Remove rq->cpu_load[] 2019-06-03 11:49:40 +02:00
stats.c proc: introduce proc_create_seq{,_data} 2018-05-16 07:23:35 +02:00
stats.h psi: make disabling/enabling easier for vendor kernels 2018-11-30 14:56:14 -08:00
stop_task.c sched: Clean up and harmonize the coding style of the scheduler code base 2018-03-03 15:50:21 +01:00
swait.c kernel/sched/: remove caller signal_pending branch predictions 2019-01-04 13:13:48 -08:00
topology.c sched/core: Remove sd->*_idx 2019-06-03 11:49:40 +02:00
wait_bit.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
wait.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00