mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-13 18:16:41 +07:00
c469933e77
A ~10% regression has been reported for UnixBench's execl throughput test by Aaron Lu and Ye Xiaolong: https://lkml.org/lkml/2018/10/30/765 That test is pretty simple, it does a "recursive" execve() syscall on the same binary. Starting from the syscall, this sequence is possible: do_execve() do_execveat_common() __do_execve_file() sched_exec() select_task_rq_fair() <==| Task already enqueued find_idlest_cpu() find_idlest_group() capacity_spare_wake() <==| Functions not called from cpu_util_wake() | the wakeup path which means we can end up calling cpu_util_wake() not only from the "wakeup path", as its name would suggest. Indeed, the task doing an execve() syscall is already enqueued on the CPU we want to get the cpu_util_wake() for. The estimated utilization for a CPU computed in cpu_util_wake() was written under the assumption that function can be called only from the wakeup path. If instead the task is already enqueued, we end up with a utilization which does not remove the current task's contribution from the estimated utilization of the CPU. This will wrongly assume a reduced spare capacity on the current CPU and increase the chances to migrate the task on execve. The regression is tracked down to: commit |
||
---|---|---|
.. | ||
autogroup.c | ||
autogroup.h | ||
clock.c | ||
completion.c | ||
core.c | ||
cpuacct.c | ||
cpudeadline.c | ||
cpudeadline.h | ||
cpufreq_schedutil.c | ||
cpufreq.c | ||
cpupri.c | ||
cpupri.h | ||
cputime.c | ||
deadline.c | ||
debug.c | ||
fair.c | ||
features.h | ||
idle.c | ||
isolation.c | ||
loadavg.c | ||
Makefile | ||
membarrier.c | ||
pelt.c | ||
pelt.h | ||
psi.c | ||
rt.c | ||
sched-pelt.h | ||
sched.h | ||
stats.c | ||
stats.h | ||
stop_task.c | ||
swait.c | ||
topology.c | ||
wait_bit.c | ||
wait.c |