mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-08 03:56:39 +07:00
7f65ea42eb
The util_avg signal computed by PELT is too variable for some use-cases. For example, a big task waking up after a long sleep period will have its utilization almost completely decayed. This introduces some latency before schedutil will be able to pick the best frequency to run a task. The same issue can affect task placement. Indeed, since the task utilization is already decayed at wakeup, when the task is enqueued in a CPU, this can result in a CPU running a big task as being temporarily represented as being almost empty. This leads to a race condition where other tasks can be potentially allocated on a CPU which just started to run a big task which slept for a relatively long period. Moreover, the PELT utilization of a task can be updated every [ms], thus making it a continuously changing value for certain longer running tasks. This means that the instantaneous PELT utilization of a RUNNING task is not really meaningful to properly support scheduler decisions. For all these reasons, a more stable signal can do a better job of representing the expected/estimated utilization of a task/cfs_rq. Such a signal can be easily created on top of PELT by still using it as an estimator which produces values to be aggregated on meaningful events. This patch adds a simple implementation of util_est, a new signal built on top of PELT's util_avg where: util_est(task) = max(task::util_avg, f(task::util_avg@dequeue)) This allows to remember how big a task has been reported by PELT in its previous activations via f(task::util_avg@dequeue), which is the new _task_util_est(struct task_struct*) function added by this patch. If a task should change its behavior and it runs longer in a new activation, after a certain time its util_est will just track the original PELT signal (i.e. task::util_avg). The estimated utilization of cfs_rq is defined only for root ones. That's because the only sensible consumer of this signal are the scheduler and schedutil when looking for the overall CPU utilization due to FAIR tasks. For this reason, the estimated utilization of a root cfs_rq is simply defined as: util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued) where: cfs_rq::util_est::enqueued = sum(_task_util_est(task)) for each RUNNABLE task on that root cfs_rq It's worth noting that the estimated utilization is tracked only for objects of interests, specifically: - Tasks: to better support tasks placement decisions - root cfs_rqs: to better support both tasks placement decisions as well as frequencies selection Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@android.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
93 lines
2.4 KiB
C
93 lines
2.4 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
/*
|
|
* Only give sleepers 50% of their service deficit. This allows
|
|
* them to run sooner, but does not allow tons of sleepers to
|
|
* rip the spread apart.
|
|
*/
|
|
SCHED_FEAT(GENTLE_FAIR_SLEEPERS, true)
|
|
|
|
/*
|
|
* Place new tasks ahead so that they do not starve already running
|
|
* tasks
|
|
*/
|
|
SCHED_FEAT(START_DEBIT, true)
|
|
|
|
/*
|
|
* Prefer to schedule the task we woke last (assuming it failed
|
|
* wakeup-preemption), since its likely going to consume data we
|
|
* touched, increases cache locality.
|
|
*/
|
|
SCHED_FEAT(NEXT_BUDDY, false)
|
|
|
|
/*
|
|
* Prefer to schedule the task that ran last (when we did
|
|
* wake-preempt) as that likely will touch the same data, increases
|
|
* cache locality.
|
|
*/
|
|
SCHED_FEAT(LAST_BUDDY, true)
|
|
|
|
/*
|
|
* Consider buddies to be cache hot, decreases the likelyness of a
|
|
* cache buddy being migrated away, increases cache locality.
|
|
*/
|
|
SCHED_FEAT(CACHE_HOT_BUDDY, true)
|
|
|
|
/*
|
|
* Allow wakeup-time preemption of the current task:
|
|
*/
|
|
SCHED_FEAT(WAKEUP_PREEMPTION, true)
|
|
|
|
SCHED_FEAT(HRTICK, false)
|
|
SCHED_FEAT(DOUBLE_TICK, false)
|
|
SCHED_FEAT(LB_BIAS, true)
|
|
|
|
/*
|
|
* Decrement CPU capacity based on time not spent running tasks
|
|
*/
|
|
SCHED_FEAT(NONTASK_CAPACITY, true)
|
|
|
|
/*
|
|
* Queue remote wakeups on the target CPU and process them
|
|
* using the scheduler IPI. Reduces rq->lock contention/bounces.
|
|
*/
|
|
SCHED_FEAT(TTWU_QUEUE, true)
|
|
|
|
/*
|
|
* When doing wakeups, attempt to limit superfluous scans of the LLC domain.
|
|
*/
|
|
SCHED_FEAT(SIS_AVG_CPU, false)
|
|
SCHED_FEAT(SIS_PROP, true)
|
|
|
|
/*
|
|
* Issue a WARN when we do multiple update_rq_clock() calls
|
|
* in a single rq->lock section. Default disabled because the
|
|
* annotations are not complete.
|
|
*/
|
|
SCHED_FEAT(WARN_DOUBLE_CLOCK, false)
|
|
|
|
#ifdef HAVE_RT_PUSH_IPI
|
|
/*
|
|
* In order to avoid a thundering herd attack of CPUs that are
|
|
* lowering their priorities at the same time, and there being
|
|
* a single CPU that has an RT task that can migrate and is waiting
|
|
* to run, where the other CPUs will try to take that CPUs
|
|
* rq lock and possibly create a large contention, sending an
|
|
* IPI to that CPU and let that CPU push the RT task to where
|
|
* it should go may be a better scenario.
|
|
*/
|
|
SCHED_FEAT(RT_PUSH_IPI, true)
|
|
#endif
|
|
|
|
SCHED_FEAT(RT_RUNTIME_SHARE, true)
|
|
SCHED_FEAT(LB_MIN, false)
|
|
SCHED_FEAT(ATTACH_AGE_LOAD, true)
|
|
|
|
SCHED_FEAT(WA_IDLE, true)
|
|
SCHED_FEAT(WA_WEIGHT, true)
|
|
SCHED_FEAT(WA_BIAS, true)
|
|
|
|
/*
|
|
* UtilEstimation. Use estimated CPU utilization.
|
|
*/
|
|
SCHED_FEAT(UTIL_EST, false)
|