linux_dsm_epyc7002/kernel/time
Rafael J. Wysocki b7eaf1aab9 cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
The way the schedutil governor uses the PELT metric causes it to
underestimate the CPU utilization in some cases.

That can be easily demonstrated by running kernel compilation on
a Sandy Bridge Intel processor, running turbostat in parallel with
it and looking at the values written to the MSR_IA32_PERF_CTL
register.  Namely, the expected result would be that when all CPUs
were 100% busy, all of them would be requested to run in the maximum
P-state, but observation shows that this clearly isn't the case.
The CPUs run in the maximum P-state for a while and then are
requested to run slower and go back to the maximum P-state after
a while again.  That causes the actual frequency of the processor to
visibly oscillate below the sustainable maximum in a jittery fashion
which clearly is not desirable.

That has been attributed to CPU utilization metric updates on task
migration that cause the total utilization value for the CPU to be
reduced by the utilization of the migrated task.  If that happens,
the schedutil governor may see a CPU utilization reduction and will
attempt to reduce the CPU frequency accordingly right away.  That
may be premature, though, for example if the system is generally
busy and there are other runnable tasks waiting to be run on that
CPU already.

This is unlikely to be an issue on systems where cpufreq policies are
shared between multiple CPUs, because in those cases the policy
utilization is computed as the maximum of the CPU utilization values
over the whole policy and if that turns out to be low, reducing the
frequency for the policy most likely is a good idea anyway.  On
systems with one CPU per policy, however, it may affect performance
adversely and even lead to increased energy consumption in some cases.

On those systems it may be addressed by taking another utilization
metric into consideration, like whether or not the CPU whose
frequency is about to be reduced has been idle recently, because if
that's not the case, the CPU is likely to be busy in the near future
and its frequency should not be reduced.

To that end, use the counter of idle calls in the timekeeping code.
Namely, make the schedutil governor look at that counter for the
current CPU every time before its frequency is about to be reduced.
If the counter has not changed since the previous iteration of the
governor computations for that CPU, the CPU has been busy for all
that time and its frequency should not be decreased, so if the new
frequency would be lower than the one set previously, the governor
will skip the frequency update.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
2017-03-23 02:12:14 +01:00
..
alarmtimer.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
clockevents.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
clocksource.c sched/clock, clocksource: Add optional cs::mark_unstable() method 2017-01-14 11:29:43 +01:00
hrtimer.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
itimer.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
jiffies.c jiffies: Revert bogus conversion of NSEC_PER_SEC to TICK_NSEC 2017-03-07 11:03:28 +01:00
Kconfig rcu: Drop RCU_USER_QS in favor of NO_HZ_FULL 2015-07-06 13:52:18 -07:00
Makefile time: Remove CONFIG_TIMER_STATS 2017-02-10 11:15:08 +01:00
ntp_internal.h ntp: Fix second_overflow's input parameter type to be 64bits 2015-12-16 16:50:56 -08:00
ntp.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
posix-clock.c posix-clock: Fix return code on the poll method's error path 2015-12-29 11:33:06 +01:00
posix-cpu-timers.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
posix-stubs.c posix-timers: Make them configurable 2016-11-16 09:26:35 +01:00
posix-timers.c sched/headers: Prepare to move exit_files() and exit_itimers() from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:40 +01:00
sched_clock.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
test_udelay.c time: Avoid timespec in udelay_test 2016-06-20 12:47:26 -07:00
tick-broadcast-hrtimer.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
tick-broadcast.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-20 10:06:32 -08:00
tick-common.c ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
tick-internal.h timers: Forward the wheel clock whenever possible 2016-07-07 10:35:11 +02:00
tick-oneshot.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
tick-sched.c cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely 2017-03-23 02:12:14 +01:00
tick-sched.h Revert "nohz: Fix collision between tick and other hrtimers" 2017-02-16 12:19:18 -08:00
time.c time: Introduce jiffies64_to_nsecs() 2017-02-01 09:13:45 +01:00
timeconst.bc time: Introduce jiffies64_to_nsecs() 2017-02-01 09:13:45 +01:00
timeconv.c time: Add time64_to_tm() 2016-06-20 12:47:15 -07:00
timecounter.c clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
timekeeping_debug.c timekeeping: Use deferred printk() in debug code 2017-02-15 11:46:08 +01:00
timekeeping_internal.h clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
timekeeping.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h> 2017-03-02 08:42:30 +01:00
timekeeping.h timekeeping: Remove unused timekeeping_{get,set}_tai_offset() 2017-01-06 16:47:28 -08:00
timer_list.c timer_list: Remove useless cast when printing 2017-02-10 11:15:08 +01:00
timer.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00