linux_dsm_epyc7002/kernel/time
Nicolas Saenz Julienne 0ff2ea9d8f timers: Fix get_next_timer_interrupt() with no timers pending
[ Upstream commit aebacb7f6ca1926918734faae14d1f0b6fae5cb7 ]

31cd0e119d ("timers: Recalculate next timer interrupt only when
necessary") subtly altered get_next_timer_interrupt()'s behaviour. The
function no longer consistently returns KTIME_MAX with no timers
pending.

In order to decide if there are any timers pending we check whether the
next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now.
Unfortunately, the next expiry time and the timer base clock are no
longer updated in unison. The former changes upon certain timer
operations (enqueue, expire, detach), whereas the latter keeps track of
jiffies as they move forward. Ultimately breaking the logic above.

A simplified example:

- Upon entering get_next_timer_interrupt() with:

	jiffies = 1
	base->clk = 0;
	base->next_expiry = NEXT_TIMER_MAX_DELTA;

  'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function
  returns KTIME_MAX.

- 'base->clk' is updated to the jiffies value.

- The next time we enter get_next_timer_interrupt(), taking into account
  no timer operations happened:

	base->clk = 1;
	base->next_expiry = NEXT_TIMER_MAX_DELTA;

  'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function
  returns a valid expire time, which is incorrect.

This ultimately might unnecessarily rearm sched's timer on nohz_full
setups, and add latency to the system[1].

So, introduce 'base->timers_pending'[2], update it every time
'base->next_expiry' changes, and use it in get_next_timer_interrupt().

[1] See tick_nohz_stop_tick().
[2] A quick pahole check on x86_64 and arm64 shows it doesn't make
    'struct timer_base' any bigger.

Fixes: 31cd0e119d ("timers: Recalculate next timer interrupt only when necessary")
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-28 14:35:37 +02:00
..
alarmtimer.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
clockevents.c tick: Remove outgoing CPU from broadcast masks 2019-03-23 18:26:43 +01:00
clocksource.c clocksource: Check per-CPU clock synchronization when marked unstable 2021-07-14 16:56:01 +02:00
hrtimer.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
itimer.c time: Prevent undefined behaviour in timespec64_to_ns() 2020-10-26 11:48:11 +01:00
jiffies.c timekeeping: Split jiffies seqlock 2020-03-21 16:00:23 +01:00
Kconfig posix-cpu-timers: Provide mechanisms to defer timer handling to task_work 2020-08-06 16:50:59 +02:00
Makefile ns: Introduce Time Namespace 2020-01-14 12:20:48 +01:00
namespace.c nsproxy: support CLONE_NEWTIME with setns() 2020-07-08 11:14:22 +02:00
ntp_internal.h ntp: Audit NTP parameters adjustment 2019-04-15 18:14:01 -04:00
ntp.c ntp/y2038: Remove incorrect time_t truncation 2019-11-12 08:13:44 +01:00
posix-clock.c posix-clocks: Rename the clock_get() callback to clock_get_timespec() 2020-01-14 12:20:49 +01:00
posix-cpu-timers.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
posix-stubs.c posix-timers: Make clock_nanosleep() time namespace aware 2020-01-14 12:20:55 +01:00
posix-timers.c posix-timers: Preserve return value in clock_adjtime32() 2021-05-11 14:47:16 +02:00
posix-timers.h posix-clocks: Introduce clock_get_ktime() callback 2020-01-14 12:20:51 +01:00
sched_clock.c time/sched_clock: Mark sched_clock_read_begin/retry() as notrace 2020-10-26 11:34:31 +01:00
test_udelay.c time/debug: Remove license boilerplate 2018-11-23 11:51:21 +01:00
tick-broadcast-hrtimer.c tick: broadcast-hrtimer: Fix a race in bc_set_next 2019-09-27 14:45:55 +02:00
tick-broadcast.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
tick-common.c timekeeping: Split jiffies seqlock 2020-03-21 16:00:23 +01:00
tick-internal.h tick: Remove outgoing CPU from broadcast masks 2019-03-23 18:26:43 +01:00
tick-oneshot.c hrtimers/tick/clockevents: Remove sloppy license references 2018-11-23 11:51:21 +01:00
tick-sched.c tick/sched: Remove bogus boot "safety" check 2021-01-06 14:56:55 +01:00
tick-sched.h tick/sched: Update tick_sched struct documentation 2019-03-24 20:29:32 +01:00
time.c y2038: remove unused time32 interfaces 2020-02-21 11:22:15 -08:00
timeconst.bc time: Add SPDX license identifiers 2018-11-23 11:51:20 +01:00
timeconv.c time: Add SPDX license identifiers 2018-11-23 11:51:20 +01:00
timecounter.c time: Remove license boilerplate 2018-11-23 11:51:21 +01:00
timekeeping_debug.c timekeeping/debug: No need to check return value of debugfs_create functions 2019-01-29 20:08:41 +01:00
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timekeeping.c These are the locking updates for v5.10: 2020-10-12 13:06:20 -07:00
timekeeping.h timekeeping: Split jiffies seqlock 2020-03-21 16:00:23 +01:00
timer_list.c timer_list: Guard procfs specific code 2019-06-23 00:08:52 +02:00
timer.c timers: Fix get_next_timer_interrupt() with no timers pending 2021-07-28 14:35:37 +02:00
vsyscall.c timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00