linux_dsm_epyc7002/arch/x86/kernel
Nicolai Stange 1a9e4c564a x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
I noticed the following bug/misbehavior on certain Intel systems: with a
single task running on a NOHZ CPU on an Intel Haswell, I recognized
that I did not only get the one expected local_timer APIC interrupt, but
two per second at minimum. (!)

Further tracing showed that the first one precedes the programmed deadline
by up to ~50us and hence, it did nothing except for reprogramming the TSC
deadline clockevent device to trigger shortly thereafter again.

The reason for this is imprecise calibration, the timeout we program into
the APIC results in 'too short' timer interrupts. The core (hr)timer code
notices this (because it has a precise ktime source and sees the short
interrupt) and fixes it up by programming an additional very short
interrupt period.

This is obviously suboptimal.

The reason for the imprecise calibration is twofold, and this patch
fixes the first reason:

In setup_APIC_timer(), the registered clockevent device's frequency
is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
it with 1000 afterwards:

  (tsc_khz / TSC_DIVISOR) * 1000

The multiplication with 1000 is done for converting from kHz to Hz and the
division by TSC_DIVISOR is carried out in order to make sure that the final
result fits into an u32.

However, with the order given in this calculation, the roundoff error
introduced by the division gets magnified by a factor of 1000 by the
following multiplication.

To fix it, reversing the order of the division and the multiplication a la:

  (tsc_khz * 1000) / TSC_DIVISOR

... reduces the roundoff error already.

Furthermore, if TSC_DIVISOR divides 1000, associativity holds:

  (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)

and thus, the roundoff error even vanishes and the whole operation can be
carried out within 32 bits.

The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
TSC_DIVISOR still allows for TSC frequencies up to
2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
in the next years.

Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
the order of the divison and the multiplication in the calculation of
the registered clockevent device's frequency.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christopher S. Hall <christopher.s.hall@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
[ Improved changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 12:37:38 +02:00
..
acpi x86/acpi: store ACPI ids from MADT for future usage 2016-07-25 13:30:53 +01:00
apic x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error 2016-08-10 12:37:38 +02:00
cpu Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-30 13:18:33 -07:00
fpu x86/fpu: Do not BUG_ON() in early FPU code 2016-07-21 18:18:45 +02:00
kprobes kprobes/x86: Clear TF bit in fault on single-stepping 2016-06-14 12:00:54 +02:00
.gitignore
alternative.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
amd_gart_64.c
amd_nb.c x86/amd_nb: Clean up init path 2016-07-01 09:36:12 +02:00
apb_timer.c x86/apb_timer: Convert to hotplug state machine 2016-07-15 10:40:22 +02:00
aperture_64.c
apm_32.c x86/apm32: Remove paravirt_enabled() use 2016-04-22 10:29:03 +02:00
asm-offsets_32.c
asm-offsets_64.c
asm-offsets.c x86/uaccess: Move thread_info::addr_limit to thread_struct 2016-07-15 10:26:30 +02:00
audit_64.c
bootflag.c x86: don't use module_init for non-modular core bootflag code 2015-06-16 14:12:34 -04:00
check.c
cpuid.c
crash_dump_32.c
crash_dump_64.c
crash.c
devicetree.c x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage 2016-04-13 11:37:41 +02:00
doublefault.c
dumpstack_32.c x86/dumpstack: Honor supplied @regs arg 2016-07-08 11:33:18 +02:00
dumpstack_64.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 18:18:04 -07:00
dumpstack.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 18:18:04 -07:00
e820.c
early_printk.c
early-quirks.c x86/quirks: Add early quirk to reset Apple AirPort card 2016-07-10 20:13:53 +02:00
ebda.c x86/boot: Simplify EBDA-vs-BIOS reservation logic 2016-07-22 11:46:01 +02:00
espfix_64.c x86: get rid of superfluous __GFP_REPEAT 2016-06-24 17:23:52 -07:00
ftrace.c
head32.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
head64.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
head_32.S Merge branch 'x86/urgent' into x86/asm, to refresh the tree 2016-04-29 11:55:04 +02:00
head_64.S x86/mm: Enable KASLR for physical mapping memory regions 2016-07-08 17:35:15 +02:00
hpet.c x86/hpet: Convert to hotplug state machine 2016-07-14 09:34:44 +02:00
hw_breakpoint.c
i386_ksyms_32.c x86/hweight: Get rid of the special calling convention 2016-06-08 15:01:02 +02:00
i8237.c
i8253.c
i8259.c
io_delay.c
ioport.c
irq_32.c x86: fix up a few misc stack pointer vs thread_info confusions 2016-06-24 16:55:53 -07:00
irq_64.c
irq_work.c
irq.c
irqinit.c
jump_label.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
kdebugfs.c
kexec-bzimage64.c KEYS: Generalise system_verify_data() to provide access to internal content 2016-04-06 16:14:24 +01:00
kgdb.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
ksysfs.c
kvm.c KVM: Fix steal clock warp during guest CPU hotplug 2016-06-14 11:13:14 +02:00
kvmclock.c
ldt.c
machine_kexec_32.c
machine_kexec_64.c kexec: provide arch_kexec_protect(unprotect)_crashkres() 2016-05-23 17:04:14 -07:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2016-05-17 17:11:27 -07:00
mcount_64.S ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it 2016-05-20 13:28:40 -04:00
mmconf-fam10h_64.c
module.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
mpparse.c
msr.c
nmi_selftest.c
nmi.c
paravirt_patch_32.c
paravirt_patch_64.c
paravirt-spinlocks.c
paravirt.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 15:54:01 -07:00
pci-calgary_64.c
pci-dma.c
pci-iommu_table.c x86: Fix non-static inlines 2016-04-16 13:21:40 +02:00
pci-nommu.c
pci-swiotlb.c
pcspeaker.c
perf_regs.c
platform-quirks.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
pmem.c
probe_roms.c
process_32.c
process_64.c x86/fsgsbase/64: Use TASK_SIZE_MAX for FSBASE/GSBASE upper limits 2016-05-20 09:10:03 +02:00
process.c x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs 2016-07-20 09:48:40 +02:00
ptrace.c x86/fsgsbase/64: Use TASK_SIZE_MAX for FSBASE/GSBASE upper limits 2016-05-20 09:10:03 +02:00
pvclock.c pvclock: Get rid of __pvclock_read_cycles in function pvclock_read_flags 2016-06-27 15:12:15 +02:00
quirks.c
reboot_fixups_32.c
reboot.c x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk 2016-07-14 20:57:22 +02:00
relocate_kernel_32.S
relocate_kernel_64.S
resource.c
rtc.c x86/init: Disable pnpbios and rtc for X86_SUBARCH_CE4100 2016-04-22 10:29:09 +02:00
setup_percpu.c x86/acpi: store ACPI ids from MADT for future usage 2016-07-25 13:30:53 +01:00
setup.c ACPI material for v4.8-rc1 2016-07-26 17:56:45 -07:00
signal_compat.c x86/signals: Add build-time checks to the siginfo compat code 2016-06-14 12:19:24 +02:00
signal.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 15:15:17 -07:00
smp.c
smpboot.c Power management material for v4.8-rc1 2016-07-26 17:29:07 -07:00
stacktrace.c
step.c
sys_x86_64.c
sysfb_efi.c Merge branch 'linus' into efi/core, to pick up fixes 2016-05-07 07:00:07 +02:00
sysfb_simplefb.c
sysfb.c
tboot.c x86/tboot: Convert to hotplug state machine 2016-07-15 10:40:30 +02:00
tce_64.c x86/cpufeature: Remove cpu_has_clflush 2016-03-31 13:35:09 +02:00
test_nx.c
test_rodata.c
time.c
tls.c x86/tls: Synchronize segment registers in set_thread_area() 2016-04-29 11:56:42 +02:00
tls.h
topology.c
trace_clock.c
tracepoint.c
traps.c x86/entry/traps: Don't force in_interrupt() to return true in IST handlers 2016-06-10 13:54:47 +02:00
tsc_msr.c x86/tsc_msr: Remove irqoff around MSR-based TSC enumeration 2016-07-11 21:30:12 +02:00
tsc_sync.c
tsc.c x86/tsc: Remove the unused check_tsc_disabled() 2016-07-15 10:35:08 +02:00
uprobes.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 15:15:17 -07:00
verify_cpu.S
vm86_32.c x86, bitops: remove use of "sbb" to return CF 2016-06-08 12:41:20 -07:00
vmlinux.lds.S x86/boot: Move compressed kernel to the end of the decompression buffer 2016-04-29 11:03:29 +02:00
vsmp_64.c
x86_init.c x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID 2016-07-11 21:30:13 +02:00
x8664_ksyms_64.c x86/hweight: Get rid of the special calling convention 2016-06-08 15:01:02 +02:00