linux_dsm_epyc7002/arch/x86
Lendacky, Thomas 6d3edaae16 x86/perf/amd: Resolve NMI latency issues for active PMCs
On AMD processors, the detection of an overflowed PMC counter in the NMI
handler relies on the current value of the PMC. So, for example, to check
for overflow on a 48-bit counter, bit 47 is checked to see if it is 1 (not
overflowed) or 0 (overflowed).

When the perf NMI handler executes it does not know in advance which PMC
counters have overflowed. As such, the NMI handler will process all active
PMC counters that have overflowed. NMI latency in newer AMD processors can
result in multiple overflowed PMC counters being processed in one NMI and
then a subsequent NMI, that does not appear to be a back-to-back NMI, not
finding any PMC counters that have overflowed. This may appear to be an
unhandled NMI resulting in either a panic or a series of messages,
depending on how the kernel was configured.

To mitigate this issue, add an AMD handle_irq callback function,
amd_pmu_handle_irq(), that will invoke the common x86_pmu_handle_irq()
function and upon return perform some additional processing that will
indicate if the NMI has been handled or would have been handled had an
earlier NMI not handled the overflowed PMC. Using a per-CPU variable, a
minimum value of the number of active PMCs or 2 will be set whenever a
PMC is active. This is used to indicate the possible number of NMIs that
can still occur. The value of 2 is used for when an NMI does not arrive
at the LAPIC in time to be collapsed into an already pending NMI. Each
time the function is called without having handled an overflowed counter,
the per-CPU value is checked. If the value is non-zero, it is decremented
and the NMI indicates that it handled the NMI. If the value is zero, then
the NMI indicates that it did not handle the NMI.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03 11:40:32 +02:00
..
boot x86/boot: Fix incorrect ifdeffery scope 2019-03-27 14:00:51 +01:00
configs Merge branch 'akpm' (patches from Andrew) 2019-03-07 19:25:37 -08:00
crypto crypto: x86/poly1305 - Clear key material from stack in SSE2 variant 2019-02-28 14:17:59 +08:00
entry pidfd patches for v5.1-rc1 2019-03-16 13:47:14 -07:00
events x86/perf/amd: Resolve NMI latency issues for active PMCs 2019-04-03 11:40:32 +02:00
hyperv x86/hyperv: Prevent potential NULL pointer dereference 2019-03-21 12:24:39 +01:00
ia32 a.out: remove core dumping support 2019-03-05 10:00:35 -08:00
include A collection of x86 and ARM bugfixes, and some improvements to documentation. 2019-03-31 08:55:59 -07:00
kernel x86/resctrl: Remove unused variable 2019-03-24 22:09:27 +01:00
kvm KVM: x86: update %rip after emulating IO 2019-03-28 17:29:04 +01:00
lib x86/lib: Fix indentation issue, remove extra tab 2019-03-21 12:24:38 +01:00
math-emu Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
mm x86/mm: Don't exceed the valid physical address space 2019-03-28 14:13:51 +01:00
net x32: bpf: implement jitting of JMP32 2019-01-26 13:33:02 -08:00
oprofile
pci x86/PCI: Fixup RTIT_BAR of Intel Denverton Trace Hub 2019-02-07 08:43:58 -06:00
platform x86/realmode: Make set_real_mode_mem() static inline 2019-03-29 10:16:27 +01:00
power mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
purgatory kbuild: move bin2c back to scripts/ from scripts/basic/ 2018-07-18 01:18:05 +09:00
ras
realmode x86/realmode: Make set_real_mode_mem() static inline 2019-03-29 10:16:27 +01:00
tools x86: Clean up 'sizeof x' => 'sizeof(x)' 2018-10-29 07:13:28 +01:00
um Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-05 14:08:26 -08:00
video
xen Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
.gitignore
Kbuild KVM: x86: Allow Qemu/KVM to use PVH entry point 2018-12-13 13:41:49 -05:00
Kconfig x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y 2019-03-28 13:34:58 +01:00
Kconfig.cpu x86/cpu: Create Hygon Dhyana architecture support file 2018-09-27 16:14:05 +02:00
Kconfig.debug efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation 2019-02-04 08:27:30 +01:00
Makefile x86/retpolines: Disable switch jump tables when retpolines are enabled 2019-03-28 13:39:48 +01:00
Makefile_32.cpu
Makefile.um x86, powerpc: Remove -funit-at-a-time compiler option entirely 2018-12-09 11:55:32 +01:00