linux_dsm_epyc7002/arch/x86
Kim Phillips 0f4cd769c4 perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
sample bias, IBS hardware preloads the least significant 7 bits of
current count (IbsOpCurCnt) with random values, such that, after the
interrupt is handled and counting resumes, the next sample taken
will be slightly perturbed.

The current count bitfield is in the IBS execution control h/w register,
alongside the maximum count field.

Currently, the IBS driver writes that register with the maximum count,
leaving zeroes to fill the current count field, thereby overwriting
the random bits the hardware preloaded for itself.

Fix the driver to actually retain and carry those random bits from the
read of the IBS control register, through to its write, instead of
overwriting the lower current count bits with zeroes.

Tested with:

perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>

'perf annotate' output before:

 15.70  65:   addsd     %xmm0,%xmm1
 17.30        add       $0x1,%rax
 15.88        cmp       %rdx,%rax
              je        82
 17.32  72:   test      $0x1,%al
              jne       7c
  7.52        movapd    %xmm1,%xmm0
  5.90        jmp       65
  8.23  7c:   sqrtsd    %xmm1,%xmm0
 12.15        jmp       65

'perf annotate' output after:

 16.63  65:   addsd     %xmm0,%xmm1
 16.82        add       $0x1,%rax
 16.81        cmp       %rdx,%rax
              je        82
 16.69  72:   test      $0x1,%al
              jne       7c
  8.30        movapd    %xmm1,%xmm0
  8.13        jmp       65
  8.24  7c:   sqrtsd    %xmm1,%xmm0
  8.39        jmp       65

Tested on Family 15h and 17h machines.

Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
affect their operation.

It is unknown why commit db98c5faf8 ("perf/x86: Implement 64-bit
counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
field; the number of preloaded random bits has always been 7, AFAICT.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: "Namhyung Kim" <namhyung@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
2019-08-30 14:27:47 +02:00
..
boot x86/boot/compressed/64: Fix boot on machines with broken E820 table 2019-08-19 15:59:13 +02:00
configs x86/defconfigs: Remove useless UEVENT_HELPER_PATH 2019-06-21 19:22:08 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
entry Merge branch master from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2019-07-28 22:22:40 +02:00
events perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops 2019-08-30 14:27:47 +02:00
hyperv x86/hyper-v: Zero out the VP ASSIST PAGE on allocation 2019-07-19 09:48:15 +02:00
ia32 clone: fix CLONE_PIDFD support 2019-07-14 20:36:12 +02:00
include perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops 2019-08-30 14:27:47 +02:00
kernel x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h 2019-08-19 19:42:52 +02:00
kvm Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot" 2019-08-21 10:28:41 +02:00
lib x86/lib/cpu: Address missing prototypes warning 2019-08-08 08:25:53 +02:00
math-emu x86/fpu/math-emu: Address fallthrough warnings 2019-08-12 20:35:05 +02:00
mm x86/mm: Sync also unmappings in vmalloc_sync_all() 2019-07-22 10:18:30 +02:00
net bpf: fix x64 JIT code generation for jmp to 1st insn 2019-08-01 13:12:09 -07:00
oprofile
pci treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 387 2019-06-05 17:37:11 +02:00
platform platform-drivers-x86 for v5.3-1 2019-07-14 16:51:47 -07:00
power x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h 2019-08-19 19:42:52 +02:00
purgatory x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS 2019-08-08 08:25:53 +02:00
ras RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there 2019-06-08 17:39:24 +02:00
realmode x86/realmode: Make set_real_mode_mem() static inline 2019-03-29 10:16:27 +01:00
tools Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 17:34:44 -07:00
um Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2019-07-08 21:48:15 -07:00
video treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
xen Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-20 11:24:49 -07:00
.gitignore
Kbuild treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
Kconfig dma-mapping fixes for 5.3-rc1 2019-07-20 12:09:52 -07:00
Kconfig.cpu x86/cpu: Create Zhaoxin processors architecture support file 2019-06-22 11:45:57 +02:00
Kconfig.debug It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
Makefile x86/build: Keep local relocations with ld.lld 2019-04-05 12:34:35 +02:00
Makefile_32.cpu
Makefile.um