linux_dsm_epyc7002/arch/x86/kernel/cpu
Andi Kleen 75925e1ad7 perf/x86: Optimize stack walk user accesses
Change the perf user stack walking to use the new
__copy_from_user_nmi(), and split each access into word sized transfer
sizes. This allows to inline the complete access and optimize it all
into a single load.

The main advantage is that this avoids the overhead of double page
faults.  When normal copy_from_user() fails it reexecutes the copy to
compute an accurate number of non copied bytes. This leads to
executing the expensive page fault twice.

While walking stacks having a fault at some point is relatively common
(typically when some part of the program isn't compiled with frame
pointers), so this is a large overhead.

With the optimized copies we avoid this problem because they only do
all accesses once. And of course they're much faster too when the
access does not fault because they're just single instructions instead
of complex function calls.

While profiling a kernel build with -g, the patch brings down the
average time of the PMI handler from 966ns to 552ns (-43%).

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1445551641-13379-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23 09:58:25 +01:00
..
mcheck x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init() 2015-11-01 11:26:14 +01:00
microcode x86/microcode/intel: Move #ifdef DEBUG inside the function 2015-10-21 11:22:12 +02:00
mtrr x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() 2015-08-28 10:09:28 +02:00
.gitignore
amd.c x86/AMD: Fix last level cache topology for AMD Fam17h systems 2015-11-07 10:37:51 +01:00
bugs_64.c
bugs.c
centaur.c
common.c x86/cpu: Fix SMAP check in PVOPS environments 2015-11-19 11:07:49 +01:00
cpu.h
cyrix.c
hypervisor.c
intel_cacheinfo.c perf/core, perf/x86: Change needlessly global functions and a variable to static 2015-09-28 08:09:52 +02:00
intel_pt.h perf/x86/intel/pt: Clean up files of Intel Processor Trace 2015-08-12 11:43:22 +02:00
intel.c x86/cpu/intel: Enable X86_FEATURE_NONSTOP_TSC_S3 for Merrifield 2015-11-07 10:37:30 +01:00
Makefile perf/x86: Add Intel cstate PMUs support 2015-10-06 17:31:51 +02:00
match.c
mkcapflags.sh
mshyperv.c x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case 2015-09-30 07:44:15 +02:00
perf_event_amd_ibs.c
perf_event_amd_iommu.c
perf_event_amd_iommu.h
perf_event_amd_uncore.c
perf_event_amd.c
perf_event_intel_bts.c perf/x86/intel/bts: Disallow use by unprivileged users on paranoid systems 2015-09-13 11:27:22 +02:00
perf_event_intel_cqm.c perf/core: Robustify the perf_cgroup_from_task() RCU checks 2015-11-23 09:21:03 +01:00
perf_event_intel_cstate.c perf/x86: Add Intel cstate PMUs support 2015-10-06 17:31:51 +02:00
perf_event_intel_ds.c perf/x86/intel/ds: Work around BTS leaking kernel addresses 2015-09-13 11:27:21 +02:00
perf_event_intel_lbr.c perf/x86: Fix LBR call stack save/restore 2015-11-23 09:44:57 +01:00
perf_event_intel_pt.c perf/x86/intel/pt: Fix KVM warning due to doing rdmsr() before the CPUID test 2015-09-13 11:27:23 +02:00
perf_event_intel_rapl.c perf/x86/intel/rapl: Remove the unused RAPL_EVENT_DESC() macro 2015-11-12 09:44:25 +01:00
perf_event_intel_uncore_nhmex.c
perf_event_intel_uncore_snb.c perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore 2015-10-06 17:31:51 +02:00
perf_event_intel_uncore_snbep.c perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore 2015-10-06 17:31:51 +02:00
perf_event_intel_uncore.c perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore 2015-10-06 17:31:51 +02:00
perf_event_intel_uncore.h perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore 2015-10-06 17:31:51 +02:00
perf_event_intel.c perf/x86/intel: Fix Skylake FRONTEND MSR extrareg mask 2015-09-18 09:20:23 +02:00
perf_event_knc.c
perf_event_msr.c arch/x86/kernel/cpu/perf_event_msr.c: use sign_extend64() for sign extension 2015-11-06 17:50:42 -08:00
perf_event_p4.c
perf_event_p6.c
perf_event.c perf/x86: Optimize stack walk user accesses 2015-11-23 09:58:25 +01:00
perf_event.h treewide: Remove old email address 2015-11-23 09:44:58 +01:00
perfctr-watchdog.c
powerflags.c
proc.c
rdrand.c
scattered.c x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag 2015-09-23 09:57:24 +02:00
topology.c
transmeta.c
umc.c
vmware.c