linux_dsm_epyc7002/tools/perf
Anton Blanchard 6bb8f311a8 perf sort: Fix symbol sort output by separating unresolved samples by type
I took a profile that suggested 60% of total CPU time was in the
hypervisor:

...
    60.20%  [H] 0x33d43c
     4.43%  [k] ._spin_lock_irqsave
     1.07%  [k] ._spin_lock

Using perf stat to get the user/kernel/hypervisor breakdown contradicted
this.

The problem is we merge all unresolved samples into the one unknown
bucket. If add a comparison by sample type to sort__sym_cmp we get the
real picture:

...
    57.11%  [.] 0x80fbf63c
     4.43%  [k] ._spin_lock_irqsave
     1.07%  [k] ._spin_lock
     0.65%  [H] 0x33d43c

So it was almost all userspace, not hypervisor as the initial profile
suggested.

I found another issue while adding this. Symbol sorting sometimes shows
multiple entries for the unknown bucket:

...
    16.65%  [.] 0x6cd3a8
     7.25%  [.] 0x422460
     5.37%  [.] yylex
     4.79%  [.] malloc
     4.78%  [.] _int_malloc
     4.03%  [.] _int_free
     3.95%  [.] hash_source_code_string
     2.82%  [.] 0x532908
     2.64%  [.] 0x36b538
     0.94%  [H] 0x8000000000e132a4
     0.82%  [H] 0x800000000000e8b0

This happens because we aren't consistent with our sorting. On
one hand we check to see if both symbols match and for two unresolved
samples sym is NULL so we match:

        if (left->ms.sym == right->ms.sym)
                return 0;

On the other hand we use sample IP for unresolved samples when
comparing against a symbol:

       ip_l = left->ms.sym ? left->ms.sym->start : left->ip;
       ip_r = right->ms.sym ? right->ms.sym->start : right->ip;

This means unresolved samples end up spread across the rbtree and we
can't merge them all.

If we use cmp_null all unresolved samples will end up in the one bucket
and the output makes more sense:

...
    39.12%  [.] 0x36b538
     5.37%  [.] yylex
     4.79%  [.] malloc
     4.78%  [.] _int_malloc
     4.03%  [.] _int_free
     3.95%  [.] hash_source_code_string
     2.26%  [H] 0x800000000000e8b0

Acked-by: Eric B Munson <emunson@mgebm.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Link: http://lkml.kernel.org/r/20110831115145.4f598ab2@kryten
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:17 -03:00
..
arch ARM: fix perf build with uclibc toolchains 2011-08-12 15:40:20 +01:00
bench perf tool: Fix gcc 4.6.0 issues 2011-02-07 12:41:41 -02:00
config perf tools: git mv tools/perf/{features-tests.mak,config/} 2011-04-19 08:18:36 -03:00
Documentation perf probe: Support adding probes on offline kernel modules 2011-07-15 16:25:12 -04:00
python perf evlist: Store pointer to the cpu and thread maps 2011-01-31 12:40:52 -02:00
scripts perf script: Finish the rename from trace to script 2010-12-25 11:29:02 -02:00
util perf sort: Fix symbol sort output by separating unresolved samples by type 2011-09-23 14:37:17 -03:00
.gitignore perf tools: Makefile: Remove various and sundry cruft 2011-02-18 07:43:06 -02:00
builtin-annotate.c perf report/annotate/script: Add option to specify a CPU range 2011-07-05 10:44:44 +02:00
builtin-bench.c perf options: Type check all the remaining OPT_ variants 2010-05-17 16:22:41 -03:00
builtin-buildid-cache.c perf buildid: add perfconfig option to specify buildid cache dir 2010-06-05 09:34:04 -03:00
builtin-buildid-list.c Merge commit 'v2.6.37-rc8' into perf/core 2011-01-04 08:08:54 +01:00
builtin-diff.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
builtin-evlist.c perf evlist: New command to list the names of events present in a perf.data file 2011-03-15 11:10:48 -03:00
builtin-help.c perf options: Type check all the remaining OPT_ variants 2010-05-17 16:22:41 -03:00
builtin-inject.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
builtin-kmem.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
builtin-kvm.c perf options: Type check all the remaining OPT_ variants 2010-05-17 16:22:41 -03:00
builtin-list.c perf list: Allow filtering list of events 2011-02-17 15:38:58 -02:00
builtin-lock.c perf lock: Dropping unsupported ':r' modifier 2011-08-08 09:41:35 -03:00
builtin-probe.c perf probe: Warn when more than one line are given 2011-08-12 09:27:11 -03:00
builtin-record.c perf record: Create events initially disabled and enable after init 2011-09-23 14:36:53 -03:00
builtin-report.c perf report: Use ui__warning in some more places 2011-08-03 12:33:24 -03:00
builtin-sched.c perf sched: Usage leftover from trace -> script rename 2011-08-09 13:32:12 -03:00
builtin-script.c perf report/annotate/script: Add option to specify a CPU range 2011-07-05 10:44:44 +02:00
builtin-stat.c perf tools: Add group event scheduling option to perf record/stat 2011-08-18 07:35:46 -03:00
builtin-test.c perf tools: Make test use the preset debugfs path 2011-07-21 10:41:14 +02:00
builtin-timechart.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
builtin-top.c perf tools: De-opt the parse_events function 2011-07-21 10:41:11 +02:00
builtin.h perf evlist: New command to list the names of events present in a perf.data file 2011-03-15 11:10:48 -03:00
command-list.txt perf evlist: New command to list the names of events present in a perf.data file 2011-03-15 11:10:48 -03:00
CREDITS perf_counter tools: Add CREDITS file for Git contributors 2009-06-24 19:54:29 +02:00
design.txt perf: Fix few typos + cosmetics 2010-01-13 17:39:44 +01:00
Makefile Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 2011-08-09 16:44:27 +02:00
MANIFEST perf packaging: add memcpy to perf MANIFEST 2010-11-30 23:00:10 -02:00
perf-archive.sh perf buildid: add perfconfig option to specify buildid cache dir 2010-06-05 09:34:04 -03:00
perf.c perf evlist: New command to list the names of events present in a perf.data file 2011-03-15 11:10:48 -03:00
perf.h perf record: Move perf_mmap__write_tail to perf.h 2011-01-22 19:56:29 -02:00