linux_dsm_epyc7002/tools/perf/Documentation
Andi Kleen 8b7bad58ef perf callchain: Support handling complete branch stacks as histograms
Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over
complete branch traces, instead of individual branches, similar to how
normal callgraphs are handled. This is done by putting it in front of
the normal callgraph and then using the normal callgraph histogram
infrastructure to unify them.

This way in complex functions we can understand the control flow that
lead to a particular sample, and may even see some control flow in the
caller for short functions.

Example (simplified, of course for such simple code this is usually not
needed), please run this after the whole patchkit is in, as at this
point in the patch order there is no --branch-history, that will be
added in a patch after this one:

tcall.c:

volatile a = 10000, b = 100000, c;

__attribute__((noinline)) f2()
{
	c = a / b;
}

__attribute__((noinline)) f1()
{
	f2();
	f2();
}
main()
{
	int i;
	for (i = 0; i < 1000000; i++)
		f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --no-children --branch-history
...
    54.91%  tcall.c:6  [.] f2                      tcall
            |
            |--65.53%-- f2 tcall.c:5
            |          |
            |          |--70.83%-- f1 tcall.c:11
            |          |          f1 tcall.c:10
            |          |          main tcall.c:18
            |          |          main tcall.c:18
            |          |          main tcall.c:17
            |          |          main tcall.c:17
            |          |          f1 tcall.c:13
            |          |          f1 tcall.c:13
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:12
            |          |          f1 tcall.c:12
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:11
            |          |
            |           --29.17%-- f1 tcall.c:12
            |                     f1 tcall.c:12
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:11
            |                     f1 tcall.c:10
            |                     main tcall.c:18
            |                     main tcall.c:18
            |                     main tcall.c:17
            |                     main tcall.c:17
            |                     f1 tcall.c:13
            |                     f1 tcall.c:13
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record or anywhere
else.

This adds the basic code to report:

- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.

The rest of the history code is unchanged and doesn't know the
difference between LBR entry and normal call entry.

- detect overlaps with the callchain
- remove small loop duplicates in the LBR

Current limitations:

- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
v3: Merge further patches into this one. Fix white space.
v4: Improve manpage. Address review feedback.
v5: Rename functions. Better error message without -g. Fix crash without
    -b.
v6: Rebase
v7: Rebase. Use NO_ENTRY in memset.
v8: Port to latest tip. Move add_callchain_ip to separate
    patch. Skip initial entries in callchain. Minor cleanups.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-01 20:00:31 -03:00
..
android.txt
asciidoc.conf
examples.txt perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
jit-interface.txt
Makefile perf tools: Implement summary output for 'make install' 2013-10-11 12:18:11 -03:00
manpage-1.72.xsl
manpage-base.xsl
manpage-bold-literal.xsl
manpage-normal.xsl
manpage-suppress-sp.xsl
perf-annotate.txt
perf-archive.txt perf archive: Remove duplicated 'runs' in man page 2013-12-09 15:21:45 -03:00
perf-bench.txt perf bench: Add --repeat option 2014-06-19 16:13:15 -03:00
perf-buildid-cache.txt perf buildid-cache: Add ability to add kcore to the cache 2013-10-14 12:20:38 -03:00
perf-buildid-list.txt
perf-diff.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-evlist.txt
perf-help.txt
perf-inject.txt perf inject: Add --kallsyms parameter 2014-07-25 12:08:34 -03:00
perf-kmem.txt
perf-kvm.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-list.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-lock.txt perf lock: Account for lock average wait time 2013-10-09 11:24:01 -03:00
perf-mem.txt perf mem: Clarify load-latency in documentation 2014-03-14 11:20:44 -03:00
perf-probe.txt perf tools: Disable kernel symbol demangling by default 2014-09-17 17:08:09 -03:00
perf-record.txt perf record: Add new -I option to sample interrupted machine state 2014-11-16 11:42:02 +01:00
perf-report.txt perf callchain: Support handling complete branch stacks as histograms 2014-12-01 20:00:31 -03:00
perf-sched.txt
perf-script-perl.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-script-python.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-script.txt perf script: Add period data column 2014-10-17 15:21:30 -03:00
perf-stat.txt perf stat: Fix --delay option in man page 2014-01-13 10:06:24 -03:00
perf-test.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf-timechart.txt perf timechart: Add more options to IO mode 2014-07-10 00:22:54 +02:00
perf-top.txt perf tools: Disable kernel symbol demangling by default 2014-09-17 17:08:09 -03:00
perf-trace.txt perf Documentation: Fix typos in perf/Documentation 2014-10-15 17:39:02 -03:00
perf.txt perf tools: Add --debug optionto set debug variable 2014-07-17 12:58:59 -03:00
perfconfig.example