The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.
So match the kernel support and validate chain->nr taking into account
both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of using a raw string, use DSO__NAME_KALLSYMS and
DSO__NAME_KCORE macros for kallsyms and kcore.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events. However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.
Before:
# perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
Performance counter stats for 'system wide':
1217.759506 cpu-clock (msec)
93 context-switches
61 page-faults
18,958,022 cycles
0.101393794 seconds time elapsed
After:
Performance counter stats for 'system wide':
1220.471884 cpu-clock (msec) # 12.013 CPUs utilized
118 context-switches # 0.097 K/sec
59 page-faults # 0.048 K/sec
17,941,247 cycles # 0.015 GHz
0.101594777 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When unwinding callchains on a different machine, vdso info should be
available so the unwind process won't be interrupted if address falls
into vdso region. But in most cases, the addresses of sample events are
not in vdso range, the buildid of a zero hit vdso won't be stored into
perf.data.
This patch stores vdso buildid regardless of whether the vdso is hit or
not.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the scaling factor is a full integer don't display fractional
digits. This avoids unnecessary .00 output for topdown metrics with
scale factors.
v2: Remove redundant check.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-7-git-send-email-andi@firstfloor.org
[ Rename 'round' to 'stat_round' as 'round' is defined in math.h,
included by this patch, and this breaks the build on ubuntu 12.04 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use new lsdir() for looking up buildid caches. This changes logic a bit
to ignore all dot files, since the build-id cache must not start with
dot.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135217.23943.94596.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use lsdir() to search in kcore cache directory. This also avoids
checking hidden dot directory entries, because kcore cache directories
must always have the name from timestamps when taking the kcore
snapshots, and it never start with dot.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135208.23943.68071.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the existing SBUILD_ID_SIZE macro instead of the equivalent
BUILD_ID_SIZE * 2 + 1 expression for allocating a buffer for build-id
strings.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135159.23943.57120.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix lsdir() to set correct positive error number (ENOMEM). Since
"errno" must have a positive error number instead of negative number,
fix lsdir to set it correctly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e1ce726e1d ("perf tools: Add lsdir() helper to read a directory")
Link: http://lkml.kernel.org/r/20160511135127.23943.40644.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-ovxifncj34ynrjjseg33lil3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-c4c47w2a2jx13terl2p2hros@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Debug-frame for remote platforms is not related to the host platform, so
we should test each platform separately.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462866037-30382-5-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently only test for local libunwind. We should check all supported
platforms so we can use them to parse perf.data with callchain info on
different machines.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462866037-30382-4-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.
To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.
This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the call to map_ip() to adjust al.addr, because it has already
been called when assembling the callchain, in:
thread__resolve_callchain_sample(perf_sample)
add_callchain_ip(ip = perf_sample->callchain->ips[j])
thread__find_addr_location(addr = ip)
thread__find_addr_map(addr) {
al->addr = addr
if (al->map)
al->addr = al->map->map_ip(al->map, al->addr);
}
Calling it a second time can result in incorrect addresses being used.
This can have effects such as duplicate symbols being created and
exported.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlipot0@gmail.com
[ Show the callchain where it is done, to help reviewing this change down the line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the dso__insert_symbol function instead of symbols__insert() in
order to properly update the dso symbol cache.
If the cache is not updated, then duplicate symbols can be
unintentionally created, inserted, and exported.
This change prevents duplicate symbols from being exported due to
dso__find_symbol() using a stale symbol cache.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current method for inserting symbols is to use the symbols__insert()
function. However symbols__insert() does not update the dso symbol
cache. This causes problems in the following scenario:
1. symbol not found at addr using dso__find_symbol
2. symbol inserted at addr using the existing symbols__insert function
3. symbol still not found at addr using dso__find_symbol() because cache isn't
updated. This is undesired behavior.
The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol() to both insert the symbol and update the symbol
cache if necessary.
If dso__insert_symbol() is used in (2) instead of symbols__insert(),
then the undesired behavior in (3) is avoided.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It probably is equivalent, but that seems to be the "pythonic" way of
dieing? Anyway, one less die() in the tools/perf codebase.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Link: http://lkml.kernel.org/n/tip-nlzgepdv2818zs4e7faif9tu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang)
- Print recently added perf_event_attr.write_backward bit flag in -vv
verbose mode (Arnaldo Carvalho de Melo)
- Fix incorrect python db-export error message in 'perf script' (Chris Phlipot)
- Fix handling of zero-length symbols (Chris Phlipot)
- perf stat: Scale values by unit before metrics (Andi Kleen)
Infrastructure changes:
- Rewrite strbuf not to die(), making tools using it to check its
return value instead (Masami Hiramatsu)
- Support reading from backward ring buffer, add a 'perf test' entry
for it (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace ALLOC_GROW with normal realloc code in add_cmd_list() so that it
can handle errors directly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054752.6158.30562.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make pmu_formats_string() to check return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054744.6158.37810.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make topology checkers to check the return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054735.6158.98650.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make alias handler and sq_quote_argv to check the return value of strbuf
APIs.
In sq_quote_argv() calls die(), but this fix handles strbuf failure as a
special case and returns to caller, since the caller - handle_alias()
also has to check the return value of other strbuf APIs and those checks
can be merged to one if() statement.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054725.6158.84597.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make check_emacsclient_version() to check the return value of strbuf
APIs so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054716.6158.11755.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check the return value of strbuf APIs in perf-probe
related code, so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054707.6158.69861.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rewrite strbuf implementation not to use die() nor xrealloc(). Instead
of die(), now most of the API returns error code or 0 if succeeded.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054658.6158.24080.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change introduces a fix to symbols__find, so that it is able to
find symbols of length zero (where start == end).
The current code has the following problem:
- The current implementation of symbols__find is unable to find any symbols
of length zero.
- The db-export framework explicitly creates zero length symbols at
locations where no symbol currently exists.
The combination of the two above behaviors results in behavior similar
to the example below.
1. addr_location is created for a sample, but symbol is unable to be
resolved.
2. db export creates an "unknown" symbol of length zero at that address
and inserts it into the dso.
3. A new sample comes in at the same address, but symbol__find is unable
to find the zero length symbol, so it is still unresolved.
4. db export sees the symbol is unresolved, and allocated a duplicate
symbol, even though it already did this in step 2.
This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.
The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix (generated with
scripts/python/export-to-postgresql.py)
Ex.
BEFORE THE CHANGE:
example_db=# select count(*) from symbols;
count
--------
900213
(1 row)
example_db=# select count(*) from symbols where symbols.name='unknown';
count
--------
897355
(1 row)
example_db=# select count(*) from symbols where symbols.name!='unknown';
count
-------
2858
(1 row)
AFTER THE CHANGE:
example_db=# select count(*) from symbols;
count
-------
25217
(1 row)
example_db=# select count(*) from symbols where name='unknown';
count
-------
22359
(1 row)
example_db=# select count(*) from symbols where name!='unknown';
count
-------
2858
(1 row)
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com
[ Moved the test to later in the rb_tree tests, as this not the likely case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we can see if it is set when using verbose mode in various tools,
such as 'perf test':
# perf test -vv back
45: Test backward reading from ring buffer :
--- start ---
<SNIP>
------------------------------------------------------------
perf_event_attr:
type 2
size 112
config 0x98
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|PERIOD|RAW
disabled 1
mmap 1
comm 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
write_backward 1
------------------------------------------------------------
sys_perf_event_open: pid 20911 cpu -1 group_fd -1 flags 0x8
<SNIP>
---- end ----
Test backward reading from ring buffer: Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test checks reading from backward ring buffer.
Test result:
# ~/perf test 'ring buffer'
45: Test backward reading from ring buffer : Ok
The test case is a while loop which calls prctl(PR_SET_NAME) multiple
times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one
PERF_RECORD_COMM.
The first round creates a relative large ring buffer (256 pages). It can
afford all events. Read from it and check the count of each type of
events.
The second round creates a small ring buffer (1 page) and makes it
overwritable. Check the correctness of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__mmap_read_backward() is introduced for reading backward
ring buffer. Since direction for reading such ring buffer is different
from the direction kernel writing to it, and since user need to fetch
most recent record from it, a perf_evlist__mmap_read_catchup() is
introduced to move the reading pointer to the end of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.
This change fixes the message to properly reference the failure to
create the call path root.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Scale values by unit before passing them to the metrics printing
functions. This is needed for TopDown, because it needs to scale the
slots correctly by pipeline width / SMTness.
For existing metrics it shouldn't make any difference, as those
generally use events that don't have any units.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is no need to check for DWARF unwinding support when using the
'dwarf' callchain record method, as this will only ask the kernel to
collect stack dumps for later DWARF CFI processing, which can be done in
another machine, where the support for DWARF unwinding need to be
present.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
klogctl can fail and return -ve len, so check for this and
return NULL to avoid passing a (size_t)-1 to malloc.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add debug output of raw counter values per CPU when perf stat -v is
specified, together with their cpu numbers. This is very useful to
debug problems with per core counters, where we can normally only see
aggregated values.
v2: Make it depend on -vv, not -v
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the export-to-postgresql.py to support the newly introduced
callchain export.
callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.
Ex.:
$ perf script -s export-to-postgresql.py example_db all callchains
Includes the following changes to enable callchain export via the python export
APIs:
- Add the "callchains" commandline option, which is used to enable
callchain export by setting the perf_db_export_callchains global
- Add perf_db_export_callchains checks for call_path table creation
and population.
- Add call_path_id to samples_table to conform with the new API
example usage and output using a small test app:
test_app.c:
volatile int x = 0;
void inc_x_loop()
{
int i;
for(i=0; i<100000000; i++)
x++;
}
void a()
{
inc_x_loop();
}
void b()
{
inc_x_loop();
}
int main()
{
a();
b();
return 0;
}
example usage:
$ gcc -g -O0 test_app.c
$ perf record --call-graph=dwarf ./a.out
[ perf record: Woken up 77 times to write data ]
[ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ]
$ perf script -s scripts/python/export-to-postgresql.py
example_db all callchains
$ psql example_db
example_db=#
SELECT
(SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol,
(SELECT name FROM symbols WHERE id =
(SELECT symbol_id from call_paths where id = cps.parent_id))
as parent_symbol,
sum(period) as event_count
FROM samples join call_paths as cps on call_path_id = cps.id
GROUP BY cps.id,evsel_id
ORDER BY event_count DESC
LIMIT 5;
symbol | parent_symbol | event_count
------------------+--------------------------+-------------
inc_x_loop | a | 734250982
inc_x_loop | b | 731028057
unknown | unknown | 1335858
task_tick_fair | scheduler_tick | 1238842
update_wall_time | tick_do_update_jiffies64 | 650373
(5 rows)
The above data shows total "self time" in cycles for each call path that was
sampled. It is intended to demonstrate how it accounts separately for the two
ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common
table expressions can be used as well to get cumulative time spent in a
function as well, but that is beyond the scope of this basic example.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.
- This feature is enabled by setting "perf_db_export_callchains" to true
- When enabled, samples that have callchain information will have the
callchains exported via call_path_table
- The call_path_id field is added to sample_table to enable association of
samples with the corresponding callchain stored in the call paths
table. A call_path_id of 0 will be exported if there is no
corresponding callchain.
- When "perf_db_export_callchains" and "perf_db_export_calls" are both
set to True, the call path root data structure will be shared. This
prevents duplicating of data and call path ids that would result from
building two separate call path trees in memory.
- The call_return_processor structure definition was relocated to the header
file to make its contents visible to db-export.c. This enables the
sharing of call path trees between the two features, as mentioned
above.
This change is visible to python scripts using the python db export api.
The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for
additional arguments to be added in the future. ie. using *x as the
final argument of the sample_table function.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The exported sample now contains a reference to the call_path_id that
represents its callchain.
While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the call
return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for
associating samples directly with call paths.
This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change enables the db export api to export callchains. This is
accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.
While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in
samples.
Summary:
- This feature is activated by initializing the call_path_root member
inside the db_export structure to a non-null value.
- Callchains are resolved with thread__resolve_callchain() and then stored
and exported by adding a call path under call path root.
- Symbol and DSO for each callchain node are exported via db_ids_from_al()
This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com
[ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the call path handling code out of thread-stack.c and
thread-stack.h to allow other components that are not part of
thread-stack to create call paths.
Summary:
- Create call-path.c and call-path.h and add them to the build.
- Move all call path related code out of thread-stack.c and thread-stack.h
and into call-path.c and call-path.h.
- A small subset of structures and functions are now visible through
call-path.h, which is required for thread-stack.c to continue to
compile.
This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.
The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:
1. callchain_param.order is set to ORDER_CALLER
2. thread__resolve_callchain_sample is able to resolve callchain entries
for the sample.
3. unwind__get_entries is also able to resolve callchain entries for the
sample.
The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.
Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.
How to Reproduce the Existing Bug:
Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.
Here is how i can reproduce the bug using perf report:
# perf record --call-graph=dwarf stress -c 1 -t 5
when i run this command:
# perf report --call-graph=flat,0,0,callee
This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):
gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9
Now run this command using caller order:
# perf report --call-graph=flat,0,0,caller
It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.
instead you see this:
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.
if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test to check if the arg format had been read from the
syscall:sys_enter_name/format file was looking at the list of non-commom
fields, and if that is empty, it would think it had failed to read it,
because it doesn't exist, for instance, for the clone() syscall.
So instead before dumping the raw syscall args list check
IS_ERR(sc->tp_format), if that is true, then an attempt was made to read
the format file and failed, in which case dump the raw arg list values.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ls7pmdqb2xy9339vdburwvnk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>