Implement functions of initialization, finalization and processing of
control command messages coming from control file descriptors.
Allocate control file descriptor as descriptor at struct pollfd object
of evsel_list for atomic poll() operation.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/62518ceb-1cc9-2aba-593b-55408d07c1bf@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Store flags per struct pollfd *entries object in a bitmap of int size.
Implement fdarray_flag__nonfilterable flag to skip object from counting
by fdarray__filter().
Fixed fdarray test issue reported by kernel test robot.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/6b7d43ff-0801-d5dd-4e90-fcd86b17c1c8@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.
Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Avoid moving of fds by fdarray__filter() so fds indices returned by
fdarray__add() can be used for access and processing of objects at
struct pollfd *entries.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/676844f8-55d3-c628-23db-aa163a81519e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.
This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.
It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
To pick up the changes in:
b2f9f1535b ("libbpf: Fix libbpf hashmap on (I)LP32 architectures")
Silencing this warning:
Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
I'll eventually update the warning to remove the "Kernel ABI" part
and instead state libbpf when noticing that the original is at
"tools/lib/something".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Jakub Bogusz <qboosh@pld-linux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 'struct expr_id_data' to keep an expr value instead of just a simple
double pointer, so we can store more data for ID in the following
changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200712132634.138901-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename expr__add_id() to expr__add_val() so we can use expr__add_id() to
actually add just the id without any value in following changes.
There's no functional change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200712132634.138901-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Warn if the probe target function is a GNU indirect function (GNU_IFUNC)
because it may not be what the user wants to probe.
The GNU indirect function ( https://sourceware.org/glibc/wiki/GNU_IFUNC )
is the dynamic symbol solved at runtime. An IFUNC function is a selector
which is invoked from the ELF loader, but the symbol address of the
function which will be modified by the IFUNC is the same as the IFUNC in
the symbol table. This can confuse users trying to probe such functions.
For example, memcpy is an IFUNC.
probe_libc:memcpy (on __new_memcpy_ifunc@x86_64/multiarch/memcpy.c in /usr/lib64/libc-2.30.so)
the probe is put on an IFUNC.
perf 1742 [000] 26201.715632: probe_libc:memcpy: (7fdaa53824c0)
7fdaa53824c0 __new_memcpy_ifunc+0x0 (inlined)
7fdaa5d4a980 elf_machine_rela+0x6c0 (inlined)
7fdaa5d4a980 elf_dynamic_do_Rela+0x6c0 (inlined)
7fdaa5d4a980 _dl_relocate_object+0x6c0 (/usr/lib64/ld-2.30.so)
7fdaa5d42155 dl_main+0x1cc5 (/usr/lib64/ld-2.30.so)
7fdaa5d5831a _dl_sysdep_start+0x54a (/usr/lib64/ld-2.30.so)
7fdaa5d3ffeb _dl_start_final+0x25b (inlined)
7fdaa5d3ffeb _dl_start+0x25b (/usr/lib64/ld-2.30.so)
7fdaa5d3f117 .annobin_rtld.c+0x7 (inlined)
And the event is invoked from the ELF loader instead of the target
program's main code.
Moreover, at this moment, we can not probe on the function which will
be selected by the IFUNC, because it is determined at runtime. But
uprobe will be prepared before running the target binary.
Thus, I decided to warn user when 'perf probe' detects that the probe
point is on an GNU IFUNC symbol. Someone who wants to probe an IFUNC
symbol to debug the IFUNC function can ignore this warning.
Committer notes:
I.e., this warning will be emitted if the probe point is an IFUNC:
"Warning: The probe function (%s) is a GNU indirect function.\n"
"Consider identifying the final function used at run time and set the probe directly on that.\n"
Complete set of steps:
# readelf -sW /lib64/libc-2.29.so | grep IFUNC | tail
22196: 0000000000109a80 183 IFUNC GLOBAL DEFAULT 14 __memcpy_chk
22214: 00000000000b7d90 191 IFUNC GLOBAL DEFAULT 14 __gettimeofday
22336: 000000000008b690 60 IFUNC GLOBAL DEFAULT 14 memchr
22350: 000000000008b9b0 89 IFUNC GLOBAL DEFAULT 14 __stpcpy
22420: 000000000008bb10 76 IFUNC GLOBAL DEFAULT 14 __strcasecmp_l
22582: 000000000008a970 60 IFUNC GLOBAL DEFAULT 14 strlen
22585: 00000000000a54d0 92 IFUNC WEAK DEFAULT 14 wmemset
22600: 000000000010b030 92 IFUNC GLOBAL DEFAULT 14 __wmemset_chk
22618: 000000000008b8a0 183 IFUNC GLOBAL DEFAULT 14 __mempcpy
22675: 000000000008ba70 76 IFUNC WEAK DEFAULT 14 strcasecmp
#
# perf probe -x /lib64/libc-2.29.so strlen
Warning: The probe function (strlen) is a GNU indirect function.
Consider identifying the final function used at run time and set the probe directly on that.
Added new event:
probe_libc:strlen (on strlen in /usr/lib64/libc-2.29.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:strlen -aR sleep 1
#
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/159438669349.62703.5978345670436126948.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the memory leakage in debuginfo__find_trace_events() when the probe
point is not found in the debuginfo. If there is no probe point found in
the debuginfo, debuginfo__find_probes() will NOT return -ENOENT, but 0.
Thus the caller of debuginfo__find_probes() must check the tf.ntevs and
release the allocated memory for the array of struct probe_trace_event.
The current code releases the memory only if the debuginfo__find_probes()
hits an error but not checks tf.ntevs. In the result, the memory allocated
on *tevs are not released if tf.ntevs == 0.
This fixes the memory leakage by checking tf.ntevs == 0 in addition to
ret < 0.
Fixes: ff74178350 ("perf probe: Introduce debuginfo to encapsulate dwarf information")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/159438668346.62703.10887420400718492503.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a wrong "variable not found" warning when the probe point is not
found in the debuginfo.
Since the debuginfo__find_probes() can return 0 even if it does not find
given probe point in the debuginfo, fill_empty_trace_arg() can be called
with tf.ntevs == 0 and it can emit a wrong warning. To fix this, reject
ntevs == 0 in fill_empty_trace_arg().
E.g. without this patch;
# perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
Failed to find the location of the '%di' variable at this address.
Perhaps it has been optimized out.
Use -V with the --range option to show '%di' location range.
Added new events:
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
You can now use it in all perf tools, such as:
perf record -e probe_libc:memcpy -aR sleep 1
With this;
# perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
Added new events:
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
You can now use it in all perf tools, such as:
perf record -e probe_libc:memcpy -aR sleep 1
Fixes: cb40273085 ("perf probe: Trace a magic number if variable is not found")
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/159438667364.62703.2200642186798763202.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a case that several same-name symbols points to the same
address. In that case, 'perf probe' returns an error.
E.g.
# perf probe -x /lib64/libc-2.30.so -v -a "memcpy arg1=%di"
probe-definition(0): memcpy arg1=%di
symbol:memcpy file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: arg1=%di into name:arg1 %di
1 arguments
symbol:setjmp file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:longjmp file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:longjmp_target file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:lll_lock_wait_private file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_arena_max file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_arena_test file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_tunable_tcache_max_bytes file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_tunable_tcache_count file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_tunable_tcache_unsorted_limit file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_trim_threshold file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_top_pad file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_mmap_threshold file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_mmap_max file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_perturb file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_mxfast file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_heap_new file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_arena_reuse_free_list file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_arena_reuse file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_arena_reuse_wait file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_arena_new file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_arena_retry file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_sbrk_less file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_heap_free file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_heap_less file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_tcache_double_free file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_heap_more file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_sbrk_more file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_malloc_retry file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_memalign_retry file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt_free_dyn_thresholds file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_realloc_retry file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_calloc_retry file:(null) line:0 offset:0 return:0 lazy:(null)
symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null)
Open Debuginfo file: /usr/lib/debug/usr/lib64/libc-2.30.so.debug
Try to find probe point from debuginfo.
Opening /sys/kernel/debug/tracing//README write=0
Failed to find the location of the '%di' variable at this address.
Perhaps it has been optimized out.
Use -V with the --range option to show '%di' location range.
An error occurred in debuginfo analysis (-2).
Trying to use symbols.
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Writing event: p:probe_libc/memcpy /usr/lib64/libc-2.30.so:0x914c0 arg1=%di
Writing event: p:probe_libc/memcpy /usr/lib64/libc-2.30.so:0x914c0 arg1=%di
Failed to write event: File exists
Error: Failed to add events. Reason: File exists (Code: -17)
You can see that perf tried to write completely the same probe
definition twice, which caused an error.
To fix this issue, check the symbol list and drop duplicated symbols
(which has the same symbol name and address) from it.
With this patch:
# perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
Failed to find the location of the '%di' variable at this address.
Perhaps it has been optimized out.
Use -V with the --range option to show '%di' location range.
Added new events:
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
You can now use it in all perf tools, such as:
perf record -e probe_libc:memcpy -aR sleep 1
Committer notes:
Fix this build error on 32-bit arches by using PRIx64 for symbol->start,
that is an u64:
In file included from util/probe-event.c:27:
util/probe-event.c: In function 'find_probe_trace_events_from_map':
util/probe-event.c:2978:14: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
pr_debug("Found duplicated symbol %s @ %lx\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/debug.h:17:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
util/probe-event.c:2978:5: note: in expansion of macro 'pr_debug'
pr_debug("Found duplicated symbol %s @ %lx\n",
^~~~~~~~
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lore.kernel.org/lkml/159438666401.62703.15196394835032087840.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf kmem' has an input file option but current an output file option
fails:
$ sudo perf kmem record -o /tmp/p.data sleep 1
Error: unknown switch `o'
Usage: perf kmem [<options>] {record|stat}
-f, --force don't complain, do it
-i, --input <file> input file name
-l, --line <num> show n lines
-s, --sort <key[,key2...]>
sort by keys: ptr, callsite, bytes, hit, pingpong, frag, page, order, mig>
-v, --verbose be more verbose (show symbol address, etc)
--alloc show per-allocation statistics
--caller show per-callsite statistics
--live Show live page stat
--page Analyze page allocator
--raw-ip show raw ip instead of symbol
--slab Analyze slab allocator
--time <str> Time span of interest (start,stop)
'perf sched' is similar in implementation and avoids the problem by
passing additional arguments to 'perf record'.
This change makes 'perf kmem' parse command line options consistently
with 'perf sched', although neither actually list that -o is a supported
option.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200708183919.4141023-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Setting the parse_events_error directly doesn't increment num_errors
causing the error message not to be displayed. Use the
parse_events__handle_error function that sets num_errors and handle
multiple errors.
Committer notes:
Ian provided a before/after upon request:
Before:
$ /tmp/perf/perf record -e /tmp/perf/util/parse-events.o
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available event
After:
$ /tmp/perf/perf record -e /tmp/perf/util/parse-events.o
event syntax error: '/tmp/perf/util/parse-events.o'
\___ Failed to load /tmp/perf/util/parse-events.o: BPF object format invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200707211449.3868944-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is generally more useful to show the symbol with an address. In this
case, the print function requires the 'machine' which means changing
callers to provide it as a parameter. It is optional because most events
do not need it and the callers that matter can provide it.
Committer notes:
Made 'union perf_event' continue to be the first parameter to the
perf_event__fprintf() and perf_event__fprintf_text_poke() events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Consistent with other new events, add an option to perf script to
display text poke events and ksymbol events. Both text poke events and
ksymbol events are displayed because some text pokes (e.g. ftrace
trampolines) have corresponding ksymbol events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Select text poke events when available and the kernel is being traced.
Process text poke events to invalidate entries in Intel PT's instruction
cache.
Example:
The example requires kernel config:
CONFIG_PROC_SYSCTL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
Before:
# perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.341 MB perf.data.before ]
[1]+ Terminated perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.before --itrace=e >/dev/null
Warning:
474 instruction trace errors
After:
# perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.646 MB perf.data.after ]
[1]+ Terminated perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.after --itrace=e >/dev/null
Example:
The example requires kernel config:
# CONFIG_FUNCTION_TRACER is not set
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (68 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.268 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
207 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (107 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 39.978 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
6 565303693547 0x291f18 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027a000 len 4096 type 2 flags 0x0 name kprobe_insn_page
6 565303697010 0x291f68 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 0 new len 6
6 565303838278 0x291fa8 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027c000 len 4096 type 2 flags 0x0 name kprobe_optinsn_page
6 565303848286 0x291ff8 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 0 new len 106
6 565369336743 0x292af8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
7 566434327704 0x217c208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
6 566456313475 0x293198 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 106 new len 0
6 566456314935 0x293238 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 6 new len 0
Example:
The example requires kernel config:
CONFIG_FUNCTION_TRACER=y
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (6 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 43.850 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
8 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (206 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.442 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
5 312216133258 0x8bafe0 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc0360000 len 415 type 2 flags 0x0 name ftrace_trampoline
5 312216133494 0x8bb030 [0x1d8]: PERF_RECORD_TEXT_POKE addr 0xffffffffc0360000 old len 0 new len 415
5 312216229563 0x8bb208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216239063 0x8bb248 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
5 312216727230 0x8bb288 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
5 312216739322 0x8bb2c8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216748321 0x8bb308 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287163462 0x2817430 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287174890 0x2817470 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287818979 0x28174b0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
7 313287829357 0x28174f0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287841246 0x2817530 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PERF_RECORD_KSYMBOL_TYPE_OOL marks an executable page. Create a map
backed only by memory, which will be populated as necessary by text poke
events.
Committer notes:
From the patch:
OOL stands for "Out of line" code such as kprobe-replaced instructions
or optimized kprobes or ftrace trampolines.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add processing for PERF_RECORD_TEXT_POKE events. When a text poke event
is processed, then the kernel dso data cache is updated with the poked
bytes.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Our local MSAN (Memory Sanitizer) build of perf throws a warning that
comes from the "dso__disassemble_filename" function in
"tools/perf/util/annotate.c" when running perf record.
The warning stems from the call to readlink, in which "build_id_path"
was being read into "linkname". Since readlink does not null terminate,
an uninitialized memory access would later occur when "linkname" is
passed into the strstr function. This is simply fixed by
null-terminating "linkname" after the call to readlink.
To reproduce this warning, build perf by running:
$ make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins"
(Additionally, llvm might have to be installed and clang might have to
be specified as the compiler - export CC=/usr/bin/clang)
Then running:
tools/perf/perf record -o - ls / | tools/perf/perf --no-pager annotate -i - --stdio
Please see the cover letter for why false positive warnings may be
generated.
Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Drayton <mbd@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20190729205750.193289-1-nums@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
**perf-<pid>.map and jit-<pid>.dump designs:
When a JIT generates code to be executed, it must allocate memory and
mark it executable using an mmap call.
*** perf-<pid>.map design
The perf-<pid>.map assumes that any sample recorded in an anonymous
memory page is JIT code. It then tries to resolve the symbol name by
looking at the process' perf-<pid>.map.
*** jit-<pid>.dump design
The jit-<pid>.dump mechanism takes a different approach. It requires a
JIT to write a `<path>/jit-<pid>.dump` file. This file must also be
mmapped so that perf inject -jit can find the file. The JIT must also
add JIT_CODE_LOAD records for any functions it generates. The records
are timestamped using a clock which can be correlated to the perf record
clock.
After perf record, the `perf inject -jit` pass parses the recording
looking for a `<path>/jit-<pid>.dump` file. When it finds the file, it
parses it and for each JIT_CODE_LOAD record:
* creates an elf file `<path>/jitted-<pid>-<code_index>.so
* injects a new mmap record mapping the new elf file into the process.
*** Coexistence design
The kernel and perf support both of these mechanisms. We need to make
sure perf works on an app supporting either or both of these mechanisms.
Both designs rely on mmap records to determine how to resolve an ip
address.
The mmap records of both techniques by definition overlap. When the JIT
compiles a method, it must:
* allocate memory (mmap)
* add execution privilege (mprotect or mmap. either will
generate an mmap event form the kernel to perf)
* compile code into memory
* add a function record to perf-<pid>.map and/or jit-<pid>.dump
Because the jit-<pid>.dump mechanism supports greater capabilities, perf
prefers the symbols from jit-<pid>.dump. It implements this based on
timestamp ordering of events. There is an implicit ASSUMPTION that the
JIT_CODE_LOAD record timestamp will be after the // anon mmap event that
was generated during memory allocation or adding the execution privilege setting.
*** Problems with the ASSUMPTION
The ASSUMPTION made in the Coexistence design section above is violated
in the following scenario.
*** Scenario
While a JIT is jitting code it will eventually need to commit more
pages and change these pages to executable permissions. Typically the
JIT will want these collocated to minimize branch displacements.
The kernel will coalesce these anonymous mapping with identical
permissions before sending an MMAP event for the new pages. The address
range of the new mmap will not be just the most recently mmap pages.
It will include the entire coalesced mmap region.
See mm/mmap.c
unsigned long mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
struct list_head *uf)
{
...
/*
* Can we just expand an old mapping?
*/
...
perf_event_mmap(vma);
...
}
*** Symptoms
The coalesced // anon mmap event will be timestamped after the
JIT_CODE_LOAD records. This means it will be used as the most recent
mapping for that entire address range. For remaining events it will look
at the inferior perf-<pid>.map for symbols.
If both mechanisms are supported, the symbol will appear twice with
different module names. This causes weird behavior in reporting.
If only jit-<pid>.dump is supported, the symbol will no longer be resolved.
** Implemented solution
This patch solves the issue by removing // anon mmap events for any
process which has a valid jit-<pid>.dump file.
It tracks on a per process basis to handle the case where some running
apps support jit-<pid>.dump, but some only support perf-<pid>.map.
It adds new assumptions:
* // anon mmap events are only required for perf-<pid>.map support.
* An app that uses jit-<pid>.dump, no longer needs
perf-<pid>.map support. It assumes that any perf-<pid>.map info is
inferior.
*** Details
Use thread->priv to store whether a jitdump file has been processed
During "perf inject --jit", discard "//anon*" mmap events for any pid which
has sucessfully processed a jitdump file.
** Testing:
// jitdump case
perf record <app with jitdump>
perf inject --jit --input perf.data --output perfjit.data
// verify mmap "//anon" events present initially
perf script --input perf.data --show-mmap-events | grep '//anon'
// verify mmap "//anon" events removed
perf script --input perfjit.data --show-mmap-events | grep '//anon'
// no jitdump case
perf record <app without jitdump>
perf inject --jit --input perf.data --output perfjit.data
// verify mmap "//anon" events present initially
perf script --input perf.data --show-mmap-events | grep '//anon'
// verify mmap "//anon" events not removed
perf script --input perfjit.data --show-mmap-events | grep '//anon'
** Repro:
This issue was discovered while testing the initial CoreCLR jitdump
implementation. https://github.com/dotnet/coreclr/pull/26897.
** Alternate solutions considered
These were also briefly considered:
* Change kernel to not coalesce mmap regions.
* Change kernel reporting of coalesced mmap regions to perf. Only
include newly mapped memory.
* Only strip parts of // anon mmap events overlapping existing
jitted-<pid>-<code_index>.so mmap events.
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1590544271-125795-1-git-send-email-steve.maclean@linux.microsoft.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up fixes and move perf/core forward, minor conflict as
perf_evlist__add_dummy() lost its 'perf_' prefix as it operates on a
'struct evlist', not on a 'struct perf_evlist', i.e. its tools/perf/
specific, it is not in libperf.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the s390 idle functions so they don't show up in top when using
software sampling.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20200707171457.85707-1-svens@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixing the common case of:
perf record
perf report
And getting just the cycles events.
We now have a 'dummy' event to get perf metadata events that take place
while we synthesize metadata records for pre-existing processes by
traversing procfs, so we always have this extra 'dummy' evsel, but we
don't have to offer it as there will be no samples on it, remove this
distraction.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20200706115452.GA2772@redhat.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The condition to add XMM registers was missing, the regs array needed to
be in the outer scope, and the size of the regs array was too small.
Fixes: 143d34a6b3 ("perf intel-pt: Add XMM registers to synthesized PEBS sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After recording PEBS-via-PT, perf script will not accept 'iregs' field e.g.
# perf record -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.062 MB perf.data ]
# ./perf script --itrace=eop -F+iregs
Samples for 'dummy:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
Fix by using allow_user_set, which is true when recording AUX area data.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording PEBS-via-PT, the kernel will not accept the intel_pt
event with register sampling e.g.
# perf record --kcore -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
Error:
intel_pt/branch=0/: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fix by suppressing register sampling on the intel_pt evsel.
Committer notes:
Adrian informed that this is only available from Tremont onwards, so on
older processors the error continues the same as before.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The segmentation fault can be reproduced as following steps:
1) Executing perf report in tui.
2) Typing '/xxxxx' to filter the symbol to get nothing matched.
3) Pressing enter with no entry selected.
Then it will report a segmentation fault.
It is caused by the lack of check of browser->he_selection when
accessing it's member res_samples in perf_evsel__hists_browse().
These processes are meaningful for specified samples, so we can skip
these when nothing is selected.
Fixes: 4968ac8fb7 ("perf report: Implement browsing of individual samples")
Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20200612094322.39565-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, time chart call tree
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Charts -> Time chart by CPU
Move mouse over middle of chart
Right-click and select Show Call Tree
Before: displays Call Tree but not expanded to selected time
After: displays Call Tree expanded to selected time
Fixes: e69d5df75d ("perf scripts python: exported-sql-viewer.py: Add ability for Call tree to open at a specified task and time")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero. Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Call Tree
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'
Fixes: ae8b887c00 ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id zero.
Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: gets stuck
After: tree is expanded to line showing 'unknown'
Fixes: 254c0d820b ("perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph or Reports -> Call Tree
Press: Ctrl-F
Enter: main
Press: Enter
Before: line showing 'main' does not display
After: tree is expanded to line showing 'main'
Fixes: ebd70c7dc2 ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 0a892c1c94 ("perf record: Add dummy event during system wide
synthesis") reveals an issue with Intel PT system wide tracing.
Specifically that Intel PT already adds a dummy tracking event, and it
is not the first event. Adding another dummy tracking event causes
duplicated sideband events. Fix by checking for an existing dummy
tracking event first.
Example showing duplicated switch events:
Before:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.895 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [007] 6390.516223: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516224: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [002] 6390.516415: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
swapper 0 [002] 6390.516416: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
After:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.868 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [005] 6450.567013: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 7179/7181
perf 7181 [005] 6450.567014: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
perf 7181 [005] 6450.567028: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.567029: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 7179/7181
swapper 0 [005] 6450.571699: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.571700: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [005] 6450.571702: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.571703: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [005] 6450.579703: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.579704: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200629091955.17090-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than disable all warnings with -w, disable specific warnings.
Predicate enabling the warnings on a recent version of bison.
Tested with GCC 9.3.0 and clang 9.0.1.
Committer testing:
The full set of compilers, gcc and clang that this will be tested on
will be on the signed tag when this change goes upstream.
Had to add -Wno-switch-enum to build on opensuse tumbleweed:
/tmp/build/perf/util/parse-events-bison.c: In function 'yydestruct':
/tmp/build/perf/util/parse-events-bison.c:1200:3: error: enumeration value 'YYSYMBOL_YYEMPTY' not handled in switch [-Werror=switch-enum]
1200 | switch (yykind)
| ^~~~~~
/tmp/build/perf/util/parse-events-bison.c:1200:3: error: enumeration value 'YYSYMBOL_YYEOF' not handled in switch [-Werror=switch-enum]
Also replace -Wno-error=implicit-function-declaration with -Wno-implicit-function-declaration.
Also needed to check just the first two levels of the bison version, as
the patch was assuming that all versions were of the form x.y.z, and
there are several cases where it is just x.y, breaking the build.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than disable all warnings with -w, disable specific warnings.
Predicate enabling the warnings on more recent flex versions.
Tested with GCC 9.3.0 and clang 9.0.1.
Committer notes:
The full set of compilers, gcc and clang that this will be tested on
will be on the signed tag when this change goes upstream.
Added -Wno-misleading-indentation to the flex_flags to overcome this on
opensuse tumbleweed when building with clang:
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/pmu.o
/tmp/build/perf/util/parse-events-flex.c:5038:13: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation]
if ( ! yyg->yy_state_buf )
^
/tmp/build/perf/util/parse-events-flex.c:5036:9: note: previous statement is here
if ( ! yyg->yy_state_buf )
^
And we need to use this to redirect stderr to stdin and then grep in a
way that is acceptable for BusyBox shell:
2>&1 |
Previously I was using:
|&
Which seems to be bash specific.
Added -Wno-sign-compare to overcome this on systems such as centos:7:
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/pmu.o
CC /tmp/build/perf/util/pmu-flex.o
util/parse-events.l: In function 'parse_events_lex':
/tmp/build/perf/util/parse-events-flex.c:193:36: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
for ( yyl = n; yyl < yyleng; ++yyl )\
^
/tmp/build/perf/util/parse-events-flex.c:204:9: note: in expansion of macro 'YY_LESS_LINENO'
Added -Wno-unused-parameter to overcome this in systems such as
centos:7:
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/pmu.o
/tmp/build/perf/util/parse-events-flex.c: In function 'yy_fatal_error':
/tmp/build/perf/util/parse-events-flex.c:6265:58: error: unused parameter 'yyscanner' [-Werror=unused-parameter]
static void yy_fatal_error (yyconst char* msg , yyscan_t yyscanner)
^
Added -Wno-missing-declarations to build in systems such as centos:6:
/tmp/build/perf/util/parse-events-flex.c:6313: error: no previous prototype for 'parse_events_get_column'
/tmp/build/perf/util/parse-events-flex.c:6389: error: no previous prototype for 'parse_events_set_column'
And -Wno-missing-prototypes to cover older compilers:
-Wmissing-prototypes (C only)
Warn if a global function is defined without a previous prototype declaration. This warning is issued even if the definition itself provides a prototype. The aim is to detect global functions that fail to be declared in header files.
-Wmissing-declarations (C only)
Warn if a global function is defined without a previous declaration. Do so even if the definition itself provides a prototype. Use this option to detect global functions that are not declared in header files.
Older C compilers lack -Wno-misleading-indentation, check if it is
available before using it.
Also needed to check just the first two levels of the flex version, as
the patch was assuming that all versions were of the form x.y.z, and
there are several cases where it is just x.y, breaking the build.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Declare bison header file output so that C files can depend upon them.
As there are multiple output targets $@ is replaced by the target name.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These will break the build as soon as we stop disabling all warnings
when building flex and bison generated files, so add them before we do
that to keep the tree bisectable.
Noticed when building on centos:7 with NO_LIBBPF=1:
util/expr.c: In function 'key_equal':
util/expr.c:29:2: error: implicit declaration of function 'strcmp' [-Werror=implicit-function-declaration]
return !strcmp((const char *)key1, (const char *)key2);
^
util/expr.c: In function 'expr__add_id':
util/expr.c:40:3: error: implicit declaration of function 'malloc' [-Werror=implicit-function-declaration]
val_ptr = malloc(sizeof(double));
^
util/expr.c:40:13: error: incompatible implicit declaration of built-in function 'malloc' [-Werror]
val_ptr = malloc(sizeof(double));
^
util/expr.c:42:12: error: 'ENOMEM' undeclared (first use in this function)
return -ENOMEM;
^
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Declare flex header file output so that bison C files can depend upon
them. As there are multiple output targets $@ is replaced by the target
name.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow pmu parser's flex to be debugged as the parse-events and expr
currently are. Enabling this requires the C code to call
perf_pmu__flex_debug.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow pmu parser to be debugged as the parse-events and expr currently
are. Enabling this requires the C code to set perf_pmu_debug.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This reduces the command line size slightly.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This reduces the command line size slightly.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200619043356.90024-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For perf list, the CPU core PMU HW event ordering is such that not all
events may will be listed adjacent - consider this example:
$ tools/perf/perf list
List of pre-defined events (to be used in -e):
duration_time [Tool event]
branch-instructions OR cpu/branch-instructions/ [Kernel PMU event]
branch-misses OR cpu/branch-misses/ [Kernel PMU event]
bus-cycles OR cpu/bus-cycles/ [Kernel PMU event]
cache-misses OR cpu/cache-misses/ [Kernel PMU event]
cache-references OR cpu/cache-references/ [Kernel PMU event]
cpu-cycles OR cpu/cpu-cycles/ [Kernel PMU event]
cstate_core/c3-residency/ [Kernel PMU event]
cstate_core/c6-residency/ [Kernel PMU event]
cstate_core/c7-residency/ [Kernel PMU event]
cstate_pkg/c2-residency/ [Kernel PMU event]
cstate_pkg/c3-residency/ [Kernel PMU event]
cstate_pkg/c6-residency/ [Kernel PMU event]
cstate_pkg/c7-residency/ [Kernel PMU event]
cycles-ct OR cpu/cycles-ct/ [Kernel PMU event]
cycles-t OR cpu/cycles-t/ [Kernel PMU event]
el-abort OR cpu/el-abort/ [Kernel PMU event]
el-capacity OR cpu/el-capacity/ [Kernel PMU event]
Notice in the above example how the cstate_core PMU events are mixed in
the middle of the CPU core events.
For my arm64 platform, all the uncore events get mixed in, making the list
very disorganised:
page-faults OR faults [Software event]
task-clock [Software event]
duration_time [Tool event]
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
L1-icache-loads [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
iTLB-load-misses [Hardware cache event]
iTLB-loads [Hardware cache event]
br_mis_pred OR armv8_pmuv3_0/br_mis_pred/ [Kernel PMU event]
br_mis_pred_retired OR armv8_pmuv3_0/br_mis_pred_retired/ [Kernel PMU event]
br_pred OR armv8_pmuv3_0/br_pred/ [Kernel PMU event]
br_retired OR armv8_pmuv3_0/br_retired/ [Kernel PMU event]
br_return_retired OR armv8_pmuv3_0/br_return_retired/ [Kernel PMU event]
bus_access OR armv8_pmuv3_0/bus_access/ [Kernel PMU event]
bus_cycles OR armv8_pmuv3_0/bus_cycles/ [Kernel PMU event]
cid_write_retired OR armv8_pmuv3_0/cid_write_retired/ [Kernel PMU event]
cpu_cycles OR armv8_pmuv3_0/cpu_cycles/ [Kernel PMU event]
dtlb_walk OR armv8_pmuv3_0/dtlb_walk/ [Kernel PMU event]
exc_return OR armv8_pmuv3_0/exc_return/ [Kernel PMU event]
exc_taken OR armv8_pmuv3_0/exc_taken/ [Kernel PMU event]
hisi_sccl1_ddrc0/act_cmd/ [Kernel PMU event]
hisi_sccl1_ddrc0/flux_rcmd/ [Kernel PMU event]
hisi_sccl1_ddrc0/flux_rd/ [Kernel PMU event]
hisi_sccl1_ddrc0/flux_wcmd/ [Kernel PMU event]
hisi_sccl1_ddrc0/flux_wr/ [Kernel PMU event]
hisi_sccl1_ddrc0/pre_cmd/ [Kernel PMU event]
hisi_sccl1_ddrc0/rnk_chg/ [Kernel PMU event]
...
hisi_sccl7_l3c21/wr_hit_cpipe/ [Kernel PMU event]
hisi_sccl7_l3c21/wr_hit_spipe/ [Kernel PMU event]
hisi_sccl7_l3c21/wr_spipe/ [Kernel PMU event]
inst_retired OR armv8_pmuv3_0/inst_retired/ [Kernel PMU event]
inst_spec OR armv8_pmuv3_0/inst_spec/ [Kernel PMU event]
itlb_walk OR armv8_pmuv3_0/itlb_walk/ [Kernel PMU event]
l1d_cache OR armv8_pmuv3_0/l1d_cache/ [Kernel PMU event]
l1d_cache_refill OR armv8_pmuv3_0/l1d_cache_refill/ [Kernel PMU event]
l1d_cache_wb OR armv8_pmuv3_0/l1d_cache_wb/ [Kernel PMU event]
l1d_tlb OR armv8_pmuv3_0/l1d_tlb/ [Kernel PMU event]
l1d_tlb_refill OR armv8_pmuv3_0/l1d_tlb_refill/ [Kernel PMU event]
So the events are list alphabetically. However, CPU core event listing is
special from commit dc098b35b5 ("perf list: List kernel supplied event
aliases"), in that the alias and full event is shown (in that order).
As such, the core events may become sparse.
Improve this by grouping the CPU core events and ensure that they are
listed first for kernel PMU events. For the first example, above, this
now looks like:
duration_time [Tool event]
branch-instructions OR cpu/branch-instructions/ [Kernel PMU event]
branch-misses OR cpu/branch-misses/ [Kernel PMU event]
bus-cycles OR cpu/bus-cycles/ [Kernel PMU event]
cache-misses OR cpu/cache-misses/ [Kernel PMU event]
cache-references OR cpu/cache-references/ [Kernel PMU event]
cpu-cycles OR cpu/cpu-cycles/ [Kernel PMU event]
cycles-ct OR cpu/cycles-ct/ [Kernel PMU event]
cycles-t OR cpu/cycles-t/ [Kernel PMU event]
el-abort OR cpu/el-abort/ [Kernel PMU event]
el-capacity OR cpu/el-capacity/ [Kernel PMU event]
el-commit OR cpu/el-commit/ [Kernel PMU event]
el-conflict OR cpu/el-conflict/ [Kernel PMU event]
el-start OR cpu/el-start/ [Kernel PMU event]
instructions OR cpu/instructions/ [Kernel PMU event]
mem-loads OR cpu/mem-loads/ [Kernel PMU event]
mem-stores OR cpu/mem-stores/ [Kernel PMU event]
ref-cycles OR cpu/ref-cycles/ [Kernel PMU event]
topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
topdown-slots-issued OR cpu/topdown-slots-issued/ [Kernel PMU event]
topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
topdown-total-slots OR cpu/topdown-total-slots/ [Kernel PMU event]
tx-abort OR cpu/tx-abort/ [Kernel PMU event]
tx-capacity OR cpu/tx-capacity/ [Kernel PMU event]
tx-commit OR cpu/tx-commit/ [Kernel PMU event]
tx-conflict OR cpu/tx-conflict/ [Kernel PMU event]
tx-start OR cpu/tx-start/ [Kernel PMU event]
cstate_core/c3-residency/ [Kernel PMU event]
cstate_core/c6-residency/ [Kernel PMU event]
cstate_core/c7-residency/ [Kernel PMU event]
cstate_pkg/c2-residency/ [Kernel PMU event]
cstate_pkg/c3-residency/ [Kernel PMU event]
cstate_pkg/c6-residency/ [Kernel PMU event]
cstate_pkg/c7-residency/ [Kernel PMU event]
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1592384514-119954-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In commit dc098b35b5 ("perf list: List kernel supplied event aliases"),
the aliases for events are supplied in addition to CPU event in perf list.
This relies on the name of the core PMU being "cpu", which is not the case
for arm64, so arm64 has always missed this. Use generic is_pmu_core()
helper which takes account of arm64 to make this feature work for arm64
(and possibly other archs).
Sample, before:
armv8_pmuv3_0/br_mis_pred/ [Kernel PMU event]
after:
br_mis_pred OR armv8_pmuv3_0/br_mis_pred/ [Kernel PMU event]
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1592384514-119954-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adjust the handling of the session sink selection to allow no sink to be
selected on the command line. This then forwards the sink selection to
the CoreSight infrastructure which will attempt to select a sink based
on the default sink select priorities.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These are broadly useful but required to handle TMA metrics. For example
encoding Ports_Utilization from:
https://download.01.org/perfmon/TMA_Metrics.csv
requires '<'.
{
"BriefDescription": "This metric estimates fraction of cycles the CPU performance was potentially limited due to Core computation issues (non divider-related). Two distinct categories can be attributed into this metric: (1) heavy data-dependency among contiguous instructions would manifest in this metric - such cases are often referred to as low Instruction Level Parallelism (ILP). (2) Contention on some hardware execution unit other than Divider. For example; when there are too many multiply operations.",
"MetricExpr": "( ( cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ + cpu@EXE_ACTIVITY.1_PORTS_UTIL@ + ( cpu@EXE_ACTIVITY.2_PORTS_UTIL@ * ( ( ( cpu@UOPS_RETIRED.RETIRE_SLOTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) ) / ( ( 4.000000 ) + 1.000000 ) ) ) ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) if ( cpu@ARITH.DIVIDER_ACTIVE\\,cmask\\=1@ < cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ ) else ( ( cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ + cpu@EXE_ACTIVITY.1_PORTS_UTIL@ + ( cpu@EXE_ACTIVITY.2_PORTS_UTIL@ * ( ( ( cpu@UOPS_RETIRED.RETIRE_SLOTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) ) / ( ( 4.000000 ) + 1.000000 ) ) ) ) - cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) )",
"MetricGroup": "Topdown_Group_Ports_Utilization",
"MetricName": "Topdown_Metric_Ports_Utilization"
},
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200610235823.52557-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
d_ratio avoids division by 0 yielding infinity, such as when a counter
doesn't get scheduled. An example usage is:
{
"BriefDescription": "DCache L1 misses",
"MetricExpr": "d_ratio(MEM_LOAD_RETIRED.L1_MISS, MEM_LOAD_RETIRED.L1_HIT + MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT)",
"MetricGroup": "DCache;DCache_L1",
"MetricName": "DCache_L1_Miss",
"ScaleUnit": "100%",
}
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200610235823.52557-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixups related to the introduction of libperf, where the
perf_{evsel,evlist}__ prefix is reserved for functions operating on
struct perf_{evsel,evlist}.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding new metric test for frontend metric. It's stolen from x86 pmu
events.
Committer testing:
# perf test "Parse and process metrics"
67: Parse and process metrics : Ok
# perf test -v "Parse and process metrics"
#
67: Parse and process metrics :
--- start ---
test child forked, pid 104881
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
found event inst_retired.any
found event cpu_clk_unhalted.thread
adding {inst_retired.any,cpu_clk_unhalted.thread}:W
metric expr idq_uops_not_delivered.core / (4 * (( ( cpu_clk_unhalted.thread / 2 ) * ( 1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk ) ))) for Frontend_Bound_SMT
found event cpu_clk_unhalted.one_thread_active
found event cpu_clk_unhalted.ref_xclk
found event idq_uops_not_delivered.core
found event cpu_clk_unhalted.thread
adding {cpu_clk_unhalted.one_thread_active,cpu_clk_unhalted.ref_xclk,idq_uops_not_delivered.core,cpu_clk_unhalted.thread}:W
test child finished with 0
---- end ----
Parse and process metrics: Ok
#
Had to fix it to initialize that 'struct value' array sentinel with a
named initializer to fix the build with some versions of clang:
tests/parse-metric.c:154:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
{ 0 },
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding new test that process metrics code and checks the expected
results. Starting with easy ipc metric.
Committer testing:
# perf test "Parse and process metrics"
67: Parse and process metrics : Ok
#
# perf test -v "Parse and process metrics"
67: Parse and process metrics :
--- start ---
test child forked, pid 103402
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
found event inst_retired.any
found event cpu_clk_unhalted.thread
adding {inst_retired.any,cpu_clk_unhalted.thread}:W
test child finished with 0
---- end ----
Parse and process metrics: Ok
#
Had to fix it to initialize that 'struct value' array sentinel with a
named initializer to fix the build with some versions of clang:
tests/parse-metric.c:135:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
{ 0 },
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding test_generic_metric that prepares and runs given metric over the
data from struct runtime_stat object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We don't release metric_events rblist, add the missing delete hook and
call the release before leaving cmd_stat.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factoring out prepare_metric function so it can be used in test
interface coming in following changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the metricgroup__parse_groups_test function. It will be used as
test's interface to metric parsing in following changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For testing purposes we need to pass our own map of events from
parse_groups() through metricgroup__add_metric.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow to pass fake_pmu in parse_groups function so it can be used in
parse_events call.
It's will be passed by the upcoming metricgroup__parse_groups_test
function.
Committer notes:
Made it a 'struct perf_pmu' pointer, in line with the changes at the
start of this patchkit to avoid statics deep down in library code.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out the parse_groups function, it will be used for new test
interface coming in following changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test goes through all metrics compiled for arch within pmu events
and try to parse them.
This test is different from 'test_parsing' in that we go through all the
events in the current arch, not just one defined for current CPU model.
Using 'fake_pmu' to parse events which do not have PMUs defined in the
system.
Say there's bad change in ivybridge metrics file, like:
- a/tools/perf/pmu-events/arch/x86/ivybridge/ivb-metrics.json
+ b/tools/perf/pmu-events/arch/x86/ivybridge/ivb-metrics.json
@@ -8,7 +8,7 @@
- "MetricExpr": "IDQ_UOPS_NOT_DELIVERED.CORE / (4 * ((
+ "MetricExpr": "IDQ_UOPS_NOT_DELIVERED.CORE / / (4 *
the test fails with (on my kabylake laptop):
$ perf test 'Parsing of PMU event table metrics with fake PMUs' -v
parsing 'idq_uops_not_delivered.core / / (4 * (( ( cpu_clk_unh...
syntax error, line 1
expr__parse failed
test child finished with -1
...
The test also defines its own list of metrics and tries to parse them.
It's handy for developing.
Committer notes:
Testing it:
$ perf test fake
10: PMU events :
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
$ perf test -v fake |& tail
parsing '(unc_p_freq_trans_cycles / unc_p_clockticks) * 100.'
parsing '(unc_m_power_channel_ppd / unc_m_clockticks) * 100.'
parsing '(unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.'
parsing '(unc_m_power_self_refresh / unc_m_clockticks) * 100.'
parsing 'idq_uops_not_delivered.core / * (4 * cycles)'
syntax error
expr__parse failed
test child finished with -1
---- end ----
PMU events subtest 4: FAILED!
$
And fix this error:
tests/pmu-events.c:437:40: error: missing field 'idx' initializer [-Werror,-Wmissing-field-initializers]
struct parse_events_error error = { 0 };
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When wanting to use the support in __parse_events() for fake pmus, just
pass it.
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is an alternative patch to what Jiri sent that instead of changing
all callers to parse_events() for allowing to pass a fake_pmu, provide
another function specifically for that.
From Jiri's patch:
This way it's possible to parse events from PMUs which are not present
in the system. It's available only for testing purposes coming in
following changes, so all the current users set fake_pmu argument as
false.
Based-on-a-patch-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-3-jolsa@kernel.org
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Separating the generic part of check_parse_id function,
so it can be used in following changes for the new test.
Committer notes:
Fix this error:
tests/pmu-events.c:413:40: error: missing field 'idx' initializer [-Werror,-Wmissing-field-initializers]
struct parse_events_error error = { 0 };
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a way to create a pmu event without the actual PMU being in place.
This way we can test metrics defined for any processor.
The interface is to define fake_pmu in struct parse_events_state data.
It will be used only in tests via special interface function added in
following changes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602214741.1218986-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The '>' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:
tools/perf/ui/browsers/annotate.c:212:30-35: WARNING: conversion to bool
not needed here
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200420123528.11655-1-yanaijie@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some platforms the default encoding is not utf-8, which causes an
UnicodeDecodeError when reading the flamegraph template and writing the
flamegraph
Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200619153232.203537-1-agerstmayr@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When build perf with ASan or UBSan, if libasan or libubsan can not find,
the feature-glibc is 0 and there exists the following error log which is
wrong, because we can find gnu/libc-version.h in /usr/include,
glibc-devel is also installed.
[yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
BUILD: Doing 'make -j4' parallel build
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
<stdin>:1:0: warning: -fsanitize=address and -fsanitize=kernel-address are not supported for this target
<stdin>:1:0: warning: -fsanitize=address not supported for this target
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:393: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
Makefile.perf:224: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2
[yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
/usr/include/gnu/libc-version.h
After install libasan and libubsan, the feature-glibc is 1 and the build
process is success, so the cause is related with libasan or libubsan, we
should check them and print an error log to reflect the reality.
Committer testing:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:401: *** No libasan found, please install libasan. Stop.
make[1]: *** [Makefile.perf:231: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$
$ sudo dnf install libasan
<SNIP>
Installed:
libasan-9.3.1-2.fc31.x86_64
$
$
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
<SNIP>
CC /tmp/build/perf/util/pmu-flex.o
FLEX /tmp/build/perf/util/expr-flex.c
CC /tmp/build/perf/util/expr-bison.o
CC /tmp/build/perf/util/expr.o
CC /tmp/build/perf/util/expr-flex.o
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/parse-events.o
LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
INSTALL python-scripts
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
libasan.so.5 => /lib64/libasan.so.5 (0x00007f0904164000)
$
And if we rebuild without -fsanitize-address:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
GEN /tmp/build/perf/common-cmds.h
CC /tmp/build/perf/exec-cmd.o
<SNIP>
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
$
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tiezhu yang <yangtiezhu@loongson.cn>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1592445961-28044-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes segmentation fault when trying to interpret zstd-compressed data
with perf script:
```
$ perf record -z ls
...
[ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
$ memcheck perf script
...
==67911== Invalid read of size 4
==67911== at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
==67911== by 0x6E726B: zstd_decompress_stream (zstd.c:100)
==67911== by 0x65729C: perf_session__process_compressed_event (session.c:72)
==67911== by 0x6598E8: perf_session__process_user_event (session.c:1583)
==67911== by 0x65BA59: reader__process_events (session.c:2177)
==67911== by 0x65BA59: __perf_session__process_events (session.c:2234)
==67911== by 0x65BA59: perf_session__process_events (session.c:2267)
==67911== by 0x5A7397: __cmd_script (builtin-script.c:2447)
==67911== by 0x5A7397: cmd_script (builtin-script.c:3840)
==67911== by 0x5FE9D2: run_builtin (perf.c:312)
==67911== by 0x711627: handle_internal_command (perf.c:364)
==67911== by 0x711627: run_argv (perf.c:408)
==67911== by 0x711627: main (perf.c:538)
==67911== Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
```
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This avoids multiple declarations if the flex header is included.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200609234344.3795-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arrays are pointer types and don't need their address taking.
Fixes: 8255718f4b (perf pmu: Expand PMU events by prefix match)
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200609053610.206588-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If config->aggr_map is NULL and config->aggr_get_id is not NULL,
the function print_aggr() will still calling arrg_update_shadow(),
which can result in accessing the invalid pointer.
Fixes: 088519f318 ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/lkml/20200608163625.GC3073@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'evname' variable can be NULL, as it is checked a few lines back,
check it before using.
Fixes: 9e207ddfa2 ("perf report: Show call graph from reference events")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Update the copies of files affected by:
c8ffd8bcdd ("vfs: add faccessat2 syscall")
To address this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h'
diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Which results in 'perf trace' gaining support for the 'faccessat2'
syscall, now one can use:
# perf trace -e faccessat2
And have system wide tracing of this syscall. And this also will include
it;
# perf trace -e faccess*
Together with the other variants.
How it affects building/usage (on an x86_64 system):
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c.before
$
[root@five ~]# perf trace -e faccessat2
event syntax error: 'faccessat2'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
[root@five ~]#
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ git diff
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 37b844f839bc..78847b32e137 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -359,6 +359,7 @@
435 common clone3 sys_clone3
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
+439 common faccessat2 sys_faccessat2
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
$ make -C tools/perf O=/tmp/build/perf/ install-bin
<SNIP>
CC /tmp/build/perf/util/syscalltbl.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
[root@five ~]# perf trace -e faccessat2
^C[root@five ~]#
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There exists some duplicated includes in tools/perf, remove them.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1591071304-19338-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao reported the issue (and posted first versions of this change)
with groups being defined over events with different cpu mask.
This causes assert aborts in get_group_fd, like:
# perf stat -M "C2_Pkg_Residency" -a -- sleep 1
perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
Aborted
All the events in the group have to be defined over the same cpus so the
group_fd can be found for every leader/member pair.
Adding check to ensure this condition is met and removing the group
(with warning) if we detect mixed cpus, like:
$ sudo perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
WARNING: event cpu maps do not match, disabling group:
anon group { power/energy-cores/, cycles }
anon group { instructions, power/energy-cores/ }
Ian asked also for cpu maps details, it's displayed in verbose mode:
$ sudo perf stat -e '{cycles,power/energy-cores/}' -v
WARNING: group events cpu maps do not match, disabling group:
anon group { power/energy-cores/, cycles }
power/energy-cores/: 0
cycles: 0-7
anon group { instructions, power/energy-cores/ }
instructions: 0-7
power/energy-cores/: 0
Committer testing:
[root@seventh ~]# perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
WARNING: grouped events cpus do not match, disabling group:
anon group { power/energy-cores/, cycles }
anon group { instructions, power/energy-cores/ }
^C
Performance counter stats for 'system wide':
12.62 Joules power/energy-cores/
106,920,637 cycles
80,228,899 instructions # 0.75 insn per cycle
12.62 Joules power/energy-cores/
14.514476987 seconds time elapsed
[root@seventh ~]#
But if we put compatible events in each group it works:
[root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2
Performance counter stats for 'system wide':
1.95 Joules power/energy-cores/
0.92 Joules power/energy-ram/
29,305,715 instructions # 1.03 insn per cycle
28,423,338 cycles
2.001438142 seconds time elapsed
[root@seventh ~]#
This needs improvement tho:
[root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (power/energy-cores/).
/bin/dmesg | grep -i perf may provide additional information.
[root@seventh ~]#
We need to emit a better message, one stating that the power/ events
can't be used for a specific workload, instead it is per-cpu or system
wide.
Fixes: 6a4bb04caa ("perf tools: Enable grouping logic for parsed events")
Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602101736.GE1112120@krava
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is currently working due to extra include paths in the build.
Before:
$ cd tools/perf/arch/arm64/util
$ ls -la ../../util/unwind-libdw.h
ls: cannot access '../../util/unwind-libdw.h': No such file or directory
After:
$ ls -la ../../../util/unwind-libdw.h
-rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200529225232.207532-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch is to add four options to synthesize events which are
described as below:
'f': synthesize first level cache events
'm': synthesize last level cache events
't': synthesize TLB events
'a': synthesize remote access events
This four options will be used by ARM SPE as their first consumer.
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-3-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create a new arm-spe-decoder directory for subsequent extensions and
move arm-spe-pkt-decoder.h/c to this directory. No code changes.
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Tested-by: Qi Liu <liuqi115@hisilicon.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-2-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid a false positive caused by assembly code in arch/x86.
In tests, zero the perf_event to avoid uninitialized memory uses.
Warnings were caught using clang with -fsanitize=memory.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200530082015.39162-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The tail call optimization can unexpectedly make the stack smaller and
cause the test to fail.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: clang-built-linux@googlegroups.com
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200530082015.39162-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that when one runs:
$ make -C tools/perf build-test
We make sure that recent changes don't break that opt-in build.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiwei Sun <jiwei.sun@windriver.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch links perf with the libpfm4 library if it is available and
LIBPFM4 is passed to the build. The libpfm4 library contains hardware
event tables for all processors supported by perf_events. It is a helper
library that helps convert from a symbolic event name to the event
encoding required by the underlying kernel interface. This library is
open-source and available from: http://perfmon2.sf.net.
With this patch, it is possible to specify full hardware events by name.
Hardware filters are also supported. Events must be specified via the
--pfm-events and not -e option. Both options are active at the same time
and it is possible to mix and match:
$ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles ....
One needs to explicitely ask for its inclusion by using the LIBPFM4 make
command line option, ie its opt-in rather than opt-out of feature
detection and build support.
Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiwei Sun <jiwei.sun@windriver.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This header is part of the jsmn JSON parser, introduced in 867a979a83.
Correct the SPDX tag to indicate that it is under the MIT license.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lore.kernel.org/lkml/20200528170858.48457-1-emaste@freefall.freebsd.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix an issue where addresses in the DWARF line table are offset by -0x40
(GEN_ELF_TEXT_OFFSET). This can be seen with `objdump -S` on the ELF
files after perf inject.
Committer notes:
Ian added this in his Acked-by reply:
---
Without too much knowledge this looks good to me. The original code came
from oprofile's jit support:
https://sourceforge.net/p/oprofile/oprofile/ci/master/tree/opjitconv/debug_line.c#l325
---
Signed-off-by: Nick Gasson <nick.gasson@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200528051916.6722-1-nick.gasson@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For each PC/BCI pair in the JVMTI compiler inlining record table, the
jitdump plugin emits debug line table entries for every source line in
the method preceding that BCI. Instead only emit one source line per
PC/BCI pair. Reported by Ian Rogers. This reduces the .dump size for
SPECjbb from ~230MB to ~40MB.
Signed-off-by: Nick Gasson <nick.gasson@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200528054049.13662-1-nick.gasson@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We forgot to add it, so one would have to explicitely ask for it to be
run, fix that by adding it to the set of tests that are performed by
default when one does:
$ make -C tools/perf build-test
It was being exercised only in the make_minimal test, this patch makes
it be tested in isolation, i.e. disabling only this feature.
Fixes: e26e63be64 ("perf build: Add sdt feature detection")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We forgot to add it, so one would have to explicitely ask for it to be
run, fix that by adding it to the set of tests that are performed by
default when one does:
$ make -C tools/perf build-test
It was being exercised only in the make_minimal test, this patch makes
it be tested in isolation, i.e. disabling only this feature.
Fixes: 8ee4646038 ("perf build: Add libcrypto feature detection")
Cc: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we make sure that even on x86-64 and other architectures where
that is the default method we test build the fallback to libaudit that
other architectures use.
I.e. now this line got added to:
$ make -C tools/perf build-test
<SNIP>
make_no_syscall_tbl_O: cd . && make NO_SYSCALL_TABLE=1 FEATURES_DUMP=/home/acme/git/perf/tools/perf/BUILD_TEST_FEATURE_DUMP -j12 O=/tmp/tmp.W0HtKR1mfr DESTDIR=/tmp/tmp.lNezgCVPzW
<SNIP>
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>