Commit Graph

4313 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
ecc4c5614b perf tools: Propagate perf_config() errors
Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

  $ cat ~/.perfconfig
  cat: /home/acme/.perfconfig: No such file or directory
  $ perf stat -e cycles usleep 1

   Performance counter stats for 'usleep 1':

           938,996      cycles:u

       0.003813731 seconds time elapsed

  $ perf top --stdio
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  <SNIP>
  [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
  [acme@jouet linux]$ perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .........................
    71.77%  usleep   libc-2.24.so       [.] _dl_addr
    27.07%  usleep   ld-2.24.so         [.] _dl_next_ld_env_entry
     1.13%  usleep   [kernel.kallsyms]  [k] page_fault
  $
  $ touch ~/.perfconfig
  $ ls -la ~/.perfconfig
  -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
  $
  $ perf stat -e instructions usleep 1

   Performance counter stats for 'usleep 1':

           244,610      instructions:u

       0.000805383 seconds time elapsed

  $
  [root@jouet ~]# chown acme.acme ~/.perfconfig
  [root@jouet ~]# perf stat -e cycles usleep 1
    Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

   Performance counter stats for 'usleep 1':

           937,615      cycles

       0.000836931 seconds time elapsed
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 12:23:33 -03:00
Arnaldo Carvalho de Melo
afc45cf52c perf config: Do not consider an error not to have any perfconfig file
While propagating the errors from perf_config(), which were being
completely ignored, everything stopped working for people without a
~/.perfconfig file, because the perf_config_set__init() was considering
an error not to have a .perfconfig file, duh, fix it by checking the
errno after the failed stat() call.

It should also not return an error when it says it is ignoring the file,
and also a empty file should not return an error either.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8beeb00f2c ("perf config: Use new perf_config_set__init() to initialize config set")
Link: http://lkml.kernel.org/n/tip-ygpbab3apbs6l8wr97xedwks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 10:28:34 -03:00
Ingo Molnar
e2cf00c257 New features:
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)
 
 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
 
 Fixes:
 
 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)
 
 - Fix map offsets in relocation in libbpf (Joe Stringer)
 
 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)
 
 Infrastructure:
 
 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)
 
 Trivial:
 
 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYig7UAAoJENZQFvNTUqpAhJQP/iI0T7A8TNekPGLv7j20c302
 89N9+9TAFtVqjgr1hIzqQgGOqbOdAW1tU3VTPW92nNDBn9JV5qwuF9YWEiDaAVv2
 0bmV5hLnrNlymddm3pdg/PbD1TVlwk2NFxtrkPxuf/vx0ZhEGqsSrRUCR/xGXbtQ
 TcMg3rQquspV9JNv4HzFdQC9nsG1CGNotZKsE1avRw70pWAqCtF81B0m8teb6OWo
 5qnN+AMJlYcC+OGffROemUksuehkMvi5L8v1e/6RO/lU1qt9Jrc/2sT9cqvjVFNR
 k4c76cUgWOCYzDEotENMpU4bc6e/24DE2ydFeovihdXw8Qs4ajEA9LXKM4yW+ZoE
 MZE3GS153a8n+CvTfkB9Ow1QJ8rgmR/L0BuhmGb6bYW/MtuTRTShhSduZwOrIyap
 9KckHYti4p3oN3CKFYGO9PN3DRUdx+Xqg/miwrgjkPo09QFp+lzfFFOk0P2/Zqw2
 yfvdWeHxkkrwoWQIyMHVKp/E9jQPuyYqwnKdp68LCN+DgNiFpPpSA8id5e47RQDE
 otqrK8U/82ktakfrBijSPBI6EEqFg7ltip2KT/xlDMfnP9HtxgFhzrk52dyi6pM/
 jkBhJaTQhVZTyaFvUXuaLmBSdPpcaaGM4KJ+2iAayA2r0KLiDj6IdzD5ROCRFOvJ
 SFA472mIxNxUjpQEUTtc
 =tYKN
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.11-20170126' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull the latest perf/core updates from Arnaldo Carvalho de Melo:

New features:

 - Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)

 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)

Fixes:

 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)

 - Fix map offsets in relocation in libbpf (Joe Stringer)

 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)

Infrastructure changes:

 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)

Trivial changes:

 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-26 16:20:59 +01:00
Namhyung Kim
a7619aef6d perf util: Add more debug message on failure path
It's helpful for debugging on tracing features.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-rjysr9ljiesymgk4qblteaty@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:00 -03:00
Namhyung Kim
cd4ceb6343 perf util: Save pid-cmdline mapping into tracing header
Current trace info data lacks the saved cmdline mapping which is needed
for pevent to find out the comm of a task.  Add this and bump up the
version number so that perf can determine its presence when reading.

This is mostly corresponding to trace.dat file version 6, but still
lacks 4 byte of number of cpus, and 10 bytes of type string - and I
think we don't need those anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ Change version test from == to >= ]
Link: http://lkml.kernel.org/n/tip-vaooqpxsikxbb3359p0corcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Arnaldo Carvalho de Melo
0a87e7bc6c perf scripting perl: Do not die() when not founding event for a type
Do just like handling other cases i.e. print some debug message and
ignore the sample.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxyvbd7d9n9554ai9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Markus Elfring
d1d0e29cb7 perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()
Remove an error code assignment which is redundant in an if branch for
the handling of a memory allocation failure because the same value was
set for the local variable "err" before.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd-5b20-02c63ed98aab@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:56 -03:00
Markus Elfring
42e233cacc perf probe: Delete an unnecessary check in try_to_find_absolute_address()
Remove a condition check which is unnecessary at the end
because this source code place should usually only be reached
with a non-zero pointer.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/a3f2473b-6383-a326-bce0-b826423608b8@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:55 -03:00
Ingo Molnar
47cd95a632 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-25 15:52:46 +01:00
Matija Glavinic Pecotic
9343e45bf6 perf unwind: Fix looking up dwarf unwind stack info
Using perf with call graph method dwarf fails to provide backtrace
support for stripped binary even though .gnu_debuglink points to *.dbg
flavor with properly populated debug symbols.

Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and
4.10.rc3.  Perf is configured with libunwind, and unwind dwarf support
[1]. Test code (stress_bt.c) can be found on [2].

Running (explicitly disable other unwinding methods):

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	-fno-asynchronous-unwind-tables stress_bt.c
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

results in properly generated call graph. Stripping the binary and running
it results with missing call graph. Expected result is to have call graph:

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	  -fno-asynchronous-unwind-tables stress_bt.c
  $ objcopy --only-keep-debug stress_bt stress_bt.dbg
  $ objcopy --strip-debug stress_bt
  $ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

Problem is that perf doesn't try to read symbols pointed by gnu
debuglink.  Patch adds checking, and reading of the symbols from
debuglink and symsrc.  Order of the check is to first check within dso,
then check whether symsrc is defined and try to read from it. Finally,
debuglink is checked. Default locations of debug files are discussed in
[3] and [4]. Comments on RFC are on [5].

[1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding
[2] [1]#Backtrace_stress_application
[3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
[4] https://sourceware.org/binutils/docs/binutils/objcopy.html
[5] https://lkml.org/lkml/2016/8/22/473

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/d309d40a-463f-482b-68e1-1465326efdc1@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-18 12:29:52 -03:00
Soramichi AKIYAMA
d94386f28a perf evlist: Fix typo in deliver_sample()
This patch fixes a typo: s/delievery/delivery/

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117222233.dfd92de0ad701e7c53396950@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Soramichi AKIYAMA
d25ed5d9fa perf tools: Move two variables usied in libperf from perf.c
The use_browser and perf_version_string variables are both declared in
perf.c but they are also referenced by other functions of libperf.a.

Therefore a user linking an own main() with libperf.a must declare those
two variables in their files even if the files never use the browser or
the version information.

This patch fixes this issue by moving use_browser and
perf_version_string out of perf.c to some other files.

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Masami Hiramatsu
613f050d68 perf probe: Fix to probe on gcc generated functions in modules
Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.

E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
  in_range.isra.12
  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range+0

With this fix, perf probe correctly shows a probe on
gcc-generated symbol.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range.isra.12+0

This also fixes same problem on online module as below.

  $ perf probe -m i915 -D assert_plane
  p:probe/assert_plane i915:assert_plane.constprop.134+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:43:04 -03:00
Masami Hiramatsu
3e96dac7c9 perf probe: Add error checks to offline probe post-processing
Add error check codes on post processing and improve it for offline
probe events as:

 - post processing fails if no matched symbol found in map(-ENOENT)
   or strdup() failed(-ENOMEM).

 - Even if the symbol name is the same, it updates symbol address
   and offset.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:35:25 -03:00
Masami Hiramatsu
d2d4edbebe perf probe: Fix to show correct locations for events on modules
Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.

This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.

Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).

In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
  ffffffffc0270000 t i915_switcheroo_can_switch	[i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
  | head -n 2
  ffffffffc027ff80 t haswell_init_clock_gating	[i915]
  ffffffffc0280110 t valleyview_init_clock_gating	[i915]

So setup probes on both function and see what happen.

  $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
        -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

  	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.

With this patch, both events are shown correctly.

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

Committer notes:

In my case:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	  perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
    probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
  #

  # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
   [Nr] Name   Type      Address          Off    Size   ES Flg Lk Inf Al
   [ 1] .text  PROGBITS  ffffffff81000000 200000 822fd3 00  AX  0   0 4096
  #

  So both are b0rked, now with the fix:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
  #

Both looks correct.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:14:06 -03:00
Andi Kleen
d02fc6bcd5 perf pmu: Factor out scale conversion code
Move the scale factor parsing code to an own function to reuse it in an
upcoming patch.

v2: Return error in case strdup returns NULL.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170103150833.6694-2-andi@firstfloor.org
[ Keep returning -ENOMEM when strdup() fails in perf_pmu__parse_scale()/convert_scale() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Jiri Olsa
0c5824498e perf record: Add switch-output size warning
Adding switch-output size warning if the requested
size of lower than the wakeup ring buffer size.

  $ perf record --switch-output=1K ls
  WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa
9808143ba2 perf tools: Add unit_number__scnprintf function
Add unit_number__scnprintf function to display size units and use it in
-m option info message.

Before:
  $ perf record -m 10M ls
  rounding mmap pages size to 16777216 bytes (4096 pages)
  ...

After:
  $ perf record -m 10M ls
  rounding mmap pages size to 16M (4096 pages)
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-2-git-send-email-jolsa@kernel.org
[ Rename it to unit_number__scnprintf for consistency ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Soramichi Akiyama
e978be9ea2 perf evlist: Fix typo in perf_evlist__start_workload()
This patch fixes a typo: s/enable to/unable to/

Signed-off-by: Soramichi AKIYAMA <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: bcf3145fbe ("perf evlist: Enhance perf_evlist__start_workload()")
Link: http://lkml.kernel.org/r/20170110200006.e1f7a766b4faf1f107ae2e1b@m.soramichi.jp
[ Wasn't applying, fixed it up by hand, added Fixes: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo
7d132caaf9 perf machine: Add a kallsyms loading constructor
To reduce the boilerplate for searching for functions in the running
kernel and modules.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-93iqzayafpaxaguoiwjqezgz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:00 -03:00
Masami Hiramatsu
8a937a25a7 perf probe: Fix to probe on gcc generated symbols for offline kernel
Fix perf-probe to show probe definition on gcc generated symbols for
offline kernel (including cross-arch kernel image).

gcc sometimes optimizes functions and generate new symbols with suffixes
such as ".constprop.N" or ".isra.N" etc. Since those symbol names are
not recorded in DWARF, we have to find correct generated symbols from
offline ELF binary to probe on it (kallsyms doesn't correct it).  For
online kernel or uprobes we don't need it because those are rebased on
_text, or a section relative address.

E.g. Without this:

  $ perf probe -k build-arm/vmlinux -F __slab_alloc*
  __slab_alloc.constprop.9
  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc+0

If you put above definition on target machine, it should fail
because there is no __slab_alloc in kallsyms.

With this fix, perf probe shows correct probe definition on
__slab_alloc.constprop.9:

  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc.constprop.9+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:44:22 -03:00
Masami Hiramatsu
eebc509b20 perf probe: Fix --funcs to show correct symbols for offline module
Fix --funcs (-F) option to show correct symbols for offline module.
Since previous perf-probe uses machine__findnew_module_map() for offline
module, even if user passes a module file (with full path) which is for
other architecture, perf-probe always tries to load symbol map for
current kernel module.

This fix uses dso__new_map() to load the map from given binary as same
as a map for user applications.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:15:09 -03:00
Arnaldo Carvalho de Melo
7934c98a6e perf symbols: Robustify reading of build-id from sysfs
Markus reported that perf segfaults when reading /sys/kernel/notes from
a kernel linked with GNU gold, due to what looks like a gold bug, so do
some bounds checking to avoid crashing in that case.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 16:11:13 -03:00
Masami Hiramatsu
1f2ed153b9 perf probe: Fix to get correct modname from elf header
Since 'perf probe' supports cross-arch probes, it is possible to analyze
different arch kernel image which has different bits-per-long.

In that case, it fails to get the module name because it uses the
MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
of the target arch bits-per-long.

This fixes above issue by changing modname-offset based on the target
archs bit width. This is ok because linux kernel uses LP64 model on
64bit arch.

E.g. without this (on x86_64, and target module is arm32):

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup :configfs_lookup+0
                          ^-Here is an empty module name.

With this fix, you can see correct module name:

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup configfs:configfs_lookup+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-02 14:09:17 -03:00
Kan Liang
ed6c166cc7 perf diff: Do not overwrite valid build id
Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%  -99.80%  tchain_edit       [.] f2
     0.12%  +99.81%  tchain_edit       [.] f3
     0.02%   -0.01%  [ixgbe]           [k] ixgbe_read_reg

  After the fix:
  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%   +0.10%  tchain_edit       [.] f3
     0.12%   -0.08%  tchain_edit       [.] f2

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
CC: Dima Kogan <dima@secretsauce.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:38 -03:00
Ravi Bangoria
edee44be59 perf annotate: Don't throw error for zero length symbols
'perf report --tui' exits with error when it finds a sample of zero
length symbol (i.e. addr == sym->start == sym->end). Actually these are
valid samples. Don't exit TUI and show report with such symbols.

Reported-and-Tested-by: Anton Blanchard <anton@samba.org>
Link: https://lkml.org/lkml/2016/10/8/189
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:32 -03:00
Ravi Bangoria
e216874cc1 perf annotate: Fix jump target outside of function address range
If jump target is outside of function range, perf is not handling it
correctly. Especially when target address is lesser than function start
address, target offset will be negative. But, target address declared to
be unsigned, converts negative number into 2's complement. See below
example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is
lesser than function start address(34cf0).

        34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0

Objdump output:

  0000000000034cf0 <__sigaction>:
  __GI___sigaction():
    34cf0: lea    -0x20(%rdi),%eax
    34cf3: cmp    -bashx1,%eax
    34cf6: jbe    34d00 <__sigaction+0x10>
    34cf8: jmpq   34ac0 <__GI___libc_sigaction>
    34cfd: nopl   (%rax)
    34d00: mov    0x386161(%rip),%rax        # 3bae68 <_DYNAMIC+0x2e8>
    34d07: movl   -bashx16,%fs:(%rax)
    34d0e: mov    -bashxffffffff,%eax
    34d13: retq

perf annotate before applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        v  jmpq   fffffffffffffdd0
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

perf annotate after applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        ^  jmpq   34ac0 <__GI___libc_sigaction>
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Ravi Bangoria
3ee2eb6da2 perf annotate: Support jump instruction with target as second operand
Architectures like PowerPC have jump instructions that includes a target
address as a second operand. For example, 'bne cr7,0xc0000000000f6154'.
Add support for such instruction in perf annotate.

objdump o/p:
  c0000000000f6140:   ld     r9,1032(r31)
  c0000000000f6144:   cmpdi  cr7,r9,0
  c0000000000f6148:   bne    cr7,0xc0000000000f6154
  c0000000000f614c:   ld     r9,2312(r30)
  c0000000000f6150:   std    r9,1032(r31)
  c0000000000f6154:   ld     r9,88(r31)

Corresponding perf annotate o/p:

Before patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    3ffffffffff09f2c
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

After patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    74
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa
a359c17a7e perf evsel: Allow to ignore missing pid
Adding perf_evsel::ignore_missing_cpu_thread bool.

When set true, it allows perf to ignore error of missing pid of perf
event syscall.

We remove missing thread id from the thread_map, so the rest of the
processing like ioctl and mmap won't get disturbed with -1 fd.

The reason for supporting this is to ease up monitoring group of pids,
that 'disappear' before perf opens their event. This currently leads
perf to report error and exit and makes perf record's -u option unusable
under certain setup.

With this change we will allow this race and ignore such failure with
following warning:

  WARNING: Ignored open failure for pid 8605

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161213074622.GA3084@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa
38af91f01d perf thread_map: Add thread_map__remove function
Add thread_map__remove function to remove thread from thread map.

Add automated test also.

Committer notes:

Testing it:

  # perf test "Remove thread map"
  39: Remove thread map                          : Ok
  # perf test -v "Remove thread map"
  39: Remove thread map                          :
  --- start ---
  test child forked, pid 4483
  2 threads: 4482, 4483
  1 thread: 4483
  0 thread:
  test child finished with 0
  ---- end ----
  Remove thread map: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-4-git-send-email-jolsa@kernel.org
[ Added stdlib.h, to get the free() declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Jiri Olsa
83c2e4f396 perf evsel: Use variable instead of repeating lengthy FD macro
It's more readable and will ease up following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Namhyung Kim
571f1eb9b9 perf callchain: Introduce callchain_cursor__copy()
The callchain_cursor__copy() function is to save current callchain
captured by a cursor.  It'll be used to keep callchains when switching
to idle task for each cpu.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
Ravi Bangoria
bec60e50af perf annotate: Show raw form for jump instruction with indirect target
For jump instructions that does not include target address as direct operand,
show the original disassembled line for them. This is needed for certain
powerpc jump instructions that use target address in a register (such as bctr,
btar, ...).

Before:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr   ffffffffffffca2c
     std    r2,24(r1)
     addis  r12,r2,-1

After:
     ld     r12,32088(r12)
     mtctr  r12
  v  bctr
     std    r2,24(r1)
     addis  r12,r2,-1

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr   3fffffffffe01510                 ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

After:

  .__bpf_prog_run  vmlinux.powerpc
         │        std    r10,512(r9)                      ▒
         │        lbz    r9,0(r31)                        ▒
         │        rldicr r9,r9,3,60                       ▒
         │        ldx    r9,r30,r9                        ▒
         │        mtctr  r9                               ▒
  100.00 │      ↓ bctr                                    ▒
         │        lwa    r10,4(r31)                       ▒
         │        lwz    r9,0(r31)                        ▒
  <SNIP>
  Invalid jump offset: 3fffffffffe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

  [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev
  # ========
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # ========
  #
  [acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 17:21:57 -03:00
Wang Nan
edd695b032 perf clang: Compile BPF script using builtin clang support
After this patch, perf utilizes builtin clang support to build BPF
script, no longer depend on external clang, but fallbacking to it
if for some reason the builtin compiling framework fails.

Test:

  $ type clang
  -bash: type: clang: not found
  $ cat ~/.perfconfig
  $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c
  $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c
  $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin
  bpf: successfull builtin compilation
  $

Can't pass cflags so unable to include kernel headers now. Will be fixed
by following commits.

Committer notes:

Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the
'verbose' variable will not be set when the bpf event is parsed and thus the
pr_debug indicating a 'successfull builtin compilation' will not be output, as
the debug level (1) will be less than what 'verbose' has at that point (0).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com
[ Spell check/reflow successfull pr_debug string ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:45 -03:00
Wang Nan
5e08a76525 perf clang: Support compile IR to BPF object and add testcase
getBPFObjectFromModule() is introduced to compile LLVM IR(Module)
to BPF object. Add new testcase for it.

Test result:
  $ ./buildperf/perf test -v clang
  51: builtin clang support                               :
  51.1: builtin clang compile C source to IR              :
  --- start ---
  test child forked, pid 21822
  test child finished with 0
  ---- end ----
  builtin clang support subtest 0: Ok
  51.2: builtin clang compile C source to ELF object      :
  --- start ---
  test child forked, pid 21823
  test child finished with 0
  ---- end ----
  builtin clang support subtest 1: Ok

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-15-wangnan0@huawei.com
[ Remove redundant "Test" from entry descriptions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
e67d52d411 perf clang: Update test case to use real BPF script
Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a
real BPF script.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-14-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
a9495fe9dc perf clang: Allow passing CFLAGS to builtin clang
Improve getModuleFromSource() API to accept a cflags list. This feature
will be used to pass LINUX_VERSION_CODE and -I flags.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-13-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
77dfa84a84 perf clang: Use real file system for #include
Utilize clang's OverlayFileSystem facility, allow CompilerInstance to
access real file system.

With this patch the '#include' directive can be used.

Add a new getModuleFromSource for real file.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-12-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:44 -03:00
Wang Nan
00b86691c7 perf clang: Add builtin clang support ant test case
Add basic clang support in clang.cpp and test__clang() testcase. The
first testcase checks if builtin clang is able to generate LLVM IR.

tests/clang.c is a proxy. Real testcase resides in
utils/c++/clang-test.cpp in c++ and exports C interface to perf test
subsystem.

Test result:

   $ perf test -v clang
   51: builtin clang support                               :
   51.1: Test builtin clang compile C source to IR              :
   --- start ---
   test child forked, pid 13215
   test child finished with 0
   ---- end ----
   Test builtin clang support subtest 0: Ok

Committer note:

Make sure you've enabled CLANG and LLVM builtin support by setting
the LIBCLANGLLVM variable on the make command line, e.g.:

  make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin

Otherwise you'll get this when trying to do the 'perf test' call above:

  # perf test clang
  51: builtin clang support                      : Skip (not compiled in)
  #

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com
[ Removed "Test" from descriptions, redundant and already removed from all the other entries ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:43 -03:00
Wang Nan
2bd42de0e1 perf llvm: Extract helpers in llvm-utils.c
The following commits will use builtin clang to compile BPF scripts.

llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help
building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros.

Doing object dumping in bpf loader, so further builtin clang compiling
needn't consider it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Wang Nan
8ad85e9e6f perf tools: Pass context to perf hook functions
Pass a pointer to perf hook functions so they receive context
information during setup.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-6-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Kim Phillips
0fcb1da4ab perf annotate: AArch64 support
This is a regex converted version from the original:

	https://lkml.org/lkml/2016/5/19/461

Add basic support to recognise AArch64 assembly. This allows perf to
identify AArch64 instructions that branch to other parts within the
same function, thereby properly annotating them.

Rebased onto new cross-arch annotation bits:

	https://lkml.org/lkml/2016/11/25/546

Sample output:

security_file_permission  vmlinux
  5.80 │    ← ret                                                  ▒
       │70:   ldr    w0, [x21,#68]                                 ▒
  4.44 │    ↓ tbnz   d0                                            ▒
       │      mov    w0, #0x24                       // #36        ▒
  1.37 │      ands   w0, w22, w0                                   ▒
       │    ↑ b.eq   60                                            ▒
  1.37 │    ↓ tbnz   e4                                            ▒
       │      mov    w19, #0x20000                   // #131072    ▒
  1.02 │    ↓ tbz    ec                                            ▒
       │90:┌─→ldr    x3, [x21,#24]                                 ▒
  1.37 │   │  add    x21, x21, #0x10                               ▒
       │   │  mov    w2, w19                                       ▒
  1.02 │   │  mov    x0, x21                                       ▒
       │   │  mov    x1, x3                                        ▒
  1.71 │   │  ldr    x20, [x3,#48]                                 ▒
       │   │→ bl     __fsnotify_parent                             ▒
  0.68 │   │↑ cbnz   60                                            ▒
       │   │  mov    x2, x21                                       ▒
  1.37 │   │  mov    w1, w19                                       ▒
       │   │  mov    x0, x20                                       ▒
  0.68 │   │  mov    w5, #0x0                        // #0         ▒
       │   │  mov    x4, #0x0                        // #0         ▒
  1.71 │   │  mov    w3, #0x1                        // #1         ▒
       │   │→ bl     fsnotify                                      ▒
  1.37 │   │↑ b      60                                            ▒
       │d0:│  mov    w0, #0x0                        // #0         ▒
       │   │  ldp    x19, x20, [sp,#16]                            ▒
       │   │  ldp    x21, x22, [sp,#32]                            ▒
       │   │  ldp    x29, x30, [sp],#48                            ▒
       │   │← ret                                                  ▒
       │e4:│  mov    w19, #0x10000                   // #65536     ▒
       │   └──b      90                                            ◆
       │ec:   brk    #0x800                                        ▒
Press 'h' for help on key bindings

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:19 -03:00
Kim Phillips
859afa6ca9 perf annotate: Use arch->objdump.comment_char in dec__parse()
Presume neglected in commit 786c1b5 "perf annotate: Start supporting
cross arch annotation".  This doesn't fix a bug since none of the
affected arches support parsing dec/inc instructions yet.

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Ryder <chris.ryder@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:18 -03:00
David Ahern
c284d669a2 perf tools: Move parse_nsec_time to time-utils.c
Code move only; no functional change intended.

Committer notes:

Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this
set of auto-detected features:

  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ OFF ]
  ...                      libaudit: [ OFF ]
  ...                        libbfd: [ OFF ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ OFF ]
  ...        numa_num_possible_cpus: [ OFF ]
  ...                       libperl: [ OFF ]
  ...                     libpython: [ OFF ]
  ...                      libslang: [ OFF ]
  ...                     libcrypto: [ OFF ]
  ...                     libunwind: [ OFF ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ OFF ]
  ...                     get_cpuid: [ OFF ]
  ...                           bpf: [ on  ]

Where it was failing with:

    CC       /tmp/build/perf/util/time-utils.o
  util/time-utils.c: In function 'parse_nsec_time':
  util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration]
    time_sec = strtoul(str, &end, 10);
               ^
  util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs]
    time_sec = strtoul(str, &end, 10);
    ^
  util/time-utils.c: In function 'perf_time__parse_str':
  util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
    free(str);
    ^
  util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror]
  util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free'

Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul()
declarations and fix the build.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:39 -03:00
David Ahern
fdf9dc4b34 perf tools: Add time-based utility functions
Add function to parse a user time string of the form <start>,<stop>
where start and stop are time in sec.nsec format. Both start and stop
times are optional.

Add function to determine if a sample time is within a given time
time window of interest.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:32 -03:00
David Ahern
64eff7d9c4 perf script: Add option to stop printing callchain
Allow user to specify list of symbols which cause the dump of callchains
to stop at that symbol.

Committer notes:

Testing it:

  # perf record -ag usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
  #
  # # Without it:
  #
  # perf script
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
  #
  # # Using it to see just what are the calls from the 'remote_function' function:
  #
  # perf script --stop-bt remote_function
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 13:06:19 -03:00
Wang Nan
a074865e60 perf tools: Introduce perf hooks
Perf hooks allow hooking user code at perf events. They can be used for
manipulation of BPF maps, taking snapshot and reporting results. In this
patch two perf hook points are introduced: record_start and record_end.

To avoid buggy user actions, a SIGSEGV signal handler is introduced into
'perf record'. It turns off perf hook if it causes a segfault and report
an error to help debugging.

A test case for perf hook is introduced.

Test result:
  $ ./buildperf/perf test -v hook
  50: Test perf hooks                                          :
  --- start ---
  test child forked, pid 10311
  SIGSEGV is observed as expected, try to recover.
  Fatal error (SEGFAULT) in perf hook 'test'
  test child finished with 0
  ---- end ----
  Test perf hooks: Ok

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:13:27 -03:00
Wang Nan
d6be16719e perf tools: Add missing struct definition in probe_event.h
Commit 0b3c2264ae ("perf symbols: Fix kallsyms perf test on ppc64le")
refers struct symbol in probe_event.h, but forgets to include its
definition.  Gcc will complain about it when that definition is not
added, by sheer luck, by some other header included before
probe_event.h.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:25:46 -03:00
Wang Nan
3dbe46c524 perf record: Fix segfault when running with suid and kptr_restrict is 1
Before this patch perf panics if kptr_restrict is set to 1 and perf is
owned by root with suid set:

  $ whoami
  wangnan
  $ ls -l ./perf
  -rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf
  $ cat /proc/sys/kernel/kptr_restrict
  1
  $ cat /proc/sys/kernel/perf_event_paranoid
  -1
  $ ./perf record -a
  Segmentation fault (core dumped)
  $

The reason is that perf assumes it is allowed to read kptr from
/proc/kallsyms when euid is root, but in fact the kernel doesn't allow
reading kptr when euid and uid do not match with each other:

  $ cp /bin/cat .
  $ sudo chown root:root ./cat
  $ sudo chmod u+s ./cat
  $ cat /proc/kallsyms | grep do_fork
  0000000000000000 T _do_fork          <--- kptr is hidden even euid is root
  $ sudo cat /proc/kallsyms | grep do_fork
  ffffffff81080230 T _do_fork

See lib/vsprintf.c for kernel side code.

This patch fixes this problem by checking both uid and euid.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:11:10 -03:00
Wang Nan
d18acd15c6 perf tools: Fix kernel version error in ubuntu
On ubuntu the internal kernel version code is different from what can
be retrived from uname:

 $ uname -r
 4.4.0-47-generic
 $ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h
 #define LINUX_VERSION_CODE 263192
 #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
 $ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h
 #define UTS_RELEASE "4.4.0-47-generic"
 #define UTS_UBUNTU_RELEASE_ABI 47
 $ cat /proc/version_signature
 Ubuntu 4.4.0-47.68-generic 4.4.24

The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but
`uname -r` reports 4.4.0.

This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become
an incorrect value, results in magic failure in BPF loading:

 $ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls
 event syntax error: './tools/perf/tests/bpf-script-example.c'
                      \___ Failed to load program for unknown reason

According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the
correct kernel version can be retrived through /proc/version_signature, which
is ubuntu specific.

This patch checks the existance of /proc/version_signature, and returns
version number through parsing this file instead of uname. Version string
is untouched (value returns from uname) because `uname -r` is required
to be consistence with path of kbuild directory in /lib/module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:00:30 -03:00