Commit Graph

7390 Commits

Author SHA1 Message Date
Andi Kleen
d581141970 perf jevents: Parse eventcode as number
The next patch needs to modify event code. Previously eventcode was just
passed through as a string. Now parse it as a number.

v2: Don't special case 0

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:02 -03:00
He Kuang
4d416436f3 perf bpf: Add missing newline in debug messages
These two debug messages are missing the trailing newline.

Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-2-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:02 -03:00
He Kuang
3bb53c9f12 perf tools arm64: Add support for generating bpf prologue
Since HAVE_KPROBES can be enabled in arm64, this patch introduces
regs_query_register_offset() to convert register name to offset for
arm64, so the BPF prologue feature is ready to use.

Signed-off-by: He Kuang <hekuang@huawei.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-1-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-08 08:55:01 -03:00
Arnaldo Carvalho de Melo
e06094ab67 Merge remote-tracking branch 'tip/perf/urgent' into perf/core
To pick fixes that are affecting tests of new 'perf diff' features in
perf/core.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-06 11:09:26 -03:00
Krister Johansen
aa33b9b9a2 perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor.  Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-02 11:39:09 -03:00
Namhyung Kim
a1c9f97f0b perf diff: Fix -o/--order option behavior (again)
Commit 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field
interface") changed list_add() to perf_hpp__register_sort_field().

This resulted in a behavior change since the field was added to the tail
instead of the head.  So the -o option is mostly ignored due to its
order in the list.

This patch fixes it by adding perf_hpp__prepend_sort_field().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field interface")
Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-02 11:39:09 -03:00
Namhyung Kim
8381cdd0e3 perf diff: Fix segfault on 'perf diff -o N' option
The -o/--order option is to select column number to sort a diff result.

It does the job by adding a hpp field at the beginning of the sort list.
But it should not be added to the output field list as it has no
callbacks required by a output field.

During the setup_sorting(), the perf_hpp__setup_output_field() appends
the given sort keys to the output field if it's not there already.

Originally it was checked by fmt->list being non-empty.  But commit
3f931f2c42 ("perf hists: Make hpp setup function generic") changed it
to check the ->equal callback.

Anyways, we don't need to add the pseudo hpp field to the output field
list since it won't be used for output.  So just skip fields if they
have no ->color or ->entry callbacks.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 3f931f2c42 ("perf hists: Make hpp setup function generic")
Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-02 11:39:04 -03:00
Taeung Song
b05d109398 perf ftrace: Add ftrace.tracer config option
Currently 'perf ftrace' command allows selecting 'function_graph' or
'function', defaulting to 'function_graph'.

Add ftrace.tracer config option to select the default tracer:

    # cat ~/.perfconfig
    [ftrace]
        tracer = function

    # perf ftrace usleep 123456 | head -10
      <...>-14450 [002] d... 10089.284231: finish_task_switch <-__schedule
      <...>-14450 [002] .... 10089.284232: finish_wait <-pipe_wait
      <...>-14450 [002] .... 10089.284232: mutex_lock <-pipe_wait
      <...>-14450 [002] .... 10089.284232: _cond_resched <-mutex_lock

Committer notes:

Retesting it with invalid variables, invalid values for ftrace.tracer,
and a valid one:

  # cat ~/.perfconfig
  [ftrace]
        trace = function
  # perf ftrace usleep 1
  Error: wrong config key-value pair ftrace.trace=function
  # cat ~/.perfconfig
  [ftrace]
        tracer = functin
  # perf ftrace usleep 1
  Please select "function_graph" (default) or "function"
  Error: wrong config key-value pair ftrace.tracer=functin
  # cat ~/.perfconfig
  [ftrace]
        tracer = function
  # perf ftrace usleep 1 | head -5
          <idle>-0     [000] d...  3855.820847: switch_mm_irqs_off <-__schedule
           <...>-18550 [000] d...  3855.820849: finish_task_switch <-__schedule
           <...>-18550 [000] d...  3855.820851: smp_irq_work_interrupt <-irq_work_interrupt
           <...>-18550 [000] d...  3855.820851: irq_enter <-smp_irq_work_interrupt
           <...>-18550 [000] d...  3855.820851: rcu_irq_enter <-irq_enter
  #

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-3-git-send-email-treeze.taeung@gmail.com
[ Added missign space in error message, changed the logic to make it more compact and less error prone ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:09 -03:00
Taeung Song
43d41deb71 perf tools: Create for_each_event macro for tracepoints iteration
Similar to for_each_subsystem and for_each_event in util/parse-events.c,
add new macro 'for_each_event' for easy iteration over the tracepoints
in order to be more compact and readable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com
[ Slight change to keep existing style for checking strcmp() return ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:08 -03:00
Joe Stringer
a26305363d perf test: Add libbpf pinning test
Add a test for the newly added BPF object pinning functionality.

For example:

  # tools/perf/perf test 37
    37: BPF filter                                 :
    37.1: Basic BPF filtering                      : Ok
    37.2: BPF pinning                              : Ok
    37.3: BPF prologue generation                  : Ok
    37.4: BPF relocation checker                   : Ok

  # tools/perf/perf test 37 -v 2>&1 | grep pinned
    libbpf: pinned map '/sys/fs/bpf/perf_test/flip_table'
    libbpf: pinned program '/sys/fs/bpf/perf_test/func=SyS_epoll_wait/0'

Signed-off-by: Joe Stringer <joe@ovn.org>
Requested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-7-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:08 -03:00
Joe Stringer
9a9c733d68 tools perf util: Make rm_rf(path) argument const
rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.

Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:20:07 -03:00
Krister Johansen
9c68ae98c6 perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor.  Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31 16:19:06 -03:00
Arnaldo Carvalho de Melo
ecc4c5614b perf tools: Propagate perf_config() errors
Previously these were being ignored, sometimes silently.

Stop doing that, emitting debug messages and handling the errors.

Testing it:

  $ cat ~/.perfconfig
  cat: /home/acme/.perfconfig: No such file or directory
  $ perf stat -e cycles usleep 1

   Performance counter stats for 'usleep 1':

           938,996      cycles:u

       0.003813731 seconds time elapsed

  $ perf top --stdio
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  <SNIP>
  [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
  [acme@jouet linux]$ perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .........................
    71.77%  usleep   libc-2.24.so       [.] _dl_addr
    27.07%  usleep   ld-2.24.so         [.] _dl_next_ld_env_entry
     1.13%  usleep   [kernel.kallsyms]  [k] page_fault
  $
  $ touch ~/.perfconfig
  $ ls -la ~/.perfconfig
  -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
  $
  $ perf stat -e instructions usleep 1

   Performance counter stats for 'usleep 1':

           244,610      instructions:u

       0.000805383 seconds time elapsed

  $
  [root@jouet ~]# chown acme.acme ~/.perfconfig
  [root@jouet ~]# perf stat -e cycles usleep 1
    Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

   Performance counter stats for 'usleep 1':

           937,615      cycles

       0.000836931 seconds time elapsed
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 12:23:33 -03:00
Arnaldo Carvalho de Melo
afc45cf52c perf config: Do not consider an error not to have any perfconfig file
While propagating the errors from perf_config(), which were being
completely ignored, everything stopped working for people without a
~/.perfconfig file, because the perf_config_set__init() was considering
an error not to have a .perfconfig file, duh, fix it by checking the
errno after the failed stat() call.

It should also not return an error when it says it is ignoring the file,
and also a empty file should not return an error either.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8beeb00f2c ("perf config: Use new perf_config_set__init() to initialize config set")
Link: http://lkml.kernel.org/n/tip-ygpbab3apbs6l8wr97xedwks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27 10:28:34 -03:00
Taeung Song
bf062bd20e perf ftrace: Remove needless code setting default tracer
As a result of commit a3497642c261 ("perf ftrace: Make 'function_graph'
be the default tracer") the ftrace.tracer variable can't be NULL but the
other code setting default tracer remained.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485423339-22780-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 15:53:27 -03:00
Ingo Molnar
e2cf00c257 New features:
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)
 
 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
 
 Fixes:
 
 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)
 
 - Fix map offsets in relocation in libbpf (Joe Stringer)
 
 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)
 
 Infrastructure:
 
 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)
 
 Trivial:
 
 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYig7UAAoJENZQFvNTUqpAhJQP/iI0T7A8TNekPGLv7j20c302
 89N9+9TAFtVqjgr1hIzqQgGOqbOdAW1tU3VTPW92nNDBn9JV5qwuF9YWEiDaAVv2
 0bmV5hLnrNlymddm3pdg/PbD1TVlwk2NFxtrkPxuf/vx0ZhEGqsSrRUCR/xGXbtQ
 TcMg3rQquspV9JNv4HzFdQC9nsG1CGNotZKsE1avRw70pWAqCtF81B0m8teb6OWo
 5qnN+AMJlYcC+OGffROemUksuehkMvi5L8v1e/6RO/lU1qt9Jrc/2sT9cqvjVFNR
 k4c76cUgWOCYzDEotENMpU4bc6e/24DE2ydFeovihdXw8Qs4ajEA9LXKM4yW+ZoE
 MZE3GS153a8n+CvTfkB9Ow1QJ8rgmR/L0BuhmGb6bYW/MtuTRTShhSduZwOrIyap
 9KckHYti4p3oN3CKFYGO9PN3DRUdx+Xqg/miwrgjkPo09QFp+lzfFFOk0P2/Zqw2
 yfvdWeHxkkrwoWQIyMHVKp/E9jQPuyYqwnKdp68LCN+DgNiFpPpSA8id5e47RQDE
 otqrK8U/82ktakfrBijSPBI6EEqFg7ltip2KT/xlDMfnP9HtxgFhzrk52dyi6pM/
 jkBhJaTQhVZTyaFvUXuaLmBSdPpcaaGM4KJ+2iAayA2r0KLiDj6IdzD5ROCRFOvJ
 SFA472mIxNxUjpQEUTtc
 =tYKN
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.11-20170126' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull the latest perf/core updates from Arnaldo Carvalho de Melo:

New features:

 - Introduce 'perf ftrace' a perf front end to the kernel's ftrace
   function and function_graph tracer, defaulting to the "function_graph"
   tracer, more work will be done in reviving this effort, forward porting
   it from its initial patch submission (Namhyung Kim)

 - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single
   hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)

Fixes:

 - Fix wrong register name for arm64, used in 'perf probe' (He Kuang)

 - Fix map offsets in relocation in libbpf (Joe Stringer)

 - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic)

Infrastructure changes:

 - libbpf prog functions sync with what is exported via uapi (Joe Stringer)

Trivial changes:

 - Remove unnecessary checks and assignments in 'perf probe's
   try_to_find_absolute_address() (Markus Elfring)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-26 16:20:59 +01:00
Arnaldo Carvalho de Melo
ec347870a9 perf ftrace: Make 'function_graph' be the default tracer
So that we can suppress the '-t function_graph' and get a more compact command
line:

  # perf ftrace usleep 123456 | grep raw_spin_lock | sort -k2 -nr | head -5
  2)   0.555 us    |                _raw_spin_lock();
  2)   0.516 us    |          _raw_spin_lock();
  2)   0.410 us    |          _raw_spin_lock_irq();
  2)   0.374 us    |                        _raw_spin_lock_irqsave();
  #

Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ss9xgx5htpxcv86x42pnh3m6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:01 -03:00
Namhyung Kim
d01f4e8db2 perf ftrace: Introduce new 'ftrace' tool
The 'perf ftrace' command is a simple wrapper of kernel's ftrace
functionality.  It only supports single thread tracing currently and
just reads trace_pipe in text and then write it to stdout.

Committer notes:

Testing it:

  # perf ftrace -f function_graph usleep 123456
  <SNIP>
  2)               |  SyS_nanosleep() {
  2)               |    _copy_from_user() {
  <SNIP>
  2)   0.900 us    |      }
  2)   1.354 us    |    }
  2)               |    hrtimer_nanosleep() {
  2)   0.062 us    |      __hrtimer_init();
  2)               |      do_nanosleep() {
  2)               |        hrtimer_start_range_ns() {
  <SNIP>
  2)   5.025 us    |        }
  2)               |        schedule() {
  2)   0.125 us    |          rcu_note_context_switch();
  2)   0.057 us    |          _raw_spin_lock();
  2)               |          deactivate_task() {
  2)   0.369 us    |            update_rq_clock.part.77();
  2)               |            dequeue_task_fair() {
  <SNIP>
  2) + 22.453 us   |            }
  2) + 23.736 us   |          }
  2)               |          pick_next_task_fair() {
  <SNIP>
  2) + 47.167 us   |          }
  2)               |          pick_next_task_idle() {
  <SNIP>
  2)   4.462 us    |          }
  ------------------------------------------
  2)  usleep-20387  =>    <idle>-0
  ------------------------------------------

  2)   0.806 us    |  switch_mm_irqs_off();
  ------------------------------------------
  2)    <idle>-0    =>  usleep-20387
  ------------------------------------------

  2)   0.151 us    |          finish_task_switch();
  2) @ 123597.2 us |        }
  2)   0.037 us    |        _cond_resched();
  2)               |        hrtimer_try_to_cancel() {
  2)   0.064 us    |          hrtimer_active();
  2)   0.353 us    |        }
  2) @ 123605.3 us |      }
  2) @ 123606.2 us |    }
  2) @ 123608.3 us |  } /* SyS_nanosleep */
  2)               |  __do_page_fault() {
 <SNIP>

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-r1hgmsj4dxny8arn3o9mw512@git.kernel.org
[ Various foward port fixes, add man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:01 -03:00
Namhyung Kim
a7619aef6d perf util: Add more debug message on failure path
It's helpful for debugging on tracing features.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-rjysr9ljiesymgk4qblteaty@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:00 -03:00
Namhyung Kim
cd4ceb6343 perf util: Save pid-cmdline mapping into tracing header
Current trace info data lacks the saved cmdline mapping which is needed
for pevent to find out the comm of a task.  Add this and bump up the
version number so that perf can determine its presence when reading.

This is mostly corresponding to trace.dat file version 6, but still
lacks 4 byte of number of cpus, and 10 bytes of type string - and I
think we don't need those anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ Change version test from == to >= ]
Link: http://lkml.kernel.org/n/tip-vaooqpxsikxbb3359p0corcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Arnaldo Carvalho de Melo
0a87e7bc6c perf scripting perl: Do not die() when not founding event for a type
Do just like handling other cases i.e. print some debug message and
ignore the sample.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t7kzlm3cxyvbd7d9n9554ai9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:59 -03:00
Joe Stringer
e28ff1a838 tools lib bpf: Add libbpf_get_error()
This function will turn a libbpf pointer into a standard error code (or
0 if the pointer is valid).

This also allows removal of the dependency on linux/err.h in the public
header file, which causes problems in userspace programs built against
libbpf.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170123011128.26534-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:58 -03:00
Markus Elfring
d1d0e29cb7 perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()
Remove an error code assignment which is redundant in an if branch for
the handling of a memory allocation failure because the same value was
set for the local variable "err" before.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd-5b20-02c63ed98aab@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:56 -03:00
Markus Elfring
42e233cacc perf probe: Delete an unnecessary check in try_to_find_absolute_address()
Remove a condition check which is unnecessary at the end
because this source code place should usually only be reached
with a non-zero pointer.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/a3f2473b-6383-a326-bce0-b826423608b8@users.sourceforge.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:55 -03:00
He Kuang
1b29dfbba1 perf probe: Fix wrong register name for arm64
The register name of arm64 architecture is x0-x31 not r0-r31, this patch
changes this typo.

Before this patch:

  # perf probe --definition 'sys_write count'
  p:probe/sys_write _text+1502872 count=%r2:s64

  # echo 'p:probe/sys_write _text+1502872 count=%r2:s64' > \
    /sys/kernel/debug/tracing/kprobe_events
  Parse error at argument[0]. (-22)

After this patch:

  # perf probe --definition 'sys_write count'
  p:probe/sys_write _text+1502872 count=%x2:s64

  # echo 'p:probe/sys_write _text+1502872 count=%x2:s64' > \
    /sys/kernel/debug/tracing/kprobe_events
  # echo 1 >/sys/kernel/debug/tracing/events/probe/enable
  # cat /sys/kernel/debug/tracing/trace
  ...
  sh-422   [000] d... 650.495930: sys_write: (SyS_write+0x0/0xc8) count=22
  sh-422   [000] d... 651.102389: sys_write: (SyS_write+0x0/0xc8) count=26
  sh-422   [000] d... 651.358653: sys_write: (SyS_write+0x0/0xc8) count=86

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170124103015.1936-2-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:42:43 -03:00
Ingo Molnar
47cd95a632 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-25 15:52:46 +01:00
Jiri Olsa
190bacca16 perf c2c report: Coalesce by default only by pid,iaddr
It seems to be the most used argument for -c option so far.  In the
beginning when you want to have the overall process report, so it makes
sense to make it the default one.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-20 16:52:56 -03:00
Jiri Olsa
8763e6ac2d perf c2c report: Display Total records column in offset view
Adding "Total records" column into cacheline pareto table, between
cycles and cpu info.

  $ perf c2c report
  ...

  ---    ---------- cycles ----------    Total       cpu
         rmt hitm  lcl hitm      load  records       cnt
  ...    ........  ........  ........  .......  ........

                0       112        71       34         4
                0         0         0       18         1
                0         0         0        2         1
                0       132         0        3         3

  ...

It's useful to see how many recorded samples represent each offset.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-20 16:52:43 -03:00
Jiri Olsa
0e3fa7a7ac perf hists browser: Add e/c hotkeys to expand/collapse callchain for current entry
Currently we allow only to expand or collapse all entries in the browser
with 'E' or 'C' keys. Allow user to expand or collapse only current
entry in the browser with e or c key.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-20 13:37:26 -03:00
Jiri Olsa
b33f922651 perf hists browser: Put hist_entry folding logic into single function
It will be used in following patch to expand or collapse only the
current browser entry.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-20 10:04:45 -03:00
Matija Glavinic Pecotic
9343e45bf6 perf unwind: Fix looking up dwarf unwind stack info
Using perf with call graph method dwarf fails to provide backtrace
support for stripped binary even though .gnu_debuglink points to *.dbg
flavor with properly populated debug symbols.

Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and
4.10.rc3.  Perf is configured with libunwind, and unwind dwarf support
[1]. Test code (stress_bt.c) can be found on [2].

Running (explicitly disable other unwinding methods):

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	-fno-asynchronous-unwind-tables stress_bt.c
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

results in properly generated call graph. Stripping the binary and running
it results with missing call graph. Expected result is to have call graph:

  $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \
	  -fno-asynchronous-unwind-tables stress_bt.c
  $ objcopy --only-keep-debug stress_bt stress_bt.dbg
  $ objcopy --strip-debug stress_bt
  $ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt
  $ perf record -N --call-graph dwarf ./stress_bt
  $ perf report

Problem is that perf doesn't try to read symbols pointed by gnu
debuglink.  Patch adds checking, and reading of the symbols from
debuglink and symsrc.  Order of the check is to first check within dso,
then check whether symsrc is defined and try to read from it. Finally,
debuglink is checked. Default locations of debug files are discussed in
[3] and [4]. Comments on RFC are on [5].

[1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding
[2] [1]#Backtrace_stress_application
[3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
[4] https://sourceware.org/binutils/docs/binutils/objcopy.html
[5] https://lkml.org/lkml/2016/8/22/473

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/d309d40a-463f-482b-68e1-1465326efdc1@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-18 12:29:52 -03:00
Ingo Molnar
31f5260a76 perf/urgent 'perf probe' fixes:
Fixes:
 
 - Show correct locations for 'perf probe' on modules (Masami Hiramatsu)
 
 - Correctly handle 'perf probe's on gcc generated functions in modules (Masami Hiramatsu)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYfg3EAAoJENZQFvNTUqpAdsYP/i+1KHLJo5E6w6DErr2gBscJ
 gqjR6ZRPRCdj3EPUkhMZFDHRN8m3+XZjTqpbc59trhRw144eANHfYuniIaY4LcmM
 qHhIyhSz7mFb3WSg/zYIfetGRMMheKCUE5UraHObgY2WC0n5c/5qSB0jOawyN+fR
 oXn0WhdqcekHIym+P19F/BivnHQsRKQVHwcswXm/4flB87olTAkKIluvIFbQVUfY
 rK/o32VsDn/vb+9DZ6l9qgc6kteYPk+f/2zZ7gcjNx4Tw3zA+i1mZBErVZ1SyQau
 zg84AWMYTNyQN1PrjZBVVFsoF+m4QKKcV53DO2Ml4+PspmY6Bqm5bw06hXSbhOH+
 QF3x3fzO7fjrAQQOXXj+UTaRyrH9e2Gxewz3McMaAZvVc/gDIAKvU0GaBpEb4Sm4
 6/4QV6aTJpejP/jG68jH8/8TbrkZzuPZ6wXRm/WxPMSqGFWj08GVuuYxfBI1GXeP
 A1rOmo4TqNJZiA4MeakCHyHJ+59DbAYwdSgZwGXalqc288paDcvGX1kRYvs7iph8
 enKErzAg+LGpIWTJdOcqk8kF7i5u6x53ACuLLR1/p1wdYEvzGWdPwmfUxpWhwVLA
 xseVD5bl8y/jjz1eqWu8latEAlSbsFeAUv7870T9Is70/fg6x11+V944brrtsmYK
 LalUTkcAqYqrzC/9nvg+
 =ebMR
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.10-20170117' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull 'perf probe' fixes from Arnaldo Carvalho de Melo <acme@redhat.com>

  - Show correct locations for 'perf probe' on modules (Masami Hiramatsu)

  - Correctly handle 'perf probe's on GCC generated functions in modules (Masami Hiramatsu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-17 17:03:16 +01:00
Soramichi AKIYAMA
d94386f28a perf evlist: Fix typo in deliver_sample()
This patch fixes a typo: s/delievery/delivery/

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117222233.dfd92de0ad701e7c53396950@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Soramichi AKIYAMA
d25ed5d9fa perf tools: Move two variables usied in libperf from perf.c
The use_browser and perf_version_string variables are both declared in
perf.c but they are also referenced by other functions of libperf.a.

Therefore a user linking an own main() with libperf.a must declare those
two variables in their files even if the files never use the browser or
the version information.

This patch fixes this issue by moving use_browser and
perf_version_string out of perf.c to some other files.

Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:45 -03:00
Namhyung Kim
587782c52a perf sched timehist: Show total wait times for summary
When --state option is given, the summary will show total run, sleep,
iowait, preempt and delay time instead of statistics of runtime.

  $ perf sched timehist -s --state

  Wait-time summary
            comm  parent sched-in run-time  sleep iowait preempt delay
                          (count)   (msec) (msec) (msec)  (msec) (msec)
  ---------------------------------------------------------------------
       systemd[1]      0        3    0.497   1.685 0.000   0.000 0.061
   ksoftirqd/0[3]      2       21    0.434 989.948 0.000   0.000 0.325
   rcu_preempt[7]      2       28    0.386 993.211 0.000   0.000 0.712
  migration/0[10]      2       12    0.126  50.174 0.000   0.000 0.044
   watchdog/0[11]      2        1    0.009   0.000 0.000   0.000 0.016
  migration/1[13]      2        2    0.029  11.755 0.000   0.000 0.007
  <SNIP>

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523.31212-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:44 -03:00
Namhyung Kim
414e050c68 perf sched timehist: Add --state option
The --state option is to show task state when switched out.  The state
is printed as a single character like in the /proc but I added 'I' for
idle state rather than 'R'.

  $ perf sched timehist --state | head
  Samples do not have callchains.
      time cpu task name              wait time sch delay run time state
               [tid/pid]                 (msec)    (msec)   (msec)
  -------- --- ----------------------- -------- ------------------ -----
  1.753791 [3] <idle>                     0.000     0.000    0.000     I
  1.753834 [1] perf[27469]                0.000     0.000    0.000     S
  1.753904 [3] perf[27470]                0.000     0.006    0.112     S
  1.753914 [1] <idle>                     0.000     0.000    0.079     I
  1.753915 [3] migration/3[23]            0.000     0.002    0.011     S
  1.754287 [2] <idle>                     0.000     0.000    0.000     I
  1.754335 [2] transmission[1773/1739]    0.000     0.004    0.047     S

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523.31212-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:44 -03:00
Namhyung Kim
941bdea79e perf sched timehist: Account thread wait time separately
Separate thread wait time into 3 parts - sleep, iowait and preempt based
on the prev_state of the last event.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523.31212-1-namhyung@kernel.org
[ Fix the build on centos:5 where 'wait' shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:35:44 -03:00
Masami Hiramatsu
613f050d68 perf probe: Fix to probe on gcc generated functions in modules
Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.

E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
  in_range.isra.12
  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range+0

With this fix, perf probe correctly shows a probe on
gcc-generated symbol.

  $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
  p:probe/in_range nf_nat:in_range.isra.12+0

This also fixes same problem on online module as below.

  $ perf probe -m i915 -D assert_plane
  p:probe/assert_plane i915:assert_plane.constprop.134+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:43:04 -03:00
Masami Hiramatsu
3e96dac7c9 perf probe: Add error checks to offline probe post-processing
Add error check codes on post processing and improve it for offline
probe events as:

 - post processing fails if no matched symbol found in map(-ENOENT)
   or strdup() failed(-ENOMEM).

 - Even if the symbol name is the same, it updates symbol address
   and offset.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:35:25 -03:00
Masami Hiramatsu
d2d4edbebe perf probe: Fix to show correct locations for events on modules
Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.

This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.

Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).

In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
  ffffffffc0270000 t i915_switcheroo_can_switch	[i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

  $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
  | head -n 2
  ffffffffc027ff80 t haswell_init_clock_gating	[i915]
  ffffffffc0280110 t valleyview_init_clock_gating	[i915]

So setup probes on both function and see what happen.

  $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
        -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

  	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.

With this patch, both events are shown correctly.

  $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

Committer notes:

In my case:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	  perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
    probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
  #

  # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
   [Nr] Name   Type      Address          Off    Size   ES Flg Lk Inf Al
   [ 1] .text  PROGBITS  ffffffff81000000 200000 822fd3 00  AX  0   0 4096
  #

  So both are b0rked, now with the fix:

  # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
  Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

  You can now use it in all perf tools, such as:

	perf record -e probe:valleyview_init_clock_gating -aR sleep 1

  # perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
  #

Both looks correct.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 15:14:06 -03:00
Yannick Brosseau
be3d466c73 perf script: Also allow forcing reading of non-root owned files by root
In 2059fc7a5a ("perf symbols: Allow forcing reading of non-root owned
files by root") 'perf report' was added the option of forcing reading of
non-root owned symbol file.

This add the same behavior for perf script.

Reported-by: Mark Drayton <mbd@fb.com>
Signed-off-by: Yannick Brosseau <scientist@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20170113182527.18625-1-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Michael Petlan
5c64f99b1d perf script: Fix man page about --dump-raw-trace option
The "--dump-raw-script" is not a valid option, replace it with the valid
one, "--dump-raw-trace"

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
LPU-Reference: 728644547.14560155.1484320012612.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
David Carrillo-Cisneros
2484c4c58f perf tools: Remove unneccessary feature-dwarf warning
Don't warn for feature-dwarf==0 if user explicitily disabled DWARF by
using NO_DWARF=1.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170112210159.76143-1-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Andi Kleen
d02fc6bcd5 perf pmu: Factor out scale conversion code
Move the scale factor parsing code to an own function to reuse it in an
upcoming patch.

v2: Return error in case strdup returns NULL.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170103150833.6694-2-andi@firstfloor.org
[ Keep returning -ENOMEM when strdup() fails in perf_pmu__parse_scale()/convert_scale() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Linus Torvalds
79078c53ba Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc race fixes uncovered by fuzzing efforts, a Sparse fix, two PMU
  driver fixes, plus miscellanous tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Reject non sampling events with precise_ip
  perf/x86/intel: Account interrupts for PEBS errors
  perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
  perf/core: Fix sys_perf_event_open() vs. hotplug
  perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
  perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
  perf/x86: Set pmu->module in Intel PMU modules
  perf probe: Fix to probe on gcc generated symbols for offline kernel
  perf probe: Fix --funcs to show correct symbols for offline module
  perf symbols: Robustify reading of build-id from sysfs
  perf tools: Install tools/lib/traceevent plugins with install-bin
  tools lib traceevent: Fix prev/next_prio for deadline tasks
  perf record: Fix --switch-output documentation and comment
  perf record: Make __record_options static
  tools lib subcmd: Add OPT_STRING_OPTARG_SET option
  perf probe: Fix to get correct modname from elf header
  samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include
  samples/bpf sock_example: Avoid getting ethhdr from two includes
  perf sched timehist: Show total scheduling time
2017-01-15 11:37:43 -08:00
Jiri Olsa
bfacbe3bf2 perf record: Add switch-output time option argument
It's now possible to specify the threshold time for perf.data like:

  $ perf record --switch-output=30s ...

Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.

  $ perf record --switch-output=30s ...
  [ perf record: dump data: Woken up 44 times ]
  [ perf record: Dump perf.data.2017010213043746 ]
  ...

The time is expected to be a number with appended unit
character - s/m/h/d.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa
0c5824498e perf record: Add switch-output size warning
Adding switch-output size warning if the requested
size of lower than the wakeup ring buffer size.

  $ perf record --switch-output=1K ls
  WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa
dc0c6127c2 perf record: Add switch-output size option argument
It's now possible to specify the threshold size for perf.data like:

  $ perf record --switch-output=2G ...

Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.

  $ perf record --switch-output=2G ...
  [ perf record: dump data: Woken up 7244 times ]
  [ perf record: Dump perf.data.2017010214093746 ]
  ...

The size is expected to be a number with appended unit character -
B/K/M/G.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa
cb4e1ebb6a perf record: Change switch-output option to take optional argument
Next patches will add --switch-output option arguments, changing the
option to allow that and adding its default value to 'signal'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Jiri Olsa
1b43b70484 perf record: Add struct switch_output
Next patches will add more --switch-output option arguments,
so preparing the data holder.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Jiri Olsa
9808143ba2 perf tools: Add unit_number__scnprintf function
Add unit_number__scnprintf function to display size units and use it in
-m option info message.

Before:
  $ perf record -m 10M ls
  rounding mmap pages size to 16777216 bytes (4096 pages)
  ...

After:
  $ perf record -m 10M ls
  rounding mmap pages size to 16M (4096 pages)
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1483955520-29063-2-git-send-email-jolsa@kernel.org
[ Rename it to unit_number__scnprintf for consistency ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Soramichi Akiyama
e978be9ea2 perf evlist: Fix typo in perf_evlist__start_workload()
This patch fixes a typo: s/enable to/unable to/

Signed-off-by: Soramichi AKIYAMA <akiyama@m.soramichi.jp>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: bcf3145fbe ("perf evlist: Enhance perf_evlist__start_workload()")
Link: http://lkml.kernel.org/r/20170110200006.e1f7a766b4faf1f107ae2e1b@m.soramichi.jp
[ Wasn't applying, fixed it up by hand, added Fixes: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo
017037ff3d perf trace: Allow specifying list of syscalls and events in -e/--expr/--event
Makes it easier to specify both events and syscalls (to be formatter
strace-like), i.e. previously one would have to do:

  # perf trace -e nanosleep --event sched:sched_switch usleep 1

Now it is possible to do:

  # perf trace -e nanosleep,sched:sched_switch usleep 1
     0.000 ( 0.021 ms): usleep/17962 nanosleep(rqtp: 0x7ffdedd61ec0) ...
     0.021 (         ): sched:sched_switch:usleep:17962 [120] S ==> swapper/1:0 [120])
     0.000 ( 0.066 ms): usleep/17962  ... [continued]: nanosleep()) = 0
  #

The old style --expr and using both -e and --event continues to work.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ieg6bakub4657l9e6afn85r4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo
355637717d perf kallsyms: Introduce tool to look for extended symbol information on the running kernel
Its similar to doing grep on a /proc/kallsyms, but it also shows extra
information like the path to the kernel module and the unrelocated
addresses in it, to help in diagnosing problems.

It is also helps demonstrate the use of the symbols routines so that
tool writers can use them more effectively.

Using it:

  $ perf kallsyms e1000_xmit_frame netif_rx usb_stor_set_xfer_buf
  e1000_xmit_frame: [e1000e] /lib/modules/4.9.0+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 0xffffffffc046fc10-0xffffffffc0470bb0 (0x19c80-0x1ac20)
  netif_rx: [kernel] [kernel.kallsyms] 0xffffffff916f03a0-0xffffffff916f0410 (0xffffffff916f03a0-0xffffffff916f0410)
  usb_stor_set_xfer_buf: [usb_storage] /lib/modules/4.9.0+/kernel/drivers/usb/storage/usb-storage.ko 0xffffffffc057aea0-0xffffffffc057af19 (0xf10-0xf89)
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-79bk9pakujn4l4vq0f90klv3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo
7d132caaf9 perf machine: Add a kallsyms loading constructor
To reduce the boilerplate for searching for functions in the running
kernel and modules.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-93iqzayafpaxaguoiwjqezgz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:00 -03:00
Laura Abbott
cd7f355ac4 perf jvmti: Create libdir directory before installing libperf-jvmti.so
The install command for libperf-jvmti.so does not check if libdir exists
before installing. This means that when the install command is run:

install libperf-jvmti.so '/tmp/test_root/usr/lib64';

libperf-jvmti.so will get installed to /usr/lib64 as a file and break
further installation. Fix this by ensuring the directory gets created
first.

See https://bugzilla.redhat.com/show_bug.cgi?id=1410296

Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: d4dfdf00d4 ("perf jvmti: Plug compilation into perf build")
Link: http://lkml.kernel.org/r/1483741088-13543-1-git-send-email-labbott@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:00 -03:00
Michal Hocko
41b6167e8f mm: get rid of __GFP_OTHER_NODE
The flag was introduced by commit 78afd5612d ("mm: add
__GFP_OTHER_NODE flag") to allow proper accounting of remote node
allocations done by kernel daemons on behalf of a process - e.g.
khugepaged.

After "mm: fix remote numa hits statistics" we do not need and actually
use the flag so we can safely remove it because all allocations which
are satisfied from their "home" node are accounted properly.

[mhocko@suse.com: fix build]
Link: http://lkml.kernel.org/r/20170106122225.GK5556@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170102153057.9451-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10 18:31:55 -08:00
Ingo Molnar
4e06d4f083 perf/urgent fixes and one improvement:
Fixes:
 
 - Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira)
 
 - Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo)
 
 - Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo)
 
 - Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo)
 
 - Fix 'perf record --switch-output' documentation and comment (Jiri Olsa)
 
 - 'perf probe' fixes for cross arch probing (Masami Hiramatsu)
 
 Improvement:
 
 - Show total scheduling time in 'perf sched timehist' (Namhyumg Kim)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYbS6KAAoJENZQFvNTUqpAWqkP/1xdvbDqeTA2YjcWG+ffQJye
 LULSNQpRShvBGpB3zsOzOebOtw0R5oWb7YAp4JFZNyMIk5qcnS9gVwdw9Qx1lc/W
 Gq8OuhI28Pr9AHE37eTJ1+12HI2aAOylwKFqUpEWMzoaUqBDRh5JV2MeVGVQXBRe
 YBJ4Qwk1a7Ng74uzN/pdiC6V7pjddPEoLZEDG5p35vrgtNAY1JGlX5rvNR7CO6nN
 72QHRnB4r1d9fgNGS5oO5zEU1q5+JN5phe+gDANdk/8pCqDegRjZpHnj5gcm71jr
 mEvbstmlV1XPZSA9aFVu2L5LZuv6asS02HjrbGydqO2DUTpDgrynHsjAuqqaTnwk
 kLGwA2I+Kyh8g155H+HrQANONe2TTOXRkBVWIoFBWkhd1JUS/p3GE2qvqx5N36sr
 U26d9i/Bk/pbOYOdRUFMc4lckZY2HrrRWj8dVr06qLr2rlbYoLv8hpqvdyHr9MJS
 Id2Yqqd1JVzBDU3/+GfdFDBjltVtpSG1mdc6LWy9IpXFMJoSOQNsrbjf+mgMadeV
 EbtY8MdDZ525EB8niv9E9Vfd56LiWTKCjg9Mc758+U+Ns50YlQFsDq2hBO2+jQ/k
 W3QNX+WJiLgxqQK8adveXoYgCSWPNVttIvg8Lj1KNk8Y//vUC1KSrvDfaKiROzLB
 iF0u5mLRAQ+Pfec6X5PP
 =o9KL
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.10-20170104' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes and one improvement from Arnaldo Carvalho de Melo:

Fixes:

  - Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira)

  - Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo)

  - Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo)

  - Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo)

  - Fix 'perf record --switch-output' documentation and comment (Jiri Olsa)

  - Fix 'perf probe' for cross arch probing (Masami Hiramatsu)

Improvement:

  - Show total scheduling time in 'perf sched timehist' (Namhyumg Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-05 08:33:02 +01:00
Masami Hiramatsu
8a937a25a7 perf probe: Fix to probe on gcc generated symbols for offline kernel
Fix perf-probe to show probe definition on gcc generated symbols for
offline kernel (including cross-arch kernel image).

gcc sometimes optimizes functions and generate new symbols with suffixes
such as ".constprop.N" or ".isra.N" etc. Since those symbol names are
not recorded in DWARF, we have to find correct generated symbols from
offline ELF binary to probe on it (kallsyms doesn't correct it).  For
online kernel or uprobes we don't need it because those are rebased on
_text, or a section relative address.

E.g. Without this:

  $ perf probe -k build-arm/vmlinux -F __slab_alloc*
  __slab_alloc.constprop.9
  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc+0

If you put above definition on target machine, it should fail
because there is no __slab_alloc in kallsyms.

With this fix, perf probe shows correct probe definition on
__slab_alloc.constprop.9:

  $ perf probe -k build-arm/vmlinux -D __slab_alloc
  p:probe/__slab_alloc __slab_alloc.constprop.9+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:44:22 -03:00
Masami Hiramatsu
eebc509b20 perf probe: Fix --funcs to show correct symbols for offline module
Fix --funcs (-F) option to show correct symbols for offline module.
Since previous perf-probe uses machine__findnew_module_map() for offline
module, even if user passes a module file (with full path) which is for
other architecture, perf-probe always tries to load symbol map for
current kernel module.

This fix uses dso__new_map() to load the map from given binary as same
as a map for user applications.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04 11:15:09 -03:00
Arnaldo Carvalho de Melo
7934c98a6e perf symbols: Robustify reading of build-id from sysfs
Markus reported that perf segfaults when reading /sys/kernel/notes from
a kernel linked with GNU gold, due to what looks like a gold bug, so do
some bounds checking to avoid crashing in that case.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 16:11:13 -03:00
Arnaldo Carvalho de Melo
30a9c64448 perf tools: Install tools/lib/traceevent plugins with install-bin
Those are binaries as well, so should be installed by:

  make -C tools/perf install-bin'

too.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-3841b37u05evxrs1igkyu6ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 16:11:13 -03:00
Jiri Olsa
60437ac02f perf record: Fix --switch-output documentation and comment
There's no --signal-trigger option, also adding the code comment into
record man page.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483431600-19887-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 11:11:38 -03:00
Jiri Olsa
efd2130711 perf record: Make __record_options static
There's no need for this one to be global.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483431600-19887-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 11:11:10 -03:00
Masami Hiramatsu
1f2ed153b9 perf probe: Fix to get correct modname from elf header
Since 'perf probe' supports cross-arch probes, it is possible to analyze
different arch kernel image which has different bits-per-long.

In that case, it fails to get the module name because it uses the
MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
of the target arch bits-per-long.

This fixes above issue by changing modname-offset based on the target
archs bit width. This is ok because linux kernel uses LP64 model on
64bit arch.

E.g. without this (on x86_64, and target module is arm32):

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup :configfs_lookup+0
                          ^-Here is an empty module name.

With this fix, you can see correct module name:

  $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
  p:probe/configfs_lookup configfs:configfs_lookup+0

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-02 14:09:17 -03:00
Namhyung Kim
9396c9cb0d perf sched timehist: Show total scheduling time
Show length of analyzed sample time and rate of idle task running.
This also takes care of time range given by --time option.

  $ perf sched timehist -sI | tail
  Samples do not have callchains.
  Idle stats:
      CPU  0 idle for    930.316  msec  ( 92.93%)
      CPU  1 idle for    963.614  msec  ( 96.25%)
      CPU  2 idle for    885.482  msec  ( 88.45%)
      CPU  3 idle for    938.635  msec  ( 93.76%)

      Total number of unique tasks: 118
  Total number of context switches: 2337
             Total run time (msec): 3718.048
      Total scheduling time (msec): 1001.131  (x 4)

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161222060350.17655-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-27 21:47:57 -03:00
Linus Torvalds
00198dab3b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
  plus on the tooling side there's a number of fixes and some late
  updates"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  perf sched timehist: Fix invalid period calculation
  perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
  perf sched timehist: Enlarge default 'comm_width'
  perf sched timehist: Honour 'comm_width' when aligning the headers
  perf/x86: Fix overlap counter scheduling bug
  perf/x86/pebs: Fix handling of PEBS buffer overflows
  samples/bpf: Move open_raw_sock to separate header
  samples/bpf: Remove perf_event_open() declaration
  samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
  tools lib bpf: Add bpf_prog_{attach,detach}
  samples/bpf: Switch over to libbpf
  perf diff: Do not overwrite valid build id
  perf annotate: Don't throw error for zero length symbols
  perf bench futex: Fix lock-pi help string
  perf trace: Check if MAP_32BIT is defined (again)
  samples/bpf: Make perf_event_read() static
  uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
  samples/bpf: Make samples more libbpf-centric
  tools lib bpf: Add flags to bpf_create_map()
  tools lib bpf: use __u32 from linux/types.h
  ...
2016-12-23 16:49:12 -08:00
Namhyung Kim
bdd75729e5 perf sched timehist: Fix invalid period calculation
When --time option is given with a value outside recorded time, the last
sample time (tprev) was set to that value and run time calculation might
be incorrect.  This is a problem of the first samples for each cpus
since it would skip the runtime update when tprev is 0.  But with --time
option it had non-zero (which is invalid) value so the calculation is
also incorrect.

For example, let's see the followging:

  $ perf sched timehist
             time    cpu  task name                       wait time  sch delay   run time
                          [tid/pid]                          (msec)     (msec)     (msec)
  --------------- ------  ------------------------------  ---------  ---------  ---------
      3195.968367 [0003]  <idle>                              0.000      0.000      0.000
      3195.968386 [0002]  Timer[4306/4277]                    0.000      0.000      0.018
      3195.968397 [0002]  Web Content[4277]                   0.000      0.000      0.000
      3195.968595 [0001]  JS Helper[4302/4277]                0.000      0.000      0.000
      3195.969217 [0000]  <idle>                              0.000      0.000      0.621
      3195.969251 [0001]  kworker/1:1H[291]                   0.000      0.000      0.033

The sample starts at 3195.968367 but when I gave a time interval from
3194 to 3196 (in sec) it will calculate the whole 2 second as runtime.
In below, 2 cpus accounted it as runtime, other 2 cpus accounted it as
idle time.

Before:

  $ perf sched timehist --time 3194,3196 -s | tail
  Idle stats:
      CPU  0 idle for   1995.991  msec
      CPU  1 idle for     20.793  msec
      CPU  2 idle for     30.191  msec
      CPU  3 idle for   1999.852  msec

      Total number of unique tasks: 23
  Total number of context switches: 128
             Total run time (msec): 3724.940

After:

  $ perf sched timehist --time 3194,3196 -s | tail
  Idle stats:
      CPU  0 idle for     10.811  msec
      CPU  1 idle for     20.793  msec
      CPU  2 idle for     30.191  msec
      CPU  3 idle for     18.337  msec

      Total number of unique tasks: 23
  Total number of context switches: 128
             Total run time (msec): 18.139

Committer notes:

Further testing:

Before:

  Idle stats:
      CPU  0 idle for    229.785  msec
      CPU  1 idle for    937.944  msec
      CPU  2 idle for    188.931  msec
      CPU  3 idle for    986.185  msec

  After:

  # perf sched timehist --time 40602,40603 -s | tail

  Idle stats:
      CPU  0 idle for    229.785  msec
      CPU  1 idle for    175.407  msec
      CPU  2 idle for    188.931  msec
      CPU  3 idle for    223.657  msec

      Total number of unique tasks: 68
  Total number of context switches: 814
             Total run time (msec): 97.688

  # for cpu in `seq 0 3` ; do echo -n "CPU $cpu idle for " ; perf sched timehist --time 40602,40603 | grep "\[000${cpu}\].*\<idle\>" | tr -s ' ' | cut -d' ' -f7 | awk '{entries++ ; s+=$1} END {print s " msec (entries: " entries ")"}' ; done
  CPU 0 idle for 229.721 msec (entries: 123)
  CPU 1 idle for 175.381 msec (entries: 65)
  CPU 2 idle for 188.903 msec (entries: 56)
  CPU 3 idle for 223.61 msec (entries: 102)

Difference due to the idle stats being accounted at nanoseconds precision while
the <idle> entries in 'perf sched timehist' are trucated at msec.usec.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 853b740711 ("perf sched timehist: Add option to specify time window of interest")
Link: http://lkml.kernel.org/r/20161222060350.17655-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-22 16:35:46 -03:00
Namhyung Kim
4fa0d1aa27 perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
Now that the default 'comm_width' value is 30, no need to check that at
print_summary,

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-22 16:35:46 -03:00
Namhyung Kim
9b8087d720 perf sched timehist: Enlarge default 'comm_width'
Current default value is 20 but it's easily changed to a bigger value as
task has a long name and different tid and pid.  And it makes the output
not aligned.  So change it to have a large value as summary shows.

Committer notes:

Before:

  # perf sched record
  ^C
  # perf sched timehist
  <SNIP>
    40602.770537 [0001]  rcuos/2[29]               7.970      0.002      0.020
    40602.771512 [0003]  <idle>                    0.003      0.000      0.986
    40602.771586 [0001]  <idle>                    0.020      0.000      1.049
    40602.771606 [0001]  qemu-system-x86[3593/3510]      0.000      0.002      0.020
    40602.771629 [0003]  qemu-system-x86[3510]           0.000      0.003      0.116
    40602.771776 [0000]  <idle>                          0.001      0.000      1.892
  <SNIP>

After:

  # perf sched timehist
  <SNIP>
   40602.770537 [0001]  rcuos/2[29]                         7.970      0.002      0.020
   40602.771512 [0003]  <idle>                              0.003      0.000      0.986
   40602.771586 [0001]  <idle>                              0.020      0.000      1.049
   40602.771606 [0001]  qemu-system-x86[3593/3510]          0.000      0.002      0.020
   40602.771629 [0003]  qemu-system-x86[3510]               0.000      0.003      0.116
  <SNIP>

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-22 16:35:45 -03:00
Namhyung Kim
0e6758e823 perf sched timehist: Honour 'comm_width' when aligning the headers
Current default value is 20, but that may change in the future, so make
places where we have 20 hardcoded use 'comm_width'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-22 16:35:45 -03:00
Kan Liang
ed6c166cc7 perf diff: Do not overwrite valid build id
Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%  -99.80%  tchain_edit       [.] f2
     0.12%  +99.81%  tchain_edit       [.] f3
     0.02%   -0.01%  [ixgbe]           [k] ixgbe_read_reg

  After the fix:
  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%   +0.10%  tchain_edit       [.] f3
     0.12%   -0.08%  tchain_edit       [.] f2

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
CC: Dima Kogan <dima@secretsauce.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:38 -03:00
Ravi Bangoria
edee44be59 perf annotate: Don't throw error for zero length symbols
'perf report --tui' exits with error when it finds a sample of zero
length symbol (i.e. addr == sym->start == sym->end). Actually these are
valid samples. Don't exit TUI and show report with such symbols.

Reported-and-Tested-by: Anton Blanchard <anton@samba.org>
Link: https://lkml.org/lkml/2016/10/8/189
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 12:00:32 -03:00
Davidlohr Bueso
9de3ffa1b7 perf bench futex: Fix lock-pi help string
Obvious copy/paste typo from the requeue program.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Link: http://lkml.kernel.org/r/1481830584-30909-1-git-send-email-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 09:37:40 -03:00
Jiri Olsa
2bd42f3aaa perf trace: Check if MAP_32BIT is defined (again)
There might be systems where MAP_32BIT is not defined, like some some
RHEL7 powerpc versions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kyle McMartin <kyle@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 256763b017 ("perf trace beauty mmap: Add more conditional defines")
Link: http://lkml.kernel.org/r/1481831814-23683-1-git-send-email-jolsa@kernel.org
[ Changed the Fixme cset to the one removing the conditional switch case for MAP_32BIT ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20 09:37:40 -03:00
Linus Torvalds
41e0e24b45 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - prototypes for x86 asm-exported symbols (Adam Borowski) and a warning
   about missing CRCs (Nick Piggin)

 - asm-exports fix for LTO (Nicolas Pitre)

 - thin archives improvements (Nick Piggin)

 - linker script fix for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION (Nick
   Piggin)

 - genksyms support for __builtin_va_list keyword

 - misc minor fixes

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  x86/kbuild: enable modversions for symbols exported from asm
  kbuild: fix scripts/adjust_autoksyms.sh* for the no modules case
  scripts/kallsyms: remove last remnants of --page-offset option
  make use of make variable CURDIR instead of calling pwd
  kbuild: cmd_export_list: tighten the sed script
  kbuild: minor improvement for thin archives build
  kbuild: modpost warn if export version crc is missing
  kbuild: keep data tables through dead code elimination
  kbuild: improve linker compatibility with lib-ksyms.o build
  genksyms: Regenerate parser
  kbuild/genksyms: handle va_list type
  kbuild: thin archives for multi-y targets
  kbuild: kallsyms allow 3-pass generation if symbols size has changed
2016-12-17 16:24:13 -08:00
Ravi Bangoria
e216874cc1 perf annotate: Fix jump target outside of function address range
If jump target is outside of function range, perf is not handling it
correctly. Especially when target address is lesser than function start
address, target offset will be negative. But, target address declared to
be unsigned, converts negative number into 2's complement. See below
example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is
lesser than function start address(34cf0).

        34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0

Objdump output:

  0000000000034cf0 <__sigaction>:
  __GI___sigaction():
    34cf0: lea    -0x20(%rdi),%eax
    34cf3: cmp    -bashx1,%eax
    34cf6: jbe    34d00 <__sigaction+0x10>
    34cf8: jmpq   34ac0 <__GI___libc_sigaction>
    34cfd: nopl   (%rax)
    34d00: mov    0x386161(%rip),%rax        # 3bae68 <_DYNAMIC+0x2e8>
    34d07: movl   -bashx16,%fs:(%rax)
    34d0e: mov    -bashxffffffff,%eax
    34d13: retq

perf annotate before applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        v  jmpq   fffffffffffffdd0
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

perf annotate after applying patch:

  __GI___sigaction  /usr/lib64/libc-2.22.so
           lea    -0x20(%rdi),%eax
           cmp    -bashx1,%eax
        v  jbe    10
        ^  jmpq   34ac0 <__GI___libc_sigaction>
           nop
    10:    mov    _DYNAMIC+0x2e8,%rax
           movl   -bashx16,%fs:(%rax)
           mov    -bashxffffffff,%eax
           retq

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Ravi Bangoria
3ee2eb6da2 perf annotate: Support jump instruction with target as second operand
Architectures like PowerPC have jump instructions that includes a target
address as a second operand. For example, 'bne cr7,0xc0000000000f6154'.
Add support for such instruction in perf annotate.

objdump o/p:
  c0000000000f6140:   ld     r9,1032(r31)
  c0000000000f6144:   cmpdi  cr7,r9,0
  c0000000000f6148:   bne    cr7,0xc0000000000f6154
  c0000000000f614c:   ld     r9,2312(r30)
  c0000000000f6150:   std    r9,1032(r31)
  c0000000000f6154:   ld     r9,88(r31)

Corresponding perf annotate o/p:

Before patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    3ffffffffff09f2c
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

After patch:
         ld     r9,1032(r31)
         cmpdi  cr7,r9,0
      v  bne    74
         ld     r9,2312(r30)
         std    r9,1032(r31)
  74:    ld     r9,88(r31)

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1480953407-7605-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa
23dc4f1586 perf record: Force ignore_missing_thread for uid option
Enable perf_evsel::ignore_missing_thread for -u option to ignore
complete failure if any of the user's processes die between its
enumeration and time we open the event.

Committer notes:

While doing a 'make -j4 allmodconfig' we sometimes get into the race:

Before:

  # perf record -u acme
  Error:
  The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:ppp).
  /bin/dmesg may provide additional information.
  No CONFIG_PERF_EVENTS=y kernel support configured?
  #

After:

  [root@jouet ~]# perf record -u acme
  WARNING: Ignored open failure for pid 9888
  WARNING: Ignored open failure for pid 18059
  [root@jouet ~]#

Which is an improvement, with the races not preventing the remaining threads
for the specified user from being monitored, but the message probably needs
further clarification.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa
a359c17a7e perf evsel: Allow to ignore missing pid
Adding perf_evsel::ignore_missing_cpu_thread bool.

When set true, it allows perf to ignore error of missing pid of perf
event syscall.

We remove missing thread id from the thread_map, so the rest of the
processing like ioctl and mmap won't get disturbed with -1 fd.

The reason for supporting this is to ease up monitoring group of pids,
that 'disappear' before perf opens their event. This currently leads
perf to report error and exit and makes perf record's -u option unusable
under certain setup.

With this change we will allow this race and ignore such failure with
following warning:

  WARNING: Ignored open failure for pid 8605

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20161213074622.GA3084@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:46 -03:00
Jiri Olsa
38af91f01d perf thread_map: Add thread_map__remove function
Add thread_map__remove function to remove thread from thread map.

Add automated test also.

Committer notes:

Testing it:

  # perf test "Remove thread map"
  39: Remove thread map                          : Ok
  # perf test -v "Remove thread map"
  39: Remove thread map                          :
  --- start ---
  test child forked, pid 4483
  2 threads: 4482, 4483
  1 thread: 4483
  0 thread:
  test child finished with 0
  ---- end ----
  Remove thread map: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-4-git-send-email-jolsa@kernel.org
[ Added stdlib.h, to get the free() declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Jiri Olsa
83c2e4f396 perf evsel: Use variable instead of repeating lengthy FD macro
It's more readable and will ease up following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481538943-21874-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Jiri Olsa
631ac41b46 perf mem: Fix --all-user/--all-kernel options
Removing extra '--' prefix.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: ad16511b0e ("perf mem: Add -U/-K (--all-user/--all-kernel) options")
Link: http://lkml.kernel.org/r/1481538943-21874-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Arnaldo Carvalho de Melo
7e6a79981b perf tools: Remove some needless __maybe_unused
I.e. those parameters/functions _are_ used, so ditch that misleading attribute.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-13cqtjh0yojg5gzvpq1zzpl0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Namhyung Kim
ba957ebb54 perf sched timehist: Show callchains for idle stat
When --idle-hist option is used with --summary, it now shows idle stats
with callchains like below:

  Idle stats by callchain:
  CPU  0:   902.195 msec
  Idle time (msec)    Count Callchains
  ----------------  ------- --------------------------------------------------
           370.589       69 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath
           178.799       17 worker_thread <- kthread <- ret_from_fork
           128.352       17 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
           125.111       19 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select
            71.599       50 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
            23.146        1 rcu_gp_kthread <- kthread <- ret_from_fork
             4.510        1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64
             0.085        1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- do_restart_poll
  ...

Committer notes:

Extra testing:

  # uname -a
  Linux jouet 4.8.8-300.fc25.x86_64 #1 SMP Tue Nov 15 18:10:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

1) Run 'perf sched record -g'

2) Run 'perf sched timehist --idle --summary'

<SNIP>
  Idle stats by callchain:
  CPU  0: 13456.840 msec
  Idle time (msec) Count Callchains
  ---------------- ----- --------------------------------------------------
          5386.637  3283 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
          2750.238  2299 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- do_syscall_64
          1275.672  1287 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- entry_SYSCALL_64_fastpath
           936.322   452 worker_thread <- kthread <- ret_from_fork
           741.311   385 rcu_nocb_kthread <- kthread <- ret_from_fork
           729.385   248 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_ppoll
           365.386   229 irq_thread <- kthread <- ret_from_fork
           338.934   265 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath
           219.488   201 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
           186.839   410 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64
           142.541    59 kvm_vcpu_block <- kvm_arch_vcpu_ioctl_run <- kvm_vcpu_ioctl <- do_vfs_ioctl <- sys_ioctl
            83.887    92 smpboot_thread_fn <- kthread <- ret_from_fork
            62.722    96 do_exit <- do_group_exit <- 0x2a5594 <- entry_SYSCALL_64_fastpath
            47.894    83 pipe_wait <- pipe_read <- __vfs_read <- vfs_read <- sys_read
            46.554    61 rcu_gp_kthread <- kthread <- ret_from_fork
            34.337    21 schedule_timeout <- intel_fbc_work_fn <- process_one_work <- worker_thread <- kthread
            29.521    14 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select
            20.274    10 schedule_timeout <- io_schedule_timeout <- bit_wait_io <- __wait_on_bit <- out_of_line_wait_on_bit
            15.085    55 schedule_timeout <- unix_stream_read_generic <- unix_stream_recvmsg <- sock_recvmsg <- SYSC_recvfrom
<SNIP>

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Namhyung Kim
07235f84ec perf sched timehist: Add -I/--idle-hist option
The --idle-hist option is to analyze system idle state so which process
makes cpu to go idle.  If this option is specified, non-idle events will
be skipped and processes switching to/from idle will be shown.

This option is mostly useful when used with --summary(-only) option.  In
the idle-time summary view, idle time is accounted to previous thread
which is run before idle task.

The example output looks like following:

  Idle-time summary
                  comm parent sched-out idle-time min-idle avg-idle max-idle stddev migrations
                                (count)    (msec)   (msec)   (msec)   (msec)      %
  --------------------------------------------------------------------------------------------
        rcu_preempt[7]      2        95   550.872    0.011    5.798   23.146   7.63      0
       migration/1[16]      2         1    15.558   15.558   15.558   15.558   0.00      0
        khugepaged[39]      2         1     3.062    3.062    3.062    3.062   0.00      0
     kworker/0:1H[124]      2         2     4.728    0.611    2.364    4.116  74.12      0
  systemd-journal[167]      1         1     4.510    4.510    4.510    4.510   0.00      0
    kworker/u16:3[558]      2        13    74.737    0.080    5.749   12.960  21.96      0
   irq/34-iwlwifi[628]      2        21   118.403    0.032    5.638   23.990  24.00      0
    kworker/u17:0[673]      2         1     3.523    3.523    3.523    3.523   0.00      0
      dbus-daemon[722]      1         1     6.743    6.743    6.743    6.743   0.00      0
          ifplugd[741]      1         1    58.826   58.826   58.826   58.826   0.00      0
  wpa_supplicant[1490]      1         1    13.302   13.302   13.302   13.302   0.00      0
     wpa_actiond[1492]      1         2     4.064    0.168    2.032    3.896  91.72      0
         dockerd[1500]      1         1     0.055    0.055    0.055    0.055   0.00      0
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-6-namhyung@kernel.org
Link: http://lkml.kernel.org/r/20161213080632.19099-2-namhyung@kernel.org
[ Merged fix sent by Namhyumg, as posted in the second Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Namhyung Kim
a4b2b6f56e perf sched timehist: Skip non-idle events when necessary
Sometimes it only focuses on idle-related events like upcoming idle-hist
feature.  In this case we don't want to see other event to reduce noise.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:44 -03:00
Namhyung Kim
699b5b920d perf sched timehist: Save callchain when entering idle
In order to investigate the idleness reason, it is necessary to keep the
callchains when entering idle.  This can be identified by the
sched:sched_switch event having the next_pid field as 0.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-4-namhyung@kernel.org
Link: http://lkml.kernel.org/r/20161213080632.19099-1-namhyung@kernel.org
[ Merged fix from Namhyung, see second Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:44 -03:00
Namhyung Kim
3bc2fa9cb8 perf sched timehist: Introduce struct idle_time_data
The struct idle_time_data is to keep idle stats with callchains entering
to the idle task.  The normal thread_runtime calculation is done
transparently since it extends the struct thread_runtime.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-3-namhyung@kernel.org
[ Align struct field names ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:44 -03:00
Namhyung Kim
96039c7c52 perf sched timehist: Split is_idle_sample()
The is_idle_sample() function actually does more than determining
whether sample come from idle task.  Split the callchain part into
save_task_callchain() to make it clearer.

Also checking prev_pid from trace data looks preferred than just
checking sample->pid since it's possible, although rare, to have invalid
0 pid/tid on scheduling an exiting task.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-2-namhyung@kernel.org
[ Remove some needless () in some return statements ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:44 -03:00
Jiri Olsa
aeafd623f8 perf tools: Move headers check into bash script
To make it nicer and easily maintainable.

Also moving the check into fixdep sub make, so its output is not
scattered around the build output.

Removing extra $$ from mman*.h checks.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-5-git-send-email-jolsa@kernel.org
[ Use /bin/sh, and 'function check() {' -> 'check () {' to make it work with busybox, in Alpine Linux, for instance ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:44 -03:00
Uwe Kleine-König
e19b7cee02 make use of make variable CURDIR instead of calling pwd
make already provides the current working directory in a variable, so make
use of it instead of forking a shell. Also replace usage of PWD by
CURDIR. PWD is provided by most shells, but not all, so this makes the
build system more robust.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michal Marek <mmarek@suse.com>
2016-12-11 12:12:56 +01:00
Yannick Brosseau
108a7c103b perf tools: Explicitly document that --children is enabled by default
The fact that the --children option is enabled by default is buried deep
at the end of the help page, in the overhead calculation section. This
make it explicit right where the option is listed, following the same
way other default options are described

Signed-off-by: Yannick Brosseau <scientist@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20161202160732.29058-1-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:35 -03:00
Namhyung Kim
b336352b41 perf sched timehist: Cleanup idle_max_cpu handling
It treats the idle_max_cpu little bit confusingly IMHO.  Let's make it
more straight forward.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:34 -03:00
Namhyung Kim
5d92d96a94 perf sched timehist: Handle zero sample->tid properly
Sometimes samples have tid of 0 but non-0 pid.  It ends up having a new
thread of 0 tid/pid (instead of referring idle task) since tid is used
to search matching task.  But I guess it's wrong to use 0 as a tid when
pid is set.  This patch uses tid only if it has a non-zero value or same
as pid (of 0).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:34 -03:00
Namhyung Kim
571f1eb9b9 perf callchain: Introduce callchain_cursor__copy()
The callchain_cursor__copy() function is to save current callchain
captured by a cursor.  It'll be used to keep callchains when switching
to idle task for each cpu.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
Namhyung Kim
6fa94258ce perf sched: Cleanup option processing
The -D/--dump-raw-trace option is in the parent option so no need to
repeat it.  Also move -f/--force option to parent as it's common to
handle data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161206034010.6499-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:33 -03:00
David Ahern
f45bf8d393 perf sched timehist: Improve error message when analyzing wrong file
Arnaldo reported an unhelpful error message when running perf sched
timehist on a file that did not contain sched tracepoints:

    [root@jouet ~]# perf sched timehist
    No trace sample to read. Did you call 'perf record -R'?

    [root@jouet ~]# perf evlist -v
    cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1

Change the has_traces check to look for the sched_switch event. Analysis
for perf sched timehist requires at least this event.

Now when analyzing a file without sched tracepoints you get:

    root@f21-vbox:/tmp$ perf sched timehist
    No sched_switch events found. Have you run 'perf sched record'?

Signed-off-by: David Ahern <dsahern@gmail.com>
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480451988-43673-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:32 -03:00
Jiri Olsa
8ac1eb7bab perf tools: Move perf build related variables under non fixdep leg
Because there's no need for them in fixdep build.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:03 -03:00
Jiri Olsa
abb26210a3 perf tools: Force fixdep compilation at the start of the build
The fixdep tool needs to be built before everything else, because it fixes
every object dependency file.

We handle this currently by making all objects to depend on fixdep, which is
error prone and is easily forgotten when new object is added.

Instead of this, this patch force fixdep tool to be built as the first target
in the separate make session. This way we don't need to handle extra fixdep
dependencies and we are certain there's no fixdep race with any parallel make
job.

Committer notes:

Testing it:

Before:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/json.o
    MKDIR    /tmp/build/perf/pmu-events/
    HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
    HOSTCC   /tmp/build/perf/pmu-events/jevents.o
    HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
    PERF_VERSION = 4.9.rc8.g868cd5
    CC       /tmp/build/perf/perf-read-vdso32
  <SNIP>

After:

  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make -k O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ on  ]
  ...                      libaudit: [ on  ]
  ...                        libbfd: [ on  ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ on  ]
  ...        numa_num_possible_cpus: [ on  ]
  ...                       libperl: [ on  ]
  ...                     libpython: [ on  ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
    MKDIR    /tmp/build/perf/fd/
    CC       /tmp/build/perf/fd/array.o
    LD       /tmp/build/perf/fd/libapi-in.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/event-parse.o
    CC       /tmp/build/perf/fs/fs.o
    PERF_VERSION = 4.9.rc8.g57a92f
    CC       /tmp/build/perf/event-plugin.o
    MKDIR    /tmp/build/perf/fs/
    CC       /tmp/build/perf/fs/tracing_path.o
  <SNIP>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1481030331-31944-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-06 13:23:02 -03:00