Commit Graph

285 Commits

Author SHA1 Message Date
Namhyung Kim
740b97f950 perf report: Show progress bar for output resorting
Sometimes it takes a long time to resort hist entries for output in case
of a large data file.  Show a progress bar window and inform user.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1419223455-4362-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-23 12:01:37 -03:00
Arnaldo Carvalho de Melo
a635fc511e perf tools: Remove hists from evsel
Now tools that deals want to have an hists per evsel need to call
hists__init() before creating any evsels, which can be as early as when
parsing the command line, so do it before calling parse_options().

The current tools using hists/hist_entries are report, top and annotate,
change them to request per evsel hists.

This is in preparation for making evsels usable by 3rd party tools, that
not necessarily live in perf's source code repository.

Acked-by: Borislav Petkov <bp@suse.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-usjx2la743f10ippj7p1b20x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14 17:32:52 -03:00
Arnaldo Carvalho de Melo
2a1731fb85 perf session: Remove last reference to hists struct
Now perf_session doesn't require that the evsels in its evlist are hists
containing ones.

Tools that are hists based and want to do per evsel events_stats
updates, if at some point this turns into a necessity, should do it in
the tool specific code, keeping the session class hists agnostic.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cli1bgwpo82mdikuhy3djsuy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-14 11:41:25 -03:00
Arnaldo Carvalho de Melo
4ea062ed43 perf evsel: Add hists helper
Not all tools need a hists instance per perf_evsel, so lets pave the way
to remove evsel->hists while leaving a way to access the hists from a
specially allocated evsel, one that comes with space at the end where
lives the evsel.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-09 13:13:41 -03:00
Jiri Olsa
23aadb1fcd perf callchain: Move callchain_param to util object in to fix python test
In following commit we changed the location of callchains data:

  72a128aa08
  perf tools: Move callchain config from record_opts to callchain_param

Now all callchains stuff stays in callchain_param struct, which adds its
dependency for evsel.c object and breaks python perf.so usage
(unresolved callchain_param).

Moving callchain_param into callchain.c and adding it into
python-ext-sources unleash just another dependency hell, so I ended up
adding callchain_param into util.c for now.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1412179229-19466-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-03 09:39:48 -03:00
Namhyung Kim
701937bd59 perf top: Fix -z option behavior
The current -z option does almost nothing.  It doesn't zero the existing
samples so that we can see profiles of exited process after last
refresh.  It seems it only affects annotation.

This patch clears existing entries before processing if -z option is
given.  For this original decaying logic also moved before processing.

Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1407831366-28892-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-13 17:28:07 -03:00
Don Zickus
9b32ba71ba perf tools: Add dcacheline sort
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
particular sample instruction.  A bunch of those details relate to the data
address.

One interesting thing you can do with data addresses is to convert them into a unique
cacheline they belong too.  Organizing these data cachelines into similar groups and sorting
them can reveal cache contention.

This patch creates an alogorithm based on various sample details that can help group
entries together into data cachelines and allows 'perf report' to sort on it.

The algorithm relies on having proper mmap2 support in the kernel to help determine
if the memory map the data address belongs to is private to a pid or globally shared.

The alogortithm is as follows:

o group cpumodes together
o group entries with discovered maps together
o sort on major, minor, inode and inode generation numbers
o if userspace anon, then sort on pid
o sort on cachelines based on data addresses

The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.

Sample output:

 #
 # Samples: 206  of event 'cpu/mem-loads/pp'
 # Total weight : 2534
 # Sort order   : dcacheline,pid
 #
 # Overhead       Samples                                                          Data Cacheline       Command:  Pid
 # ........  ............  ......................................................................  ..................
 #
    13.22%             1  [k] 0xffff88042f08ebc0                                                       swapper:    0
     9.27%             1  [k] 0xffff88082e8cea80                                                       swapper:    0
     3.59%             2  [k] 0xffffffff819ba180                                                       swapper:    0
     0.32%             1  [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0       swapper:    0
     0.32%             1  [k] timekeeper_seq+0xfffffffffffffff8                                        swapper:    0

Note:  Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
from prematurely tabbing over and mis-aligning.  Not sure what the problem is.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-09 13:34:49 +02:00
Don Zickus
7365be55ee perf tools: Add cpumode to struct hist_entry
The next patch needs to sort on cpumode, so add it to hist_entry to be tracked.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1401208087-181977-6-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-09 13:34:48 +02:00
Namhyung Kim
9d3c02d718 perf tools: Add callback function to hist_entry_iter
The new ->add_entry_cb() will be called after an entry was added to
the histogram.  It's used for code sharing between perf report and
perf top.  Note that ops->add_*_entry() should set iter->he properly
in order to call the ->add_entry_cb.

Also pass @arg to the callback function.  It'll be used by perf top
later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/87k393g999.fsf@sejong.aot.lge.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:35:05 +02:00
Namhyung Kim
be7f855a3e perf tools: Save callchain info for each cumulative entry
When accumulating callchain entry, also save current snapshot of the
chain so that it can show the rest of the chain.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-10-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:35:00 +02:00
Namhyung Kim
b4d3c8bd86 perf report: Cache cumulative callchains
It is possble that a callchain has cycles or recursive calls.  In that
case it'll end up having entries more than 100% overhead in the
output.  In order to prevent such entries, cache each callchain node
and skip if same entry already cumulated.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-8-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:58 +02:00
Namhyung Kim
c7405d85d7 perf tools: Update cpumode for each cumulative entry
The cpumode and level in struct addr_localtion was set for a sample
and but updated as cumulative callchains were added.  This led to have
non-matching symbol and cpumode in the output.

Update it accordingly based on the fact whether the map is a part of
the kernel or not.  This is a reverse of what thread__find_addr_map()
does.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:58 +02:00
Namhyung Kim
7a13aa28aa perf hists: Accumulate hist entry stat based on the callchain
Call __hists__add_entry() for each callchain node to get an
accumulated stat for an entry.  Introduce new cumulative_iter ops to
process them properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:57 +02:00
Namhyung Kim
a0b51af367 perf hists: Check if accumulated when adding a hist entry
To support callchain accumulation, @entry should be recognized if it's
accumulated or not when add_hist_entry() called.  The period of an
accumulated entry should be added to ->stat_acc but not ->stat. Add
@sample_self arg for that.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:56 +02:00
Namhyung Kim
f8be1c8c48 perf hists: Add support for accumulated stat of hist entry
Maintain accumulated stat information in hist_entry->stat_acc if
symbol_conf.cumulate_callchain is set.  Fields in ->stat_acc have same
vaules initially, and will be updated as callchain is processed later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:56 +02:00
Namhyung Kim
69bcb019fc perf tools: Introduce struct hist_entry_iter
There're some duplicate code when adding hist entries.  They are
different in that some have branch info or mem info but generally do
same thing.  So introduce new struct hist_entry_iter and add callbacks
to customize each case in general way.

The new perf_evsel__add_entry() function will look like:

  iter->prepare_entry();
  iter->add_single_entry();

  while (iter->next_entry())
    iter->add_next_entry();

  iter->finish_entry();

This will help further work like the cumulative callchain patchset.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1401335910-16832-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:55 +02:00
Namhyung Kim
1844dbcbe7 perf tools: Introduce hists__inc_nr_samples()
There're some duplicate code for counting number of samples.  Add
hists__inc_nr_samples() and reuse it.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1401335910-16832-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:55 +02:00
Namhyung Kim
e67d49a72d perf tools: Skip elided sort entries
When it converted sort entries to hpp formats, it missed se->elide
handling, so add it for compatibility.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-16-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:36 +02:00
Namhyung Kim
26d8b33827 perf tools: Consolidate output field handling to hpp format routines
Until now the hpp and sort functions do similar jobs different ways.
Since the sort functions converted/wrapped to hpp formats it can do
the job in a uniform way.

The perf_hpp__sort_list has a list of hpp formats to sort entries and
the perf_hpp__list has a list of hpp formats to print output result.

To have a backward compatibility, it automatically adds 'overhead'
field in front of sort list.  And then all of fields in sort list
added to the output list (if it's not already there).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/n/tip-7g3h86woz2sckg3h1lj42ygj@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
043ca389a3 perf tools: Use hpp formats to sort final output
Convert output sorting function to use ->sort hpp functions.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
093f0ef34c perf tools: Use hpp formats to sort hist entries
It wrapped sort entries to hpp functions, so using the hpp sort list
to sort entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
3186b6815d perf hists: Add missing update on filtered stats in hists__decay_entries()
When a filter is used for perf top, its hists->nr_non_filtered_entries
was not updated after it removed an entry in hists__decay_entries().
Also hists->stats.total_non_filtered_period was missed too.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-8-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:32:44 +02:00
Namhyung Kim
820bc81f4c perf tools: Account entry stats when it's added to the output tree
Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree.  It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.

To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely.  There're some
exceptional cases here and there - they should be addressed separately
with comments.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:32:15 +02:00
Namhyung Kim
87e90f4328 perf hists: Collapse expanded callchains after filter is applied
When a filter is applied a hist entry checks whether its callchain was
folded and account it to the output stat.  But this is rather hacky
and only TUI-specific.  Simply fold the callchains for the entry looks
like a simpler and more generic solution IMHO.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:31:50 +02:00
Namhyung Kim
9283ba9bd7 perf hists: Add a couple of hists stat helper functions
Add hists__{reset,inc}_[filter_]stats() functions to cleanup accesses
to hist stats (for output).  Note that number of samples in the stat
is not handled here since it belongs to the input stage.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:31:25 +02:00
Namhyung Kim
ae993efc9c perf hists: Move column length calculation out of hists__inc_stats()
It's not the part of logic of hists__inc_stats() so it'd be better to
move it out of the function.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:30:58 +02:00
Namhyung Kim
6263835a1b perf hists: Rename hists__inc_stats()
The existing hists__inc_nr_entries() is a misnomer as it's not only
increasing ->nr_entries but also other stats.  So rename it to more
general hists__inc_stats().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-04-24 16:30:30 +02:00
Namhyung Kim
0b93da1756 perf tools: Add hist.percentage config option
Add hist.percentage option for setting default value of the
symbol_conf.filter_relative.  It affects the output of various perf
commands (like perf report, top and diff) only if filter(s) applied.

An user can write .perfconfig file like below to show absolute
percentage of filtered entries by default:

  $ cat ~/.perfconfig
  [hist]
  percentage = absolute

And it can be changed through command line:

  $ perf report --percentage relative

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:04 +02:00
Namhyung Kim
33db4568e1 perf top: Add --percentage option
The --percentage option is for controlling overhead percentage
displayed.  It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.

For more information, please see previous commit same thing done to
"perf report".

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Namhyung Kim
f214833054 perf report: Add --percentage option
The --percentage option is for controlling overhead percentage
displayed.  It can only receive either of "relative" or "absolute".

"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%.  "absolute" means it retains
the original value before and after the filter is applied.

  $ perf report -s comm
  # Overhead       Command
  # ........  ............
  #
      74.19%           cc1
       7.61%           gcc
       6.11%            as
       4.35%            sh
       4.14%          make
       1.13%        fixdep
  ...

  $ perf report -s comm -c cc1,gcc --percentage absolute
  # Overhead       Command
  # ........  ............
  #
      74.19%           cc1
       7.61%           gcc

  $ perf report -s comm -c cc1,gcc --percentage relative
  # Overhead       Command
  # ........  ............
  #
      90.69%           cc1
       9.31%           gcc

Note that it has zero effect if no filter was applied.

Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Namhyung Kim
1ab1fa5dfb perf hists: Add support for showing relative percentage
When filtering by thread, dso or symbol on TUI it also update total
period so that the output shows different result than no filter - the
percentage changed to relative to filtered entries only.  Sometimes
this is not desired since users might expect same results with filter.

So new filtered_* fields to hists->stats to count them separately.
They'll be controlled/used by user later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Namhyung Kim
2c86c7ca76 perf report: Merge al->filtered with hist_entry->filtered
I.e. don't drop al->filtered entries, create the hist_entries and use
its ->filtered bitmap, that is kept with the same semantics for its
bitmap, leaving the filtering to be done at the hist_entry level, i.e.
in the UIs.

This will allow zooming in/out the filters.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-xeyhkepu7plw716lrtb0zlnu@git.kernel.org
[ yanked this out of a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-03-18 18:16:59 -03:00
Namhyung Kim
b3cef7f60f perf symbols: Record the reason for filtering an address_location
By turning the addr_location->filtered member from a boolean to a u8
bitmap, reusing (and extending) the hist_filter enum for that.

This patch doesn't change the logic at all, as it keeps the meaning of
al->filtered !0 to mean that the entry _was_ filtered, so no change in
how this value is interpreted needs to be done at this point.

This will be soon used in upcoming patches.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-89hmfgtr9t22sky1lyg7nw7l@git.kernel.org
[ yanked this out of a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-03-18 18:16:57 -03:00
Arnaldo Carvalho de Melo
644f2df29f perf tools: Shorten sample symbol resolving function signature
Since two of the parameters come from the same 'struct
addr_location', rename machine__resolve_bstack() to sample__resolve_bstack()
and pass the that addr_location instead.

This is also for consistency with the same change that resulted in the
sample__resolve_mem() function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-99ecqt8jiyyksiyx3se7l5ia@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18 09:34:46 -03:00
Arnaldo Carvalho de Melo
e80faac046 perf tools: Shorten sample symbol resolving function signature
Since three of the parameters come from the same 'struct addr_location',
rename machine__resolve_mem() to sample__resolve_mem() and pass the
that addr_location instead.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3f5otpssefh9l5hi1t259h8n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-02-18 09:34:46 -03:00
Namhyung Kim
f39056f9c3 perf hists: Convert hist entry functions to use struct he_stat
The hist_entry__add_cpumode_period() and hist_entry__decay() functions
are dealing with hist_entry's stat fields only.

Make them he_stat methods then.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Link: http://lkml.kernel.org/r/1389677157-30513-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-01-15 15:34:00 -03:00
Arnaldo Carvalho de Melo
74cf249d5c perf tools: Use zfree to help detect use after free bugs
Several areas already used this technique, so do some audit to
consistently use it elsewhere.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9sbere0kkplwe45ak6rk4a1f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-27 17:08:19 -03:00
Arnaldo Carvalho de Melo
f626adffe1 perf annotate: Adopt methods from hists
Those are just wrappers to annotation methods, so move them to
annotate.c

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-336h7z0bi2k51cbfi6mkpo5k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-19 11:34:27 -03:00
Namhyung Kim
f1cbf78d17 perf hists: Do not pass period and weight to add_hist_entry()
The @entry argument already has the info so no need to pass them.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Link: http://lkml.kernel.org/r/1387344086-12744-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-18 14:44:05 -03:00
Namhyung Kim
41a4e6e2a0 perf hists: Consolidate __hists__add_*entry()
The __hists__add_{branch,mem}_entry() does almost the same thing that
__hists__add_entry() does.  Consolidate them into one.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383202576-28141-2-git-send-email-namhyung@kernel.org
[ Fixup clash with new COMM infrastructure ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04 20:59:09 -03:00
Namhyung Kim
4dfced359f perf tools: Get current comm instead of last one
At insert time, a hist entry should reference comm at the time otherwise
it'll get the last comm anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-n6pykiiymtgmcjs834go2t8x@git.kernel.org
[ Fixed up const pointer issues ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04 12:16:39 -03:00
Namhyung Kim
c1fb5651bb perf tools: Show progress on histogram collapsing
It can take quite amount of time so add progress bar UI to inform user.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381468543-25334-4-git-send-email-namhyung@kernel.org
[ perf_progress -> ui_progress ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-23 15:48:24 -03:00
Arnaldo Carvalho de Melo
c824c4338a perf tools: Stop using 'self' in some more places
As suggested by tglx, 'self' should be replaced by something that is
more useful.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fmblhc6tbb99tk1q8vowtsbj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-23 09:55:37 -03:00
Namhyung Kim
f048d548f8 perf annotate: Factor out get/free_srcline()
Currently external addr2line tool is used for srcline sort key and
annotate with srcline info.  Separate the common code to prepare
upcoming enhancements.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1378876173-13363-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 15:59:39 -03:00
Namhyung Kim
909b143162 perf hists: Free srcline when freeing hist_entry
We've been leaked srcline of hist_entry, it should be freed also.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1378876173-13363-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 15:58:28 -03:00
Andi Kleen
475eeab9f3 tools/perf: Add support for record transaction flags
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:12 +02:00
Andi Kleen
354cc40e3b tools/perf: Fix sorting for 64bit entries
Some of the node comparisons in hist.c dropped the upper
32bit by using an int variable to store the compare
result. This broke various 64bit fields, causing
incorrect collapsing (found for the TSX transaction field)

Just use int64_t always.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1380637335-30110-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:06 +02:00
Arnaldo Carvalho de Melo
33e940a25d perf session: Check for SIGINT in more loops
When processing big files we were not checking if session_done was set
by the SIGINT signal handler, for instance in 'perf report'. Fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pyad42lgrtq7xhg2dpsoauq7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-09-19 11:32:17 -03:00
Andi Kleen
99571ab3d9 perf tools: Support callchain sorting based on addresses
With programs with very large functions it can be useful to distinguish
the callgraph nodes on more than just function names. So for example if
you have multiple calls to the same function, it ends up being separate
nodes in the chain.

This patch adds a new key field to the callgraph options, that allows
comparing nodes on functions (as today, default) and addresses.

Longer term it would be nice to also handle src lines, but that would
need more changes and address is a reasonable proxy for it today.

I right now reference the global params, as there was no simple way to
register a params pointer.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-0uskktybf0e7wrnoi5e9b9it@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-22 12:42:18 -03:00
Jiri Olsa
e0af43d248 perf hists: Marking dummy hists entries
It does not make sense to make some computation (ratio, wdiff), when the
hist_entry is 'dummy' - added via hists__link.

Adding dummy field to struct hist_entry which indicates that it was
added by hists__link and avoiding some of the processing for such
entries.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-g8bxml0n0pnqsrpyd98p0ird@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-12 13:54:04 -03:00
Namhyung Kim
27a0dcb7ad perf hists: Move locking to its call-sites
It's a preparation patch to eliminate unneeded locking in the perf
report path.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-28 16:24:00 +03:00
Namhyung Kim
3a5714f8b5 perf top: Get rid of *_threaded() functions
Those _threaded() functions are needed to make hist tree handling
thread-safe, but AFAICS the only thing it does is forcing it to use
the intermediate 'collapsed' tree.

This can be acheived by setting sort__need_collapse to 1 in cmd_top() so
no need to keep those _threaded() variants.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368497347-9628-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-28 16:23:59 +03:00
Namhyung Kim
ded19d57a6 perf report: Fix alignment of symbol column when -v is given
When -v option is given, the symbol sort key prints its address also but
it wasn't properly aligned since hists__calc_col_len() misses the
additional part.  Also it missed 2 spaces for 0x prefix when printing.

  $ perf report --stdio -v -s sym
  # Samples: 133  of event 'cycles'
  # Event count (approx.): 50536717
  #
  # Overhead                          Symbol
  # ........  ..............................
  #
      12.20%  0xffffffff81384c50 v [k] intel_idle
       7.62%  0xffffffff8170976a v [k] ftrace_caller
       7.02%  0x2d986d         B [.] 0x00000000002d986d

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-28 16:23:53 +03:00
Namhyung Kim
ceb2acbc2c perf hists: Free unused mem info of a matched hist entry
The mem info is shared between matched entries so one should be freed.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-28 16:23:52 +03:00
Namhyung Kim
26353a61b9 perf hists: Fix an invalid memory free on he->branch_info
The branch info was allocated for the whole stack and passed matching
hist entry for each level during processing samples.  Thus when a hist
entry tries to free its branch info like in hists__collapse_insert_entry
it'll face following error.

  *** glibc detected *** perf: munmap_chunk(): invalid pointer: 0x00000000014e9d20 ***
  ======= Backtrace: =========
  /lib64/libc.so.6[0x387d47ae16]
  perf[0x4923bd]
  perf(cmd_report+0xd68)[0x432a08]
  perf[0x41a663]
  perf(main+0x58f)[0x419eaf]
  /lib64/libc.so.6(__libc_start_main+0xf5)[0x387d421735]
  perf[0x419f95]

Fix it by allocating and copying branch info for each new hist entry.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-05-28 16:23:52 +03:00
Stephane Eranian
028f12ee6b perf tools: Add new mem command for memory access profiling
This new command is a wrapper on top of perf record and perf report to
make it easier to configure for memory access profiling.

To record loads:
$ perf mem -t load rec .....

To record stores:
$ perf mem -t store rec .....

To get the report:
$ perf mem -t load rep

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-15-git-send-email-eranian@google.com
[ Fixed minor conflict with 66857b5 "Sort command-list.txt alphabetically" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:21:44 -03:00
Stephane Eranian
98a3b32c99 perf tools: Add mem access sampling core support
This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.

The following sorting orders are added:
 - symbol_daddr: data address symbol (or raw address)
 - dso_daddr: data address shared object
 - locked: access uses locked transaction
 - tlb : TLB access
 - mem : memory level of the access (L1, L2, L3, RAM, ...)
 - snoop: access snoop mode

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
  machine.[ch], and the rename of dsrc to data_src, to match the change
  made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:20:13 -03:00
Andi Kleen
05484298cb perf tools: Add support for weight v7 (modified)
perf record has a new option -W that enables weightened sampling.

Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.

Add the necessary glue to perf report, record and the library.

v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.

Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:19:43 -03:00
Namhyung Kim
29d720ed5f perf hists: Resort hist entries using group members for output
When event group is enabled, sorting hist entries on periods for output
should consider groups members' period also.  To do that, build period
table using link/pair information and compare the table.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1358845787-1350-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31 13:07:45 -03:00
Stephane Eranian
3cf0cb1f89 perf tools: Mark branch_info maps as referenced
As noticed by Jiri, the hist_entry->branch_info.to/from maps need to be
marked as referenced to avoid problems later on.  So we do this when the
hist_entry is allocated.

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130114140245.GA4692@quad
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:38 -03:00
Namhyung Kim
cb99374455 perf sort: Calculate parent column width too
When hists__calc_col_len() called, most of column length are refreshed
but it missed parent column.  So if the parent sort key was used along
with other keys rests will be misalinged since parent has no proper
column width.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1356599507-14226-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:24 -03:00
Arnaldo Carvalho de Melo
28a6b6aa54 perf session: There is no need for a per session hists instance
It was being used just for its stats member, so ditch session->hists and
use just what is needed, session->stats.

This completes the move support multiple events in the hists layer, the
last user of session->hists was 'perf diff' but Jiri Olsa has fixed that
some time ago.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pimk92kek8kcp4dmb1jakoro@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:12 -03:00
Namhyung Kim
66f97ed3ac perf diff: Use internal rb tree for compute resort
There's no reason to run hists_compute_resort() using output tree.
Convert it to use internal tree so that it can remove unnecessary
_output_resort.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1355128197-18193-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:06 -03:00
Namhyung Kim
ce74f60eab perf hists: Link hist entries before inserting to an output tree
For matching and/or linking hist entries, they need to be sorted by
given sort keys.  However current hists__match/link did this on the
output trees, so that the entries in the output tree need to be resort
before doing it.

This looks not so good since we have trees for collecting or collapsing
entries before passing them to an output tree and they're already sorted
by the given sort keys.  Since we don't need to print anything at the
time of matching/linking, we can use these internal trees directly
instead of bothering with double resort on the output tree.

Its only user - at the time of this writing - perf diff can be easily
converted to use the internal tree and can save some lines too by
getting rid of unnecessary resorting codes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1355128197-18193-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:06 -03:00
Namhyung Kim
9afcf930b1 perf hists: Exchange order of comparing items when collapsing hists
When comparing entries for collapsing put the given entry first, and
then the iterated entry.  This is not the case of hist_entry__cmp() when
called if given sort keys don't require collapsing.  So change the order
for the sake of consistency.  It will be required for matching and/or
linking multiple hist entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1355128197-18193-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:05 -03:00
Namhyung Kim
5fa9041bba perf hists: Link hist entry pairs to leader
Current hists__match/link() link a leader to its pair, so if multiple
pairs were linked, the leader will lose pointer to previous pairs since
it was overwritten.  Fix it by making leader the list head.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1354171126-14387-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09 08:46:06 -03:00
Namhyung Kim
2850d94872 perf hists: Fix typo on hist__entry_add_pair
Fix a misplaced underscore.  In this case, 'hist_entry' is the name of
data structure and we usually put double underscores between data
structure and actual function name.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8jdq8g6kl6v54hkexrfwsy72@git.kernel.org
[ committer note: put it in front of the patch queue where it came from ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-12-09 08:46:06 -03:00
Arnaldo Carvalho de Melo
30193d78d8 perf hists: Initialize all of he->stat with zeroes
Not just nr_events and period.

Reported-by: Namhyung Kim <namhyung@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8nodd6b4bytyf1snf96oy531@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-14 16:50:47 -03:00
Arnaldo Carvalho de Melo
494d70a181 perf hists: Introduce hists__link
That given two hists will find the hist_entries (buckets) in the second
hists that are for the same bucket in the first and link them, then it
will look for all buckets in the second that don't have a counterpart in
the first and will create a dummy counterpart that will then be linked
to the entry in the second.

For multiple events this will be done pairing the leader with all the
other events in the group, so that in the end the leader will have all
the buckets in all the hists in a group, dummy or not while the other
hists will be left untouched.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-l9l9ieozqdhn9lieokd95okw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08 18:08:15 -03:00
Arnaldo Carvalho de Melo
95529be478 perf diff: Move hists__match to the hists lib
Its not 'diff' specific and will be useful for other use cases, like
bucketizing multiple events in a single session.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-o35urjgxfxxm70aw1wa81s4w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08 17:57:37 -03:00
Arnaldo Carvalho de Melo
b821c73253 perf diff: Start moving to support matching more than two hists
We want to match more than two hists, so that we can match more than two
perf.data files and moreover, match hist_entries (buckets) in multiple
events in a group.

So the "baseline"/"leader" will instead of a ->pair pointer, use a
list_head, that will link to the pairs and hists__match use it.

Following that perf_evlist__link will link the hists in its evsel
groups.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2kbmzepoi544ygj9godseqpv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08 17:43:09 -03:00
Namhyung Kim
580e338d7e perf hists: Free branch_info when freeing hist_entry
Those data should be free along with the associated hist_entry,
otherwise it'll be leaked.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1352273234-28912-7-git-send-email-namhyung@kernel.org
[ committer note: mem_info is not yet in perf/core, free just branch_info ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-11-08 12:05:12 -03:00
Namhyung Kim
139c081590 perf hists: Add more helpers for hist entry stat
Add and use he_stat__add_{period,stat} for calculating hist entry's
stat.  It will be used for accumulated stats later as well.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04 13:36:18 -03:00
Namhyung Kim
c4b35351ef perf hists: Move he->stat.nr_events initialization to a template
Since it is set to 1 for a new hist entry, no need to set to separately.
Move it to a template entry.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04 13:35:14 -03:00
Namhyung Kim
b24c28f794 perf hists: Introduce struct he_stat
The struct he_stat is for separating out statistics data of a hist
entry.  It is required for later changes.

It's just a mechanical change and should have no functional differences.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04 13:34:22 -03:00
Jiri Olsa
ae359f193a perf hists: Add struct hists pointer to struct hist_entry
Adding pointer back to the parent struct hists for struct hists_entry.

This will be useful in future for any hist_entry's data computation,
that depends on total data of its parent hists.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-2-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-04 13:27:00 -03:00
Namhyung Kim
9ec60972a3 perf hists: Add missing period_* fields when collapsing a hist entry
So that the perf report won't lost the cpu utilization information.

For example, if there're two process that have same name.

  $ perf report --stdio --showcpuutilization -s pid
  [SNIP]
  #   Overhead       sys        us  Command:  Pid
  #   ........  ........  ........  .............
  #
        55.12%     0.01%    55.10%  noploop:28781
        44.88%     0.06%    44.83%  noploop:28782

Before:
  $ perf report --stdio --showcpuutilization -s comm
  [SNIP]
  #   Overhead       sys        us
  #   ........  ........  ........
  #
       100.00%     0.06%    44.83%

After:
  $ perf report --stdio --showcpuutilization -s comm
  [SNIP]
  #   Overhead       sys        us
  #   ........  ........  ........
  #
       100.00%     0.07%    99.93%

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1348645663-25303-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-26 20:44:11 -03:00
Irina Tirdea
1d037ca164 perf tools: Use __maybe_used for unused variables
perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 12:19:15 -03:00
Namhyung Kim
7e62ef44e8 perf hists: Use perf_hpp__format->width to calculate the column widths
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346640790-17197-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-08 13:20:05 -03:00
Namhyung Kim
7ccf4f9058 perf hists: Separate out hist print functions
Separate out those functions into ui/stdio/hist.c. This is required for
upcoming changes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1345438331-20234-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-20 09:46:34 -03:00
Frederic Weisbecker
6654f5d8bd perf hists: Print newline between hists callchains
Tiny cosmetic fix. The lack of a newline between hists callchains was
looking slightly messy.

Before:

     0.24%      swapper  [kernel.kallsyms]  [k] _raw_spin_lock_irq
                |
                --- _raw_spin_lock_irq
                    run_timer_softirq
                    __do_softirq
                    call_softirq
                    do_softirq
                    irq_exit
                    smp_apic_timer_interrupt
                    apic_timer_interrupt
                    default_idle
                    amd_e400_idle
                    cpu_idle
                    start_secondary
     0.10%         perf  [kernel.kallsyms]  [k] lock_is_held
                   |
                   --- lock_is_held
                       __might_sleep
                       mutex_lock_nested
                       perf_event_for_each_child
                       perf_ioctl
                       do_vfs_ioctl
                       sys_ioctl
                       system_call_fastpath
                       ioctl
                       cmd_record
                       run_builtin
                       main
                       __libc_start_main

After:

     0.24%      swapper  [kernel.kallsyms]  [k] _raw_spin_lock_irq
                |
                --- _raw_spin_lock_irq
                    run_timer_softirq
                    __do_softirq
                    call_softirq
                    do_softirq
                    irq_exit
                    smp_apic_timer_interrupt
                    apic_timer_interrupt
                    default_idle
                    amd_e400_idle
                    cpu_idle
                    start_secondary

     0.10%         perf  [kernel.kallsyms]  [k] lock_is_held
                   |
                   --- lock_is_held
                       __might_sleep
                       mutex_lock_nested
                       perf_event_for_each_child
                       perf_ioctl
                       do_vfs_ioctl
                       sys_ioctl
                       system_call_fastpath
                       ioctl
                       cmd_record
                       run_builtin
                       main
                       __libc_start_main

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1342631456-7233-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-07-25 11:32:26 -03:00
Frederic Weisbecker
8760db726e perf hists: Return correct number of characters printed in callchain
Include the omitted number of characters printed for the first entry.

Not that it really matters because nobody seem to care about the number
of printed characters for now. But just in case.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1342631456-7233-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-07-25 11:31:37 -03:00
Namhyung Kim
472606458f perf callchain: Make callchain cursors TLS
perf top -G has a race on callchain cursor between main thread and
display thread. Since the callchain cursors are used locally make them
thread-local data would solve the problem.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 10:47:12 -03:00
Jiri Olsa
a0187060f4 perf hists: Fix callchain ip printf format
The callchain address is stored as u64. Current code uses following
format string to display callchain address:

  "%p\n", (void *)(long)chain->ip

This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
environment. Fixing this to always display whole 64 bits.

Note, running following to test perf endianity handling:
test 1)
  - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

  - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

  - the diff should produce no output
    (besides some white space stuff and possibly different
     date/TZ output)

test 2)
  - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
  - mount origin system root to the target system on /mnt/origin
  - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
     --kallsyms /mnt/origin/proc/kallsyms
  - complete perf.data header is displayed

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-17 13:18:19 -03:00
David Miller
63fa471dd4 perf hists: Catch and handle out-of-date hist entry maps.
When a process exec()'s, all the maps are retired, but we keep the hist
entries around which hold references to those outdated maps.

If the same library gets mapped in for which we have hist entries, a new
map will be created.  But when we take a perf entry hit within that map,
we'll find the existing hist entry with the older map.

This causes symbol translations to be done incorrectly.  For example,
the perf entry processing will lookup the correct uptodate map entry and
use that to calculate the symbol and DSO relative address.  But later
when we update the histogram we'll translate the address using the
outdated map file instead leading to conditions such as out-of-range
offsets in symbol__inc_addr_samples().

Therefore, update the map of the hist_entry dynamically at lookup/
creation time.

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120327.031418.1220315351537060808.davem@davemloft.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-05 18:53:47 -03:00
Frederic Weisbecker
6d4818c524 perf tools: Fix display of first level of callchains
The callchain stdio mode display was written using a sorted by symbol
report. In this mode we have only one callchain root per hist so we
forgot to handle cases where we have multiple callchain root, as in per
dso sorting for example.

Fix this by handling these roots like any other branch, with the hist as
the parent.

Before:

     1.97%  libpthread-2.12.1.so
            |
            --- __libc_write
                create_worker
                bench_sched_messaging
                cmd_bench
                run_builtin
                main
                __libc_start_main

            |
            --- __libc_read
                create_worker
                bench_sched_messaging
                cmd_bench
                run_builtin
                main
                __libc_start_main

After:

     1.97%  libpthread-2.12.1.so
            |
            |--36.97%-- __libc_write
            |          create_worker
            |          bench_sched_messaging
            |          cmd_bench
            |          run_builtin
            |          main
            |          __libc_start_main
            |
            |--31.47%-- __libc_read
            |          create_worker
            |          bench_sched_messaging
            |          cmd_bench
            |          run_builtin
            |          main
            |          __libc_start_main
           ...

Single roots keep their entry without percentage because they have
the same overhead than the hist they refer to. ie: 100% in fractal
mode and the percentage of the hist in graph mode:

     0.00%  [k] reschedule_interrupt
            |
            --- default_idle
                amd_e400_idle
                cpu_idle
                start_secondary

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1332526010-15400-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-26 15:14:40 -03:00
Jiri Olsa
4bf9ce1b5e perf diff: Fix to work with new hists design
The perf diff command is broken since:
  perf hists: Threaded addition and sorting of entries
  commit 1980c2ebd7

Several places were broken:
  - hists data need to be collected into opened sessions instead
    of into events
  - session's hists data need to be initialized properly when the
    session is created
  - hist_entry__pcnt_snprintf: the percentage and displacement
    buffer preparation must not use 'ret' because it's used
    as a pointer to the final buffer

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120322133726.GB1601@m.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-22 15:12:09 -03:00
Arnaldo Carvalho de Melo
0d09eb7a9a Merge branch 'perf/urgent' into perf/core
Merge Reason: to pick the fix:

 commit e7f01d1
     perf tools: Use scnprintf where applicable

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-22 15:09:08 -03:00
Namhyung Kim
e94d53ebec perf hists: Add hists__filter_by_symbol
This function will be used for simple (sub-)string matching filter based
on user input.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16 16:31:09 -03:00
Arnaldo Carvalho de Melo
e7f01d1e3d perf tools: Use scnprintf where applicable
Several places were expecting that the value returned was the number of
characters printed, not what would be printed if there was space.

Fix it by using the scnprintf and vscnprintf variants we inherited from
the kernel sources.

Some corner cases where the number of printed characters were not
accounted were fixed too.

Reported-by: Anton Blanchard <anton@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/n/tip-kwxo2eh29cxmd8ilixi2005x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-14 12:36:19 -03:00
Roberto Agostino Vitillo
b5387528f3 perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
This patch adds:

 - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
 - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
 - build histograms on branches

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:04 +01:00
Namhyung Kim
0ed35abc2b perf report: Fix --stdio output alignment when --showcpuutilization used
Current perf report output is broken if --showcpuutilization is used.
Combination with -n and/or --show-total-period make things worse.
This patch fixes it as follows:

before:
    48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
    34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
    15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
     0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe

after:
    48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
    34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
    15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
     0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1325957132-10600-8-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-01-08 13:32:51 -02:00
Arnaldo Carvalho de Melo
12c142781e perf hists: Stop using 'self' for struct hist_entry
Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of /perf.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-me4dyj6s5snh7jr8wb9gzt82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-01-06 15:42:52 -02:00
Arnaldo Carvalho de Melo
13d3ee5402 perf hists: Rename total_session to total_period
Nowadays we do it per evsel, not per session (that may have multiple
evsels), so rename it to avoid confusion.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-azsgomr5h4dmaudoogw48w49@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-01-06 15:42:08 -02:00
Arnaldo Carvalho de Melo
0e2a5f10fb perf python: Fix undefined symbol problem
Recently we made perf_evsel__init call hists__init, which broke the perf
python binding:

[root@emilia linux]# ./tools/perf/python/twatch.py
Traceback (most recent call last):
  File "./tools/perf/python/twatch.py", line 16, in <module>
    import perf
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: hists__init

Fix it by moving the hists__init function to its only caller, evsel.c.

This way we avoid dragging in other parts of tools/perf/util/ to the
perf python binding.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5nffmdt5mu6ozxgj54oi4qon@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-16 10:02:26 -02:00
Arnaldo Carvalho de Melo
7928631a66 perf hists: Fix recalculation of total_period when sorting entries
We were doing parts of it in hists__collapse_resort and parts of it in
hists__output_resort, leading to a bogus total_period.

Fix it by doing just the filtering operation when collapsing because
there we know that the Zoom operations adds filters just  what is in
hists->entries, not to the new batch of entries being collapsed.

And move all the nr_entries + total_period recalculation to
hists__output_resort since we will traverse all entries anyway there.

Problem introduced when developing threaded addition of new batches
of hist_entries, i.e. post v3.1.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8xyh165h7hmwy0696hu25en6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-27 09:19:48 -02:00
Arnaldo Carvalho de Melo
d197fd5d74 perf hists: Don't consider filtered entries when calculating column widths
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rf01wktu1e3f3az32nry86vu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-20 07:35:45 -02:00
Arnaldo Carvalho de Melo
c64550cfdd perf hists: Don't decay total_period for filtered entries
Following the 'perf report' model we don't zap hist_entry instances from
the rb tree, we just keep them with he->filtered set to a mask of the
filters applied to it (thread, parent, DSO so far).

In top we need to decay even filtered entries, but we better not touch
total_period for them...

Now everything seems to work when filters are applied on top as they
worked in 'report', i.e. both dynamic and static hist entry browsing
works with filters.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-yt4xsbq20u9x9ypuwwyw2kao@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-20 06:45:44 -02:00
Arnaldo Carvalho de Melo
90cf1fb5c0 perf hists browser: Apply the dso and thread filters when merging new batches
Now that we dynamicly add entries on the timer we need to not only
traverse all entries when the user zooms into threads and/or DSOs, but
as well after that apply it to the new batches of hist entries in
hists__collapse_resort.

Reported-by: Mike Galbraith <efault@gmx.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-zustn633c7hnrae94x6nld1p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-19 13:09:10 -02:00
Arnaldo Carvalho de Melo
d7b76f0935 perf hists: Move the dso and thread filters from hist_browser
Since with dynamic addition of new hist entries we need to apply those
filters as we merge new batches of hist_entry instances, for instance in
perf top.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-zjhhf8kh9w1buty9p10od6rz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-19 09:28:19 -02:00
Arnaldo Carvalho de Melo
f1cf602c16 perf hists: Don't format the percentage on hist_entry__snprintf
We can't have color correctly set there because in libslang (and in a future
GUI) the colors must be set on a separate function call, so move that part to a
separate function and make the stdio fprintf function call it.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-jpgy42438ce9tgbqppm397lq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-18 17:03:08 -02:00
Arnaldo Carvalho de Melo
b079d4e975 perf top: Honour --hide_{user,kernel}_symbols and the 'U' hotkey
The new decay routine (__hists__decay_entries) wasn't being passed the
toggles, fix it.

Reported-by: Mike Galbraith <efault@gmx.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-hg6m0mi1colket982oq9hhly@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-17 09:05:04 -02:00
Arnaldo Carvalho de Melo
e345fa185a perf top: Remove entries from entries_collapsed on decay
We were removing only when using a --sort order that needs collapsing,
while we also use it in the threaded case, causing memory corruption
because we were scribbling freed hist entries, oops.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-k16fb4jsulr7x0ixv43amb6d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-13 10:29:17 -03:00
Arnaldo Carvalho de Melo
df71d95f86 perf hists: Don't free decayed entries if in the annotation browser
Just let it there till the user exits the annotation browser.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-nmaxuzreqhm5k10t2co5sk9a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-13 08:01:33 -03:00
Stephane Eranian
e39622ceb1 perf tools: Fix broken number of samples for perf report -n
The perf report -n option was broken because it was not reporting the
correct number of samples depending on the sorting mode. By default,
samples are sorted by comm,dso,sym. That means that samples for the same
command (binary) get collapsed.

The hists__collapse_insert_entry() had a bug whereby it was aggregating
the number of events observed (periods) but not the number of samples.
Consequently, the number of samples reported could be below reality. The
percentage remained correct because based on the periods.

This patch fixes the problem by also aggregating the number of samples.
Here is an example:

$ perf report -n --stdio
    12.38%        842     pong  [kernel.kallsyms]     [k] __lock_acquire

Here pong (a ctxsw stress test), is the only program running
and thus it is the only one responsible for the lock_acquire samples.

If we change the sorting mode:

$ perf report -n --stdio --sort=sym
    12.38%       1732  [k] __lock_acquire

The actual number of samples is shown.

With the fix:

$ perf report -n --stdio
    12.38%       1732     pong  [kernel.kallsyms]     [k] __lock_acquire

Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20111003093815.GA6393@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 17:00:31 -03:00
Arnaldo Carvalho de Melo
ab81f3fd35 perf top: Reuse the 'report' hist_entry/hists classes
This actually fixes several problems we had in the old 'perf top':

1. Unresolved symbols not show, limitation that came from the old
   "KernelTop" codebase, to solve it we would need to do changes
   that would make sym_entry have most of the hist_entry fields.
2. It was using the number of samples, not the sum of sample->period.

And brings the --sort code that allows us to have all the views in
'perf report', for instance:

[root@emilia ~]# perf top --sort dso
PerfTop: 5903 irqs/sec kernel:77.5% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)
------------------------------------------------------------------------------

    31.59%  libcrypto.so.1.0.0
    21.55%  [kernel]
    18.57%  libpython2.6.so.1.0
     7.04%  libc-2.12.so
     6.99%  _backend_agg.so
     4.72%  sshd
     1.48%  multiarray.so
     1.39%  libfreetype.so.6.3.22
     1.37%  perf
     0.71%  libgobject-2.0.so.0.2200.5
     0.53%  [tg3]
     0.48%  libglib-2.0.so.0.2200.5
     0.44%  libstdc++.so.6.0.13
     0.40%  libcairo.so.2.10800.8
     0.38%  libm-2.12.so
     0.34%  umath.so
     0.30%  libgdk-x11-2.0.so.0.1800.9
     0.22%  libpthread-2.12.so
     0.20%  libgtk-x11-2.0.so.0.1800.9
     0.20%  librt-2.12.so
     0.15%  _path.so
     0.13%  libpango-1.0.so.0.2800.1
     0.11%  libatlas.so.3.0
     0.09%  ft2font.so
     0.09%  libpangoft2-1.0.so.0.2800.1
     0.08%  libX11.so.6.3.0
     0.07%  [vdso]
     0.06%  cyclictest
^C

All the filter lists can be used as well: --dsos, --comms, --symbols,
etc.

The 'perf report' TUI is also reused, being possible to apply all the
zoom operations, do annotation, etc.

This change will allow multiple simplifications in the symbol system as
well, that will be detailed in upcoming changesets.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xzaaldxq7zhqrrxdxjifk1mh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 16:56:44 -03:00
Arnaldo Carvalho de Melo
1980c2ebd7 perf hists: Threaded addition and sorting of entries
By using a mutex just for inserting and rotating two hist_entry rb
trees, so that when sorting we can get the last batch of entries created
from the ring buffer, merge it with whatever we have processed so far
and show the output while new entries are being added.

The 'report' tool continues, for now, to do it without threading, but
will use this in the future to allow visualization of results in long
perf.data sessions while the entries are being processed.

The new 'top' tool will be the first user.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9b05atsn0q6m7fqgrug8fk2i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 12:12:29 -03:00
Arnaldo Carvalho de Melo
3f2728bdb6 perf report: Add option to show total period
Just like --show-nr-samples, to help in diagnosing problems in the
tools.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1lr7ejdjfvy2uwy2wkmatcpq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 12:12:13 -03:00
Arnaldo Carvalho de Melo
ef9dfe6ec3 perf hists: Allow limiting the number of rows and columns in fprintf
So that we can reuse hists__fprintf for in the new perf top tool.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-huazw48x05h8r9niz5cf63za@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 12:11:49 -03:00
Arnaldo Carvalho de Melo
42b28ac071 perf hists: Stop using 'self' for struct hists
Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of /perf.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-91i56jwnzq9edhsj9y2y9l3b@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07 12:11:36 -03:00
Frederic Weisbecker
e84d21227c perf tools: Don't display ignored entries on stdio ui
As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:33 +02:00
Sam Liao
d797fdc5c5 perf tools: Add inverted call graph report support.
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-06-30 00:24:30 +02:00
Arnaldo Carvalho de Melo
e248de331a perf tools: Improve support for sessions with multiple events
By creating an perf_evlist out of the attributes in the perf.data file
header, so that we can use evlists and evsels when reading recorded
sessions in addition to when we record sessions.

More work is needed to allow tools to allow the user to select which
events are wanted when browsing sessions, be it just one or a subset of
them, aggregated or showed at the same time but with different
indications on the UI to allow seeing workloads thru different views at
the same time.

But the overall goal/trend is to more uniformly use evsels and evlists.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06 13:13:40 -03:00
Arnaldo Carvalho de Melo
d7603d5122 perf hists: Remove needless global col lenght calcs
To support multiple events we need to do these calcs per 'struct hists'
instance, and it turns out we already do that at:

	__hists__add_entry
		hists__inc_nr_entries
			hists__calc_col_len

for all the unfiltered hist_entry instances we stash in the rb tree, so
trow away the dead code.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-05 22:31:04 -03:00
Arnaldo Carvalho de Melo
fec9cbd15b perf hists: Print number of samples, not the period sum
So that we match the header where we state the number of events with the
"Samples" column when using 'perf report -n/--show-nr-samples':

 [root@emilia ~]# perf record -a sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ]
 [root@emilia ~]# perf report --stdio --show-nr-samples
 # Events: 11  cycles
 #
 # Overhead  Samples        Command       Shared Object                        Symbol
 # ........ ..........  ...........  ..................  ............................
 #
     16.65%          1        sleep  [kernel.kallsyms]   [k] unmap_vmas
     16.10%          1         perf  libpthread-2.12.so  [.] __pthread_cleanup_push_defer
     15.79%          2         perf  [kernel.kallsyms]   [k] format_decode
     12.88%          1  kworker/1:2  [kernel.kallsyms]   [k] cache_reap
     10.69%          1      swapper  [kernel.kallsyms]   [k] _raw_spin_lock
      7.55%          1        sleep  [kernel.kallsyms]   [k] prepare_exec_creds
      6.00%          1         perf  [jbd2]              [k] start_this_handle
      5.29%          1         perf  [kernel.kallsyms]   [k] seq_read
      4.75%          1         perf  [kernel.kallsyms]   [k] get_pid_task
      4.30%          1         perf  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [root@emilia ~]#

Reported-by: Stephane Eranian <eranian@google.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17 13:56:20 -02:00
Arnaldo Carvalho de Melo
ce6f4fab40 perf annotate: Move locking to struct annotation
Since we'll need it when implementing the live annotate TUI browser.

This also simplifies things a bit by having the list head for the source
code to be in the dynamicly allocated part of struct annotation, that
way we don't have to pass it around, it can be found from the struct
symbol that is passed everywhere.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-08 15:03:36 -02:00
Arnaldo Carvalho de Melo
2f525d0148 perf annotate: Support multiple histograms in annotation
The perf annotate tool continues aggregating everything on just one
histograms, but to support the top model add support for one histogram
perf evsel in the evlist.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 12:28:48 -02:00
Arnaldo Carvalho de Melo
78f7defedb perf annotate: Move annotate functions to util/
They will be used by perf top, so that we have just one set of routines
to do annotation.

Rename "struct sym_priv" to "struct annotation", etc, to clarify this
code a bit.

Rename "struct sym_ext" to "struct source_line", to give it a meaningful
name, that clarifies that it is a the result of an addr2line call, that
is sorted by percentage one particular source code line appeared in the
annotation.

And since we're moving things around also rename 'sym_hist->ip' to
'sym_hist->addr' as we want to do data structure annotation at some
point.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 12:28:21 -02:00
Arnaldo Carvalho de Melo
8115d60c32 perf tools: Kill event_t typedef, use 'union perf_event' instead
And move the event_t methods to the perf_event__ too.

No code changes, just namespace consistency.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:25:37 -02:00
Frederic Weisbecker
f08c3154ac perf callchain: Rename cumul_hits into callchain_cumul_hits
That makes the callchain API naming more consistent and
reduce potential naming clashes.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Frederic Weisbecker
1b3a0e9592 perf callchain: Feed callchains into a cursor
The callchains are fed with an array of a fixed size.
As a result we iterate over each callchains three times:

- 1st to resolve symbols
- 2nd to filter out context boundaries
- 3rd for the insertion into the tree

This also involves some pairs of memory allocation/deallocation
everytime we insert a callchain, for the filtered out array of
addresses and for the array of symbols that comes along.

Instead, feed the callchains through a linked list with persistent
allocations. It brings several pros like:

- Merge the 1st and 2nd iterations in one. That was possible before
but in a way that would involve allocating an array slightly taller
than necessary because we don't know in advance the number of context
boundaries to filter out.

- Much lesser allocations/deallocations. The linked list keeps
persistent empty entries for the next usages and is extendable at
will.

- Makes it easier for multiple sources of callchains to feed a
stacktrace together. This is deemed to pave the way for cfi based
callchains wherein traditional frame pointer based kernel
stacktraces will precede cfi based user ones, producing an overall
callchain which size is hardly predictable. This requirement
makes the static array obsolete and makes a linked list based
iterator a much more flexible fit.

Basic testing on a big perf file containing callchains (~ 176 MB)
has shown a throughput gain of about 11% with perf report.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Arnaldo Carvalho de Melo
9486aa3877 perf tools: Fix 64 bit integer format strings
Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.

Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.

Reported-by: Denis Kirjanov <dkirjanov@kernel.org>
Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
Cc: Denis Kirjanov <dkirjanov@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pingtian Han <phan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 23:41:57 -02:00
Ingo Molnar
aef1b9cef7 Merge commit 'v2.6.37' into perf/core
Merge reason: Add the final .37 tree.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-05 14:22:10 +01:00
Frederic Weisbecker
d425de5436 perf: Fix callchain hit bad cast on ascii display
ipchain__fprintf_graph() casts the number of hits in a branch as an
int, which means we lose its highests bits.

This results in meaningless number of callchain hits in perf.data
that have a high number of hits recorded, typically those that have
callchain branches hits appearing more than INT_MAX. This happens
easily as those are pondered by the event period.

Reported-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2011-01-03 16:13:11 +01:00
David Ahern
ec5761eab3 perf symbols: Add symfs option for off-box analysis using specified tree
The symfs argument allows analysis of perf.data file using a locally accessible
filesystem tree with debug symbols - e.g., tree created during image builds,
sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
with an OS tree can be analyzed from anywhere without the need to populate a
local data store with build-ids.

Commiter notes:

o Fixed up symfs="/" variants handling.

o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
  outside the symfs directory.

LKML-Reference: <1291926427-28846-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 20:17:51 -02:00
Thomas Gleixner
3835bc00c5 perf event: Prevent unbound event__name array access
event__name[] is missing an entry for PERF_RECORD_FINISHED_ROUND, but we
happily access the array from the dump code.

Make event__name[] static and provide an accessor function, fix up all
callers and add the missing string.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.432593943@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 11:15:07 -02:00
Frederic Weisbecker
612d4fd7d0 perf: Support for callchains merge
If we sort the histograms by comm, which is the default,
we need to merge some of them, typically different thread
histograms of a same process, or just same comm. But during
this merge, we forgot to merge callchains.

So imagine we have three threads (tids: 1000, 1001, 1002) that
belong to comm "foo".

tid 1000 got 100 events
tid 1001 got 10 events
tid 1002 got 3 events

Once we merge these histograms to get a per comm result, we'll
finally get:

"foo" got 113 events

The problem is if we merge 1000 and 1001 histograms into 1002, then
the end merge result, wrt callchains, will be only callchains that
belong to 1002.
This is because we haven't handled callchains in the merge. Only those
from one of the threads inside a common comm survive.

It means during this merge, we can lose a lot of callchains.

Fix this by implementing callchains merge and apply it on histograms
that collapse.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2010-08-22 21:10:35 +02:00
Frederic Weisbecker
d2009c5130 perf: Keep track of the max depth of a callchain
In order to implement callchains collapsing, we need to keep
track of the maximum depth in a histogram tree of callchains.
This way we'll avoid allocating an arbitrary temporary buffer
size on callchain merge time.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@infradead.org>
2010-08-22 20:43:17 +02:00
Arnaldo Carvalho de Melo
9222116287 perf annotate: Sort by hottest lines in the TUI
Right now it will just sort and position at the hottest line, i.e.
the one where more samples were taken.

It will be at the center of the screen and later TAB/shift-TAB will
cycle thru the hottest lines.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-10 16:11:42 -03:00
Arnaldo Carvalho de Melo
903cce6eb9 perf hists: Handle verbose in hists__sort_list_width
Otherwise entries will get chopped up on the window.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-05 19:38:01 -03:00
Arnaldo Carvalho de Melo
0a1eae391d perf tools: Don't keep unreferenced maps when unmaps are detected
For a file with:

[root@emilia linux-2.6-tip]# perf report -D -fi allmodconfig-j32.perf.data | grep events:
     TOTAL events:      36933
      MMAP events:       9056
      LOST events:          0
      COMM events:       1702
      EXIT events:       1887
  THROTTLE events:          8
UNTHROTTLE events:          8
      FORK events:       1894
      READ events:          0
    SAMPLE events:      22378
      ATTR events:          0
EVENT_TYPE events:          0
TRACING_DATA events:          0
  BUILD_ID events:          0
[root@emilia linux-2.6-tip]#

Testing with valgrind and making perf_session__delete() a nop, so that
we can notice how many maps were actually deleted due to not having any
samples on it:

==== HEAP SUMMARY:

Before:

==10339==     in use at exit: 8,909,997 bytes in 68,690 blocks
==10339==   total heap usage: 78,696 allocs, 10,007 frees, 11,925,853 bytes allocated

After:

==10506==     in use at exit: 8,902,605 bytes in 68,606 blocks
==10506==   total heap usage: 78,696 allocs, 10,091 frees, 11,925,853 bytes allocated

I.e. just 84 detected unmaps with no hits out of 9056 for this workload,
not much, but in some other long running workload this may save more
bytes.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-02 19:45:23 -03:00
Ingo Molnar
3772b73472 Merge commit 'v2.6.35' into perf/core
Conflicts:
	tools/perf/Makefile
	tools/perf/util/hist.c

Merge reason: Resolve the conflicts and update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-02 08:31:54 +02:00
Arnaldo Carvalho de Melo
0f0cbf7aa3 perf ui: New hists tree widget
The stock newt checkbox tree widget we were using was not really
suitable for hist entry + callchain browsing.

The problems with it were manifold:

- We needed to traverse the whole hist_entry rb_tree to add each entry +
  callchains beforehand.

- No control over the colors used for each row

So a new tree widget, based mostly on slang, was written.

It extends the ui_browser class already used for annotate to allow the
user to fold/unfold branches in the callchains tree, using extra fields
in the symbol_map class that is embedded in hist_entry and
callchain_node instances to store the folding state and when changing
this state calculates the number of rows that are produced when showing
a particular hist_entry instance.

This greatly speeds up browsing as we don't have to upfront touch all
the entries and only calculate callchain related operations when some
callchain branch is actually unfolded.

The memory footprint is also reduced as the data structure is not
duplicated, just some extra fields for controling callchain state and to
simplify the process of seeking thru entries (nr_rows, row_offset) were
added.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-27 11:24:31 -03:00
Arnaldo Carvalho de Melo
06daaaba7c perf hist: Introduce routine to measure lenght of formatted entry
Will be used to figure out the window width needed in the new tree
widget.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-27 11:24:31 -03:00
Arnaldo Carvalho de Melo
8a6c5b261c perf sort: Make column width code per hists instance
They were globals, and since we support multiple hists and sessions
at the same time, it doesn't make sense to calculate those values
considereing all symbols in all sessions.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-23 08:55:59 -03:00
Arnaldo Carvalho de Melo
7a007ca90b perf hists: Mark entries filtered by parent
And don't consider them in hists__inc_nr_entries.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-23 08:55:59 -03:00
Arnaldo Carvalho de Melo
70a7cb3b39 perf annotate: Fix handling of goto labels that are valid hex numbers
When parsing the objdump disassembly output we can have goto labels that
are valid hex numbers and thus get confused with lines with machine
code.

Handle the common case of a label that has nothing after it and other
cases where there is just source code by validating the resulting "ip".

It is still possible that we find goto labels that are in the function
address range, but only if they are located before the real address we
should be OK.

A change in the objdump output to have a clear marker separating
addresses from the disassembly would come handy, but we would still have
to deal with older versions.

Reported-by: Gleb Natapov <gleb@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100722170541.GF17631@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-22 14:04:13 -03:00
Arnaldo Carvalho de Melo
cc5edb0eb9 perf hists: Factor out duplicated code
Introducing hists__remove_entry_filter.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-17 15:45:55 -03:00
Frederic Weisbecker
58c3439083 perf: Fix various display bugs with parent filtering
Hists that have been filtered, because they don't have callchains
matching the parent filter, won't be printed. As such,
hist_entry__snprintf() returns 0 for them, but we don't control
this value and we always print the buffer, which might be
untouched and then only made of random stack garbage.

Not only does it paint the screen with barf, it also prints
the callchains for these hists, even though they have been filtered,
since the hist has been filtered as well.

We need to check the return value of hist_entry__snprintf() and
ignore the hist if it is 0, which means it didn't get any callchain
matching the parent filter. This fixes the barf and the undesired
callchains.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2010-07-16 04:56:09 +02:00
Arun Sharma
f60f359383 perf report: Implement --sort cpu
In a shared multi-core environment, users want to analyze why their
program was slow. In particular, if the code ran slower only on certain
CPUs due to interference from other programs or kernel threads, the user
should be able to notice that.

Sample usage:

perf record -f -a -- sleep 3
perf report --sort cpu,comm

Workload:

program is running on 16 CPUs
Experiencing interference from an antagonist only on 4 CPUs.

  Samples: 106218177676 cycles

  Overhead  CPU          Command
  ........  ...  ...............

     6.25%  2            program
     6.24%  6            program
     6.24%  11           program
     6.24%  5            program
     6.24%  9            program
     6.24%  10           program
     6.23%  15           program
     6.23%  7            program
     6.23%  3            program
     6.23%  14           program
     6.22%  1            program
     6.20%  13           program
     3.17%  12           program
     3.15%  8            program
     3.14%  0            program
     3.13%  4            program
     3.11%  4         antagonist
     3.11%  0         antagonist
     3.10%  8         antagonist
     3.07%  12        antagonist

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100505181612.GA5091@sharma-home.net>
Signed-off-by: Arun Sharma <aruns@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:35:53 -03:00
Stephane Eranian
45d8e8025a perf annotate: Ask objdump to demangle symbols
Perf report is demangling symbols but not annotate.

The former uses internal demangling via libbdf or libiberty. The latter
executes objdump which by default does not demangle symbols.

This patch adds the -C option to the objdump cmdline to enable symbol
demangling.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:34:59 -03:00
Konstantin Stepanyuk
75d9ef1707 perf hist: fix objdump output parsing
hist_entry__annotate() runs objdump with -S option so the output may contain
lines of any format. If a line starts with a colon strtoull() returns 0 and
calculated offset will be negative. This causes perf annotate segfaults.

Make sure that strtoull() has parsed at least one digit.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Konstantin Stepanyuk <konstantin.stepanyuk@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 05:44:36 -03:00
Arnaldo Carvalho de Melo
44bf460649 perf annotate: Fix up usage of the build id cache
It was assuming that the cache was always available and also wasn't
checking if the file found in the build id cache was just a kallsyms
file, that is not supported by objdump for disassembly.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-23 22:35:07 -03:00
Arnaldo Carvalho de Melo
46e3e055ce perf annotate: Add TUI interface
When annotating multiple entries, for instance, when running simply as:

$ perf annotate

the right and left keys, as well as TAB can be used to cycle thru the
multiple symbols being annotated.

If one doesn't like TUI annotate, disable it by editing ~/.perfconfig
and adding:

[tui]

	annotate = off

Just like it is possible for report.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-22 11:25:40 -03:00
Frederic Weisbecker
598357eba6 perf: Fix getline undeclared
We need to have stdio.h included with _GNU_SOURCEfopr getline,
which is broken with the inclusion of build-id.h.

Keep util.h included first in hist.c

Fixes:
	util/hist.c: Dans la fonction «hist_entry__parse_objdump_line» :
	util/hist.c:938: attention : déclaration implicite de la fonction « «getline» »
	util/hist.c:938: attention : nested extern declaration of «getline»
	make: *** [util/hist.o] Erreur 1

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1274438919-5104-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21 13:55:32 +02:00
Arnaldo Carvalho de Melo
b36f19d572 perf annotate: Use build-ids to find the right DSO
We were still using the pathname found on the MMAP event, that could not
be the one we used when recording, so use the build-id cache for that,
only falling back to use the pathname in the MMAP event if no build-ids
are available.

With this we now also are able to do secure, seamless offline annotation.

Example:

[root@doppio linux-2.6-tip]# perf report -g none -v 2> /dev/null | head -10
     8.12%     Xorg  /usr/lib64/libpixman-1.so.0.14.0       0x0000000000026d02 B [.] pixman_rasterize_edges
     4.68%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005dbdba B [.] 0x000000005dbdba
     3.70%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
     2.96%     init  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
     2.73%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff8100a738 ! [k] mwait_idle_with_hints
[root@doppio linux-2.6-tip]# perf annotate -v pixman_rasterize_edges 2>&1 | grep Executing
Executing: objdump --start-address=0x000000371ce26670 --stop-address=0x000000371ce2709f -dS /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|grep -v /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|expand
[root@doppio linux-2.6-tip]# perf buildid-list | grep libpixman-1.so.0.14.0
bd6ac5199137aaeb279f864717d8d061477466c1 /usr/lib64/libpixman-1.so.0.14.0
[root@doppio linux-2.6-tip]#

Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-20 12:15:33 -03:00
Arnaldo Carvalho de Melo
edb7c60e27 perf options: Type check all the remaining OPT_ variants
OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.

Several string constifications were done to make OPT_STRING require a
const char * type.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 16:22:41 -03:00
Arnaldo Carvalho de Melo
c82ee828aa perf report: Report number of events, not samples
Number of samples is meaningless after we switched to auto-freq, so
report the number of events, i.e. not the sum of the different periods,
but the number PERF_RECORD_SAMPLE emitted by the kernel.

While doing this I noticed that naming "count" to the sum of all the
event periods can be confusing, so rename it to .period, just like in
struct sample.data, so that we become more consistent.

This helps with the next step, that was to record in struct hist_entry
the number of sample events for each instance, we need that because we
use it to generate the number of events when applying filters to the
tree of hist entries like it is being done in the TUI report browser.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 14:19:35 -03:00
Arnaldo Carvalho de Melo
cee75ac7ec perf hist: Clarify events_stats fields usage
The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.

Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.

Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 13:16:55 -03:00
Arnaldo Carvalho de Melo
c8446b9bda perf hist: Make event__totals per hists
This is one more thing that started global but are more useful per hist
or per session.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 10:36:42 -03:00