Commit Graph

19377 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
9afec87ec1 perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul()
With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray'
pointer to be used, just like with syscall_arg_fmt->scnprintf() for the
other direction (number -> string).

With this all the strarrays that are associated with syscalls can be
used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be
possible as well to use with the strace-like shorter form, with just the
syscall names, i.e. something like:

   -e lseek/whence==END/

For now we have to use the longer form:

    # perf trace -e syscalls:sys_enter_lseek
       0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
       0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
       0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
    5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
    5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
    5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
   10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
   10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
   10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
  ^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR"
       0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET)
       0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET)
       0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
       0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
       0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET)
  ^C#

And for those curious about what are those lseek(DSO, offset, SET), well, its the loader:

  # perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR"
     0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
                                       __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                       _dl_map_object (/usr/lib64/ld-2.29.so)
     0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
                                       __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                       _dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
                                       _dl_map_object (/usr/lib64/ld-2.29.so)
     0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
                                       __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                       _dl_map_object (/usr/lib64/ld-2.29.so)
     0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
                                       __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                       _dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
                                       _dl_map_object (/usr/lib64/ld-2.29.so)
  ^C#

:-)

With this we can use strings in strarrays in filters, which allows us to
reuse all these that are in place for syscalls:

  $ find tools/perf/trace/beauty/ -name "*.c" | xargs grep -w DEFINE_STRARRAY
  tools/perf/trace/beauty/fcntl.c:	static DEFINE_STRARRAY(fcntl_setlease, "F_");
  tools/perf/trace/beauty/mmap.c:       static DEFINE_STRARRAY(mmap_flags, "MAP_");
  tools/perf/trace/beauty/mmap.c:       static DEFINE_STRARRAY(madvise_advices, "MADV_");
  tools/perf/trace/beauty/sync_file_range.c:       static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_");
  tools/perf/trace/beauty/socket.c:	static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_");
  tools/perf/trace/beauty/mount_flags.c:	static DEFINE_STRARRAY(mount_flags, "MS_");
  tools/perf/trace/beauty/pkey_alloc.c:	static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_");
  tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_");
  tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR");
  tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_");
  tools/perf/trace/beauty/prctl.c:	static DEFINE_STRARRAY(prctl_options, "PR_");
  tools/perf/trace/beauty/prctl.c:	static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_");
  tools/perf/trace/beauty/fspick.c:       static DEFINE_STRARRAY(fspick_flags, "FSPICK_");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(ioctl_tty_cmd, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(drm_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(kvm_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(perf_ioctl_cmds, "");
  tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, "");
  tools/perf/trace/beauty/fsmount.c:       static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_");
  tools/perf/trace/beauty/renameat.c:       static DEFINE_STRARRAY(rename_flags, "RENAME_");
  tools/perf/trace/beauty/kcmp.c:	static DEFINE_STRARRAY(kcmp_types, "KCMP_");
  tools/perf/trace/beauty/move_mount.c:       static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_");
  $

Well, some, as the mmap flags are like:

  $ tools/perf/trace/beauty/mmap_flags.sh
  static const char *mmap_flags[] = {
  	[ilog2(0x40) + 1] = "32BIT",
  	[ilog2(0x01) + 1] = "SHARED",
  	[ilog2(0x02) + 1] = "PRIVATE",
  	[ilog2(0x10) + 1] = "FIXED",
  	[ilog2(0x20) + 1] = "ANONYMOUS",
  	[ilog2(0x008000) + 1] = "POPULATE",
  	[ilog2(0x010000) + 1] = "NONBLOCK",
  	[ilog2(0x020000) + 1] = "STACK",
  	[ilog2(0x040000) + 1] = "HUGETLB",
  	[ilog2(0x080000) + 1] = "SYNC",
  	[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
  	[ilog2(0x0100) + 1] = "GROWSDOWN",
  	[ilog2(0x0800) + 1] = "DENYWRITE",
  	[ilog2(0x1000) + 1] = "EXECUTABLE",
  	[ilog2(0x2000) + 1] = "LOCKED",
  	[ilog2(0x4000) + 1] = "NORESERVE",
  };
  $

So we'll need a strarray__strtoul_flags() that will break donw the flags
into tokens separated by '|' before doing the lookup and then go on
reconstructing the value from, say:

      # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE"

into:

      # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2|0x10|0x0800"

and finally into:

      # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812"

That is what we see if we don't use the augmented view obtained from:

  # perf trace -e mmap
  <SNIP>
  211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000
  <SNIP>

But plain use tracefs:

        procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c6mgkjt8ujnc263eld5tb7q3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-19 15:34:48 -03:00
Arnaldo Carvalho de Melo
db25bf98a3 perf trace: Honour --max-events in processing syscalls:sys_enter_*
We were doing this only at the sys_exit syscall tracepoint, as for
strace-like we count the pair of sys_enter and sys_exit as one event,
but when asking specifically for a the syscalls:sys_enter_NAME
tracepoint we need to count each of those as an event.

I.e. things like:

  # perf trace --max-events=4 -e syscalls:sys_enter_lseek
     0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
     0.034 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
     0.051 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
  2307.900 sshd/30800 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libsystemd.so.0.25.0>, offset: 9032, whence: SET)
  #

Were going on forever, since we only had sys_enter events.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0ob1dky1a9ijlfrfhxyl40wr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-18 12:19:02 -03:00
Arnaldo Carvalho de Melo
d066da978f libbeauty: Introduce syscall_arg__strtoul_strarray()
To go from strarrays strings to its indexes.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wta0qvo207z27huib2c4ijxq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-18 12:07:46 -03:00
Steven Rostedt (VMware)
9bdff5b643 perf tools: Remove unused trace_find_next_event()
trace_find_next_event() was buggy and pretty much a useless helper. As
there are no more users, just remove it.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191017210636.224045576@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-18 12:07:46 -03:00
Steven Rostedt (VMware)
a5e05abc6b perf scripting engines: Iterate on tep event arrays directly
Instead of calling a useless (and broken) helper function to get the
next event of a tep event array, just get the array directly and iterate
over it.

Note, the broken part was from trace_find_next_event() which after this
will no longer be used, and can be removed.

Committer notes:

This fixes a segfault when generating python scripts from perf.data
files with multiple tracepoint events, i.e. the following use case is
fixed by this patch:

  # perf record -e sched:* sleep 1
  [ perf record: Woken up 31 times to write data ]
  [ perf record: Captured and wrote 0.031 MB perf.data (9 samples) ]
  # perf script -g python
  Segmentation fault (core dumped)
  #

Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20191017153733.630cd5eb@gandalf.local.home
Link: http://lore.kernel.org/lkml/20191017210636.061448713@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-18 12:07:46 -03:00
Arnaldo Carvalho de Melo
362222f877 perf trace: Initialize evsel_trace->fmt for syscalls:sys_enter_* tracepoints
From the syscall_fmts->arg entries for formatting strace-like syscalls.

This is when resolving the string "whence" on a filter expression for
the syscalls:sys_enter_lseek:

  Breakpoint 3, perf_evsel__syscall_arg_fmt (evsel=0xc91ed0, arg=0x7fffffff7cd0 "whence") at builtin-trace.c:3626
  3626	{
  (gdb) n
  3628		struct syscall_arg_fmt *fmt = __evsel__syscall_arg_fmt(evsel);
  (gdb) n
  3630		if (evsel->tp_format == NULL || fmt == NULL)
  (gdb) n
  3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
  (gdb) n
  3634			if (strcmp(field->name, arg) == 0)
  (gdb) p field->name
  $3 = 0xc945e0 "__syscall_nr"
  (gdb) n
  3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
  (gdb) p *fmt
  $4 = {scnprintf = 0x0, strtoul = 0x0, mask_val = 0x0, parm = 0x0, name = 0x0, nr_entries = 0, show_zero = false}
  (gdb) n
  3634			if (strcmp(field->name, arg) == 0)
  (gdb) p field->name
  $5 = 0xc94690 "fd"
  (gdb) n
  3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
  (gdb) n
  3634			if (strcmp(field->name, arg) == 0)
  (gdb) n
  3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
  (gdb) n
  3634			if (strcmp(field->name, arg) == 0)
  (gdb) p *fmt
  $9 = {scnprintf = 0x489be2 <syscall_arg__scnprintf_strarray>, strtoul = 0x0, mask_val = 0x0, parm = 0xa2da80 <strarray.whences>, name = 0x0,
    nr_entries = 0, show_zero = false}
  (gdb) p field->name
  $10 = 0xc947b0 "whence"
  (gdb) p fmt->parm
  $11 = (void *) 0xa2da80 <strarray.whences>
  (gdb) p *(struct strarray *)fmt->parm
  $12 = {offset = 0, nr_entries = 5, prefix = 0x724d37 "SEEK_", entries = 0xa2da40 <whences>}
  (gdb) p (struct strarray *)fmt->parm)->entries
  Junk after end of expression.
  (gdb) p ((struct strarray *)fmt->parm)->entries
  $13 = (const char **) 0xa2da40 <whences>
  (gdb) p ((struct strarray *)fmt->parm)->entries[0]
  $14 = 0x724d21 "SET"
  (gdb) p ((struct strarray *)fmt->parm)->entries[1]
  $15 = 0x724d25 "CUR"
  (gdb) p ((struct strarray *)fmt->parm)->entries[2]
  $16 = 0x724d29 "END"
  (gdb) p ((struct strarray *)fmt->parm)->entries[2]
  $17 = 0x724d29 "END"
  (gdb) p ((struct strarray *)fmt->parm)->entries[3]
  $18 = 0x724d2d "DATA"
  (gdb) p ((struct strarray *)fmt->parm)->entries[4]
  $19 = 0x724d32 "HOLE"
  (gdb)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lc8h9jgvbnboe0g7ic8tra1y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-18 12:07:42 -03:00
Arnaldo Carvalho de Melo
2b00bb627f perf trace: Introduce 'struct evsel__trace' for evsel->priv needs
For syscalls we need to cache the 'syscall_id' and 'ret' field offsets
but as well have a pointer to the syscall_fmt_arg array for the fields,
so that we can expand strings in filter expressions, so introduce
a 'struct evsel_trace' to have in evsel->priv that allows for that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hx8ukasuws5sz6rsar73cocv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-17 17:27:43 -03:00
Arnaldo Carvalho de Melo
8b913df50f perf trace: Hide evsel->access further, simplify code
Next step will be to have a 'struct evsel_trace' to allow for handling
the syscalls tracepoints via the strace-like code while reusing parts of
that code with the other tracepoints, where we don't have things like
the 'syscall_nr' or 'ret' ((raw_)?syscalls:sys_{enter,exit}(_SYSCALL)?)
args that we want to cache offsets and have been using evsel->priv for
that, while for the other tracepoints we'll have just an array of
'struct syscall_arg_fmt' (i.e. ->scnprint() for number->string and
->strtoul() string->number conversions and other state those functions
need).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fre21jbyoqxmmquxcho7oa0x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo
fecd990720 perf trace: Introduce accessors to trace specific evsel->priv
We're using evsel->priv in syscalls:sys_{enter,exit}_SYSCALL and in
raw_syscalls:sys_{enter,exit} to cache the offset of the common fields,
the multiplexor id/syscall_id in the sys_enter case and syscall_id + ret
for sys_exit.

And for the rest of the tracepoints we use it to have a syscall_arg_fmt
array to have scnprintf/strtoul for tracepoint args.

So we better clearly mark them with accessors so that we can move to
having a 'struct evsel_trace' struct for all 'perf trace' specific
evsel->priv usage.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dcoyxfslg7atz821tz9aupjh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo
3cdc8db91e perf trace: Show error message when not finding a field used in a filter expression
It was there, but as pr_debug(), make it pr_err() so that we can see it
without -v:

  # trace -e syscalls:*lseek --filter="whenc==SET" sleep 1
  "whenc" not found in "syscalls:sys_enter_lseek", can't set filter "whenc==SET"
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ly4rgm1bto8uwc2itpaixjob@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo
df604bfda6 perf trace: Hook the 'vec' tracepoint argument with the x86 IRQ vectors scnprintf/strtoul
Ended up only being useful when filtering multiple irq_vectors
tracepoints, as we end up having a tracepoint for each of the entries,
i.e.:

This will always come with the "RESCHEDULE_VECTOR" in the 'vector' arg:

  # perf trace --max-events 8 -e irq_vectors:reschedule*
     0.000 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
     0.004 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
     0.553 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
     0.556 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
     1.182 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
     1.185 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
     1.203 :29052/29052 irq_vectors:reschedule_entry(vector: RESCHEDULE)
     1.206 :29052/29052 irq_vectors:reschedule_exit(vector: RESCHEDULE)
  #

While filtering that value will produce nothing:

  # perf trace --max-events 8 -e irq_vectors:reschedule* --filter="vector != RESCHEDULE"
  ^C#

Maybe it'll be useful for those other tracepoints:

  # perf list irq_vectors:vector_*

  List of pre-defined events (to be used in -e):

    irq_vectors:vector_activate                        [Tracepoint event]
    irq_vectors:vector_alloc                           [Tracepoint event]
    irq_vectors:vector_alloc_managed                   [Tracepoint event]
    irq_vectors:vector_clear                           [Tracepoint event]
    irq_vectors:vector_config                          [Tracepoint event]
    irq_vectors:vector_deactivate                      [Tracepoint event]
    irq_vectors:vector_free_moved                      [Tracepoint event]
    irq_vectors:vector_reserve                         [Tracepoint event]
    irq_vectors:vector_reserve_managed                 [Tracepoint event]
    irq_vectors:vector_setup                           [Tracepoint event]
    irq_vectors:vector_teardown                        [Tracepoint event]
    irq_vectors:vector_update                          [Tracepoint event]
  #

But since we have it done, keep it.

This at least served to teach me that all those irq vectors have a entry
and an exit tracepoint that I can then use just like with
raw_syscalls:sys_{enter,exit}, i.e. pair them, use just a
trace__irq_vectors_entry() + trace__irq_vectors_exit() and use the
'vector' arg as I use the 'syscall id' one for syscalls.

Then the default for 'perf trace' will include irq_vectors in addition
to syscalls.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wer4cwbbqub3o7sa8h1j3uzb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 16:50:13 -03:00
Arnaldo Carvalho de Melo
573ed8985d perf trace beauty: Add the glue for the autogenerated x86 IRQ vector array
We need to wrap this autogenerated string array with the
strarray__scnprintf() formatter and the strarray__strotul() lookup
method, do it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-bx2cjcyv6aerhyy3gvu3uwcy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 16:13:27 -03:00
Arnaldo Carvalho de Melo
97c2a7806f libbeauty: Add a strarray__scnprintf_suffix() method
In some cases, like with x86 IRQ vectors, the common part in names is at
the end, so a suffix, add a scnprintf function for that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-agxbj6es2ke3rehwt4gkdw23@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 16:01:42 -03:00
Arnaldo Carvalho de Melo
f19a85c68c libbeauty: Hook up the x86 irq_vectors table generator
I.e. after running:

  $ make -C tools/perf O=/tmp/build/perf

We end up with:

  $ cat /tmp/build/perf/trace/beauty/generated/x86_arch_irq_vectors_array.c
  static const char *x86_irq_vectors[] = {
  	[0x02] = "NMI",
  	[0x12] = "MCE",
  	[0x20] = "IRQ_MOVE_CLEANUP",
  	[0x80] = "IA32_SYSCALL",
  	[0xec] = "LOCAL_TIMER",
  	[0xed] = "HYPERV_STIMER0",
  	[0xee] = "HYPERV_REENLIGHTENMENT",
  	[0xef] = "MANAGED_IRQ_SHUTDOWN",
  	[0xf0] = "POSTED_INTR_NESTED",
  	[0xf1] = "POSTED_INTR_WAKEUP",
  	[0xf2] = "POSTED_INTR",
  	[0xf3] = "HYPERVISOR_CALLBACK",
  	[0xf4] = "DEFERRED_ERROR",
  	[0xf6] = "IRQ_WORK",
  	[0xf7] = "X86_PLATFORM_IPI",
  	[0xf8] = "REBOOT",
  	[0xf9] = "THRESHOLD_APIC",
  	[0xfa] = "THERMAL_APIC",
  	[0xfb] = "CALL_FUNCTION_SINGLE",
  	[0xfc] = "CALL_FUNCTION",
  	[0xfd] = "RESCHEDULE",
  	[0xfe] = "ERROR_APIC",
  	[0xff] = "SPURIOUS_APIC",
  };
  $

Now its just a matter of using it, associating it to tracepoint arguments named
'vector', all of which can be correctly used with this table, for int args.

At some point we should move tools/perf/trace/beauty to tools/beauty/,
so that it can be used more generally and even made available externally
like libbpf, libperf, libtraceevent, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0p2df4kq1afrxbck4e4ct34r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 15:48:50 -03:00
Arnaldo Carvalho de Melo
5fa022aeba libbeauty: Add a generator for x86's IRQ vectors -> strings
We'll wire this up with the 'vector' arg in irq_vectors:*, etc:

Just run it straight away and check what it produces:

  $ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh
  static const char *x86_irq_vectors[] = {
  	[0x02] = "NMI",
  	[0x12] = "MCE",
  	[0x20] = "IRQ_MOVE_CLEANUP",
  	[0x80] = "IA32_SYSCALL",
  	[0xec] = "LOCAL_TIMER",
  	[0xed] = "HYPERV_STIMER0",
  	[0xee] = "HYPERV_REENLIGHTENMENT",
  	[0xef] = "MANAGED_IRQ_SHUTDOWN",
  	[0xf0] = "POSTED_INTR_NESTED",
  	[0xf1] = "POSTED_INTR_WAKEUP",
  	[0xf2] = "POSTED_INTR",
  	[0xf3] = "HYPERVISOR_CALLBACK",
  	[0xf4] = "DEFERRED_ERROR",
  	[0xf6] = "IRQ_WORK",
  	[0xf7] = "X86_PLATFORM_IPI",
  	[0xf8] = "REBOOT",
  	[0xf9] = "THRESHOLD_APIC",
  	[0xfa] = "THERMAL_APIC",
  	[0xfb] = "CALL_FUNCTION_SINGLE",
  	[0xfc] = "CALL_FUNCTION",
  	[0xfd] = "RESCHEDULE",
  	[0xfe] = "ERROR_APIC",
  	[0xff] = "SPURIOUS_APIC",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-cpl1pa7kkwn0llufi5qw4li8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 15:42:44 -03:00
Arnaldo Carvalho de Melo
d2b72b7280 tools arch x86: Grab a copy of the file containing the IRQ vector defines
We'll use it to generate a table and then convert the irq_vectors:*
tracepoint 'vector' arg in things like perf trace, script, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-z7gi058lzhnrm32slevg3xod@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 15:42:01 -03:00
John Garry
2b78471581 perf vendor events arm64: Add some missing events for Hisi hip08 HHA PMU
Add some more missing events.

A trivial typo is also fixed.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:58 -03:00
John Garry
e3ae569541 perf vendor events arm64: Add some missing events for Hisi hip08 L3C PMU
Add some more missing events.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:58 -03:00
John Garry
1410732a1b perf vendor events arm64: Add some missing events for Hisi hip08 DDRC PMU
Add some more missing events.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:58 -03:00
John Garry
84b0975f48 perf vendor events arm64: Fix Hisi hip08 DDRC PMU eventname
The "EventName" for the DDRC precharge command event is incorrect, so
fix it.

Fixes: 57cc732479 ("perf jevents: Add support for Hisi hip08 DDRC PMU aliasing")
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:58 -03:00
Arnaldo Carvalho de Melo
c5e006cdbd perf trace: Support tracepoint dynamic char arrays
Things like:

  # grep __data_loc /sys/kernel/debug/tracing/events/sched/sched_process_exec/format
	field:__data_loc char[] filename;	offset:8;	size:4;	signed:1;
  #

That, at that offset (8) and with that size(8) have an integer that
contains the real length and offset for the contents of that array.

Now this works:

  # perf trace --max-events 1 -e sched:*exec -a
     0.000 sed/19441 sched:sched_process_exec(filename: "/usr/bin/sync", pid: 19441 (sync), old_pid: 19441 (sync))
  #

As when using the libtraceevent based beautifier:

  # perf trace --libtraceevent --max-events 1 -e sched:*exec -a
     0.000 sync/19463 sched:sched_process_exec(filename=/usr/bin/sync pid=19463 old_pid=19463)
  #

I.e. that 'filename' is implemented as a dynamic char array.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-950p0m842fe6n7sxsdwqj5i2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:58 -03:00
Arnaldo Carvalho de Melo
7fbfe22cf4 perf trace: Filter own pid to avoid a feedback look in 'perf trace record -a'
When doing a system wide 'perf trace record' we need, just like in 'perf
trace' live mode, to filter out perf trace's own pid, so set up a
tracepoint filter for the raw_syscalls tracepoints right after adding
them to the argv array that is set up to then call cmd_record().

Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uysx5w8f2y5ndoln5cq370tv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:57 -03:00
Arnaldo Carvalho de Melo
da949f507a perf string: Export asprintf__tp_filter_pids()
Will be used directly in 'perf trace' for setting up the command line
argv array to pass to cmd_record, as this was how 'perf trace record'
was implemented, following the model used in 'perf kvm record', 'perf
sched record', etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-w3cuwjs63lxf5zpryy3145uv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:57 -03:00
Arnaldo Carvalho de Melo
b88b14db21 perf trace: Introduce --errno-summary
To be used with -S or -s, using just this new option implies -s,
examples:

  # perf trace --errno-summary sleep 1

   Summary of events:

   sleep (10793), 80 events, 93.0%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     nanosleep              1      0  1000.427  1000.427  1000.427  1000.427      0.00%
     mmap                   8      0     0.026     0.002     0.003     0.005      9.18%
     close                  5      0     0.018     0.001     0.004     0.009     48.97%
     mprotect               4      0     0.017     0.003     0.004     0.006     16.49%
     openat                 3      0     0.012     0.003     0.004     0.005      9.41%
     munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
     brk                    4      0     0.005     0.001     0.001     0.002     22.77%
     read                   4      0     0.005     0.001     0.001     0.002     22.33%
     access                 1      1     0.004     0.004     0.004     0.004      0.00%
  				ENOENT: 1
     fstat                  3      0     0.004     0.001     0.001     0.002     17.18%
     lseek                  3      0     0.003     0.001     0.001     0.001     11.62%
     arch_prctl             2      1     0.002     0.001     0.001     0.001      3.32%
  				EINVAL: 1
     execve                 1      0     0.000     0.000     0.000     0.000      0.00%

  #

Works as well together with --failure and -S, i.e. collect the stats and
show just the syscalls that failed:

  # perf trace --failure -S --errno-summary sleep 1
       0.032 arch_prctl(option: 0x3001, arg2: 0x7fffdb11b580) = -1 EINVAL (Invalid argument)
       0.045 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)

   Summary of events:

   sleep (10806), 80 events, 93.0%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     nanosleep              1      0  1000.094  1000.094  1000.094  1000.094      0.00%
     mmap                   8      0     0.026     0.002     0.003     0.005      9.06%
     close                  5      0     0.018     0.001     0.004     0.010     49.58%
     mprotect               4      0     0.017     0.003     0.004     0.006     17.56%
     openat                 3      0     0.014     0.004     0.005     0.006     12.29%
     munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
     brk                    4      0     0.005     0.001     0.001     0.002     22.75%
     read                   4      0     0.005     0.001     0.001     0.002     17.19%
     access                 1      1     0.005     0.005     0.005     0.005      0.00%
  				ENOENT: 1
     fstat                  3      0     0.004     0.001     0.001     0.002     21.66%
     lseek                  3      0     0.003     0.001     0.001     0.001     11.71%
     arch_prctl             2      1     0.002     0.001     0.001     0.001      2.66%
  				EINVAL: 1
     execve                 1      0     0.000     0.000     0.000     0.000      0.00%

  #

Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-l0mjwczkpouov7lss5zn8d9h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 13:03:49 -03:00
Arnaldo Carvalho de Melo
8eded45fcd perf trace: Add syscall failure stats to -s/--summary and -S/--with-summary
Just like strace has:

  # trace -s sleep 1

  Summary of events:

  sleep (32370), 80 events, 93.0%

    syscall            calls  errors  total       min       avg       max       stddev
                                      (msec)    (msec)    (msec)    (msec)        (%)
    --------------- --------  ------ -------- --------- --------- ---------     ------
    nanosleep              1      0  1000.402  1000.402  1000.402  1000.402      0.00%
    mmap                   8      0     0.023     0.002     0.003     0.004      8.49%
    close                  5      0     0.015     0.001     0.003     0.009     51.39%
    mprotect               4      0     0.014     0.002     0.003     0.005     16.95%
    openat                 3      0     0.013     0.003     0.004     0.005     14.29%
    munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
    read                   4      0     0.005     0.001     0.001     0.002     16.83%
    brk                    4      0     0.004     0.001     0.001     0.002     20.82%
    access                 1      1     0.004     0.004     0.004     0.004      0.00%
    fstat                  3      0     0.003     0.001     0.001     0.001     12.17%
    lseek                  3      0     0.003     0.001     0.001     0.001     11.45%
    arch_prctl             2      1     0.002     0.001     0.001     0.001      2.30%
    execve                 1      0     0.000     0.000     0.000     0.000      0.00%

  #

  # perf trace -S sleep 1
         ?  ... [continued]: execve())             = 0
     0.028 brk(brk: NULL)                          = 0x559f5bd96000
     0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument)
     0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
     0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
     0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0)   = 0
     0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000
     0.066 close(fd: 3)                            = 0
     0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
     0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832
     0.088 lseek(fd: 3, offset: 792, whence: SET)  = 792
     0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68
     0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0)   = 0
     0.095 mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f3aedfc2000
     0.101 lseek(fd: 3, offset: 792, whence: SET)  = 792
     0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68
     0.105 lseek(fd: 3, offset: 864, whence: SET)  = 864
     0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32
     0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000
     0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0
     0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000
     0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000
     0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000
     0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f3aedfbe000
     0.147 close(fd: 3)                            = 0
     0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0
     0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0
     0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0
     0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0
     0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0
     0.300 brk(brk: NULL)                          = 0x559f5bd96000
     0.302 brk(brk: 0x559f5bdb7000)                = 0x559f5bdb7000
     0.305 brk(brk: NULL)                          = 0x559f5bdb7000
     0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
     0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0)   = 0
     0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000
     0.325 close(fd: 3)                            = 0
     0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0
  1000.622 close(fd: 1)                            = 0
  1000.641 close(fd: 2)                            = 0
  1000.664 exit_group(error_code: 0)               = ?

   Summary of events:

   sleep (722), 80 events, 93.0%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     nanosleep              1      0  1000.194  1000.194  1000.194  1000.194      0.00%
     mmap                   8      0     0.025     0.002     0.003     0.005     10.17%
     close                  5      0     0.018     0.001     0.004     0.010     50.18%
     mprotect               4      0     0.016     0.003     0.004     0.006     16.81%
     openat                 3      0     0.011     0.003     0.004     0.004      6.57%
     munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
     brk                    4      0     0.005     0.001     0.001     0.002     20.72%
     read                   4      0     0.005     0.001     0.001     0.002     16.71%
     access                 1      1     0.005     0.005     0.005     0.005      0.00%
     fstat                  3      0     0.004     0.001     0.001     0.002     14.82%
     lseek                  3      0     0.003     0.001     0.001     0.001     11.66%
     arch_prctl             2      1     0.002     0.001     0.001     0.001      3.59%
     execve                 1      0     0.000     0.000     0.000     0.000      0.00%

  #

Works for system wide, e.g. for 1ms:

  # perf trace -s -a sleep 0.001

   Summary of events:

   sleep (768), 94 events, 37.9%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     nanosleep              1      0     1.133     1.133     1.133     1.133      0.00%
     execve                 7      6     0.351     0.003     0.050     0.316     88.53%
     mmap                   8      0     0.024     0.002     0.003     0.004      8.86%
     mprotect               4      0     0.017     0.003     0.004     0.006     16.02%
     openat                 3      0     0.013     0.004     0.004     0.005      8.34%
     munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
     brk                    4      0     0.007     0.001     0.002     0.002     10.99%
     close                  5      0     0.005     0.001     0.001     0.002     11.69%
     read                   5      0     0.005     0.000     0.001     0.002     30.53%
     access                 1      1     0.004     0.004     0.004     0.004      0.00%
     fstat                  3      0     0.004     0.001     0.001     0.002     10.74%
     lseek                  3      0     0.003     0.001     0.001     0.001     10.20%
     arch_prctl             2      1     0.002     0.001     0.001     0.001      3.34%

   Web Content (21258), 46 events, 18.5%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     recvmsg               12     12     0.015     0.001     0.001     0.002      8.50%
     futex                  2      0     0.008     0.003     0.004     0.005     27.08%
     poll                   6      0     0.006     0.000     0.001     0.002     22.14%
     read                   2      0     0.006     0.002     0.003     0.003     26.08%
     write                  1      0     0.002     0.002     0.002     0.002      0.00%

   Web Content (4365), 36 events, 14.5%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     recvmsg               10     10     0.015     0.001     0.002     0.003     11.83%
     poll                   5      0     0.006     0.000     0.001     0.002     28.44%
     futex                  2      0     0.005     0.001     0.003     0.004     48.29%
     read                   1      0     0.003     0.003     0.003     0.003      0.00%

   Timer (21275), 14 events, 5.6%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     futex                  6      1     0.240     0.000     0.040     0.149     64.58%
     write                  1      0     0.008     0.008     0.008     0.008      0.00%

   Timer (4383), 14 events, 5.6%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     futex                  6      2     0.186     0.000     0.031     0.181     96.45%
     write                  1      0     0.010     0.010     0.010     0.010      0.00%

   Web Content (20354), 28 events, 11.3%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     recvmsg                8      8     0.010     0.001     0.001     0.002     15.24%
     poll                   4      0     0.004     0.000     0.001     0.002     35.68%
     futex                  1      0     0.003     0.003     0.003     0.003      0.00%
     read                   1      0     0.003     0.003     0.003     0.003      0.00%

   Timer (20371), 10 events, 4.0%

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     futex                  4      1     0.077     0.000     0.019     0.075     95.46%
     write                  1      0     0.005     0.005     0.005     0.005      0.00%

  [root@quaco ~]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-k7kh2muo5oeg56yx446hnw9v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Jin Yao
dd071024bf perf stat: Support --all-kernel/--all-user
'perf record' has supported --all-kernel / --all-user to configure all
used events to run in kernel space or run in user space. But 'perf stat'
doesn't support these options.

It would be useful to support these options in 'perf stat' too to keep
the same semantics available in both tools.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011050545.3899-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Thomas Richter
5fb470bc29 perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:

 # ~/linux/tools/perf/perf record -k mono -- java  \
      -XX:+PreserveFramePointer \
      -agentpath:/root/linux/tools/perf/libperf-jvmti.so \
       hog 100000 123450
  Error occurred during initialization of VM
  Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
      in absolute path, with error:
      /root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype

Add the missing _ctype symbol into the build script.

Fixes: 79743bc927 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20191008093841.59387-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
c5baf90892 perf annotate: Fix objdump --no-show-raw-insn flag
Remove redirection of objdump's stderr to /dev/null to help diagnose
failures.

Fix the '--no-show-raw' flag to be '--no-show-raw-insn' which binutils
is permissive and allows, but fails with LLVM objdump.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
b34b45eef1 perf annotate: Don't pipe objdump output through 'expand' command
Avoiding a pipe allows objdump command failures to surface.  Move to the
caller of symbol__parse_objdump_line the call to strim that removes
leading and trailing tabs.  Add a new expand_tabs function that if a tab
is present allocate a new line in which tabs are expanded.  In
symbol__parse_objdump_line the line had no leading spaces, so simplify
the line_ip processing.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
7a675de428 perf annotate: Don't pipe objdump output through 'grep' command
Simplify the objdump command by not piping the output of objdump through
grep. Instead, drop lines that match the grep pattern during the reading
loop.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
4235949944 perf annotate: Use libsubcmd's run-command.h to fork objdump
Reduce duplicated logic by using the subcmd library. Ensure when errors
occur they are reported to the caller. Before this patch, if no lines
are read the error status is 0.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-3-irogers@google.com
Link: http://lore.kernel.org/lkml/20191015003418.62563-1-irogers@google.com
[ merged follow up fix for NULL termination as in the 2nd link above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:01 -03:00
Ian Rogers
353dcaa2f9 perf annotate: Avoid reallocation in objdump parsing
Objdump output is parsed using getline which allocates memory for the
read. Getline will realloc if the memory is too small, but currently the
line is always freed after the call.

Simplify parse_objdump_line by performing the reading in symbol__disassemble.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Jin Yao
800d3f5616 perf report: Add warning when libunwind not compiled in
We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.

We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).

The warning is:

Warning:
Please install libunwind development packages during the perf build.

Both TUI and stdio are supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Leo Yan
791ce9c48c perf test: Avoid infinite loop for task exit case
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.

After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs.  This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.

Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic.  This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure.  This allows the Perf tool can continue its other test cases.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Leo Yan
6add129c5d perf test: Report failure for mmap events
When fail to mmap events in task exit case, it misses to set 'err' to
-1; thus the testing will not report failure for it.

This patch sets 'err' to -1 when fails to mmap events, thus Perf tool
can report correct result.

Fixes: d723a55096 ("perf test: Add test case for checking number of EXIT events")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Andi Kleen
5a40e19948 perf evlist: Fix fix for freed id arrays
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.

Of course we need to close the current event in the loop, not the
original failing event.

The same test case as in the original patch still passes.

Fixes: 7834fa948b ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Andi Kleen
b3509b6ed7 perf script: Fix --reltime with --time
My earlier patch to just enable --reltime with --time was a little too
optimistic.  The --time parsing would accept absolute time, which is
very confusing to the user.

Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.

Fixes: 3714437d3f ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Jiri Olsa
bb91a073ed perf tools: Allow to build with -ltcmalloc
By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).

Get heap profile (tools/perf directory):

  $ <install gperftools>
  $ make TCMALLOC=1 DEBUG=1
  $ HEAPPROFILE=/tmp/heapprof ./perf ...
  $ pprof ./perf /tmp/heapprof.000*
  (pprof) top
  Total: 2335.5 MB
    1735.1  74.3%  74.3%   1735.1  74.3% memdup
     402.0  17.2%  91.5%    402.0  17.2% zalloc
     140.2   6.0%  97.5%    145.8   6.2% map__new
      33.6   1.4%  98.9%     33.6   1.4% symbol__new
      12.4   0.5%  99.5%     12.4   0.5% alloc_event
       6.2   0.3%  99.7%      6.2   0.3% nsinfo__new
       5.5   0.2% 100.0%      5.5   0.2% nsinfo__copy
       0.3   0.0% 100.0%      0.3   0.0% dso__new
       0.1   0.0% 100.0%      0.1   0.0% do_read_string
       0.0   0.0% 100.0%      0.0   0.0% __GI__IO_file_doallocate

See callstack:
  $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
  $ pprof --web ./perf /tmp/heapprof.00*

Committer testing:

Install gperftools, on fedora:

  # dnf install gperftools-devel

Then build:

 $ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin

Verify that it linked against the right library:

  $ ldd ~/bin/perf | grep tcma
	libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
  $

Run 'perf trace' system wide for 1 minute:

  # HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
  <SNIP>
   59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
   59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
   59981.956 (10.143 ms): SCTP timer/21716  ... [continued]: select())                            = 0 (Timeout)
   59985.549 (         ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
       0.926 (59999.481 ms): sleep/29764  ... [continued]: nanosleep())                           = 0
   59992.133 (         ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80)                            ...
   60000.477 ( 0.009 ms): sleep/29764 close(fd: 1)                                                = 0
   60000.493 ( 0.005 ms): sleep/29764 close(fd: 2)                                                = 0
   60000.514 (         ): sleep/29764 exit_group()                                                = ?
  Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#

Install pprof:

  # dnf install pprof

And run it:

  # pprof ~/bin/perf /tmp/heapprof.0001.heap
  Using local file /root/bin/perf.
  Using local file /tmp/heapprof.0001.heap.
  Welcome to pprof!  For help, type 'help'.
  (pprof) top
  Total: 4.0 MB
       1.7  42.0%  42.0%      2.2  54.1% map__new
       0.9  23.3%  65.3%      0.9  23.3% zalloc
       0.5  11.4%  76.7%      0.5  11.4% dso__new
       0.2   5.6%  82.3%      0.3   8.5% trace__sys_enter
       0.2   4.9%  87.2%      0.2   4.9% __GI___strdup
       0.2   3.8%  91.0%      0.2   3.8% new_term
       0.1   2.2%  93.2%      0.4  10.1% __perf_pmu__new_alias
       0.0   1.0%  94.3%      0.0   1.2% event_read_fields
       0.0   0.8%  95.1%      0.0   0.8% nsinfo__new
       0.0   0.7%  95.8%      0.1   3.2% trace__read_syscall_info
  (pprof)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Ingo Molnar
39b656ee9f perf/core improvements and fixes:
perf trace:
 
   Arnaldo Carvalho de Melo:
 
   - Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines
     (convert integer arguments into strings, like open flags, etc) in tracepoint
     arguments.
 
     For now the type based scnprintf routines (pid_t, umode_t, etc) and the
     ones based in well known arg name based ("fd", etc) gets associated with
     tracepoint args of that type.
 
     A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as
     an initial step.
 
   - Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation
     of ->scnprintf(), i.e. to go from a string to an integer.
 
   - Implement --filter, just like in 'perf record', that affects the tracepoint
     events specied thus far in the command line, use the ->strtoul() methods
     to allow strings in tables associated with beautifiers to the integers
     the in-kernel tracepoint (eBPF later) filters expect, e.g.:
 
      # perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2"
       0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1)
      #
 
      # perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE"
       207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         lapic_next_deadline ([kernel.kallsyms])
                                         clockevents_program_event ([kernel.kallsyms])
                                         hrtimer_interrupt ([kernel.kallsyms])
                                         smp_apic_timer_interrupt ([kernel.kallsyms])
                                         apic_timer_interrupt ([kernel.kallsyms])
                                         [0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                         [0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so)
      #
      # perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL"
          0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL)
          0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL)
          0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL)
          0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL)
      #
      # perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE"
            0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037)
            0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
          238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893)
          511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
         1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595)
         1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952)
         1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
      #
 
   - Default to not using libtraceevent and its plugins for beautifying
     tracepoint arguments, since now we're reusing the strace-like beatufiers.
     Use --libtraceevent_print (using just --libtrace is unambiguous and can
     be used as a short hand) to go back to those beautifiers.
 
     This will help in the transition, as can be seen in some of the sched tracepoints
     that still need some work in the libbeauty based mode:
 
     # trace --no-inherit -e msr:*,*sleep,sched:* sleep 1
          0.000 (         ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
          0.006 (         ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
          0.348 (         ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep))
          0.490 (         ): msr:write_msr(msr: FS_BASE, val: 139631189321088)
          0.670 (         ): nanosleep(rqtp: 0x7ffc52c23bc0)                                    ...
          0.674 (         ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342)
          0.675 (         ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120)
       1001.059 (         ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
       1001.098 (         ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
          0.670 (1000.504 ms):  ... [continued]: nanosleep())                                        = 0
       1001.456 (         ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120)
     # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
     # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
          0.000 (         ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000)
          0.007 (         ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000)
          0.382 (         ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323)
          0.525 (         ): msr:write_msr(c0000100, value 7f5d508a0580)
          0.713 (         ): nanosleep(rqtp: 0x7fff487fb4a0)                                    ...
          0.717 (         ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns])
          0.719 (         ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120)
       1001.117 (         ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000)
       1001.157 (         ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000)
          0.713 (1000.522 ms):  ... [continued]: nanosleep())                                        = 0
       1001.538 (         ): sched:sched_process_exit(comm=sleep pid=3323 prio=120)
     #
 
   - Make -v (verbose) mode be honoured for .perfconfig based trace.add_events,
     to help in diagnosing problems with building eBPF events (-e source.c).
 
   - When using eBPF syscall payload augmentation do not show strace-like
     syscalls when all the user specified was some tracepoint event, bringing
     the behaviour in line with that of when not using eBPF augmentation.
 
 Intel PT:
 
   exported-sql-viewer GUI:
 
   Adrian Hunter:
 
   - Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations
     so as to add a time chart by CPU.
 
 perf script:
 
   Andi Kleen:
 
   - Allow --time (to specify a time span of interest) with --reltime
 
 perf diff:
 
   Jin Yao:
 
   - Report noise for cycles diff, i.e. a histogram + stddev.
     (timestamps relative to start).
 
 perf annotate:
 
   Arnaldo Carvalho de Melo:
 
   - Initialize env->cpuid when running in live mode (perf top), as it
     is used in some of the per arch annotation init routines.
 
 samples bpf:
 
   Björn Töpel:
 
   - Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.
 
 Core:
 
   Ian Rogers:
 
   - Avoid 'sample_reg_masks' being const + weak, as this breaks with some
     compilers that constant-propagate from the weak symbol.
 
 libperf:
 
   - First part of moving the perf_mmap class from tools/perf to libperf.
 
   - Propagate CFLAGS to libperf from the tools/perf Makefile.
 
 Vendor events:
 
   John Garry:
 
   - Add entry in MAINTAINERS with reviewers for the for perf tool arm64
     pmu-events files.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXaDUqgAKCRCyPKLppCJ+
 J5ypAP9spIHEx/thoT72iaApkdD9tOXls4lMTUFhOW39rrZV1AD9F3+3kfQ62kav
 0j1sZGd7s5sDiLr1joJCAFiVPZc86wo=
 =r+Ik
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.5-20191011' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf trace:

  Arnaldo Carvalho de Melo:

  - Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines
    (convert integer arguments into strings, like open flags, etc) in tracepoint
    arguments.

    For now the type based scnprintf routines (pid_t, umode_t, etc) and the
    ones based in well known arg name based ("fd", etc) gets associated with
    tracepoint args of that type.

    A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as
    an initial step.

  - Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation
    of ->scnprintf(), i.e. to go from a string to an integer.

  - Implement --filter, just like in 'perf record', that affects the tracepoint
    events specied thus far in the command line, use the ->strtoul() methods
    to allow strings in tables associated with beautifiers to the integers
    the in-kernel tracepoint (eBPF later) filters expect, e.g.:

     # perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2"
      0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1)
     #

     # perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE"
      207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822)
                                        do_trace_write_msr ([kernel.kallsyms])
                                        do_trace_write_msr ([kernel.kallsyms])
                                        lapic_next_deadline ([kernel.kallsyms])
                                        clockevents_program_event ([kernel.kallsyms])
                                        hrtimer_interrupt ([kernel.kallsyms])
                                        smp_apic_timer_interrupt ([kernel.kallsyms])
                                        apic_timer_interrupt ([kernel.kallsyms])
                                        [0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
                                        [0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so)
     #
     # perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL"
         0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
         0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL)
         0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL)
         0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL)
         0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
         0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
         0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
         2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL)
     #
     # perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE"
           0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037)
           0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893)
         511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
        1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
        1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595)
        1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952)
        1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
     #

  - Default to not using libtraceevent and its plugins for beautifying
    tracepoint arguments, since now we're reusing the strace-like beatufiers.
    Use --libtraceevent_print (using just --libtrace is unambiguous and can
    be used as a short hand) to go back to those beautifiers.

    This will help in the transition, as can be seen in some of the sched tracepoints
    that still need some work in the libbeauty based mode:

    # trace --no-inherit -e msr:*,*sleep,sched:* sleep 1
         0.000 (         ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
         0.006 (         ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
         0.348 (         ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep))
         0.490 (         ): msr:write_msr(msr: FS_BASE, val: 139631189321088)
         0.670 (         ): nanosleep(rqtp: 0x7ffc52c23bc0)                                    ...
         0.674 (         ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342)
         0.675 (         ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120)
      1001.059 (         ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
      1001.098 (         ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
         0.670 (1000.504 ms):  ... [continued]: nanosleep())                                        = 0
      1001.456 (         ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120)
    # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
    # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
         0.000 (         ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000)
         0.007 (         ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000)
         0.382 (         ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323)
         0.525 (         ): msr:write_msr(c0000100, value 7f5d508a0580)
         0.713 (         ): nanosleep(rqtp: 0x7fff487fb4a0)                                    ...
         0.717 (         ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns])
         0.719 (         ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120)
      1001.117 (         ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000)
      1001.157 (         ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000)
         0.713 (1000.522 ms):  ... [continued]: nanosleep())                                        = 0
      1001.538 (         ): sched:sched_process_exit(comm=sleep pid=3323 prio=120)
    #

  - Make -v (verbose) mode be honoured for .perfconfig based trace.add_events,
    to help in diagnosing problems with building eBPF events (-e source.c).

  - When using eBPF syscall payload augmentation do not show strace-like
    syscalls when all the user specified was some tracepoint event, bringing
    the behaviour in line with that of when not using eBPF augmentation.

Intel PT:

  exported-sql-viewer GUI:

  Adrian Hunter:

  - Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations
    so as to add a time chart by CPU.

perf script:

  Andi Kleen:

  - Allow --time (to specify a time span of interest) with --reltime

perf diff:

  Jin Yao:

  - Report noise for cycles diff, i.e. a histogram + stddev.
    (timestamps relative to start).

perf annotate:

  Arnaldo Carvalho de Melo:

  - Initialize env->cpuid when running in live mode (perf top), as it
    is used in some of the per arch annotation init routines.

samples bpf:

  Björn Töpel:

  - Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.

Core:

  Ian Rogers:

  - Avoid 'sample_reg_masks' being const + weak, as this breaks with some
    compilers that constant-propagate from the weak symbol.

libperf:

  - First part of moving the perf_mmap class from tools/perf to libperf.

  - Propagate CFLAGS to libperf from the tools/perf Makefile.

Vendor events:

  John Garry:

  - Add entry in MAINTAINERS with reviewers for the for perf tool arm64
    pmu-events files.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-15 07:19:55 +02:00
Linus Torvalds
465a7e291f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also a couple of updates for new Intel
  models (which are technically hw-enablement, but to users it's a fix
  to perf behavior on those new CPUs - hope this is fine), an AUX
  inheritance fix, event time-sharing fix, and a fix for lost non-perf
  NMI events on AMD systems"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  perf/x86/cstate: Add Tiger Lake CPU support
  perf/x86/msr: Add Tiger Lake CPU support
  perf/x86/intel: Add Tiger Lake CPU support
  perf/x86/cstate: Update C-state counters for Ice Lake
  perf/x86/msr: Add new CPU model numbers for Ice Lake
  perf/x86/cstate: Add Comet Lake CPU support
  perf/x86/msr: Add Comet Lake CPU support
  perf/x86/intel: Add Comet Lake CPU support
  perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
  perf/core: Fix corner case in perf_rotate_context()
  perf/core: Rework memory accounting in perf_mmap()
  perf/core: Fix inheritance of aux_output groups
  perf annotate: Don't return -1 for error when doing BPF disassembly
  perf annotate: Return appropriate error code for allocation failures
  perf annotate: Fix arch specific ->init() failure errors
  perf annotate: Propagate the symbol__annotate() error return
  perf annotate: Fix the signedness of failure returns
  perf annotate: Propagate perf_env__arch() error
  perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
  perf tools: Propagate get_cpuid() error
  ...
2019-10-12 15:15:17 -07:00
Linus Torvalds
db60a5a035 powerpc fixes for 5.4 #3
Fix a kernel crash in spufs_create_root() on Cell machines, since the new mount
 API went in.
 
 Fix a regression in our KVM code caused by our recent PCR changes.
 
 Avoid a warning message about a failing hypervisor API on systems that don't
 have that API.
 
 A couple of minor build fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do Rosario, Emmanuel
   Nicolet, Jordan Niethe, Laurent Dufour, Stephen Rothwell.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl2hq8ETHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgHouD/4nfAtKfhYJSo84dsCKeneRD6uLyJXI
 SdWqVMneJn8uZU53aRg2CjO97cD3dKi1V/ZpMJnf2zJCUZ2fWytkTnXNHUPx8n/r
 W5YRyGFENuwVF7UfjBRgJ1mczRa/vNkQK4YmumZ7commjoGroVxorfecy+re3B+6
 xGFJA3SGvvMYB3v1KzOsJ+ymrRAt3uNLOcbwkBfRWvptY4MtYGcPpuv8TtsSzRXB
 sfjDUTLNIvl+yH/hm5Tvh04yNLLU/SJKaptKMzc/L4dFRSlqzWDXfUMui8ePLZTy
 crh9ckNa+tcKzNG9SUbuLKE1nHSB0x1nsQYzdnG7T10AR5rDmcQi9Q+Pz87WiT59
 pdTebV6AS71RbOpBuAKUOX3XZuhi7NozmExSwDqNU6PCb/zqTdAWe1dQAitcwrUE
 g9TNwfluihdHJ/cojE7jylPEkJI3VNguJHLRZzLBv/+X8CQ03X8oKJJJiS5xYRuv
 RgA8p+oc7Hsl6h32tcLYJ22IaJ0dNZde+/oV+WxYAT0IE0r1ZaYfhtRgGrXVJDwC
 Ev2m55UyqfRTSEAszzT3du+GCAwdLqEPPBGR+Xavb+5SgEupqzn23RTa7Orkagtc
 l89ImHXKjB5eI6gZZ5a57gDj320Kdxd6LhrxEoxP+RnRveEF4hNS0B+MwLtmt9L4
 OH6FrWrGKbqIIg==
 =jRp0
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Fix a kernel crash in spufs_create_root() on Cell machines, since the
  new mount API went in.

  Fix a regression in our KVM code caused by our recent PCR changes.

  Avoid a warning message about a failing hypervisor API on systems that
  don't have that API.

  A couple of minor build fixes.

  Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do
  Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen
  Rothwell"

* tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  spufs: fix a crash in spufs_create_root()
  powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
  selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
  powerpc/pseries: Remove confusing warning message.
  powerpc/64s/radix: Fix build failure with RADIX_MMU=n
2019-10-12 14:13:55 -07:00
Jin Yao
cebf7d51a6 perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.

This patch is inspired by Andi Kleen's patch:

  https://lwn.net/Articles/600471/

We create new option '--cycles-hist'.

Example:

  perf record -b ./div
  perf record -b ./div
  perf diff -c cycles

  # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
  # ........  .......................................................... ....  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0  libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0  div                [.] rand@plt
       0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr

When we enable the option '--cycles-hist', the output is

  perf diff -c cycles --cycles-hist

  # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
  # ........  .......................................................... ....  .................  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0                     libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0                     libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0                     libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0                     div                [.] rand@plt
       0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr

 v8:
 ---
 Rebase to perf/core branch

 v7:
 ---
 1. v6 got Jiri's ACK.
 2. Rebase to latest perf/core branch.

 v6:
 ---
 1. Jiri provides better code for using data__hpp_register() in ui_init().
    Use this code in v6.

 v5:
 ---
 1. Refine the use of data__hpp_register() in ui_init() according to
    Jiri's suggestion.

 v4:
 ---
 1. Rename the new option from '--noisy' to '--cycles-hist'
 2. Remove the option '-n'.
 3. Only update the spark value and stats when '--cycles-hist' is enabled.
 4. Remove the code of printing '..'.

 v3:
 ---
 1. Move the histogram to a separate column
 2. Move the svals[] out of struct stats

 v2:
 ---
 Jiri got a compile error,

  CC       builtin-diff.o
  builtin-diff.c: In function ‘compute_cycles_diff’:
  builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
  712 |          labs(pair->block_info->cycles_spark[i] -
      |          ^~~~

 Because the result of u64 - u64 is still u64. Now we change the type of
 cycles_spark[] to s64.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-11 10:57:00 -03:00
Jiri Olsa
55542113c6 perf tools: Propagate CFLAGS to libperf
Andi reported that 'make DEBUG=1' does not propagate to the libbperf
code. It's true also for the other flags. Changing the code to propagate
the global build flags to libperf compilation.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011122155.15738-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-11 10:55:22 -03:00
Jiri Olsa
84227cb11f libperf: Adopt perf_evlist__filter_pollfd() from tools/perf
Introduce the perf_evlist__filter_pollfd function and export it in the
perf/evlist.h header, so that libperf users can check if the descriptor
is still alive.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-27-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:58:45 -03:00
Jiri Olsa
696f27c994 libperf: Introduce perf_evlist__purge()
Add a static perf_evlist__purge() function to purge evsels from a evlist.

Add also perf_evlist__for_each_entry_safe() which is used by
perf_evlist__purge().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:57:22 -03:00
Jiri Olsa
93dd6e2831 libperf: Introduce perf_evlist__exit()
Add the perf_evlist__exit() function, so far it's not exported and added
only for internal use for perf and libperf.

USe it to release cpus/threads and pollfd array.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:56:01 -03:00
Jiri Olsa
230662e15e libperf: Move the pollfd allocation from tools/perf to libperf
It's needed in libperf only, so move it to the perf_evlist__mmap_ops()
function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:54:35 -03:00
Jiri Olsa
285aaeac8c libperf: Centralize map refcnt setting
Currently when a new map is mmapped we set its refcnt to 2 in the
perf_evlist_mmap_ops::mmap callback.

Every mmap gets its refcnt set to 2 when it's first mmaped:

  - 1 for the current user, which will be taken out by a call to
    perf_evlist__munmap_filtered(), where we find out there's
    no more data comming from kernel to this mmap.

  - 1 for the drain code where in perf_mmap__consume() the mmap
    is released if it is empty.

Move this common setup into libperf's generic code before the mmap
callback is called.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:52:41 -03:00
Jiri Olsa
923d0f1868 perf evlist: Switch to libperf's mmap interface
Switch to the libperf mmap interface by calling directly
perf_evlist__mmap_ops() and removing perf's evlist__mmap_per_*
functions.

By switching to libperf perf_evlist__mmap() we need to operate over
'struct perf_mmap' in evlist__add_pollfd, so make the related changes
there.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:46:04 -03:00
Jiri Olsa
b80132b12a perf evlist: Introduce perf_evlist__mmap_cb_mmap()
Add the perf_evlist__mmap_cb_mmap() function to call perf specific
mmap__mmap() function during perf_evlist__mmap_ops() call.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:44:59 -03:00