Commit Graph

9606 Commits

Author SHA1 Message Date
Mattias Jacobsson
099be74886 perf strbuf: Remove redundant va_end() in strbuf_addv()
Each call to va_copy() should have one, and only one, corresponding call
to va_end(). In strbuf_addv() some code paths result in va_end() getting
called multiple times. Remove the superfluous va_end().

Signed-off-by: Mattias Jacobsson <2pi@mok.nu>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sanskriti Sharma <sansharm@redhat.com>
Link: http://lkml.kernel.org/r/20181229141750.16945-1-2pi@mok.nu
Fixes: ce49d8436c ("perf strbuf: Match va_{add,copy} with va_end")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04 12:54:49 -03:00
Ivan Krylov
442b4eb3af perf annotate: Pass filename to objdump via execl
The symbol__disassemble() function uses shell to launch objdump and
filter its output via grep. Passing filenames by interpolating them into
the command line via "%s" may lead to problems if said filenames contain
special characters.

Instead, pass the filename as a command line argument where it is not
subject to any kind of interpretation, then use quoted shell
interpolation to build the strings we need safely.

Signed-off-by: Ivan Krylov <krylov.r00t@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181014111803.5d83b806@Tarkus
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04 12:54:49 -03:00
Jin Yao
a3366db06b perf report: Fix wrong iteration count in --branch-history
By calculating the removed loops, we can get the iteration count.

But the iteration count could be reported incorrectly, reporting
impossibly high counts.

That's because previous code uses the number of removed LBR entries for
the iteration count. That's not good. Fix this by increasing the
iteration count when a loop is detected.

When matching the chain, the iteration count would be added up, finally we need
to compute the average value when printing out.

For example,

  $ perf report --branch-history --stdio --no-children

Before:

  ---f2 +0
     |
     |--33.62%--f1 +9 (cycles:1)
     |          f1 +0
     |          main +22 (cycles:1)
     |          main +17
     |          main +38 (cycles:1)
     |          main +27
     |          f1 +26 (cycles:1)
     |          f1 +24
     |          f2 +27 (cycles:7)
     |          f2 +0
     |          f1 +19 (cycles:1)
     |          f1 +14
     |          f2 +27 (cycles:11)
     |          f2 +0
     |          f1 +9 (cycles:1 iter:2968 avg_cycles:3)
     |          f1 +0
     |          main +22 (cycles:1 iter:2968 avg_cycles:3)
     |          main +17
     |          main +38 (cycles:1 iter:2968 avg_cycles:3)

2968 is an impossible high iteration count and avg_cycles is too small.

After:

  ---f2 +0
     |
     |--33.62%--f1 +9 (cycles:1)
     |          f1 +0
     |          main +22 (cycles:1)
     |          main +17
     |          main +38 (cycles:1)
     |          main +27
     |          f1 +26 (cycles:1)
     |          f1 +24
     |          f2 +27 (cycles:7)
     |          f2 +0
     |          f1 +19 (cycles:1)
     |          f1 +14
     |          f2 +27 (cycles:11)
     |          f2 +0
     |          f1 +9 (cycles:1 iter:1 avg_cycles:23)
     |          f1 +0
     |          main +22 (cycles:1 iter:1 avg_cycles:23)
     |          main +17
     |          main +38 (cycles:1 iter:1 avg_cycles:23)

avg_cycles:23 is the average cycles of this iteration.

Fixes: c4ee06251d ("perf report: Calculate the average cycles of iterations")

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04 12:54:49 -03:00
Linus Torvalds
96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Arnaldo Carvalho de Melo
805e4c8b61 tools beauty: Make the prctl option table generator catch all PR_ options
In ba83088565 ("arm64: add prctl control for resetting ptrauth keys")
the PR_PAC_RESET_KEYS prctl option was introduced, get that into the
regex in addition to PR_GET_* and PR_SET_*:

So just get everything that matches '^#define PR_\w+' this ends up
adding these entries:

  $ tools/perf/trace/beauty/prctl_option.sh  > after
  $ diff -u before after
  --- before	2019-01-03 14:58:51.541807353 -0300
  +++ after	2019-01-03 15:17:05.909583804 -0300
  @@ -19,12 +19,18 @@
          [20] = "SET_ENDIAN",
          [21] = "GET_SECCOMP",
          [22] = "SET_SECCOMP",
  +       [23] = "CAPBSET_READ",
  +       [24] = "CAPBSET_DROP",
          [25] = "GET_TSC",
          [26] = "SET_TSC",
          [27] = "GET_SECUREBITS",
          [28] = "SET_SECUREBITS",
          [29] = "SET_TIMERSLACK",
          [30] = "GET_TIMERSLACK",
  +       [31] = "TASK_PERF_EVENTS_DISABLE",
  +       [32] = "TASK_PERF_EVENTS_ENABLE",
  +       [33] = "MCE_KILL",
  +       [34] = "MCE_KILL_GET",
          [35] = "SET_MM",
          [36] = "SET_CHILD_SUBREAPER",
          [37] = "GET_CHILD_SUBREAPER",
  @@ -33,8 +39,13 @@
          [40] = "GET_TID_ADDRESS",
          [41] = "SET_THP_DISABLE",
          [42] = "GET_THP_DISABLE",
  +       [43] = "MPX_ENABLE_MANAGEMENT",
  +       [44] = "MPX_DISABLE_MANAGEMENT",
          [45] = "SET_FP_MODE",
          [46] = "GET_FP_MODE",
  +       [47] = "CAP_AMBIENT",
  +       [50] = "SVE_SET_VL",
  +       [51] = "SVE_GET_VL",
          [52] = "GET_SPECULATION_CTRL",
          [53] = "SET_SPECULATION_CTRL",
          [54] = "PAC_RESET_KEYS",
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-sg2pkmtjr5988bhbcp4yp6sw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-03 15:16:04 -03:00
Jin Yao
8a99255a50 perf stat: Fix endless wait for child process
We hit a 'perf stat' issue by using following script:

  #!/bin/bash

  sleep 1000 &
  exec perf stat -a -e cycles -I1000 -- sleep 5

Since "perf stat" is launched by exec, the "sleep 1000" would be the
child process of "perf stat". The wait4() call will not return because
it's waiting for the child process "sleep 1000" to end. So 'perf stat'
doesn't return even after 5s passes.

This patch lets 'perf stat' return when the specified child process ends
(in this case, the specified child process is "sleep 5").

Committer testing:

  # cat test.sh
  #!/bin/bash

  sleep 10 &
  exec perf stat -a -e cycles -I1000 -- sleep 5
  #

Before:

  # time ./test.sh
  #           time             counts unit events
       1.001113090        108,453,351      cycles
       2.002062196        142,075,435      cycles
       3.002896194        164,801,068      cycles
       4.003731666        107,062,140      cycles
       5.002068867        112,241,832      cycles

  real	0m10.066s
  user	0m0.016s
  sys	0m0.101s
  #

After:

  # time ./test.sh
  #           time             counts unit events
       1.001016096         91,412,027      cycles
       2.002014963        124,063,708      cycles
       3.002883964        125,993,929      cycles
       4.003706470        120,465,734      cycles
       5.002006778        163,560,355      cycles

  real	0m5.123s
  user	0m0.014s
  sys	0m0.105s
  #

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1546501245-4512-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-03 12:12:18 -03:00
Adrian Hunter
b25756df5b perf session: Add comment for perf_session__register_idle_thread()
Add a comment to perf_session__register_idle_thread() to bring attention to
a pitfall with the idle task thread structure. The pitfall is that there
should really be a 'struct thread' for the idle task of each cpu, but there
is only one that can have pid == tid == 0.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 11:05:06 -03:00
Adrian Hunter
256d92bc93 perf thread-stack: Fix thread stack processing for the idle task
perf creates a single 'struct thread' to represent the idle task. That
is because threads are identified by PID and TID, and the idle task
always has PID == TID == 0.

However, there are actually separate idle tasks for each CPU. That
creates a problem for thread stack processing which assumes that each
thread has a single stack, not one stack per CPU.

Fix that by passing through the CPU number, and in the case of the idle
"thread", pick the thread stack from an array based on the CPU number.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 11:03:17 -03:00
Adrian Hunter
139f42f3b3 perf thread-stack: Allocate an array of thread stacks
In preparation for fixing thread stack processing for the idle task,
allocate an array of thread stacks.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-7-adrian.hunter@intel.com
[ No need to check for NULL when calling zfree(), noticed by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:55:55 -03:00
Adrian Hunter
2e9e868876 perf thread-stack: Factor out thread_stack__init()
In preparation for fixing thread stack processing for the idle task,
factor out thread_stack__init().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:53:41 -03:00
Adrian Hunter
f6060ac601 perf thread-stack: Allow for a thread stack array
In preparation for fixing thread stack processing for the idle task,
allow for a thread stack array.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:49:51 -03:00
Adrian Hunter
bd8e68ace1 perf thread-stack: Avoid direct reference to the thread's stack
In preparation for fixing thread stack processing for the idle task,
avoid direct reference to the thread's stack. The thread stack will
change to an array of thread stacks, at which point the meaning of the
direct reference will change.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-4-adrian.hunter@intel.com
[ Rename thread_stack__ts() to thread__stack() since this operates on a 'thread' struct ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:48:18 -03:00
Adrian Hunter
e0b8951190 perf thread-stack: Tidy thread_stack__bottom() usage
In preparation for fixing thread stack processing for the idle task,
tidy thread_stack__bottom() usage. Specifically, the parameter 'thread'
is not needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:45:26 -03:00
Adrian Hunter
03b32cb281 perf thread-stack: Simplify some code in thread_stack__process()
In preparation for fixing thread stack processing for the idle task,
simplify some code in thread_stack__process().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02 10:42:45 -03:00
Linus Torvalds
889bb74302 nds32 patches for 4.21
Here is the nds32 patch set based on 4.20-rc1.
 Contained in here are
 1. Perf support
 2. Power management support
 3. FPU support
 4. Hardware prefetcher support
 5. Build error fixed
 6. Performance enhancement
 
 These are the LTP20170427 testing results.
 Total Tests: 1902
 Total Skipped Tests: 603
 Total Failures: 410
 Kernel Version: 4.20.0-rc1-00016-ge0db606bc023
 Machine Architecture: nds32
 Hostname: greentime-d15-ae3xx
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.17 (GNU/Linux)
 
 iQIcBAABAgAGBQJcJdsZAAoJEHfB0l0b2JxEV7QQAJLwF0ixvOhCO+y4tM9596ai
 BiV+duMg9tvJkbrfM4Rli5Bd2PpZdNoWtwXRi6azgORkczx5ioYJFSFmkodvhlb9
 WQfYiDeD1PF1/kWQyT9xQm4x/kpDTWDHROacUENLlwJn/36iqTKVPn2aSFR5hhDv
 fVbYUyCqvUq+jRaxvcL95KirGMJZNFZhT+OMnLwVbxwcFCstOTkTAS+K5GIOfg6Z
 I0ONlcM+N9ezrsqfIiaO45nXD9OVsTTHGqrXVuh5GF8KMVARImCOxAtehpt5jdmE
 xw3YMlzUNzKfdB8olu9rb903UcW1Vy2g/5H9paFhPGPNmWtlMV5zgKrTAQM1ETWC
 JNJaL4oDWfQPJdV191rmAgcTOxvZbbAGlGjjViOZMvwgrjUIWgA0+vAzmBQvW0cQ
 EYj4nHwaAIVA2p3Mobt5i9inH/xm7vKoLHqvqUNgdl4JVDbtyGBOxV2f9pEtU7ij
 AZCDc0EBhR/3Tqj48YLSrInkMVyc4CRtSPTZxkQmot02+iJsEROo7GZyDTwmxdgw
 epKDZeMnTGNF3atGBtuVLBhrj+l2W88WGFq52hT841WqfFknTar0J/M4b3FXCm6g
 EjeADk6Oy9eI/gDAAWnRDptZbZEqtA0qguTBrNtS5kqI1rX6kREMJnnJ3KuqB0bK
 qT/3aw6a4nFOVdtgYw5z
 =Gy5E
 -----END PGP SIGNATURE-----

Merge tag 'nds32-for-linus-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux

Pull nds32 updates from Greentime Hu:

 - Perf support

 - Power management support

 - FPU support

 - Hardware prefetcher support

 - Build error fixed

 - Performance enhancement

* tag 'nds32-for-linus-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux:
  nds32: support hardware prefetcher
  nds32: Fix the items of hwcap_str ordering issue.
  math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning
  math-emu/op-2.h: Use statement expressions to prevent negative constant shift
  nds32: support denormalized result through FP emulator
  nds32: Support FP emulation
  nds32: nds32 FPU port
  nds32: Remove duplicated include from pm.c
  nds32: Power management for nds32
  nds32: Add document for NDS32 PMU.
  nds32: Add perf call-graph support.
  nds32: Perf porting
  nds32: Fix bug in bitfield.h
  nds32: Fix gcc 8.0 compiler option incompatible.
  nds32: Fill all TLB entries with kernel image mapping
  nds32: Remove the redundant assignment
2018-12-29 09:37:03 -08:00
Jiri Olsa
c4a75bb948 perf c2c: Increase the HITM ratio limit for displayed cachelines
The cachelines being reported are the ones with percentages all the way
down to 0.05%.  That makes for very long output files. Raising that to
0.1%.  The user can always specify --show-all if they want all the
cachelines with hits.

Suggested-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181228101820.28010-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:07 -03:00
Jiri Olsa
423701a0c8 perf c2c: Change the default coalesce setup
Joe suggested to have the coalesce default set just to 'iaddr', because
it's easier to read on the default 'perf c2c report' output.

By removing the "pid" field from the default -c/--coalesce option, the
'perf c2c' report will group all the relevant PIDs under the instruction
address ('iaddr') bucket. User can always run "-c pid,iaddr" for a more
fine grained output on particular PIDs.

Suggested-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181228101820.28010-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:06 -03:00
Arnaldo Carvalho de Melo
38fc9da69f perf trace beauty ioctl: Beautify USBDEVFS_ commands
For instance, while debugging the 'galileo' python utility to
synchronize fitbit trackers:

  # perf trace -e ioctl ./run --force
  ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666420) = 0
  ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
  ioctl(2</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0
  ioctl(3</home/acme/hg/galileo/run>, TCSETS, 0x7ffe286663f0) = -1 ENOTTY (Inappropriate ioctl for device)
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286655a0) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0
  ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665400) = 0
  ioctl(1</dev/pts/8>, TIOCSWINSZ, 0x7ffe286654c0) = 0
  ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0
  ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0
  ioctl(0</dev/pts/8>, TIOCMGET, 0x7ffe28665560) = 0
  ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28665530) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GET_CAPABILITIES, 0x561468dad048) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available)
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available)
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SETCONFIGURATION, 0x7ffe2866513c) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_CLAIMINTERFACE, 0x7ffe286647bc) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = -1 EAGAIN (Resource temporarily unavailable)
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = -1 EAGAIN (Resource temporarily unavailable)
  <SNIP>
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468e72ec0) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = -1 EAGAIN (Resource temporarily unavailable)
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0
  ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0
  Tracker: 813F4690C3D1: Synchronisation successful
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6x2cawak7jno3gpp5pagzj50@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:06 -03:00
Arnaldo Carvalho de Melo
2d473389f8 perf trace beauty: Export function to get the files for a thread
So that beautifiers can access things like dev_maj.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wm5o51f206c5pi063dsaeraq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:05 -03:00
Arnaldo Carvalho de Melo
86cf4c659c perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
That ends up generating this:

  [acme@quaco perf]$ cat /tmp/build/perf/trace/beauty/generated/ioctl/usbdevfs_ioctl_array.c
  static const char *usbdevfs_ioctl_cmds[] = {
	[0] = "CONTROL",
	[10] = "SUBMITURB",
	[11] = "DISCARDURB",
	[12] = "REAPURB",
	[13] = "REAPURBNDELAY",
	[14] = "DISCSIGNAL",
	[15] = "CLAIMINTERFACE",
	[16] = "RELEASEINTERFACE",
	[17] = "CONNECTINFO",
	[18] = "IOCTL",
	[19] = "HUB_PORTINFO",
	[2] = "BULK",
	[20] = "RESET",
	[21] = "CLEAR_HALT",
	[22] = "DISCONNECT",
	[23] = "CONNECT",
	[24] = "CLAIM_PORT",
	[25] = "RELEASE_PORT",
	[26] = "GET_CAPABILITIES",
	[27] = "DISCONNECT_CLAIM",
	[28] = "ALLOC_STREAMS",
	[29] = "FREE_STREAMS",
	[3] = "RESETEP",
	[30] = "DROP_PRIVILEGES",
	[31] = "GET_SPEED",
	[4] = "SETINTERFACE",
	[5] = "SETCONFIGURATION",
	[8] = "GETDRIVER",
  };

  #if 0
  static const char *usbdevfs_ioctl_32_cmds[] = {
	[0] = "CONTROL32",
	[10] = "SUBMITURB32",
	[12] = "REAPURB32",
	[13] = "REAPURBNDELAY32",
	[14] = "DISCSIGNAL32",
	[18] = "IOCTL32",
	[2] = "BULK32",
  };
  #endif
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hkam6lt1g806l0p4b7buif3n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:05 -03:00
Arnaldo Carvalho de Melo
870c3f40dc perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
Will be associated with fds with the right device major.

  $ tools/perf/trace/beauty/usbdevfs_ioctl.sh
  static const char *usbdevfs_ioctl_cmds[] = {
	[0] = "CONTROL",
	[10] = "SUBMITURB",
	[11] = "DISCARDURB",
	[12] = "REAPURB",
	[13] = "REAPURBNDELAY",
	[14] = "DISCSIGNAL",
	[15] = "CLAIMINTERFACE",
	[16] = "RELEASEINTERFACE",
	[17] = "CONNECTINFO",
	[18] = "IOCTL",
	[19] = "HUB_PORTINFO",
	[20] = "RESET",
	[21] = "CLEAR_HALT",
	[22] = "DISCONNECT",
	[23] = "CONNECT",
	[24] = "CLAIM_PORT",
	[25] = "RELEASE_PORT",
	[26] = "GET_CAPABILITIES",
	[27] = "DISCONNECT_CLAIM",
	[28] = "ALLOC_STREAMS",
	[29] = "FREE_STREAMS",
	[2] = "BULK",
	[30] = "DROP_PRIVILEGES",
	[31] = "GET_SPEED",
	[3] = "RESETEP",
	[4] = "SETINTERFACE",
	[5] = "SETCONFIGURATION",
	[8] = "GETDRIVER",
  };

  #if 0
  static const char *usbdevfs_ioctl_32_cmds[] = {
	[0] = "CONTROL32",
	[10] = "SUBMITURB32",
	[12] = "REAPURB32",
	[13] = "REAPURBNDELAY32",
	[14] = "DISCSIGNAL32",
	[18] = "IOCTL32",
	[2] = "BULK32",
  };
  #endif
  $

Leaving the '32' variants commented, later we can try to support those
as well, from some other hint (maybe something about the thread issuing
the ioctls) and from the _IOC_SIZE(cmd).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-neq1lrji5k4ku0rktn7ytnri@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:04 -03:00
Arnaldo Carvalho de Melo
2bd71d11a8 tools headers uapi: Grab a copy of usbdevice_fs.h
Will be used to generate the string table for the USBDEVFS_ prefixed
ioctl commands.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3vrm9b55tdhzn8sw9qazh4z5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:04 -03:00
Arnaldo Carvalho de Melo
4bcc4cff6a perf trace: Store the major number for a file when storing its pathname
We keep a table for the fds to map them back to pathnames when showing
'fd' based APIs such as write(), store as well the major number for the
device the path is in, to use in things like choosing the right ioctl
'cmd' beautifier.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qjkds7bnk7v7fk2xhqsb0a4v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:04 -03:00
Arnaldo Carvalho de Melo
d7e134845d perf trace: Move the files table resizing to outside set_pathname()
So that we can have that table expanded when setting other attributes.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hzvpe3qwafe6sqcq3bhtbxds@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:03 -03:00
Arnaldo Carvalho de Melo
f4a74fcbfd perf trace: Rename thread_thread->paths to thread_trace->files
So that we can add more per file attributes besides the pathname, such
as which ioctl beautifier to use, for cases such as the sound and
usbdeffs ioctls, that both use the 'U' command, so we have to
differentiate at the major number for the device file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1895cmhrdz2dkl5prf2cj2yj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:03 -03:00
Andi Kleen
61f611593f perf script: Fix LBR skid dump problems in brstackinsn
This is a fix for another instance of the skid problem Milian recently
found [1]

The LBRs don't freeze at the exact same time as the PMI is triggered.
The perf script brstackinsn code that dumps LBR assembler assumes that
the last branch in the LBR leads to the sample point.  But with skid
it's possible that the CPU executes one or more branches before the
sample, but which do not appear in the LBR.

What happens then is either that the sample point is before the last LBR
branch. In this case the dumper sees a negative length and ignores it.
Or it the sample point is long after the last branch. Then the dumper
sees a very long block and dumps it upto its block limit (16k bytes),
which is noise in the output.

On typical sample session this can happen regularly.

This patch tries to detect and handle the situation. On the last block
that is dumped by the LBR dumper we always stop on the first branch. If
the block length is negative just scan forward to the first branch.
Otherwise scan until a branch is found.

The PT decoder already has a function that uses the instruction decoder
to detect branches, so we can just reuse it here.

Then when a terminating branch is found print an indication and stop
dumping. This might miss a few instructions, but at least shows no
runaway blocks.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Link: http://lkml.kernel.org/r/20181120050617.4119-1-andi@firstfloor.org
[ Resolved conflict with dd2e18e9ac ("perf tools: Support 'srccode' output") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:02 -03:00
Jiri Olsa
a389aece97 perf python: Do not force closing original perf descriptor in evlist.get_pollfd()
Ondřej reported that when compiled with python3, the python extension
regresses in evlist.get_pollfd function behaviour.

The evlist.get_pollfd function creates file objects from evlist's fds
and returns them in a list. The python3 version also sets them to 'close
the original descriptor' when the object dies (is closed), by passing
True via the 'closefd' arg in the PyFile_FromFd call.

The python's closefd doc says:

  If closefd is False, the underlying file descriptor will be kept open
  when the file is closed.

That's why the following line in python3 closes all evlist fds:

  evlist.get_pollfd()

the returned list is immediately destroyed and that takes down the
original events fds.

Passing closefd as False to PyFile_FromFd to fix this.

Reported-by: Ondřej Lysoněk <olysonek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 66dfdff03d ("perf tools: Add Python 3 support")
Link: http://lkml.kernel.org/r/20181226112121.5285-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:33:02 -03:00
Colin Ian King
fbe7e42515 perf trace: Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*"
The spelling of the SECCOMP is incorrect, fix these.

Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Fixes: c65c83ffe9 ("perf trace: Allow asking for not suppressing common string prefixes")
Link: http://lkml.kernel.org/r/20181221084809.6108-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28 16:32:54 -03:00
Linus Torvalds
8d6973327e powerpc updates for 4.21
Notable changes:
 
  - Mitigations for Spectre v2 on some Freescale (NXP) CPUs.
 
  - A large series adding support for pass-through of Nvidia V100 GPUs to guests
    on Power9.
 
  - Another large series to enable hardware assistance for TLB table walk on
    MPC8xx CPUs.
 
  - Some preparatory changes to our DMA code, to make way for further cleanups
    from Christoph.
 
  - Several fixes for our Transactional Memory handling discovered by fuzzing the
    signal return path.
 
  - Support for generating our system call table(s) from a text file like other
    architectures.
 
  - A fix to our page fault handler so that instead of generating a WARN_ON_ONCE,
    user accesses of kernel addresses instead print a ratelimited and
    appropriately scary warning.
 
  - A cosmetic change to make our unhandled page fault messages more similar to
    other arches and also more compact and informative.
 
  - Freescale updates from Scott:
    "Highlights include elimination of legacy clock bindings use from dts
     files, an 83xx watchdog handler, fixes to old dts interrupt errors, and
     some minor cleanup."
 
 And many clean-ups, reworks and minor fixes etc.
 
 Thanks to:
  Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
  Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao, Christian Lamparter,
  Christophe Leroy, Christoph Hellwig, Daniel Axtens, Darren Stevens, David
  Gibson, Diana Craciun, Dmitry V. Levin, Firoz Khan, Geert Uytterhoeven, Greg
  Kurz, Gustavo Romero, Hari Bathini, Joel Stanley, Kees Cook, Madhavan
  Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal
  Suchánek, Naveen N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras,
  Ram Pai, Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam
  Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen Rothwell,
  Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian Tang, Yue Haibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcJLwZAAoJEFHr6jzI4aWAAv4P/jMvP52lA90i2E8G72LOVSF1
 33DbE/Okib3VfmmMcXZpgpEfwIcEmJcIj86WWcLWzBfXLunehkgwh+AOfBLwqWch
 D08+RR9EZb7ppvGe91hvSgn4/28CWVKAxuDviSuoE1OK8lOTncu889r2+AxVFZiY
 f6Al9UPlB3FTJonNx8iO4r/GwrPigukjbzp1vkmJJg59LvNUrMQ1Fgf9D3cdlslH
 z4Ff9zS26RJy7cwZYQZI4sZXJZmeQ1DxOZ+6z6FL/nZp/O4WLgpw6C6o1+vxo1kE
 9ZnO/3+zIRhoWiXd6OcOQXBv3NNCjJZlXh9HHAiL8m5ZqbmxrStQWGyKW/jjEZuK
 wVHxfUT19x9Qy1p+BH3XcUNMlxchYgcCbEi5yPX2p9ZDXD6ogNG7sT1+NO+FBTww
 ueCT5PCCB/xWOccQlBErFTMkFXFLtyPDNFK7BkV7uxbH0PQ+9guCvjWfBZti6wjD
 /6NK4mk7FpmCiK13Y1xjwC5OqabxLUYwtVuHYOMr5TOPh8URUPS4+0pIOdoYDM6z
 Ensrq1CC843h59MWADgFHSlZ78FRtZlG37JAXunjLbqGupLOvL7phC9lnwkylHga
 2hWUWFeOV8HFQBP4gidZkLk64pkT9LzqHgdgIB4wUwrhc8r2mMZGdQTq5H7kOn3Q
 n9I48PWANvEC0PBCJ/KL
 =cr6s
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Mitigations for Spectre v2 on some Freescale (NXP) CPUs.

   - A large series adding support for pass-through of Nvidia V100 GPUs
     to guests on Power9.

   - Another large series to enable hardware assistance for TLB table
     walk on MPC8xx CPUs.

   - Some preparatory changes to our DMA code, to make way for further
     cleanups from Christoph.

   - Several fixes for our Transactional Memory handling discovered by
     fuzzing the signal return path.

   - Support for generating our system call table(s) from a text file
     like other architectures.

   - A fix to our page fault handler so that instead of generating a
     WARN_ON_ONCE, user accesses of kernel addresses instead print a
     ratelimited and appropriately scary warning.

   - A cosmetic change to make our unhandled page fault messages more
     similar to other arches and also more compact and informative.

   - Freescale updates from Scott:
       "Highlights include elimination of legacy clock bindings use from
        dts files, an 83xx watchdog handler, fixes to old dts interrupt
        errors, and some minor cleanup."

  And many clean-ups, reworks and minor fixes etc.

  Thanks to: Alexandre Belloni, Alexey Kardashevskiy, Andrew Donnellan,
  Aneesh Kumar K.V, Arnd Bergmann, Benjamin Herrenschmidt, Breno Leitao,
  Christian Lamparter, Christophe Leroy, Christoph Hellwig, Daniel
  Axtens, Darren Stevens, David Gibson, Diana Craciun, Dmitry V. Levin,
  Firoz Khan, Geert Uytterhoeven, Greg Kurz, Gustavo Romero, Hari
  Bathini, Joel Stanley, Kees Cook, Madhavan Srinivasan, Mahesh
  Salgaonkar, Markus Elfring, Mathieu Malaterre, Michal Suchánek, Naveen
  N. Rao, Nick Desaulniers, Oliver O'Halloran, Paul Mackerras, Ram Pai,
  Ravi Bangoria, Rob Herring, Russell Currey, Sabyasachi Gupta, Sam
  Bobroff, Satheesh Rajendran, Scott Wood, Segher Boessenkool, Stephen
  Rothwell, Tang Yuantian, Thiago Jung Bauermann, Yangtao Li, Yuantian
  Tang, Yue Haibing"

* tag 'powerpc-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (201 commits)
  Revert "powerpc/fsl_pci: simplify fsl_pci_dma_set_mask"
  powerpc/zImage: Also check for stdout-path
  powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y
  macintosh: Use of_node_name_{eq, prefix} for node name comparisons
  ide: Use of_node_name_eq for node name comparisons
  powerpc: Use of_node_name_eq for node name comparisons
  powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name
  powerpc/mm: Remove very old comment in hash-4k.h
  powerpc/pseries: Fix node leak in update_lmb_associativity_index()
  powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL
  powerpc/dts/fsl: Fix dtc-flagged interrupt errors
  clk: qoriq: add more compatibles strings
  powerpc/fsl: Use new clockgen binding
  powerpc/83xx: handle machine check caused by watchdog timer
  powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved"
  powerpc/fsl_pci: simplify fsl_pci_dma_set_mask
  arch/powerpc/fsl_rmu: Use dma_zalloc_coherent
  vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver
  vfio_pci: Allow regions to add own capabilities
  vfio_pci: Allow mapping extra regions
  ...
2018-12-27 10:43:24 -08:00
Arnaldo Carvalho de Melo
b9b6a2ea2b perf trace: Do not hardcode the size of the tracepoint common_ fields
We shouldn't hardcode the size of the tracepoint common_ fields, use the
offset of the 'id'/'__syscallnr' field in the sys_enter event instead.

This caused the augmented syscalls code to fail on a particular build of a
PREEMPT_RT_FULL kernel where these extra 'common_migrate_disable' and
'common_padding' fields were before the syscall id one:

  # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
  name: sys_enter
  ID: 22
  format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;
	field:unsigned short common_migrate_disable;	offset:8;	size:2;	signed:0;
	field:unsigned short common_padding;	offset:10;	size:2;	signed:0;

	field:long id;	offset:16;	size:8;	signed:1;
	field:unsigned long args[6];	offset:24;	size:48;	signed:0;

  print fmt: "NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)", REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3], REC->args[4], REC->args[5]
  #

All those 'common_' prefixed fields are zeroed when they hit a BPF tracepoint
hook, we better just discard those, i.e. somehow pass an offset to the
BPF program from the start of the ctx and make adjustments in the 'perf trace'
handlers to adjust the offset of the syscall arg offsets obtained from tracefs.

Till then, fix it the quick way and add this to the augmented_raw_syscalls.c to
bet it to work in such kernels:

  diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
  index 53c233370fae..1f746f931e13 100644
  --- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
  +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
  @@ -38,12 +38,14 @@ struct bpf_map SEC("maps") syscalls = {

   struct syscall_enter_args {
          unsigned long long common_tp_fields;
  +       long               rt_common_tp_fields;
          long               syscall_nr;
          unsigned long      args[6];
   };

   struct syscall_exit_args {
          unsigned long long common_tp_fields;
  +       long               rt_common_tp_fields;
          long               syscall_nr;
          long               ret;
   };

Just to check that this was the case. Fix it properly later, for now remove the
hardcoding of the offset in the 'perf trace' side and document the situation
with this patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2pqavrktqkliu5b9nzouio21@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-21 09:42:46 -03:00
Stanislav Fomichev
14541b1e7e perf build: Don't unconditionally link the libbfd feature test to -liberty and -lz
Current libbfd feature test unconditionally links against -liberty and -lz.
While it's required on some systems (e.g. opensuse), it's completely
unnecessary on the others, where only -lbdf is sufficient (debian).
This patch streamlines (and renames) the following feature checks:

feature-libbfd           - only link against -lbfd (debian),
                           see commit 2cf9040714 ("perf tools: Fix bfd
			   dependency libraries detection")
feature-libbfd-liberty   - link against -lbfd and -liberty
feature-libbfd-liberty-z - link against -lbfd, -liberty and -lz (opensuse),
                           see commit 280e7c48c3 ("perf tools: fix BFD
			   detection on opensuse")

(feature-liberty{,-z} were renamed to feature-libbfd-liberty{,z}
for clarity)

The main motivation is to fix this feature test for bpftool which is
currently broken on debian (libbfd feature shows OFF, but we still
unconditionally link against -lbfd and it works).

Tested on debian with only -lbfd installed (without -liberty); I'd
appreciate if somebody on the other systems can test this new detection
method.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4dfc634cfcfb236883971b5107cf3c28ec8a31be.1542328222.git.sdf@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-21 09:42:46 -03:00
Arnaldo Carvalho de Melo
5ce29d522e perf beauty mmap: PROT_WRITE should come before PROT_EXEC
To match strace output:

  # cat mmap.c
  #include <sys/mman.h>

  int main(void)
  {
	  mmap(0, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	  return 0;
  }
  # strace -e mmap ./mmap |& grep -v ^+++
  mmap(NULL, 103484, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5bae400000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5bae3fe000
  mmap(NULL, 3889792, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5bade40000
  mmap(0x7f5bae1ec000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ac000) = 0x7f5bae1ec000
  mmap(0x7f5bae1f2000, 14976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5bae1f2000
  mmap(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5bae419000
  # trace -e mmap ./mmap |& grep -v ^+++
  mmap(NULL, 103484, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6646c25000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f6646c23000
  mmap(NULL, 3889792, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f6646665000
  mmap(0x7f6646a11000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ac000) = 0x7f6646a11000
  mmap(0x7f6646a17000, 14976, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f6646a17000
  mmap(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f6646c3e000
  #

Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-nt49d6iqle80cw8f529ovaqi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-21 09:42:46 -03:00
Arnaldo Carvalho de Melo
f76214f937 perf trace: Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter
While updating 'perf trace' on an machine with an old precompiled
augmented_raw_syscalls.o that didn't setup the syscall map the new 'perf
trace' codebase notices the augmented_raw_syscalls.o eBPF event, decides
to use it instead of the old raw_syscalls:sys_{enter,exit} method, but
then because we don't have the syscall map tries to set the tracepoint
filter on the sys_{enter,exit} evsels, that are NULL, segfaulting.

Make the code more robust by checking it those tracepoints have
their respective evsels in place before trying to set the tp filter.

With this we still get everything to work, just not setting up the
syscall filters, which is better than a segfault. Now to update the
precompiled augmented_raw_syscalls.o and continue development :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3ft5rjdl05wgz2pwpx2z8btu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-21 09:42:46 -03:00
Madhavan Srinivasan
333804dc3b powerpc/perf: Update perf_regs structure to include SIER
On each sample, Sample Instruction Event Register (SIER) content
is saved in pt_regs. SIER does not have a entry as-is in the pt_regs
but instead, SIER content is saved in the "dar" register of pt_regs.

Patch adds another entry to the perf_regs structure to include the "SIER"
printing which internally maps to the "dar" of pt_regs.

It also check for the SIER availability in the platform and present
value accordingly

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Arnaldo Carvalho de Melo
bc055c54b8 perf symbols: Relax checks on perf-PID.map ownership
Those are simple enough, and usually not produced by root, instead by
whatever user is running java, rust, Node.js JIT code that end up
generating those /tmp/perf-PID.map for resolution of symbols in the
anonymous executable maps.

Having to use --force to resolve symbols in 'perf top' is a distraction,
as recently I experienced when node.js symbols were not being resolved
by 'perf top'.

Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hítalo Silva <hitalos@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-tk2jgo2v4v2yjuj28axbpppo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:17:41 -03:00
Arnaldo Carvalho de Melo
42337cb768 perf trace: Wire up the fadvise 'advice' table generator
That ends up generating this:

  $ cat /tmp/build/perf/trace/beauty/generated/fadvise_advice_array.c
  static const char *fadvise_advices[] = {
	[0] = "NORMAL",
	[1] = "RANDOM",
	[2] = "SEQUENTIAL",
	[3] = "WILLNEED",
	[4] = "DONTNEED",
	[5] = "NOREUSE",
  };
$

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zwbslubagram8a8zdc003u8h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:17:41 -03:00
Arnaldo Carvalho de Melo
069c1c6cc3 perf beauty: Add generator for fadvise64's 'advice' arg constants
$ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
	[0] = "NORMAL",
	[1] = "RANDOM",
	[2] = "SEQUENTIAL",
	[3] = "WILLNEED",
	[4] = "DONTNEED",
	[5] = "NOREUSE",
  };
  $

This has a hack wrt the s390 difference.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tb7jguv01u8p570piq13eioh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:17:41 -03:00
Arnaldo Carvalho de Melo
f9cdd63e79 tools headers uapi: Grab a copy of fadvise.h
Will be used to generate the string table for fadvise64's 'advice'
argument.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-muswpnft8q9krktv052yrgsc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:17:40 -03:00
Arnaldo Carvalho de Melo
a66313408a perf beauty mmap: Print mmap's 'offset' arg in hexadecimal
Also to make it match 'strace' output, for regression testing.

Both now produce this option, when 'perf trace' uses a .perfconfig
asking for the strace like output:

  mmap(0x7faf66e6a000, 1363968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7faf66e6a000

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-27qhouo1kaac2iyl85nfnsf5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:20 -03:00
Arnaldo Carvalho de Melo
1355e09ab0 perf beauty mmap: Print PROT_READ before PROT_EXEC to match strace output
Helps with comparing 'strace' and 'perf trace' output, for mutual
regression testing.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-va0qe95xbhep5hy52aq5qe0v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:19 -03:00
Arnaldo Carvalho de Melo
fb7068e73d perf trace beauty: Beautify arch_prctl()'s arguments
This actually so far, AFAIK is available only in x86, so the code was
put in place with x86 prefixes, in arches where it is not available it
will just not be called, so no further mechanisms are needed at this
time.

Later, when other arches wire this up, we'll just look at the uname
(live sessions) or perf_env data in the perf.data header to auto-wire
the right beautifier.

With this the output is the same as produced by 'strace' when used with
the following ~/.perfconfig:

  # cat ~/.perfconfig
  [llvm]
	dump-obj = true
  [trace]
	  add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
	  show_zeros = yes
	  show_duration = no
	  no_inherit = yes
	  show_timestamp = no
	  show_arg_names = no
	  args_alignment = -40
	  show_prefix = yes
  #

And, on fedora 29, since the string tables are generated from the kernel
sources, we don't know about 0x3001, just like strace:

  --- /tmp/strace 2018-12-17 11:22:08.707586721 -0300
  +++ /tmp/trace  2018-12-18 11:11:32.037512729 -0300
  @@ -1,49 +1,49 @@
  -arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument)
  +arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe4eb93ae0) = -1 EINVAL (Invalid argument)
  -arch_prctl(ARCH_SET_FS, 0x7faf6700f540) = 0
  +arch_prctl(ARCH_SET_FS, 0x7fb507364540) = 0

And that seems to be related to the CET/Shadow Stack feature, that
userland in Fedora 29 (glibc 2.28) are querying the kernel about, that
0x3001 seems to be ARCH_CET_STATUS, I'll check the situation and test
with a fedora 29 kernel to see if the other codes are used.

A diff that ignores the different pointers for different runs needs to
be put in place in the upcoming regression tests comparing 'perf trace's
output to strace's.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-73a9prs8ktkrt97trtdmdjs8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:19 -03:00
Arnaldo Carvalho de Melo
9614b8d697 perf trace: When showing string prefixes show prefix + ??? for unknown entries
To match 'strace' output, like in:

  arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-kx59j2dk5l1x04ou57mt99ck@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:19 -03:00
Arnaldo Carvalho de Melo
1f2d085e0f perf trace: Move strarrays to beauty.h for further reuse
We'll use it in the upcoming arch_prctl() 'code' arg beautifier.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6e4tj2fjen8qa73gy4u49vav@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:19 -03:00
Arnaldo Carvalho de Melo
40714e8b37 perf beauty: Wire up the x86_arch prctl code table generator
$ cat /tmp/build/perf/trace/beauty/generated/x86_arch_prctl_code_array.c
  #define x86_arch_prctl_codes_1_offset 0x1001
  static const char *x86_arch_prctl_codes_1[] = {
	[0x1001 - 0x1001]= "SET_GS",
	[0x1002 - 0x1001]= "SET_FS",
	[0x1003 - 0x1001]= "GET_FS",
	[0x1004 - 0x1001]= "GET_GS",
	[0x1011 - 0x1001]= "GET_CPUID",
	[0x1012 - 0x1001]= "SET_CPUID",
  };

  #define x86_arch_prctl_codes_2_offset 0x2001
  static const char *x86_arch_prctl_codes_2[] = {
	[0x2001 - 0x2001]= "MAP_VDSO_X32",
	[0x2002 - 0x2001]= "MAP_VDSO_32",
	[0x2003 - 0x2001]= "MAP_VDSO_64",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3r9blij6n8wdlsyd5dujx86r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:19 -03:00
Arnaldo Carvalho de Melo
ff4cb769bc perf beauty: Add a string table generator for x86's 'arch_prctl' codes
$ tools/perf/trace/beauty/x86_arch_prctl.sh
  #define x86_arch_prctl_codes_1_offset 0x1001
  static const char *x86_arch_prctl_codes_1[] = {
	[0x1001 - 0x1001]= "SET_GS",
	[0x1002 - 0x1001]= "SET_FS",
	[0x1003 - 0x1001]= "GET_FS",
	[0x1004 - 0x1001]= "GET_GS",
	[0x1011 - 0x1001]= "GET_CPUID",
	[0x1012 - 0x1001]= "SET_CPUID",
  };

  #define x86_arch_prctl_codes_2_offset 0x2001
  static const char *x86_arch_prctl_codes_2[] = {
	[0x2001 - 0x2001]= "MAP_VDSO_X32",
	[0x2002 - 0x2001]= "MAP_VDSO_32",
	[0x2003 - 0x2001]= "MAP_VDSO_64",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-w0fux1psivphhx6rve8kn3vq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:18 -03:00
Arnaldo Carvalho de Melo
c22e2683c0 tools include arch: Grab a copy of x86's prctl.h
We need it to generate the tables for the 'code' arch_prctl's syscall
argument.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-vu890pi18fpd4eyz61cazckj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:18 -03:00
Arnaldo Carvalho de Melo
ce05539f20 perf trace: Show NULL when syscall pointer args are 0
Matching strace's output format. The 'format' file for the syscall
tracepoints have an indication if the arg is a pointer, with some
exceptions like 'mmap' that has its first arg as an 'unsigned long', so
use a heuristic using the argument name, i.e. if it contains the 'addr'
substring, format it with the pointer formatter.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ddghemr8qrm6i0sb8awznbze@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:18 -03:00
Arnaldo Carvalho de Melo
2c83dfae02 perf trace: Enclose the errno strings with ()
To match strace, now both emit the same line for calls like:

 access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-krxl6klsqc9qyktoaxyih942@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:18 -03:00
Arnaldo Carvalho de Melo
c48ee107bb perf augmented_raw_syscalls: Copy 'access' arg as well
This will all come from userspace, but to test the changes to make 'perf
trace' output similar to strace's, do this one more now manually.

To update the precompiled augmented_raw_syscalls.o binary I just run:

  # perf record -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.022 MB perf.data ]
  #

Because to have augmented_raw_syscalls to be always used and a fast
startup and remove the need to have the llvm toolchain installed, I'm
using:

  # perf config | grep add_events
  trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  #

So when doing changes to augmented_raw_syscals.c one needs to rebuild
the .o file.

This will be done automagically later, i.e. have a 'make' behaviour of
recompiling when the .c gets changed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-lw3i2atyq8549fpqwmszn3qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:18 -03:00
Arnaldo Carvalho de Melo
4b8a240ed5 perf trace: Add alignment spaces after the closing parens
To use strace's style, helping in comparing the output of 'perf trace'
with the one from 'strace', to help in upcoming regression tests.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mw6peotz4n84rga0fk78buff@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:15:10 -03:00
Arnaldo Carvalho de Melo
601d66d433 perf trace beauty: Print O_RDONLY when (flags & O_ACCMODE) == 0
And there are more flags, to match strace's output.

 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

Also to help with regression tests.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ofovpmvdli3bwch30936xn7t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:07:42 -03:00
Arnaldo Carvalho de Melo
c65c83ffe9 perf trace: Allow asking for not suppressing common string prefixes
So far we've been suppressing common stuff such as "MAP_" in the mmap
flags, showing "SHARED" instead of "MAP_SHARED", allow for those
prefixes (and a few suffixes) to be shown:

  # trace -e *map,open*,*seek sleep 1
  openat("/etc/ld.so.cache", CLOEXEC) = 3
  mmap(0, 109093, READ, PRIVATE, 3, 0) = 0x7ff61c695000
  openat("/lib64/libc.so.6", CLOEXEC) = 3
  lseek(3, 792, SET) = 792
  mmap(0, 8192, READ|WRITE, PRIVATE|ANONYMOUS) = 0x7ff61c693000
  lseek(3, 792, SET) = 792
  lseek(3, 864, SET) = 864
  mmap(0, 1857568, READ, PRIVATE|DENYWRITE, 3, 0) = 0x7ff61c4cd000
  mmap(0x7ff61c4ef000, 1363968, EXEC|READ, PRIVATE|FIXED|DENYWRITE, 3, 139264) = 0x7ff61c4ef000
  mmap(0x7ff61c63c000, 311296, READ, PRIVATE|FIXED|DENYWRITE, 3, 1503232) = 0x7ff61c63c000
  mmap(0x7ff61c689000, 24576, READ|WRITE, PRIVATE|FIXED|DENYWRITE, 3, 1814528) = 0x7ff61c689000
  mmap(0x7ff61c68f000, 14368, READ|WRITE, PRIVATE|FIXED|ANONYMOUS) = 0x7ff61c68f000
  munmap(0x7ff61c695000, 109093) = 0
  openat("/usr/lib/locale/locale-archive", CLOEXEC) = 3
  mmap(0, 217749968, READ, PRIVATE, 3, 0) = 0x7ff60f523000
  #
  # vim ~/.perfconfig
  #
  # perf config
  llvm.dump-obj=true
  trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  trace.show_zeros=yes
  trace.show_duration=no
  trace.no_inherit=yes
  trace.show_timestamp=no
  trace.show_arg_names=no
  trace.args_alignment=0
  trace.string_quote="
  trace.show_prefix=yes
  #
  #
  # trace -e *map,open*,*seek sleep 1
  openat(AT_FDCWD, "/etc/ld.so.cache", O_CLOEXEC) = 3
  mmap(0, 109093, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7ebbe59000
  openat(AT_FDCWD, "/lib64/libc.so.6", O_CLOEXEC) = 3
  lseek(3, 792, SEEK_SET) = 792
  mmap(0, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7ebbe57000
  lseek(3, 792, SEEK_SET) = 792
  lseek(3, 864, SEEK_SET) = 864
  mmap(0, 1857568, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7ebbc91000
  mmap(0x7f7ebbcb3000, 1363968, PROT_EXEC|PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 139264) = 0x7f7ebbcb3000
  mmap(0x7f7ebbe00000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1503232) = 0x7f7ebbe00000
  mmap(0x7f7ebbe4d000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1814528) = 0x7f7ebbe4d000
  mmap(0x7f7ebbe53000, 14368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f7ebbe53000
  munmap(0x7f7ebbe59000, 109093) = 0
  openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_CLOEXEC) = 3
  mmap(0, 217749968, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7eaece7000
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mtn1i4rjowjl72trtnbmvjd4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:07:42 -03:00
Arnaldo Carvalho de Melo
2e3d7fac9d perf trace: Add a prefix member to the strarray class
So that the user, in an upcoming patch, can select printing it to get
the full string as used in the source code, not one with a common prefix
chopped off so as to make the output more compact.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zypczc88gzbmeqx7b372s138@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:07:42 -03:00
Arnaldo Carvalho de Melo
721f5326fb perf trace: Enclose strings with double quotes
To match 'strace' output, helping with upcoming regression tests
comparing both outputs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jab52t1dcuh6vlztqle9g7u9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:07:41 -03:00
Arnaldo Carvalho de Melo
9ed45d59ae perf trace: Make the alignment of the syscall args be configurable
Since the start 'perf trace' aligns the parens enclosing the list of
syscall args to align the syscall results, allow this to be
configurable, keeping the default of 70. Using:

  # perf config
  llvm.dump-obj=true
  trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  trace.show_zeros=yes
  trace.show_duration=no
  trace.no_inherit=yes
  trace.show_timestamp=no
  trace.show_arg_names=no
  trace.args_alignment=0
  # trace -e open*,close,*sleep sleep 1
  openat(CWD, /etc/ld.so.cache, CLOEXEC) = 3
  close(3) = 0
  openat(CWD, /lib64/libc.so.6, CLOEXEC) = 3
  close(3) = 0
  openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC) = 3
  close(3) = 0
  nanosleep(0x7ffc00de66f0, 0) = 0
  close(1) = 0
  close(2) = 0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-r8cbhoz1lr5npq9tutpvoigr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 16:07:26 -03:00
Arnaldo Carvalho de Melo
9d6dc178f0 perf trace: Allow suppressing the syscall argument names
To show just the values:

Default:

  # trace -e open*,close,*sleep sleep 1
  openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
  close(fd: 3                                                           ) = 0
  openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC           ) = 3
  close(fd: 3                                                           ) = 0
  openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
  close(fd: 3                                                           ) = 0
  nanosleep(rqtp: 0x7ffc0c4ea0d0, rmtp: 0                               ) = 0
  close(fd: 1                                                           ) = 0
  close(fd: 2                                                           ) = 0
  #

Remove it:

  # perf config trace.show_arg_names=no
  # trace -e open*,close,*sleep sleep 1
  openat(CWD, /etc/ld.so.cache, CLOEXEC                                 ) = 3
  close(3                                                               ) = 0
  openat(CWD, /lib64/libc.so.6, CLOEXEC                                 ) = 3
  close(3                                                               ) = 0
  openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC                   ) = 3
  close(3                                                               ) = 0
  nanosleep(0x7ffced3a8c40, 0                                           ) = 0
  close(1                                                               ) = 0
  close(2                                                               ) = 0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ta9tbdwgodpw719sr2bjm8eb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:01 -03:00
Arnaldo Carvalho de Melo
b036146fd0 perf trace: Allow configuring if the syscall start timestamp should be printed
# trace -e open*,close,*sleep sleep 1
     0.000 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
     0.016 close(fd: 3                                                           ) = 0
     0.024 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC           ) = 3
     0.074 close(fd: 3                                                           ) = 0
     0.235 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
     0.251 close(fd: 3                                                           ) = 0
     0.285 nanosleep(rqtp: 0x7ffd68e6d620, rmtp: 0                               ) = 0
  1000.386 close(fd: 1                                                           ) = 0
  1000.395 close(fd: 2                                                           ) = 0
  #

  # perf config trace.show_timestamp=no
  # trace -e open*,close,*sleep sleep 1
  openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
  close(fd: 3                                                           ) = 0
  openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC           ) = 3
  close(fd: 3                                                           ) = 0
  openat(dfd: CWD, filename: , flags: CLOEXEC                           ) = 3
  close(fd: 3                                                           ) = 0
  nanosleep(rqtp: 0x7fffa79c38e0, rmtp: 0                               ) = 0
  close(fd: 1                                                           ) = 0
  close(fd: 2                                                           ) = 0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mjjnicy48367jah6ls4k0nk8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:01 -03:00
Arnaldo Carvalho de Melo
d32de87e73 perf trace: Allow configuring default for perf_event_attr.inherit
I.f. if children should inherit the parent perf_event configuration,
i.e. if we should trace children as well or just the parent.

The default is to follow children, to disable this and have a behaviour
similar to strace, set this config option or use the --no_inherit 'perf
trace' option.

E.g.:

Default:

  # perf config trace.no_inherit
  # trace -e clone,*sleep time sleep 1
     0.000 time/21107 clone(clone_flags: CHILD_CLEARTID|CHILD_SETTID|0x11, newsp: 0, child_tidptr: 0x7f7b8f9ae810) = 21108 (time)
         ? time/21108  ... [continued]: clone()
     0.691 sleep/21108 nanosleep(rqtp: 0x7ffed01d0540, rmtp: 0                               ) = 0
  0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 1988maxresident)k
  0inputs+0outputs (0major+76minor)pagefaults 0swaps
  #

Disable it:

  # trace -e clone,*sleep time sleep 1
     0.000 clone(clone_flags: CHILD_CLEARTID|CHILD_SETTID|0x11, newsp: 0, child_tidptr: 0x7ff41e100810) = 21414 (time)
  0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 1964maxresident)k
  0inputs+0outputs (0major+76minor)pagefaults 0swaps
  #

Notice that since there is just one thread, the "comm/TID" column is
suppressed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-thd8s16pagyza71ufi5vjlan@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:01 -03:00
Arnaldo Carvalho de Melo
41e0d040c4 perf config: Show the configuration when no arguments are provided
More convenient thah having to recall what letter is about
showing/listing/dumping the configuration, i.e. no arguments means
-l/--list:

  # perf config
  llvm.dump-obj=true
  trace.default_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  trace.show_zeros=yes
  trace.show_duration=no
  # perf config -l
  llvm.dump-obj=true
  trace.default_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  trace.show_zeros=yes
  trace.show_duration=no
  # perf config -h

   Usage: perf config [<file-option>] [options] [section.name[=value] ...]

      -l, --list            show current config variables
          --system          use system config file
          --user            use user config file

  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-z2n63avz6tliqb5gmu4l1dti@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Arnaldo Carvalho de Melo
42e4a52d01 perf trace: Allow configuring if the syscall duration should be printed
# perf config trace.show_duration=no
  # perf config -l | grep trace
  trace.default_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  trace.show_zeros=true
  trace.show_duration=no
  # trace -e *sleep sleep 1
     0.000 sleep/8729 nanosleep(rqtp: 0x7ffcb0b4c940, rmtp: 0) = 0
  # perf config trace.show_duration=yes
  # trace -e *sleep sleep 1
     0.000 (1000.212 ms): sleep/8735 nanosleep(rqtp: 0x7ffca15fa770, rmtp: 0) = 0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2c7h1m8fhzb9puxtj9nlevi8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Arnaldo Carvalho de Melo
e7c634fcc6 perf trace: Allow configuring if zeroed syscall args should be printed
The default so far, since we show argument names followed by its values,
was to make the output more compact by suppressing most zeroed args.

Make this configurable so that users can choose what best suit their
needs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-q0gxws02ygodh94o0hzim5xd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Arnaldo Carvalho de Melo
ac96287cae perf trace: Allow specifying a set of events to add in perfconfig
To add augmented_raw_syscalls to the events speficied by the user, or be
the only one if no events were specified by the user, one can add this
to perfconfig:

  # cat ~/.perfconfig
  [trace]
	  add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  #

I.e. pre-compile the augmented_raw_syscalls.c BPF program and make it
always load, this way:

  # perf trace -e open* cat /etc/passwd > /dev/null
     0.000 ( 0.013 ms): cat/31557 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
     0.035 ( 0.007 ms): cat/31557 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
     0.353 ( 0.009 ms): cat/31557 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
     0.424 ( 0.006 ms): cat/31557 openat(dfd: CWD, filename: /etc/passwd) = 3
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0lgj7vh64hg3ce44gsmvj7ud@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Arnaldo Carvalho de Melo
4623ce405d perf augmented_raw_syscalls: Do not include stdio.h
We're not using that puts() thing, and thus we don't need to define the
__bpf_stdout__ map, reducing the setup time.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3452xgatncpil7v22minkwbo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Leo Yan
7100b12cf4 perf cs-etm: Generate branch sample for exception packet
The exception packet appears as one element with 'elem_type' ==
OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which is
present for exception entry and exit respectively.  The decoder sets the
packet fields 'packet->exc' and 'packet->exc_ret' to indicate the
exception packets; but exception packets don't have a dedicated sample
type and shares the same sample type CS_ETM_RANGE with normal
instruction packets.

As a result, the exception packets are taken as normal instruction
packets and this introduces confusion in mixing different packet types.
Furthermore, these instruction range packets will be processed for
branch samples only when 'packet->last_instr_taken_branch' is true,
otherwise they will be omitted, this can introduce a mess for exception
and exception returning due to not having the complete address range
info for context switching.

To process exception packets properly, this patch introduces two new
sample types: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; these two types
of packets will be handled by cs_etm__exception().  The function
cs_etm__exception() forces setting the previous CS_ETM_RANGE packet flag
'prev_packet->last_instr_taken_branch' to true, this matches well with
the program flow when the exception is trapped from user space to kernel
space, no matter if the most recent flow has branch taken or not; this
is also safe for returning to user space after exception handling.

After exception packets have their own sample type, the packet fields
'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove
them.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-9-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:24:00 -03:00
Leo Yan
02e7e2509e perf cs-etm: Treat EO_TRACE element as trace discontinuity
If the decoder outputs an EO_TRACE element, it means the end of the
trace buffer; this is a discontinuity and in this case the end of trace
data needs to be saved.

This patch generates a CS_ETM_DISCONTINUITY packet for the EO_TRACE
element hereby flushing the end of trace data in cs-etm.c.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-8-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
37bb37168d perf cs-etm: Treat NO_SYNC element as trace discontinuity
The CoreSight tracer driver might insert barrier packets between
different buffers, thus the decoder can spot the boundaries based on the
barrier packet; it is possible for the decoder to hit a barrier packet
and emit a NO_SYNC element, then the decoder will find a periodic
synchronisation point inside that next trace block that starts the trace
again but does not have the TRACE_ON element as indicator - usually
because this trace block has wrapped the buffer so we have lost the
original point when the trace was enabled.

In the first case it causes the insertion of a OCSD_GEN_TRC_ELEM_NO_SYNC
in the middle of the tracing stream, but as we were not handling the
NO_SYNC element properly this ends up making users miss the
discontinuity indications.

Though OCSD_GEN_TRC_ELEM_NO_SYNC is different from CS_ETM_TRACE_ON when
output from the decoder, both indicate that the trace data is
discontinuous; this patch treats OCSD_GEN_TRC_ELEM_NO_SYNC as a trace
discontinuity and generates a CS_ETM_DISCONTINUITY packet for it, so
cs-etm can handle the discontinuity for this case, finally it saves the
last trace data for the previous trace block and restart samples for the
new block.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-7-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
49ccf87bfb perf cs-etm: Rename CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY
TRACE_ON element is used at the beginning of trace, it also can be
appeared in the middle of trace data to indicate discontinuity; for
example, it's possible to see multiple TRACE_ON elements in the trace
stream if the trace is being limited by address range filtering.

Furthermore, except TRACE_ON element is for discontinuity, NO_SYNC and
EO_TRACE also can be used to indicate discontinuity, though they are
used for different scenarios for which the trace is interrupted.

This patch renames sample type CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY,
firstly the new name describes more closely the purpose of the packet;
secondly this is a preparation for other output elements which also
cause the trace discontinuity thus they can share the same one packet
type.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-6-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
cfc1d4276b perf cs-etm: Refactor enumeration cs_etm_sample_type
The values in enumeration cs_etm_sample_type are defined with setting
bit N for each packet type, this is not suggested in the usual case.

This patch refactor cs_etm_sample_type by converting from bit shifting
values to continuous numbers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-5-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
cee7a6a212 perf cs-etm: Remove unused 'trace_on' in cs_etm_decoder
cs_etm_decoder::trace_on is being assigned when TRACE_ON or NO_SYNC
element is coming, but it is never used hence it is redundant and can
be removed.

So let's remove 'trace_on' field from cs_etm_decoder struct.

Suggested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-4-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
24fff5eb2b perf cs-etm: Avoid stale branch samples when flush packet
At the end of trace buffer handling, function cs_etm__flush() is invoked
to flush any remaining branch stack entries.  As a side effect, it also
generates branch sample, because the 'etmq->packet' doesn't contains any
new coming packet but point to one stale packet after packets swapping,
so it wrongly makes synthesize branch samples with stale packet info.

We could review below detailed flow which causes issue:

  Packet1: start_addr=0xffff000008b1fbf0 end_addr=0xffff000008b1fbfc
  Packet2: start_addr=0xffff000008b1fb5c end_addr=0xffff000008b1fb6c

  step 1: cs_etm__sample():
	sample: ip=(0xffff000008b1fbfc-4) addr=0xffff000008b1fb5c

  step 2: flush packet in cs_etm__run_decoder():
	cs_etm__run_decoder()
	  `-> err = cs_etm__flush(etmq, false);
	sample: ip=(0xffff000008b1fb6c-4) addr=0xffff000008b1fbf0

Packet1 and packet2 are two continuous packets, when packet2 is the new
coming packet, cs_etm__sample() generates branch sample for these two
packets and use [packet1::end_addr - 4 => packet2::start_addr] as branch
jump flow, thus we can see the first generated branch sample in step 1.
At the end of cs_etm__sample() it swaps packets so 'etm->prev_packet'=
packet2 and 'etm->packet'=packet1, so far it's okay for branch sample.

If packet2 is the last one packet in trace buffer, even there have no
any new coming packet, cs_etm__run_decoder() invokes cs_etm__flush() to
flush branch stack entries as expected, but it also generates branch
samples by taking 'etm->packet' as a new coming packet, thus the branch
jump flow is as [packet2::end_addr - 4 =>  packet1::start_addr]; this
is the second sample which is generated in step 2.  So actually the
second sample is a stale sample and we should not generate it.

This patch introduces a new function cs_etm__end_block(), at the end of
trace block this function is invoked to only flush branch stack entries
and thus can avoid to generate branch sample for stale packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-3-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:59 -03:00
Leo Yan
43fd56669c perf cs-etm: Correct packets swapping in cs_etm__flush()
The structure cs_etm_queue uses 'prev_packet' to point to previous
packet, this can be used to combine with new coming packet to generate
samples.

In function cs_etm__flush() it swaps packets only when the flag
'etm->synth_opts.last_branch' is true, this means that it will not swap
packets if without option '--itrace=il' to generate last branch entries;
thus for this case the 'prev_packet' doesn't point to the correct
previous packet and the stale packet still will be used to generate
sequential sample.  Thus if dump trace with 'perf script' command we can
see the incorrect flow with the stale packet's address info.

This patch corrects packets swapping in cs_etm__flush(); except using
the flag 'etm->synth_opts.last_branch' it also checks the another flag
'etm->sample_branches', if any flag is true then it swaps packets so can
save correct content to 'prev_packet'.  Finally this can fix the wrong
program flow dumping issue.

The patch has a minor refactoring to use 'etm->synth_opts.last_branch'
instead of 'etmq->etm->synth_opts.last_branch' for condition checking,
this is consistent with that is done in cs_etm__sample().

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1544513908-16805-2-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:58 -03:00
Arnaldo Carvalho de Melo
bbab50dda7 perf trace: Switch to using a struct for the aumented_raw_syscalls syscalls map values
We'll start adding more perf-syscall stuff, so lets do this prep step so
that the next ones are just about adding more fields.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-vac4sn1ns1vj4y07lzj7y4b8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:58 -03:00
Arnaldo Carvalho de Melo
27f2992e7b perf augmented_syscalls: Switch to using a struct for the syscalls map values
We'll start adding more perf-syscall stuff, so lets do this prep step so
that the next ones are just about adding more fields.

Run it with the .c file once to cache the .o file:

  # trace --filter-pids 2834,2199 -e openat,augmented_raw_syscalls.c
  LLVM: dumping augmented_raw_syscalls.o
       0.000 ( 0.021 ms): tmux: server/4952 openat(dfd: CWD, filename: /proc/5691/cmdline                         ) = 11
     349.807 ( 0.040 ms): DNS Res~er #39/11082 openat(dfd: CWD, filename: /etc/hosts, flags: CLOEXEC                 ) = 44
    4988.759 ( 0.052 ms): gsd-color/2431 openat(dfd: CWD, filename: /etc/localtime                             ) = 18
    4988.976 ( 0.029 ms): gsd-color/2431 openat(dfd: CWD, filename: /etc/localtime                             ) = 18
  ^C[root@quaco bpf]#

From now on, we can use just the newly built .o file, skipping the
compilation step for a faster startup:

  # trace --filter-pids 2834,2199 -e openat,augmented_raw_syscalls.o
       0.000 ( 0.046 ms): DNS Res~er #39/11088 openat(dfd: CWD, filename: /etc/hosts, flags: CLOEXEC                 ) = 44
    1946.408 ( 0.190 ms): systemd/1 openat(dfd: CWD, filename: /proc/1071/cgroup, flags: CLOEXEC          ) = 20
    1946.792 ( 0.215 ms): systemd/1 openat(dfd: CWD, filename: /proc/954/cgroup, flags: CLOEXEC           ) = 20
  ^C#

Now on to do the same in the builtin-trace.c side of things.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-k8mwu04l8es29rje5loq9vg7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:58 -03:00
Arnaldo Carvalho de Melo
61d007138a perf bpf: Move perf_event_output() from stdio.h to bpf.h
So that we don't always carry that __bpf_output__ map, leaving that to
the scripts wanting to use that facility.

'perf trace' will be changed to look if that map is present and only
setup the bpf-output events if so.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-azwys8irxqx9053vpajr0k5h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:58 -03:00
Arnaldo Carvalho de Melo
b27b38ed94 perf trace: Implement syscall filtering in augmented_syscalls
Just another map, this time an BPF_MAP_TYPE_ARRAY, stating with
one bool per syscall, stating if it should be filtered or not.

So, with a pre-built augmented_raw_syscalls.o file, we use:

  # perf trace -e open*,augmented_raw_syscalls.o
     0.000 ( 0.016 ms): DNS Res~er #37/29652 openat(dfd: CWD, filename: /etc/hosts, flags: CLOEXEC                 ) = 138
   187.039 ( 0.048 ms): gsd-housekeepi/2436 openat(dfd: CWD, filename: /etc/fstab, flags: CLOEXEC                 ) = 11
   187.348 ( 0.041 ms): gsd-housekeepi/2436 openat(dfd: CWD, filename: /proc/self/mountinfo, flags: CLOEXEC       ) = 11
   188.793 ( 0.036 ms): gsd-housekeepi/2436 openat(dfd: CWD, filename: /proc/self/mountinfo, flags: CLOEXEC       ) = 11
   189.803 ( 0.029 ms): gsd-housekeepi/2436 openat(dfd: CWD, filename: /proc/self/mountinfo, flags: CLOEXEC       ) = 11
   190.774 ( 0.027 ms): gsd-housekeepi/2436 openat(dfd: CWD, filename: /proc/self/mountinfo, flags: CLOEXEC       ) = 11
   284.620 ( 0.149 ms): DataStorage/3076 openat(dfd: CWD, filename: /home/acme/.mozilla/firefox/ina67tev.default/SiteSecurityServiceState.txt, flags: CREAT|TRUNC|WRONLY, mode: IRUGO|IWUSR|IWGRP) = 167
  ^C#

What is it that this gsd-housekeeping thingy needs to open
/proc/self/mountinfo four times periodically? :-)

This map will be extended to tell per-syscall parameters, i.e. how many
bytes to copy per arg, using the function signature to get the types and
then the size of those types, via BTF.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-cy222g9ucvnym3raqvxp0hpg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:58 -03:00
Arnaldo Carvalho de Melo
0df50e0b0e perf trace: Avoid using raw_syscalls in duplicity with eBPF augmentation
So when we do something like:

   # perf trace -e open*,augmented_raw_syscalls.o

We need to set trace->trace_syscalls because there is logic that use
that when mixing strace-like output with other events, such as scheduler
tracepoints, but with that set we ended up having multiple
raw_syscalls:sys_{enter,exit} setup, which garbled the output, so
check if trace->augmented_raw_syscalls is set and avoid the two extra
tracepoints.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-kjmnbrlgu0c38co1ye8egbsb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:57 -03:00
Arnaldo Carvalho de Melo
246fbe03ed perf trace: Rename set_ev_qualifier_filter to clarify its a tracepoint filter
Rename it to trace__set_ev_qualifier_tp_filter(), as this just sets up
tracepoint filters on the raw_syscalls:sys_{enter,exit} tracepoints, and
since we're going to do the same for the augmented_raw_syscalls
codepath, when used, rename it to clarify.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8bjsul8x7osw7nxjodnyfn14@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:57 -03:00
Jiri Olsa
3f643937aa perf tools: Link libperf-jvmti.so with LDFLAGS variable
So we could propagate distro flags into libperf-jvmti.so library.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181212132940.840-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:57 -03:00
Arnaldo Carvalho de Melo
866053bb64 perf tools: Cast off_t to s64 to avoid warning on bionic libc
To avoid this warning:

    CC       /tmp/build/perf/util/s390-cpumsf.o
  util/s390-cpumsf.c: In function 's390_cpumsf_samples':
  util/s390-cpumsf.c:508:3: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'off_t' [-Wformat=]
     pr_err("[%#08" PRIx64 "] Invalid AUX trailer entry TOD clock base\n",
     ^

Now the various Android cross toolchains used in the perf tools
container test builds are all clean and we can remove this:

  export EXTRA_MAKE_ARGS="WERROR=0"

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lkml.kernel.org/n/tip-5rav4ccyb0sjciysz2i4p3sx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:57 -03:00
Arnaldo Carvalho de Melo
d7a8c4a6a0 perf tools: Add missing open_memstream() prototype for systems lacking it
There are systems such as the Android NDK API level 24 has the
open_memstream() function but doesn't provide a prototype, adding noise
to the build:

  builtin-timechart.c: In function 'cat_backtrace':
  builtin-timechart.c:486:2: warning: implicit declaration of function 'open_memstream' [-Wimplicit-function-declaration]
    FILE *f = open_memstream(&p, &p_len);
    ^
  builtin-timechart.c:486:2: warning: nested extern declaration of 'open_memstream' [-Wnested-externs]
  builtin-timechart.c:486:12: warning: initialization makes pointer from integer without a cast
    FILE *f = open_memstream(&p, &p_len);
              ^

Define a LACKS_OPEN_MEMSTREAM_PROTOTYPE define so that code needing that
can get a prototype.

Checked in the bionic git repo to be available since level 23:

https://android.googlesource.com/platform/bionic/+/master/libc/include/stdio.h#241

  FILE* open_memstream(char** __ptr, size_t* __size_ptr) __INTRODUCED_IN(23);

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-343ashae97e5bq6vizusyfno@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:57 -03:00
Arnaldo Carvalho de Melo
0afcf29bab perf header: Fix up argument to ctime()
Reducing this noise when cross building to the Android NDK:

  util/header.c: In function 'perf_header__fprintf_info':
  util/header.c:2710:45: warning: pointer targets in passing argument 1 of 'ctime' differ in signedness [-Wpointer-sign]
    fprintf(fp, "# captured on    : %s", ctime(&st.st_ctime));
                                               ^
  In file included from util/../perf.h:5:0,
                   from util/evlist.h:11,
                   from util/header.c:22:
  /opt/android-ndk-r15c/platforms/android-26/arch-arm/usr/include/time.h:81:14: note: expected 'const time_t *' but argument is of type 'long unsigned int *'
   extern char* ctime(const time_t*) __LIBC_ABI_PUBLIC__;
                ^

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6bz74zp080yhmtiwb36enso9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:56 -03:00
Arnaldo Carvalho de Melo
748fe0889c perf tools: Add missing sigqueue() prototype for systems lacking it
There are systems such as the Android NDK API level 24 has the
sigqueue() function but doesn't provide a prototype, adding noise to the
build:

  util/evlist.c: In function 'perf_evlist__prepare_workload':
  util/evlist.c:1494:4: warning: implicit declaration of function 'sigqueue' [-Wimplicit-function-declaration]
      if (sigqueue(getppid(), SIGUSR1, val))
      ^
  util/evlist.c:1494:4: warning: nested extern declaration of 'sigqueue' [-Wnested-externs]

Define a LACKS_SIGQUEUE_PROTOTYPE define so that code needing that can
get a prototype.

Checked in the bionic git repo to be available since level 23:

https://android.googlesource.com/platform/bionic/+/master/libc/include/signal.h#123

  int sigqueue(pid_t __pid, int __signal, const union sigval __value) __INTRODUCED_IN(23);

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lmhpev1uni9kdrv7j29glyov@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:56 -03:00
Arnaldo Carvalho de Melo
436651caa1 perf trace beauty: renameat's newdirfd may also be AT_FDCWD
Noticed while working on renameat2.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8omchrcjcvlwoxxv6wrjehfh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:56 -03:00
Arnaldo Carvalho de Melo
ca7ff2c8e7 perf trace: Beautify renameat2's flags argument
# strace -e renameat2 -f perf trace -e rename* mv c /tmp
  strace: Process 10824 attached
  [pid 10824] renameat2(AT_FDCWD, "c", AT_FDCWD, "/tmp", RENAME_NOREPLACE) = -1 EXDEV (Invalid cross-device link)
  [pid 10824] renameat2(AT_FDCWD, "c", AT_FDCWD, "/tmp/c", RENAME_NOREPLACE) = -1 EEXIST (File exists)
       1.857 ( 0.008 ms): mv/10824 renameat2(olddfd: CWD, oldname: 0x7ffc72ff3d81, newdfd: CWD, newname: 0x7ffc72ff3d83, flags: NOREPLACE) = -1 EXDEV Invalid cross-device link
       2.002 ( 0.006 ms): mv/10824 renameat2(olddfd: CWD, oldname: 0x7ffc72ff3d81, newdfd: CWD, newname: 0x55ad609efcc0, flags: NOREPLACE) = -1 EEXIST File exists
  mv: 'c' and '/tmp/c' are the same file
  [pid 10824] +++ exited with 1 +++
  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=10824, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
  +++ exited with 0 +++
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-glyt6nzlt1yx56m5bshy6g83@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:56 -03:00
Arnaldo Carvalho de Melo
5a1cb7edfb perf beauty: Wire up the renameat flags table generator to the Makefile
Now when we run 'make -C tools/perf O=/tmp/build/perf' we end up with:

  $ cat /tmp/build/perf/trace/beauty/generated/rename_flags_array.c
  static const char *rename_flags[] = {
	[0 + 1] = "NOREPLACE",
	[1 + 1] = "EXCHANGE",
	[2 + 1] = "WHITEOUT",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-4fad4xahrn04y06o0lc49clm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:56 -03:00
Arnaldo Carvalho de Melo
bdc2a9d64a perf beauty: Add a string table generator for renameat2's flags constants
Using the already copied tools/include/uapi/linux/fs.h file:

  $ tools/perf/trace/beauty/rename_flags.sh
  static const char *rename_flags[] = {
	[0 + 1] = "NOREPLACE",
	[1 + 1] = "EXCHANGE",
	[2 + 1] = "WHITEOUT",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ta2jbh03spkymp4sbdh489g8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:55 -03:00
Arnaldo Carvalho de Melo
84a835412c perf trace beauty: Beautify renameat2's fd arg wrt AT_FDCWD
Just like is done with the 'renameat' syscall.

  # strace -e renameat2 -f perf trace -e rename* mv c /tmp
  [pid 12334] renameat2(AT_FDCWD, "c", AT_FDCWD, "/tmp", RENAME_NOREPLACE) = -1 EXDEV (Invalid cross-device link)
  [pid 12334] renameat2(AT_FDCWD, "c", AT_FDCWD, "/tmp/c", RENAME_NOREPLACE) = -1 EEXIST (File exists)
     1.947 ( 0.007 ms): mv/12334 renameat2(olddfd: CWD, oldname: 0x7ffce8b7fd81, newdfd: CWD, newname: 0x7ffce8b7fd83, flags: 1) = -1 EXDEV Invalid cross-device link
     2.073 ( 0.009 ms): mv/12334 renameat2(olddfd: CWD, oldname: 0x7ffce8b7fd81, newdfd: CWD, newname: 0x55ce7f0a1cc0, flags: 1mv: ) = -1 EEXIST File exists'c' and '/tmp/c' are the same file
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8q9l92eh9eee3y2bwyqku3tc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:55 -03:00
Arnaldo Carvalho de Melo
a761a8d102 perf trace: Allow selecting use the use of the ordered_events code
I was trigger happy on this one, as using ordered_events as implemented
by Jiri for use with the --block code under discussion on lkml incurs
in delaying processing to form batches that then get ordered and then
printed.

With 'perf trace' we want to process the events as they go, without that
delay, and doing it that way works well for the common case which is to
trace a thread or a workload started by 'perf trace'.

So revert back to not using ordered_events but add an option to select
that mode so that users can experiment with their particular use case to
see if works better, i.e. if the added delay is not a problem and the
ordering helps.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8ki7sld6rusnjhhtaly26i5o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:55 -03:00
Arnaldo Carvalho de Melo
7ba61524fa perf trace: Rename delivery functions to ease making ordered_events selectable
Just hide a bit more how events gets delivered, hiding ordered_events
details from the main loop.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lxwwf3238ta4neq2zh1y1h45@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:23:38 -03:00
Michael Petlan
51433ead14 perf stat: Avoid segfaults caused by negated options
Some 'perf stat' options do not make sense to be negated (event,
cgroup), some do not have negated path implemented (metrics). Due to
that, it is better to disable the "no-" prefix for them, since
otherwise, the later opt-parsing segfaults.

Before:

  $ perf stat --no-metrics -- ls
  Segmentation fault (core dumped)

After:

  $ perf stat --no-metrics -- ls
   Error: option `no-metrics' isn't available
   Usage: perf stat [<options>] [<command>]

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 1485912065.62416880.1544457604340.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:21:44 -03:00
Michael Petlan
4eaf97e8c5 perf tests: Use shebangs in the shell scripts
Since the first line was used as a test identification, it needs to be
skipped by shell_test__description() function now.

Further notes from Hendrik:

It might be worth to note that adding the shebang is necessary to spot
them as scripts.

Using /bin/sh looks fine to.  Just briefly checked whether the scripts
contains some bash-specifics, which is not the case.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 2127419430.57657104.1542836358464.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:21:44 -03:00
Adrian Hunter
571766010e perf auxtrace: Alter addr_filter__entire_dso() to work if there are no symbols
addr_filter__entire_dso() uses the first and last symbols from a dso,
and so does not work when there are no symbols.  Alter it to filter the
whole file instead.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Fixes: 1b36c03e35 ("perf record: Add support for using symbols in address filters")
Link: http://lkml.kernel.org/r/20181127084634.12469-1-adrian.hunter@intel.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:21:44 -03:00
Adrian Hunter
b5c2161cc4 perf dso: Export data_file_size() method there are no symbols
Will be used outside dso.c in a followup patch, so rename it and make it
non-static.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181127084634.12469-1-adrian.hunter@intel.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-18 12:21:44 -03:00
Jiri Olsa
028713aa83 perf trace: Add ordered processing
Sort events to provide the precise outcome of ordered events, just like
is done with 'perf report' and 'perf top'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Levin <ldv@altlinux.org>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Luis Cláudio Gonçalves <lclaudio@uudg.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181205160509.1168-9-jolsa@kernel.org
[ split from a larger patch, added trace__ prefixes to new 'struct trace' methods ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 15:21:17 -03:00
Jiri Olsa
83356b3d12 perf ordered_events: Add first_time() method
To get the timestamp in the first event in the queue.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Levin <ldv@altlinux.org>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Luis Cláudio Gonçalves <lclaudio@uudg.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/n/tip-appp27jw1ul8kgg872j43r5o@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 15:02:17 -03:00
Jiri Olsa
1f44b3e2fc perf trace: Move event delivery to a new deliver_event() function
Mov event delivery code to a new trace__deliver_event() function, so
it's easier to add ordered delivery coming in the following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Levin <ldv@altlinux.org>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Luis Cláudio Gonçalves <lclaudio@uudg.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181205160509.1168-8-jolsa@kernel.org
[ Add trace__ prefix to the deliver_event method ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 15:02:17 -03:00
Jiri Olsa
68ca5d07de perf ordered_events: Add ordered_events__flush_time interface
Add OE_FLUSH__TIME flush type, to be able to flush only certain amount
of the queue based on the provided timestamp. It will be used in the
following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Levin <ldv@altlinux.org>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Luis Cláudio Gonçalves <lclaudio@uudg.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181205160509.1168-7-jolsa@kernel.org
[ Fix the build on older systems such as centos 5 and 6 where 'time' shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 15:02:12 -03:00
Eugeniy Paltsev
6d99a79cb4 perf annotate: Introduce basic support for ARC
Introduce basic 'perf annotate' support for ARC to be able to use
anotation via stdio interface.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Brodkin <alexey.brodkin@synopsys.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
Link: http://lkml.kernel.org/r/20181204175118.25232-1-Eugeniy.Paltsev@synopsys.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:42 -03:00
Sihyeon Jang
75c375c0ae perf config: Modify size factor of snprintf
According to definition of snprintf, it gets size factor including
null('\0') byte.  So '-1' is not neccessary. Also it will be helpful
unfied style with other cases. (eg. builtin-script.c)

Signed-off-by: Sihyeon Jang <uneedsihyeon@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181201154603.10093-1-uneedsihyeon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:40 -03:00
Alexey Budankov
c8dd6ee51a perf record: Fix memory leak on AIO objects deallocation
Sending a part which was missed between v12 and v13 of the patch set
introducing AIO trace streaming for perf record mode.

The part is essential to avoid memory leakage during deallocation of AIO
related trace data buffers.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e5d3154e-1583-83bb-9527-28ddbc6dbf9d@linux.intel.com
[ No need to test for NULL before calling zfree() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:34 -03:00
Andi Kleen
91b2b97025 perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX
Fix incorrect event names for the Load_Miss_Real_Latency metric for
Skylake and Skylake Server.

Fixes https://github.com/andikleen/pmu-tools/issues/158

Before:

  % perf stat -M Load_Miss_Real_Latency true
  event syntax error: '..ss.pending,mem_load_retired.l1_miss_ps,mem_load_retired.fb_hit_ps}:W'
                                    \___ parser error

   Usage: perf stat [<options>] [<command>]

      -M, --metrics <metric/metric group list>
                            monitor specified metrics or metric groups (separated by ,)

After:

  % perf stat -M Load_Miss_Real_Latency true

   Performance counter stats for 'true':

             279,204      l1d_pend_miss.pending     #     14.0 Load_Miss_Real_Latency
               4,784      mem_load_uops_retired.l1_miss
              15,188      mem_load_uops_retired.hit_lfb

         0.000899640 seconds time elapsed

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20181120050635.4215-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:32 -03:00
Arnaldo Carvalho de Melo
bd8d57fb7e perf parse-events: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  util/parse-events.c: In function 'print_symbol_events':
  util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
      strncpy(name, syms->symbol, MAX_NAME_LEN);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In function 'print_symbol_events.constprop',
      inlined from 'print_events' at util/parse-events.c:2508:2:
  util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
      strncpy(name, syms->symbol, MAX_NAME_LEN);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In function 'print_symbol_events.constprop',
      inlined from 'print_events' at util/parse-events.c:2511:2:
  util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
      strncpy(name, syms->symbol, MAX_NAME_LEN);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 947b4ad1d1 ("perf list: Fix max event string size")
Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:30 -03:00
Arnaldo Carvalho de Melo
bef0b8970f perf probe: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

In this case the 'target' buffer is coming from a list of build-ids that
are expected to have a len of at most (SBUILD_ID_SIZE - 1) chars, so
probably we're safe, but since we're using strncpy() here, use strlcpy()
instead to provide the intended safety checking without the using the
problematic strncpy() function.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  util/probe-file.c: In function 'probe_cache__open.isra.5':
  util/probe-file.c:427:3: error: 'strncpy' specified bound 41 equals destination size [-Werror=stringop-truncation]
     strncpy(sbuildid, target, SBUILD_ID_SIZE);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 1f3736c9c8 ("perf probe: Show all cached probes")
Link: https://lkml.kernel.org/n/tip-l7n8ggc9kl38qtdlouke5yp5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:28 -03:00
Arnaldo Carvalho de Melo
4d0f16d059 perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

In this case we are actually setting the null byte at the right place,
but since we pass the buffer size as the limit to strncpy() and not
it minus one, gcc ends up warning us about that, see below. So, lets
just switch to the shorter form provided by strlcpy().

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  ui/tui/helpline.c: In function 'tui_helpline__push':
  ui/tui/helpline.c:27:2: error: 'strncpy' specified bound 512 equals destination size [-Werror=stringop-truncation]
    strncpy(ui_helpline__current, msg, sz)[sz - 1] = '\0';
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: e6e9046879 ("perf ui: Introduce struct ui_helpline")
Link: https://lkml.kernel.org/n/tip-d1wz0hjjsh19xbalw69qpytj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:23 -03:00
Arnaldo Carvalho de Melo
2f5302533f perf svghelper: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

In this specific case this would only happen if fgets() was buggy, as
its man page states that it should read one less byte than the size of
the destination buffer, so that it can put the nul byte at the end of
it, so it would never copy 255 non-nul chars, as fgets reads into the
orig buffer at most 254 non-nul chars and terminates it. But lets just
switch to strlcpy to keep the original intent and silence the gcc 8.2
warning.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  In function 'cpu_model',
      inlined from 'svg_cpu_box' at util/svghelper.c:378:2:
  util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation]
       strncpy(cpu_m, &buf[13], 255);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Fixes: f48d55ce78 ("perf: Add a SVG helper library file")
Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:20 -03:00
Arnaldo Carvalho de Melo
b6313899f4 perf help: Remove needless use of strncpy()
Since we make sure the destination buffer has at least strlen(orig) + 1,
no need to do a strncpy(dest, orig, strlen(orig)), just use strcpy(dest,
orig).

This silences this gcc 8.2 warning on Alpine Linux:

  In function 'add_man_viewer',
      inlined from 'perf_help_config' at builtin-help.c:284:3:
  builtin-help.c:192:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    strncpy((*p)->name, name, len);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  builtin-help.c: In function 'perf_help_config':
  builtin-help.c:187:15: note: length computed here
    size_t len = strlen(name);
                 ^~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 0780060124 ("perf_counter tools: add in basic glue from Git")
Link: https://lkml.kernel.org/n/tip-2f69l7drca427ob4km8i7kvo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:18 -03:00
Arnaldo Carvalho de Melo
5192bde7d9 perf header: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  util/header.c: In function 'perf_event__synthesize_event_update_name':
  util/header.c:3625:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    strncpy(ev->data, evsel->name, len);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  util/header.c:3618:15: note: length computed here
    size_t len = strlen(evsel->name);
                 ^~~~~~~~~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a6e5281780 ("perf tools: Add event_update event unit type")
Link: https://lkml.kernel.org/n/tip-wycz66iy8dl2z3yifgqf894p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:08 -03:00
Arnaldo Carvalho de Melo
7572588085 perf header: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  util/header.c: In function 'perf_event__synthesize_event_update_unit':
  util/header.c:3586:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    strncpy(ev->data, evsel->unit, size);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  util/header.c:3579:16: note: length computed here
    size_t size = strlen(evsel->unit);
                  ^~~~~~~~~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a6e5281780 ("perf tools: Add event_update event unit type")
Link: https://lkml.kernel.org/n/tip-fiikh5nay70bv4zskw2aa858@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:06 -03:00
Arnaldo Carvalho de Melo
fca5085c15 perf dso: Fix unchecked usage of strncpy()
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  In function 'decompress_kmodule',
      inlined from 'dso__decompress_kmodule_fd' at util/dso.c:305:9:
  util/dso.c:298:3: error: 'strncpy' destination unchanged after copying no bytes [-Werror=stringop-truncation]
     strncpy(pathname, tmpbuf, len);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    CC       /tmp/build/perf/util/values.o
    CC       /tmp/build/perf/util/debug.o
  cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: c9a8a6131f ("perf tools: Move the temp file processing into decompress_kmodule")
Link: https://lkml.kernel.org/n/tip-tl2hdxj64tt4k8btbi6a0ugw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:03 -03:00
Mathieu Poirier
15a5cd1962 perf cs-etm: Add support for PTMv1.1 decoding
This patch is re-using the mechanic set forth by ETMv3 to add support
for PTM decoding.  Configuration for both encoding protocol is similar
but the generated stream itself is very different, hence requiring
special handling.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1543955944-10042-4-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:59:01 -03:00
Mathieu Poirier
7d0f4fefc4 perf cs-etm: Add support for ETMv3 trace decoding
Add support for the creation of packet printer and decoder for the ETMv3
trace architecture.  That way traces generated by tracers adhering to
that trace protocol can be handled properly by the perf infrastructure.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1543955944-10042-3-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:59 -03:00
Mathieu Poirier
78688342c5 perf cs-etm: Add configuration for ETMv3 trace protocol
This patch deals with the proper initialisation of configuration
parameters for the ETMv3 trace protocol in order to properly handle
packets generated by tracers following this specification.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1543955944-10042-2-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:53 -03:00
Jiri Olsa
8aa5c8eddc perf top: Move perf_top__reset_sample_counters() to after counts display
Move the perf_top__reset_sample_counters() call to right after we
display the counters so we can see the updated numbers for longer.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-o72pyiwt05f3p2juprwmz2jo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:47 -03:00
Jiri Olsa
d8590430fb perf top: Display slow reader warning when droping samples
Currently we display the "Too slow to read ring buffer.." helpline only
in the slow reader thread. This patch triggers it also when the
processing thread drops samples, because it has the same reason, which
is too many data on input.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-bnev2mloavyurmgchcr3o24o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:40 -03:00
Jiri Olsa
97f7e0b33d perf top: Save and display the drop count stats
Add drop count to 'perf top' headers:

  # perf top --stdio
   PerfTop:    3549 irqs/sec  kernel:51.8%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles:ppp],  (all, 8 CPUs)

  # perf top
  Samples: 0  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 0 lost: 0/0 drop: 0/0

The format is: <current period drop>/<total drop>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-2lj87zz8tq9ye1ntax3ulw0n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:33 -03:00
Jiri Olsa
d63b9f6fea perf top: Drop samples which are behind the refresh rate
Drop samples from processing thread if they get behind the latest event
read from the kernel maps. If it gets behind more than the refresh rate
(-d option), drop the sample.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-x533ra5c1pgofvbtsizzuydd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:26 -03:00
Jiri Olsa
c94cef4beb perf top: Set the 'session_done' volatile variable when exiting
So we can get out of hist processing ASAP on user request.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-r8aufbgbixr2f85s3wcoaw9v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:19 -03:00
Jiri Olsa
94ad6e7e36 perf top: Use cond variable instead of a lock
Use conditional variable logic to synchronize between the reading and
processing threads. Currently it's done by having mutex around rotation
code.

Using a POSIX cond variable to sync both threads after queues rotation:

  Process thread:

    - Detects data
    - Switches queues
    - Sets rotate variable
    - Waits in pthread_cond_wait()

  Read thread:

    - Detects rotate is set
    - Kicks the process thread with a pthread_cond_signal()

After this rotation is safely completed and both threads can continue
with the new queue.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-3rdeg23rv3brvy1pwt3igvyw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:58:03 -03:00
Jiri Olsa
16c66bc167 perf top: Add processing thread
Add a new thread that takes care of the hist creating to alleviate the
main reader thread so it can keep perf mmaps served in time so that we
reduce the possibility of losing events.

The 'perf top' command now spawns 2 extra threads, the data processing
is the following:

  1) The main thread reads the data from mmaps and queues them to
     ordered events object;

  2) The processing threads takes the data from the ordered events
     object and create initial histogram;

  3) The GUI thread periodically sorts the initial histogram and
     presents it.

Passing the data between threads 1 and 2 is done by having 2 ordered
events queues. One is always being stored by thread 1 while the other is
flushed out in thread 2.

Passing the data between threads 2 and 3 stays the same as was initially
for threads 1 and 3.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-hhf4hllgkmle9wl1aly1jli0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:52 -03:00
Jiri Olsa
254de74cd1 perf top: Move lost events warning to helpline
We can't display the UI box saying that we are slow in the reader
thread.  That will make 'perf top' even slower and the user even more
angry ;-)

Move the UI box message from the reader thread to the UI thread and
change it to a helpline, so there's no need to 'press any key'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-x4k0iuw7tt6mywsaguq6jfwu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:45 -03:00
Jiri Olsa
d24e3c98ac perf top: Save and display the lost count stats
Add a 'lost count' to 'perf top' headers:

  # perf top --stdio
   PerfTop:    3850 irqs/sec  kernel:49.0%  exact: 100.0% lost: 0/0 [4000Hz cycles:ppp],  (all, 8 CPUs)

  # perf top
  Samples: 0  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 0 lost: 0/0

The format is: <current period lost>/<total lost>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-zo11rn270gij5jtp8fknpf8u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:36 -03:00
Jiri Olsa
a4a6668a62 perf ordered_events: Add private data member
We will need it in following patch, where we can't use the
container_of() trick to get the higher level object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-vgs9aoek21v14o3obza586yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:30 -03:00
Jiri Olsa
b8494f1df8 perf ordered_events: Rework show_progress for __ordered_events__flush
Decide to use the progress bar one level higher, we will need this in
following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-ocjdukp2a8ujikkmafd0j5zv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:12 -03:00
Andi Kleen
dd2e18e9ac perf tools: Support 'srccode' output
When looking at PT or brstackinsn traces with 'perf script' it can be
very useful to see the source code. This adds a simple facility to print
them with 'perf script', if the information is available through dwarf

  % perf record ...
  % perf script -F insn,ip,sym,srccode
  ...

            4004c6 main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004c6 main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004b3 main
  6                       v++;

  % perf record -b ...
  % perf script -F insn,ip,sym,srccode,brstackinsn

  ...
         main+22:
          0000000000400543        insn: e8 ca ff ff ff            # PRED
  |18                     f1();
          f1:
          0000000000400512        insn: 55
  |10       {
          0000000000400513        insn: 48 89 e5
          0000000000400516        insn: b8 00 00 00 00
  |11             f2();
          000000000040051b        insn: e8 d6 ff ff ff            # PRED
          f2:
          00000000004004f6        insn: 55
  |5        {
          00000000004004f7        insn: 48 89 e5
          00000000004004fa        insn: 8b 05 2c 0b 20 00
  |6              c = a / b;
          0000000000400500        insn: 8b 0d 2a 0b 20 00
          0000000000400506        insn: 99
          0000000000400507        insn: f7 f9
          0000000000400509        insn: 89 05 29 0b 20 00
          000000000040050f        insn: 90
  |7        }
          0000000000400510        insn: 5d
          0000000000400511        insn: c3                        # PRED
          f1+14:
          0000000000400520        insn: b8 00 00 00 00
  |12             f2();
          0000000000400525        insn: e8 cc ff ff ff            # PRED
          f2:
          00000000004004f6        insn: 55
  |5        {
          00000000004004f7        insn: 48 89 e5
          00000000004004fa        insn: 8b 05 2c 0b 20 00
  |6              c = a / b;

Not supported for callchains currently, would need some layout changes
there.

Committer notes:

Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
warning:

  In file included from util/srccode.c:19:0:
  /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
   #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
    ^~~~~~~
  cc1: all warnings being treated as errors

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:07 -03:00
Arnaldo Carvalho de Melo
42da438c1b perf trace: We need to consider "nr" if "__syscall_nr" is not there
To cope with older kernels that don't have this patch backported:

  026842d148 ("tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:")

This makes 'perf trace' work again in RHEL7 kernels.

Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6h1syw2isegnhb1bjmtr9x9k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:02 -03:00
Mark Drayton
3fcb10e496 perf tools: Allow specifying proc-map-timeout in config file
The default timeout of 500ms for parsing /proc/<pid>/maps files is too
short for profiling many of our services.

This can be overridden by passing --proc-map-timeout to the relevant
command but it'd be nice to globally increase our default value.

This patch permits setting a different default with the
core.proc-map-timeout config file parameter.

Signed-off-by: Mark Drayton <mbd@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:57 -03:00
Ingo Molnar
adba163441 perf tools: Fix diverse comment typos
Go over the tools/ files that are maintained in Arnaldo's tree and
fix common typos: half of them were in comments, the other half
in JSON files.

No change in functionality intended.

Committer notes:

This was split from a larger patch as there are code that is,
additionally, maintained outside the kernel tree, so to ease
cherry-picking and/or backporting, split this into multiple patches.

Just typos in comments, no need to backport, reducing the possibility of
possible backporting artifacts.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:47 -03:00
Ingo Molnar
e4a8b0af51 perf bpf-loader: Fix debugging message typo
Go over the tools/ files that are maintained in Arnaldo's tree and
fix common typos: half of them were in comments, the other half
in JSON files.

No change in functionality intended.

Committer notes:

This was split from a larger patch as there are code that is,
additionally, maintained outside the kernel tree, so to ease cherry
picking and/or backporting, split this into multiple patches.

This one has information that is presented to the user, albeit in debug
mode.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:39 -03:00
Ingo Molnar
1a7ea3283f perf tools Documentation: Fix diverse typos
Go over the tools/ files that are maintained in Arnaldo's tree and
fix common typos: half of them were in comments, the other half
in JSON files.

No change in functionality intended.

Committer notes:

This was split from a larger patch as there are code that is,
additionally, maintained outside the kernel tree, so to ease cherry
picking and/or backporting, split this into multiple patches.

In this particular case, it affects documentation, so may be interesting
to cherry pick as it is information that is presented to the user.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:36 -03:00
Ingo Molnar
b1d6f155e1 perf vendor events intel: Fix diverse typos
Go over the tools/ files that are maintained in Arnaldo's tree and
fix common typos: half of them were in comments, the other half
in JSON files.

( Care should be taken not to re-import these typos in the future,
  if the JSON files get updated by the vendor without fixing the typos. )

No change in functionality intended.

Committer notes:

This was split from a larger patch as there are code that is,
additionally, maintained outside the kernel tree, so to ease cherry
picking and/or backporting, split this into multiple patches.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:31 -03:00
Florian Fainelli
24f967337f perf tests ARM: Disable breakpoint tests 32-bit
The breakpoint tests on the ARM 32-bit kernel are broken in several
ways.

The breakpoint length requested does not necessarily match whether the
function address has the Thumb bit (bit 0) set or not, and this does
matter to the ARM kernel hw_breakpoint infrastructure. See [1] for
background.

[1]: https://lkml.org/lkml/2018/11/15/205

As Will indicated, the overflow handling would require single-stepping
which is not supported at the moment. Just disable those tests for the
ARM 32-bit platforms and update the comment above to explain these
limitations.

Co-developed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181203191138.2419-1-f.fainelli@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:27 -03:00
Robert Walker
a7ee4d625e perf cs-etm: Support for ARM A32/T32 instruction sets in CoreSight trace
This patch adds support for generating instruction samples from trace of
AArch32 programs using the A32 and T32 instruction sets.

T32 has variable 2 or 4 byte instruction size, so the conversion between
addresses and instruction counts requires extra information from the
trace decoder, requiring version 0.10.0 of OpenCSD.  A check for the
OpenCSD library version has been added to the feature check for OpenCSD.

Signed-off-by: Robert Walker <robert.walker@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1543839526-30348-1-git-send-email-robert.walker@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:18 -03:00
Sihyeon Jang
00879763fc perf beauty mmap_flags: Fixed syntax error Fixed missing ']' error
Committer testing:

Before:

  # tools/perf/trace/beauty/mmap_flags.sh
  static const char *mmap_flags[] = {
	[ilog2(0x40) + 1] = "32BIT",
  tools/perf/trace/beauty/mmap_flags.sh: line 23: [: missing `]'
	[ilog2(0x01) + 1] = "SHARED",
	[ilog2(0x02) + 1] = "PRIVATE",
	[ilog2(0x10) + 1] = "FIXED",
	[ilog2(0x20) + 1] = "ANONYMOUS",
	[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
  tools/perf/trace/beauty/mmap_flags.sh: line 28: [: missing `]'
	[ilog2(0x0100) + 1] = "GROWSDOWN",
	[ilog2(0x0800) + 1] = "DENYWRITE",
	[ilog2(0x1000) + 1] = "EXECUTABLE",
	[ilog2(0x2000) + 1] = "LOCKED",
	[ilog2(0x4000) + 1] = "NORESERVE",
	[ilog2(0x8000) + 1] = "POPULATE",
	[ilog2(0x10000) + 1] = "NONBLOCK",
	[ilog2(0x20000) + 1] = "STACK",
	[ilog2(0x40000) + 1] = "HUGETLB",
	[ilog2(0x80000) + 1] = "SYNC",
  };
  #

After:

  $ tools/perf/trace/beauty/mmap_flags.sh
  static const char *mmap_flags[] = {
	[ilog2(0x40) + 1] = "32BIT",
	[ilog2(0x01) + 1] = "SHARED",
	[ilog2(0x02) + 1] = "PRIVATE",
	[ilog2(0x10) + 1] = "FIXED",
	[ilog2(0x20) + 1] = "ANONYMOUS",
	[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
	[ilog2(0x0100) + 1] = "GROWSDOWN",
	[ilog2(0x0800) + 1] = "DENYWRITE",
	[ilog2(0x1000) + 1] = "EXECUTABLE",
	[ilog2(0x2000) + 1] = "LOCKED",
	[ilog2(0x4000) + 1] = "NORESERVE",
	[ilog2(0x8000) + 1] = "POPULATE",
	[ilog2(0x10000) + 1] = "NONBLOCK",
	[ilog2(0x20000) + 1] = "STACK",
	[ilog2(0x40000) + 1] = "HUGETLB",
	[ilog2(0x80000) + 1] = "SYNC",
  };
  $

Signed-off-by: Sihyeon Jang <uneedsihyeon@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: aca70cc40a0b ("perf beauty mmap_flags: Check if the arch has a mmap.h file")
Link: http://lkml.kernel.org/r/20181202080651.4685-1-uneedsihyeon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:13 -03:00
Tzvetomir Stoyanov
f0bba09ce3 perf tools: traceevent API cleanup, remove __tep_data2host*()
In order to make libtraceevent into a proper library, its API should be
straightforward. The __tep_data2host*() functions are going to no longer
be available as a libtraceevent API, tep_read_number() should be used
instead. This patch replaces __tep_data2host*() usage with
tep_read_number() in perf.

Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181130154647.743979275@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:08 -03:00
Tzvetomir Stoyanov
97fbf3f0e0 tools lib traceevent, perf tools: Rename 'struct tep_event_format' to 'struct tep_event'
In order to make libtraceevent into a proper library, variables, data
structures and functions require a unique prefix to prevent name space
conflicts.

This renames 'struct tep_event_format' to 'struct tep_event', which
describes more closely the purpose of the struct.

Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181130154647.436403995@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
[ Fixup conflict with 6e33c250a88f ("tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:02 -03:00
Jin Yao
239ca3e786 perf report: Documentation average IPC and IPC coverage
Add explanations for new columns "IPC" and "IPC coverage" in perf
documentation.

 v5:
 ---
 Update the description according to Ingo's comments.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1543586097-27632-5-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:49 -03:00
Jin Yao
ec6ae74fe8 perf report: Display average IPC and IPC coverage per symbol
Support displaying the average IPC and IPC coverage per symbol in 'perf
report' --tui and --stdio modes.

For example,

 $ perf record -b ...
 $ perf report -s symbol

 Overhead  Symbol                           IPC   [IPC Coverage]
   39.60%  [.] __random                     2.30  [ 54.8%]
   18.02%  [.] main                         0.43  [ 54.3%]
   14.21%  [.] compute_flag                 2.29  [100.0%]
   14.16%  [.] rand                         0.36  [100.0%]
    7.06%  [.] __random_r                   2.57  [ 70.5%]
    6.85%  [.] rand@plt                     0.00  [  0.0%]

Jiri Olsa <jolsa@redhat.com> provided the patch to support the --stdio
mode. I merged Jiri's code in this patch.

  $ perf report -s symbol --stdio

    # Overhead  Symbol                       IPC   [IPC Coverage]
    # ........  ...........................  ....................
    #
      39.60%  [.] __random                   2.30  [ 54.8%]
      18.02%  [.] main                       0.43  [ 54.3%]
      14.21%  [.] compute_flag               2.29  [100.0%]
      14.16%  [.] rand                       0.36  [100.0%]
       7.06%  [.] __random_r                 2.57  [ 70.5%]
       6.85%  [.] rand@plt                   0.00  [  0.0%]
       0.02%  [k] run_timer_softirq          1.60  [ 57.2%]

The columns "IPC" and "[IPC Coverage]" are automatically enabled when
the sort-key "symbol" is specified. If the perf.data file doesn't
contain timed LBR information, columns are filled with "-".

For example,

  # Overhead  Symbol                       IPC   [IPC Coverage]
  # ........  ...........................  ....................
  #
      46.57%  [.] main                     -      -
      17.60%  [.] rand                     -      -
      15.84%  [.] __random_r               -      -
      11.90%  [.] __random                 -      -
       6.50%  [.] compute_flag             -      -
       1.59%  [.] rand@plt                 -      -
       0.00%  [.] _dl_relocate_object      -      -
       0.00%  [k] tlb_flush_mmu            -      -
       0.00%  [k] perf_event_mmap          -      -
       0.00%  [k] native_sched_clock       -      -
       0.00%  [k] intel_pmu_handle_irq_v4  -      -
       0.00%  [k] native_write_msr         -      -

 v3:
 ---
 Removed the sortkey 'ipc' from command-line. The columns "IPC"
 and "[IPC Coverage]" are automatically enabled when "symbol"
 is specified.

 v2:
 ---
 Merge in Jiri's patch to support stdio mode

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1543586097-27632-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:44 -03:00
Jin Yao
246fda09c1 perf annotate: Create a annotate2 flag in struct symbol
We often use the symbol__annotate2() to annotate a specified symbol.
While annotating may take some time, so in order to avoid annotating the
same symbol repeatedly, the patch creates a new flag to indicate the
symbol has been annotated.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1543586097-27632-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:40 -03:00
Jin Yao
ace4f8faea perf annotate: Compute average IPC and IPC coverage per symbol
Add support to 'perf report' annotate view or 'perf annotate --stdio2'
to aggregate the IPC derived from timed LBRs per symbol. We compute the
average IPC and the IPC coverage percentage.

For example:

  $ perf annotate --stdio2

  Percent  IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)

                          Disassembly of section .text:

                          000000000003aac0 <random@@GLIBC_2.2.5>:
    8.32  3.28              sub    $0x18,%rsp
          3.28              mov    $0x1,%esi
          3.28              xor    %eax,%eax
          3.28              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
   11.57  3.28     1      ↓ je     20
                            lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
                          ↓ jne    29
                          ↓ jmp    43
   11.57  1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
    0.00  1.10     1      ↓ je     43
                      29:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                            sub    $0x80,%rsp
                          → callq  __lll_lock_wait_private
                            add    $0x80,%rsp
    0.00  3.00        43:   lea    __ctype_b@GLIBC_2.2.5+0x38,%rdi
          3.00              lea    0xc(%rsp),%rsi
    8.49  3.00     1      → callq  __random_r
    7.91  1.94              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
    0.00  1.94     1      ↓ je     68
                            lock   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
                          ↓ jne    70
                          ↓ jmp    8a
    0.00  2.00        68:   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
   21.56  2.00     1      ↓ je     8a
                      70:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                            sub    $0x80,%rsp
                          → callq  __lll_unlock_wake_private
                            add    $0x80,%rsp
   21.56  2.90        8a:   movslq 0xc(%rsp),%rax
          2.90              add    $0x18,%rsp
    9.03  2.90     1      ← retq

It shows for this symbol the average IPC is 2.30 and the IPC coverage is
54.8%.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:32 -03:00
Arnaldo Carvalho de Melo
a1c8cf293d perf beauty mmap_flags: Check if the arch has a mmap.h file
If not, then just use what is in asm-generic. This fixes the build for
my sh4, m68k and riscv64 perf test build containers that were failing
due to 80ee5668b8 ("perf beauty: Add a generator for MAP_ mmap's flag
constants"), that were not covered in the cset introducing those
tools/arch/*/include/uapi/asm/mman.h files.

  f3539c12d8 ("tools include: Add uapi mman.h for each architecture")

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 80ee5668b8 ("perf beauty: Add a generator for MAP_ mmap's flag constants")
Link: https://lkml.kernel.org/n/tip-rpy9t2e0wxpnum1yvxhreafe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:14 -03:00
Alexey Budankov
93f20c0fe3 perf record: Extend trace writing to multi AIO
Multi AIO trace writing allows caching more kernel data into userspace
memory postponing trace writing for the sake of overall profiling data
thruput increase. It could be seen as kernel data buffer extension into
userspace memory.

With an --aio option value different from 0 (default value is 1) the
tool has capability to cache more and more data into user space along
with delegating spill to AIO.

That allows avoiding to suspend at record__aio_sync() between calls of
record__mmap_read_evlist() and increases profiling data thruput at the
cost of userspace memory.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:11 -03:00
Alexey Budankov
d3d1af6f01 perf record: Enable asynchronous trace writing
The trace file offset is read once before mmaps iterating loop and
written back after all performance data is enqueued for aio writing.

The trace file offset is incremented linearly after every successful aio
write operation.

record__aio_sync() blocks till completion of the started AIO operation
and then proceeds.

record__aio_mmap_read_sync() implements a barrier for all incomplete
aio write requests.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:08 -03:00
Alexey Budankov
0b77383134 perf mmap: Map data buffer for preserving collected data
The map->data buffer is used to preserve map->base profiling data for
writing to disk. AIO map->cblock is used to queue corresponding
map->data buffer for asynchronous writing.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:55:01 -03:00
Alexey Budankov
2a07d81474 tools build feature: Check if libaio is available
This will be used by 'perf record' to speed up reading the perf ring
buffer.

Committer testing:

  $ make -C tools/perf O=/tmp/build/perf
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j8' parallel build

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ OFF ]
  ...                      libaudit: [ OFF ]
  ...                        libbfd: [ OFF ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ OFF ]
  ...        numa_num_possible_cpus: [ OFF ]
  ...                       libperl: [ OFF ]
  ...                     libpython: [ OFF ]
  ...                      libslang: [ on  ]
  ...                     libcrypto: [ on  ]
  ...                     libunwind: [ on  ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ on  ]
  ...                     get_cpuid: [ on  ]
  ...                           bpf: [ on  ]
  ...                        libaio: [ on  ]

  $ ls -la /tmp/build/perf/feature/test-libaio.*
  -rwxrwxr-x. 1 acme acme 18296 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.bin
  -rw-rw-r--. 1 acme acme  1165 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.d
  -rw-rw-r--. 1 acme acme     0 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.make.output
  $
  $ grep -i aio /tmp/build/perf/FEATURE-DUMP
  feature-libaio=1
  $

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:54 -03:00
Adrian Hunter
1c6f709b9f perf intel-pt: Fix error with config term "pt=0"
Users should never use 'pt=0', but if they do it may give a meaningless
error:

	$ perf record -e intel_pt/pt=0/u uname
	Error:
	The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
	event (intel_pt/pt=0/u).

Fix that by forcing 'pt=1'.

Committer testing:

  # perf record -e intel_pt/pt=0/u uname
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u).
  /bin/dmesg | grep -i perf may provide additional information.

  # perf record -e intel_pt/pt=0/u uname
  pt=0 doesn't make sense, forcing pt=1
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB perf.data ]
  #

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:51 -03:00
Arnaldo Carvalho de Melo
1b3aae90c6 perf top: Allow passing a kallsyms file
This basically replicates what was done for 'perf report' in:

   b226a5a729 ("perf report: Allow user to specify path to kallsyms file")

This should help with resolving eBPF symbols, that are in kallsyms but,
of course, not in vmlinux.

Reported-by: Ivan Babrou <ibobrik@gmail.com>
Tested-by: Ivan Babrou <ibobrik@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-x52mx1ybq8128rtg9hjrj5qk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:40 -03:00
Wen Yang
19702894cd perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR())
Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)).  This
makes it more readable and also fix this warning detected by
err_cast.cocci:

  tools/perf/util/bpf-loader.c:1606:11-18: WARNING: ERR_CAST can be used with op

Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wen Yang <yellowriver2010@hotmail.com>
Cc: zhong.weidong@zte.com.cn
Link: http://lkml.kernel.org/r/20181127090610.28488-1-wen.yang99@zte.com.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:36 -03:00
Adrian Hunter
741dad88dd perf test: Fix perf_event_attr test failure
Fix inconsistent use of tabs and spaces error:

  # perf test 16 -v
  16: Setup struct perf_event_attr                          :
  --- start ---
  test child forked, pid 20224
    File "/usr/libexec/perf-core/tests/attr.py", line 119
      log.warning("expected %s=%s, got %s" % (t, self[t], other[t]))
                                                                 ^
  TabError: inconsistent use of tabs and spaces in indentation
  test child finished with -1
  ---- end ----
  Setup struct perf_event_attr: FAILED!

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181122140456.16817-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:32 -03:00
Adrian Hunter
2aac9f9d5b perf tests record: Allow for 'sleep' being 'coreutils'
If the 'sleep' command is provided by coreutils, then the "PERF_RECORD_*
events & perf_sample fields" test will fail because the MMAP name is
'coreutils' not 'sleep', and there is an extra COMM event. Fix the test
to detect that case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181122135545.16295-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:26 -03:00
Adrian Hunter
692d0e6332 perf script: Use fallbacks for branch stacks
Branch stacks do not necessarily have the same cpumode as the 'ip'. Use
the fallback functions in those cases.

This patch depends on patch "perf tools: Add fallback functions for cases
where cpumode is insufficient".

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:18 -03:00