linux_dsm_epyc7002/tools/perf/util
Alexey Budankov 470530bbb8 perf record: Implement --mmap-flush=<number> option
Implement a --mmap-flush option that specifies minimal number of bytes
that is extracted from mmaped kernel buffer to store into a trace. The
default option value is 1 byte what means every time trace writing
thread finds some new data in the mmaped buffer the data is extracted,
possibly compressed and written to a trace.

  $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc
  $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc

The option is independent from -z setting, doesn't vary with compression
level and can serve two purposes.

The first purpose is to increase the compression ratio of a trace data.
Larger data chunks are compressed more effectively so the implemented
option allows specifying data chunk size to compress. Also at some cases
executing more write syscalls with smaller data size can take longer
than executing less write syscalls with bigger data size due to syscall
overhead so extracting bigger data chunks specified by the option value
could additionally decrease runtime overhead.

The second purpose is to avoid self monitoring live-lock issue in system
wide (-a) profiling mode. Profiling in system wide mode with compression
(-a -z) can additionally induce data into the kernel buffers along with
the data from monitored processes. If performance data rate and volume
from the monitored processes is high then trace streaming and
compression activity in the tool is also high. High tool process
activity can lead to subtle live-lock effect when compression of single
new byte from some of mmaped kernel buffer leads to generation of the
next single byte at some mmaped buffer. So perf tool process ends up in
endless self monitoring.

Implemented synch parameter is the mean to force data move independently
from the specified flush threshold value. Despite the provided flush
value the tool needs capability to unconditionally drain memory buffers,
at least in the end of the collection.

Committer testing:

Running with the default value, i.e. as soon as there is something to
read go on consuming, we first write the synthesized events, small
chunks of about 128 bytes:

  # perf trace -m 2048 --call-graph dwarf -e write -- perf record
  <SNIP>
     101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120
                                         __libc_write (/usr/lib64/libpthread-2.28.so)
                                         ion (/home/acme/bin/perf)
                                         record__write (inlined)
                                         process_synthesized_event (/home/acme/bin/perf)
                                         perf_tool__process_synth_event (inlined)
                                         perf_event__synthesize_mmap_events (/home/acme/bin/perf)

Then we move to reading the mmap buffers consuming the events put there
by the kernel perf infrastructure:

     107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336
                                         __libc_write (/usr/lib64/libpthread-2.28.so)
                                         ion (/home/acme/bin/perf)
                                         record__write (inlined)
                                         record__pushfn (/home/acme/bin/perf)
                                         perf_mmap__push (/home/acme/bin/perf)
                                         record__mmap_read_evlist (inlined)
                                         record__mmap_read_all (inlined)
                                         __cmd_record (inlined)
                                         cmd_record (/home/acme/bin/perf)
     12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984
  <SNIP same backtrace as in the 107.561 timestamp>
     12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816
  <SNIP same backtrace as in the 107.561 timestamp>
     12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832
  <SNIP same backtrace as in the 107.561 timestamp>

If we limit it to write only when more than 16MB are available for
reading, it throttles that to a quarter of the --mmap-pages set for
'perf record', which by default get to 528384 bytes, found out using
'record -v':

  mmap flush: 132096
  mmap size 528384B

With that in place all the writes coming from
record__mmap_read_evlist(), i.e. from the mmap buffers setup by the
kernel perf infrastructure were at least 132096 bytes long.

Trying with a bigger mmap size:

   perf trace -e write perf record -v -m 2048 --mmap-flush 16M
   74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888
   74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256
   74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232
   74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032
   74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520
   74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688
   74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760

Was again limited to a quarter of the mmap size:

  mmap flush: 2098176
  mmap size 8392704B

A warning about that would be good to have but can be added later,
something like:

  "max flush is a quarter of the mmap size, if wanting to bump the mmap
   flush further, bump the mmap size as well using -m/--mmap-pages"

Also rename the 'sync' parameters to 'synch' to keep tools/perf building
with older glibcs:

  cc1: warnings being treated as errors
  builtin-record.c: In function 'record__mmap_read_evlist':
  builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration
  /usr/include/unistd.h:933: warning: shadowed declaration is here
  builtin-record.c: In function 'record__mmap_read_all':
  builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration
  /usr/include/unistd.h:933: warning: shadowed declaration is here

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01 15:18:10 -03:00
..
c++ perf clang: Remove needless extra semicolon 2019-03-06 09:47:48 -03:00
cs-etm-decoder perf cs-etm: Add missing case value 2019-03-28 14:31:55 -03:00
include Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
intel-pt-decoder perf intel-pt: Fix TSC slip 2019-03-28 14:31:55 -03:00
libunwind
scripting-engines perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep" 2019-04-01 15:18:10 -03:00
annotate.c perf annotate: Enable annotation of BPF programs 2019-03-20 16:43:15 -03:00
annotate.h perf annotate: Enable annotation of BPF programs 2019-03-20 16:43:15 -03:00
archinsn.h perf script: Support insn output for normal samples 2019-03-11 11:56:02 -03:00
arm-spe-pkt-decoder.c
arm-spe-pkt-decoder.h
arm-spe.c
arm-spe.h
auxtrace.c perf auxtrace: Improve address filter error message when there is no DSO 2019-03-01 14:47:06 -03:00
auxtrace.h perf auxtrace: Add timestamp to auxtrace errors 2019-02-06 11:20:32 -03:00
block-range.c perf block-range: Add missing headers 2019-01-25 15:12:09 +01:00
block-range.h perf block-range: Add missing headers 2019-01-25 15:12:09 +01:00
bpf_map.c perf bpf: Add bpf_map dumper 2019-02-19 16:11:56 -03:00
bpf_map.h perf bpf: Add bpf_map dumper 2019-02-19 16:11:56 -03:00
bpf-event.c perf bpf: Show more BPF program info in print_bpf_prog_info() 2019-03-21 11:27:04 -03:00
bpf-event.h perf bpf: Show more BPF program info in print_bpf_prog_info() 2019-03-21 11:27:04 -03:00
bpf-loader.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 07:59:36 -08:00
bpf-loader.h perf bpf-loader: Remove unecessary includes from bpf-loader.h 2019-02-06 10:00:39 -03:00
bpf-prologue.c
bpf-prologue.h
branch.c
branch.h perf tools: Move branch structs to branch.h 2019-01-25 15:12:08 +01:00
Build perf bpf: Add bpf_map dumper 2019-02-19 16:11:56 -03:00
build-id.c perf build-id: Fix memory leak in print_sdt_events() 2019-03-19 16:52:04 -03:00
build-id.h perf namespaces: Remove namespaces.h from .h headers 2019-01-25 15:12:09 +01:00
cache.h
call-path.c
call-path.h
callchain.c perf hist: Remove symbol.h from hist.h, just fwd decls are needed 2019-02-06 10:00:38 -03:00
callchain.h perf symbols: Introduce map_symbol.h 2019-02-06 10:00:38 -03:00
cgroup.c
cgroup.h
cloexec.c
cloexec.h
color_config.c perf utils: Move perf_config using routines from color.c to separate object 2019-01-21 17:38:56 -03:00
color.c perf utils: Move perf_config using routines from color.c to separate object 2019-01-21 17:38:56 -03:00
color.h perf color: Add missing stdarg.g to color.h 2019-01-25 15:12:08 +01:00
comm.c perf comm: Remove needless headers from comm.h 2019-01-25 15:12:09 +01:00
comm.h perf comm: Remove needless headers from comm.h 2019-01-25 15:12:09 +01:00
compress.h
config.c perf config: Fix a memory leak in collect_config() 2019-03-19 16:52:04 -03:00
config.h
counts.c
counts.h
cpu-set-sched.h perf tools: Add fallback versions for CPU_{OR,EQUAL}() 2019-02-06 10:00:39 -03:00
cpumap.c perf cpumap: Increase debug level for cpu_map__snprint verbose output 2019-02-20 17:08:39 -03:00
cpumap.h perf record: Apply affinity masks when reading mmap buffers 2019-02-06 10:00:39 -03:00
cputopo.c perf tools: Use sysfs__mountpoint() when reading cpu topology 2019-02-19 12:21:10 -03:00
cputopo.h perf tools: Add numa_topology object 2019-02-19 12:21:06 -03:00
cs-etm.c perf cs-etm: Modularize auxtrace_buffer fetch function 2019-02-14 15:18:08 -03:00
cs-etm.h perf cs-etm: Introducing function cs_etm__init_trace_params() 2019-02-14 15:18:06 -03:00
ctype.c
data-convert-bt.c perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep" 2019-04-01 15:18:10 -03:00
data-convert-bt.h
data-convert.h
data.c perf record: Allow to limit number of reported perf.data files 2019-03-19 11:56:20 -03:00
data.h perf record: Allow to limit number of reported perf.data files 2019-03-19 11:56:20 -03:00
db-export.c perf db-export: Add calls parent_id to enable creation of call trees 2019-03-01 14:50:47 -03:00
db-export.h perf db-export: Add calls parent_id to enable creation of call trees 2019-03-01 14:50:47 -03:00
debug.c
debug.h
demangle-java.c
demangle-java.h
demangle-rust.c
demangle-rust.h
dso.c perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO 2019-03-19 16:52:07 -03:00
dso.h perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO 2019-03-19 16:52:07 -03:00
dump-insn.c perf script: Fix LBR skid dump problems in brstackinsn 2018-12-28 16:33:02 -03:00
dump-insn.h perf script: Fix LBR skid dump problems in brstackinsn 2018-12-28 16:33:02 -03:00
dwarf-aux.c
dwarf-aux.h
dwarf-regs.c
env.c perf bpf: Save BTF in a rbtree in perf_env 2019-03-19 16:52:07 -03:00
env.h perf bpf: Save BTF in a rbtree in perf_env 2019-03-19 16:52:07 -03:00
event.c perf tools: Add missing include for symbols.h 2019-02-06 10:00:38 -03:00
event.h perf tools: Add header defining used namespace struct to event.h 2019-04-01 14:49:24 -03:00
evlist.c perf record: Implement --mmap-flush=<number> option 2019-04-01 15:18:10 -03:00
evlist.h perf record: Implement --mmap-flush=<number> option 2019-04-01 15:18:10 -03:00
evsel_fprintf.c perf tools: Support 'srccode' output 2018-12-17 14:57:07 -03:00
evsel.c perf evsel: Support printing evsel name for 'duration_time' 2019-04-01 14:49:24 -03:00
evsel.h perf stat: Implement duration_time as a proper event 2019-04-01 14:49:24 -03:00
expr.h
expr.y
find-map.c perf tools: Make find_vdso_map() more modular 2019-01-08 13:28:13 -03:00
genelf_debug.c
genelf.c
genelf.h
generate-cmdlist.sh
get_current_dir_name.c tools build feature: Check if get_current_dir_name() is available 2018-11-19 12:12:17 -08:00
group.h
header.c perf bpf: Show more BPF program info in print_bpf_prog_info() 2019-03-21 11:27:04 -03:00
header.h perf bpf: Save BTF information as headers to perf.data 2019-03-19 16:52:07 -03:00
help-unknown-cmd.c
help-unknown-cmd.h
hist.c perf hist: Add missing map__put() in error case 2019-03-19 16:52:04 -03:00
hist.h perf report: Implement browsing of individual samples 2019-03-11 16:33:19 -03:00
intel-bts.c perf thread: Generalize function to copy from thread addr space from intel-bts code 2019-03-06 17:55:35 -03:00
intel-bts.h
intel-pt.c perf intel-pt: Fix divide by zero when TSC is not available 2019-03-01 14:48:30 -03:00
intel-pt.h
intlist.c
intlist.h perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
jit.h
jitdump.c perf symbols: Remove some unnecessary includes from symbol.h 2019-01-25 15:12:09 +01:00
jitdump.h
kvm-stat.h perf kvm stat: Replace kvm-stat.h includes with forward declarations 2019-02-06 10:00:39 -03:00
levenshtein.c
levenshtein.h
llvm-utils.c
llvm-utils.h
lzma.c
machine.c perf machine: Update kernel map address and re-order properly 2019-03-28 14:41:21 -03:00
machine.h perf map: Move structs and prototypes for map groups to a separate header 2019-02-06 10:00:38 -03:00
map_groups.h perf map: Move structs and prototypes for map groups to a separate header 2019-02-06 10:00:38 -03:00
map_symbol.h perf symbols: Introduce map_symbol.h 2019-02-06 10:00:38 -03:00
map.c perf maps: Purge all maps from the 'names' tree 2019-03-19 16:52:05 -03:00
map.h perf map: Move structs and prototypes for map groups to a separate header 2019-02-06 10:00:38 -03:00
mem2node.c
mem2node.h
mem-events.c perf mem/c2c: Fix perf_mem_events to support powerpc 2019-02-04 11:32:14 -03:00
mem-events.h
memswap.c
memswap.h
metricgroup.c perf list: Display metric expressions for --details option 2019-02-14 15:18:09 -03:00
metricgroup.h perf list: Display metric expressions for --details option 2019-02-14 15:18:09 -03:00
mmap.c perf record: Implement --mmap-flush=<number> option 2019-04-01 15:18:10 -03:00
mmap.h perf record: Implement --mmap-flush=<number> option 2019-04-01 15:18:10 -03:00
namespaces.c perf tools: Restore proper cwd on return from mnt namespace 2018-11-19 12:12:26 -08:00
namespaces.h perf tools: Restore proper cwd on return from mnt namespace 2018-11-19 12:12:26 -08:00
ordered-events.c perf top: Fix global-buffer-overflow issue 2019-03-19 16:52:05 -03:00
ordered-events.h perf ordered_events: Add first_time() method 2018-12-17 15:02:17 -03:00
parse-branch-options.c
parse-branch-options.h
parse-events.c perf list: Output tool events 2019-04-01 14:49:25 -03:00
parse-events.h perf list: Output tool events 2019-04-01 14:49:25 -03:00
parse-events.l perf stat: Implement duration_time as a proper event 2019-04-01 14:49:24 -03:00
parse-events.y perf stat: Implement duration_time as a proper event 2019-04-01 14:49:24 -03:00
parse-regs-options.c
parse-regs-options.h
path.c
path.h
perf_regs.c
perf_regs.h
perf-hooks-list.h
perf-hooks.c
perf-hooks.h
PERF-VERSION-GEN
pmu.c perf pmu: Fix parser error for uncore event alias 2019-03-28 15:53:27 -03:00
pmu.h perf tools: Read and store caps/max_precise in perf_pmu 2019-03-06 18:18:17 -03:00
pmu.l
pmu.y
print_binary.c
print_binary.h
probe-event.c perf probe: Fix getting the kernel map 2019-03-11 11:56:03 -03:00
probe-event.h perf namespaces: Remove namespaces.h from .h headers 2019-01-25 15:12:09 +01:00
probe-file.c perf namespaces: Remove namespaces.h from .h headers 2019-01-25 15:12:09 +01:00
probe-file.h
probe-finder.c
probe-finder.h
pstack.c
pstack.h
python-ext-sources
python.c perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep" 2019-04-01 15:18:10 -03:00
rb_resort.h perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
rblist.c perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
rblist.h perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
record.c
rwsem.c
rwsem.h
s390-cpumcf-kernel.h perf report: Display arch specific diagnostic counter sets, starting with s390 2019-01-21 17:00:48 -03:00
s390-cpumsf-kernel.h
s390-cpumsf.c perf report: Add s390 diagnosic sampling descriptor size 2019-02-14 13:31:08 -03:00
s390-cpumsf.h
s390-sample-raw.c perf report: Display names in s390 diagnostic counter sets 2019-01-21 17:00:56 -03:00
sample-raw.c perf report: Display arch specific diagnostic counter sets, starting with s390 2019-01-21 17:00:48 -03:00
sample-raw.h perf report: Display arch specific diagnostic counter sets, starting with s390 2019-01-21 17:00:48 -03:00
sane_ctype.h
session.c perf bpf: Save bpf_prog_info in a rbtree in perf_env 2019-03-19 16:52:06 -03:00
session.h
setns.c
setup.py perf record: Bind the AIO user space buffers to nodes 2019-02-06 10:00:39 -03:00
smt.c
smt.h
sort.c perf report: Show all sort keys in help output 2019-03-19 16:15:42 -03:00
sort.h perf report: Show all sort keys in help output 2019-03-19 16:15:42 -03:00
srccode.c perf tools: Support 'srccode' output 2018-12-17 14:57:07 -03:00
srccode.h perf srccode: Move struct definition from map.h to srccode.h 2019-02-06 10:00:38 -03:00
srcline.c perf report: Don't shadow inlined symbol with different addr range 2019-02-19 12:30:12 -03:00
srcline.h perf callchain: Use cached rbtrees 2019-01-25 15:12:09 +01:00
stat-display.c perf stat: Revert checks for duration_time 2019-04-01 14:49:24 -03:00
stat-shadow.c perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
stat.c perf stat: Fix --no-scale 2019-03-19 16:52:03 -03:00
stat.h
strbuf.c perf strbuf: Remove redundant va_end() in strbuf_addv() 2019-01-04 12:54:49 -03:00
strbuf.h
strfilter.c
strfilter.h
string2.h
string.c
strlist.c
strlist.h perf util: Use cached rbtree for rblists 2019-01-25 15:12:10 +01:00
svghelper.c perf svghelper: Fix unchecked usage of strncpy() 2018-12-17 14:59:20 -03:00
svghelper.h
symbol_conf.h perf report: Implement browsing of individual samples 2019-03-11 16:33:19 -03:00
symbol_fprintf.c perf symbols: Use cached rbtrees 2019-01-25 15:12:10 +01:00
symbol-elf.c perf/core improvements and fixes: 2019-02-09 13:16:01 +01:00
symbol-minimal.c perf symbols: Remove some unnecessary includes from symbol.h 2019-01-25 15:12:09 +01:00
symbol.c perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO 2019-03-19 16:52:07 -03:00
symbol.h perf symbols: Introduce map_symbol.h 2019-02-06 10:00:38 -03:00
syscalltbl.c
syscalltbl.h
target.c
target.h
term.c
term.h
thread_map.c
thread_map.h
thread-stack.c perf db-export: Add calls parent_id to enable creation of call trees 2019-03-01 14:50:47 -03:00
thread-stack.h perf db-export: Add calls parent_id to enable creation of call trees 2019-03-01 14:50:47 -03:00
thread.c perf thread: Generalize function to copy from thread addr space from intel-bts code 2019-03-06 17:55:35 -03:00
thread.h perf thread: Generalize function to copy from thread addr space from intel-bts code 2019-03-06 17:55:35 -03:00
time-utils.c perf time-utils: Add utility function to print time stamps in nanoseconds 2019-03-11 11:56:02 -03:00
time-utils.h perf time-utils: Add utility function to print time stamps in nanoseconds 2019-03-11 11:56:02 -03:00
tool.h perf tools: Handle PERF_RECORD_BPF_EVENT 2019-01-21 17:00:57 -03:00
top.c perf top: Move perf_top__reset_sample_counters() to after counts display 2018-12-17 14:58:47 -03:00
top.h perf top: Save and display the drop count stats 2018-12-17 14:58:33 -03:00
trace-event-info.c
trace-event-parse.c perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep" 2019-04-01 15:18:10 -03:00
trace-event-read.c tools tools, tools lib traceevent: Make traceevent APIs more consistent 2019-04-01 15:18:09 -03:00
trace-event-scripting.c
trace-event.c tools tools, tools lib traceevent: Make traceevent APIs more consistent 2019-04-01 15:18:09 -03:00
trace-event.h tools lib traceevent, perf tools: Rename 'struct tep_event_format' to 'struct tep_event' 2018-12-17 14:56:02 -03:00
trigger.h
tsc.c
tsc.h
units.c
units.h
unwind-libdw.c perf tools: Add missing include for symbols.h 2019-02-06 10:00:38 -03:00
unwind-libdw.h
unwind-libunwind-local.c pref tools: Add missing map.h includes 2019-02-06 10:00:38 -03:00
unwind-libunwind.c pref tools: Add missing map.h includes 2019-02-06 10:00:38 -03:00
unwind.h
usage.c
util-cxx.h
util.c perf tools: Add perf_exe() helper to find perf binary 2019-02-25 10:58:28 -03:00
util.h perf tools: Add perf_exe() helper to find perf binary 2019-02-25 10:58:28 -03:00
values.c
values.h
vdso.c pref tools: Add missing map.h includes 2019-02-06 10:00:38 -03:00
vdso.h
xyarray.c
xyarray.h
zlib.c perf tools: Remove duplicate headers 2019-01-21 15:15:57 -03:00