linux_dsm_epyc7002/kernel/trace
Steven Rostedt (Red Hat) 79922b8009 ftrace: Optimize function graph to be called directly
Function graph tracing is a bit different than the function tracers, as
it is processed after either the ftrace_caller or ftrace_regs_caller
and we only have one place to modify the jump to ftrace_graph_caller,
the jump needs to happen after the restore of registeres.

The function graph tracer is dependent on the function tracer, where
even if the function graph tracing is going on by itself, the save and
restore of registers is still done for function tracing regardless of
if function tracing is happening, before it calls the function graph
code.

If there's no function tracing happening, it is possible to just call
the function graph tracer directly, and avoid the wasted effort to save
and restore regs for function tracing.

This requires adding new flags to the dyn_ftrace records:

  FTRACE_FL_TRAMP
  FTRACE_FL_TRAMP_EN

The first is set if the count for the record is one, and the ftrace_ops
associated to that record has its own trampoline. That way the mcount code
can call that trampoline directly.

In the future, trampolines can be added to arbitrary ftrace_ops, where you
can have two or more ftrace_ops registered to ftrace (like kprobes and perf)
and if they are not tracing the same functions, then instead of doing a
loop to check all registered ftrace_ops against their hashes, just call the
ftrace_ops trampoline directly, which would call the registered ftrace_ops
function directly.

Without this patch perf showed:

  0.05%  hackbench  [kernel.kallsyms]  [k] ftrace_caller
  0.05%  hackbench  [kernel.kallsyms]  [k] arch_local_irq_save
  0.05%  hackbench  [kernel.kallsyms]  [k] native_sched_clock
  0.04%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] preempt_trace
  0.04%  hackbench  [kernel.kallsyms]  [k] prepare_ftrace_return
  0.04%  hackbench  [kernel.kallsyms]  [k] __this_cpu_preempt_check
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller

See that the ftrace_caller took up more time than the ftrace_graph_caller
did.

With this patch:

  0.05%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] call_filter_check_discard
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller
  0.04%  hackbench  [kernel.kallsyms]  [k] sched_clock

The ftrace_caller is no where to be found and ftrace_graph_caller still
takes up the same percentage.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-01 07:13:31 -04:00
..
blktrace.c Most of the changes were largely clean ups, and some documentation. 2014-04-03 10:26:31 -07:00
ftrace.c ftrace: Optimize function graph to be called directly 2014-07-01 07:13:31 -04:00
Kconfig tracing: Add tracepoint benchmark tracepoint 2014-05-29 22:49:54 -04:00
Makefile tracing: Add tracepoint benchmark tracepoint 2014-05-29 22:49:54 -04:00
power-traces.c PM / tracing: remove deprecated power trace API 2013-01-26 00:39:12 +01:00
ring_buffer_benchmark.c trace: Replace hardcoding of 19 with MAX_NICE 2014-02-27 12:41:03 +01:00
ring_buffer.c ring-buffer: Check if buffer exists before polling 2014-06-10 09:46:00 -04:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace_benchmark.c tracing: Only calculate stats of tracepoint benchmarks for 2^32 times 2014-06-06 00:41:38 -04:00
trace_benchmark.h tracing: Add tracepoint benchmark tracepoint 2014-05-29 22:49:54 -04:00
trace_branch.c tracing: Update event filters for multibuffer 2013-11-05 16:50:20 -05:00
trace_clock.c tracing: Add "uptime" trace clock that uses jiffies 2013-03-15 00:36:09 -04:00
trace_entries.h tracing: Add trace_puts() for even faster trace_printk() tracing 2013-03-15 00:35:55 -04:00
trace_event_perf.c kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace 2014-04-24 10:26:39 +02:00
trace_events_filter_test.h
trace_events_filter.c tracing: Add and use generic set_trigger_filter() implementation 2013-12-21 22:02:17 -05:00
trace_events_trigger.c tracing: Use rcu_dereference_sched() for trace event triggers 2014-05-02 23:12:42 -04:00
trace_events.c tracing: Return error if ftrace_trace_arrays list is empty 2014-06-06 04:47:46 -04:00
trace_export.c tracing: Fix anonymous unions in struct ftrace_event_call 2014-04-09 20:02:55 -04:00
trace_functions_graph.c tracing: Add funcgraph_tail option to print function name after closing braces 2014-05-20 23:29:32 -04:00
trace_functions.c tracing: Remove mock up poll wait function 2014-04-30 08:40:05 -04:00
trace_irqsoff.c tracing: Allow irq/preempt tracers to be used by instances 2014-04-21 13:59:29 -04:00
trace_kdb.c tracing: Consolidate max_tr into main trace_array structure 2013-03-15 00:35:40 -04:00
trace_kprobe.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 19:18:49 -07:00
trace_mmiotrace.c tracing: Update event filters for multibuffer 2013-11-05 16:50:20 -05:00
trace_nop.c tracing: Remove mock up poll wait function 2014-04-30 08:40:05 -04:00
trace_output.c tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks 2014-05-15 11:29:37 -04:00
trace_output.h tracing: Rename trace_event_mutex to trace_event_sem 2013-03-15 13:22:10 -04:00
trace_printk.c tracing: Add __tracepoint_string() to export string pointers 2013-07-26 13:39:44 -04:00
trace_probe.c kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace 2014-04-24 10:26:39 +02:00
trace_probe.h kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace 2014-04-24 10:26:39 +02:00
trace_sched_switch.c tracing: Update event filters for multibuffer 2013-11-05 16:50:20 -05:00
trace_sched_wakeup.c tracing: Remove mock up poll wait function 2014-04-30 08:40:05 -04:00
trace_selftest_dynamic.c
trace_selftest.c tracing: Add static to local functions 2014-04-21 14:00:46 -04:00
trace_stack.c tracing: Print max callstack on stacktrace bug 2014-06-02 16:43:49 -04:00
trace_stat.c trace/trace_stat: use rbtree postorder iteration helper instead of opencoding 2013-11-05 16:01:47 -05:00
trace_stat.h
trace_syscalls.c tracing: Consolidate event trigger code 2014-01-09 21:20:07 -05:00
trace_uprobe.c Merge branch 'perf/kprobes' into perf/core 2014-06-05 12:26:50 +02:00
trace.c tracing: Fix leak of per cpu max data in instances 2014-06-10 12:06:30 -04:00
trace.h tracing: Fix check of ftrace_trace_arrays list_empty() check 2014-06-10 13:53:50 -04:00