ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
/*
|
2015-12-28 15:35:07 +07:00
|
|
|
* Dynamic function tracing support.
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
*
|
|
|
|
* Copyright (C) 2007-2008 Steven Rostedt <srostedt@redhat.com>
|
|
|
|
*
|
|
|
|
* Thanks goes to Ingo Molnar, for suggesting the idea.
|
|
|
|
* Mathieu Desnoyers, for suggesting postponing the modifications.
|
|
|
|
* Arjan van de Ven, for keeping me straight, and explaining to me
|
|
|
|
* the dangers of modifying code on the run.
|
|
|
|
*/
|
|
|
|
|
2009-10-05 07:53:29 +07:00
|
|
|
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/hardirq.h>
|
2008-08-20 23:55:07 +07:00
|
|
|
#include <linux/uaccess.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
#include <linux/ftrace.h>
|
|
|
|
#include <linux/percpu.h>
|
2008-11-11 17:57:02 +07:00
|
|
|
#include <linux/sched.h>
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
#include <linux/slab.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/list.h>
|
2010-11-17 04:35:16 +07:00
|
|
|
#include <linux/module.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2009-04-09 01:40:59 +07:00
|
|
|
#include <trace/syscall.h>
|
|
|
|
|
2009-02-18 05:57:30 +07:00
|
|
|
#include <asm/cacheflush.h>
|
2012-05-04 20:26:16 +07:00
|
|
|
#include <asm/kprobes.h>
|
2008-06-22 01:17:27 +07:00
|
|
|
#include <asm/ftrace.h>
|
ftrace: use only 5 byte nops for x86
Mathieu Desnoyers revealed a bug in the original code. The nop that is
used to relpace the mcount caller can be a two part nop. This runs the
risk where a process can be preempted after executing the first nop, but
before the second part of the nop.
The ftrace code calls kstop_machine to keep multiple CPUs from executing
code that is being modified, but it does not protect against a task preempting
in the middle of a two part nop.
If the above preemption happens and the tracer is enabled, after the
kstop_machine runs, all those nops will be calls to the trace function.
If the preempted process that was preempted between the two nops is executed
again, it will execute half of the call to the trace function, and this
might crash the system.
This patch instead uses what both the latest Intel and AMD spec suggests.
That is the P6_NOP5 sequence of "0x0f 0x1f 0x44 0x00 0x00".
Note, some older CPUs and QEMU might fault on this nop, so this nop
is executed with fault handling first. If it detects a fault, it will then
use the code "0x66 0x66 0x66 0x66 0x90". If that faults, it will then
default to a simple "jmp 1f; .byte 0x00 0x00 0x00; 1:". The jmp is
not optimal but will do if the first two can not be executed.
TODO: Examine the cpuid to determine the nop to use.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 05:05:05 +07:00
|
|
|
#include <asm/nops.h>
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2008-11-11 13:03:45 +07:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2009-02-18 05:57:30 +07:00
|
|
|
int ftrace_arch_code_modify_prepare(void)
|
|
|
|
{
|
|
|
|
set_kernel_text_rw();
|
2010-11-17 04:35:16 +07:00
|
|
|
set_all_modules_text_rw();
|
2009-02-18 05:57:30 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_arch_code_modify_post_process(void)
|
|
|
|
{
|
2010-11-17 04:35:16 +07:00
|
|
|
set_all_modules_text_ro();
|
2009-02-18 05:57:30 +07:00
|
|
|
set_kernel_text_ro();
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
union ftrace_code_union {
|
2008-06-22 01:17:27 +07:00
|
|
|
char code[MCOUNT_INSN_SIZE];
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
struct {
|
2014-07-04 01:51:36 +07:00
|
|
|
unsigned char e8;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
int offset;
|
|
|
|
} __attribute__((packed));
|
|
|
|
};
|
|
|
|
|
2008-10-23 20:33:08 +07:00
|
|
|
static int ftrace_calc_offset(long ip, long addr)
|
2008-05-13 02:20:43 +07:00
|
|
|
{
|
|
|
|
return (int)(addr - ip);
|
|
|
|
}
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2008-11-15 07:21:19 +07:00
|
|
|
static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
|
2008-05-13 02:20:43 +07:00
|
|
|
{
|
|
|
|
static union ftrace_code_union calc;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2008-05-13 02:20:43 +07:00
|
|
|
calc.e8 = 0xe8;
|
2008-06-22 01:17:27 +07:00
|
|
|
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
|
2008-05-13 02:20:43 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* No locking needed, this must be called via kstop_machine
|
|
|
|
* which in essence is like running on a uniprocessor machine.
|
|
|
|
*/
|
|
|
|
return calc.code;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
}
|
|
|
|
|
2009-10-29 09:46:57 +07:00
|
|
|
static inline int
|
|
|
|
within(unsigned long addr, unsigned long start, unsigned long end)
|
|
|
|
{
|
|
|
|
return addr >= start && addr < end;
|
|
|
|
}
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
static unsigned long text_ip_addr(unsigned long ip)
|
2008-10-31 03:08:32 +07:00
|
|
|
{
|
2009-10-29 09:46:57 +07:00
|
|
|
/*
|
2016-02-18 05:41:14 +07:00
|
|
|
* On x86_64, kernel text mappings are mapped read-only, so we use
|
|
|
|
* the kernel identity mapping instead of the kernel text mapping
|
|
|
|
* to modify the kernel text.
|
2009-10-29 09:46:57 +07:00
|
|
|
*
|
|
|
|
* For 32bit kernels, these mappings are same and we can use
|
|
|
|
* kernel identity mapping to modify code.
|
|
|
|
*/
|
|
|
|
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
|
2012-11-17 04:57:32 +07:00
|
|
|
ip = (unsigned long)__va(__pa_symbol(ip));
|
2009-10-29 09:46:57 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
return ip;
|
2008-10-31 03:08:32 +07:00
|
|
|
}
|
|
|
|
|
2011-04-19 05:19:51 +07:00
|
|
|
static const unsigned char *ftrace_nop_replace(void)
|
2008-11-11 13:03:45 +07:00
|
|
|
{
|
2011-04-19 05:19:51 +07:00
|
|
|
return ideal_nops[NOP_ATOMIC5];
|
2008-11-11 13:03:45 +07:00
|
|
|
}
|
|
|
|
|
2008-11-15 07:21:19 +07:00
|
|
|
static int
|
2012-05-31 00:36:38 +07:00
|
|
|
ftrace_modify_code_direct(unsigned long ip, unsigned const char *old_code,
|
2011-05-13 00:33:40 +07:00
|
|
|
unsigned const char *new_code)
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
{
|
2008-08-20 23:55:07 +07:00
|
|
|
unsigned char replaced[MCOUNT_INSN_SIZE];
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2015-11-26 02:13:11 +07:00
|
|
|
ftrace_expected = old_code;
|
|
|
|
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
/*
|
2015-12-06 09:02:58 +07:00
|
|
|
* Note:
|
|
|
|
* We are paranoid about modifying text, as if a bug was to happen, it
|
|
|
|
* could cause us to read or write to someplace that could cause harm.
|
|
|
|
* Carefully read and modify the code with probe_kernel_*(), and make
|
|
|
|
* sure what we read is what we expected it to be before modifying it.
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
*/
|
2008-10-23 20:33:00 +07:00
|
|
|
|
|
|
|
/* read the text we want to modify */
|
2008-10-23 20:33:01 +07:00
|
|
|
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
|
2008-10-23 20:32:59 +07:00
|
|
|
return -EFAULT;
|
2008-08-20 23:55:07 +07:00
|
|
|
|
2008-10-23 20:33:00 +07:00
|
|
|
/* Make sure it is what we expect it to be */
|
2008-08-20 23:55:07 +07:00
|
|
|
if (memcmp(replaced, old_code, MCOUNT_INSN_SIZE) != 0)
|
2008-10-23 20:32:59 +07:00
|
|
|
return -EINVAL;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
ip = text_ip_addr(ip);
|
|
|
|
|
2008-10-23 20:33:00 +07:00
|
|
|
/* replace the text with the new text */
|
2014-02-12 08:19:44 +07:00
|
|
|
if (probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE))
|
2008-10-23 20:32:59 +07:00
|
|
|
return -EPERM;
|
2008-08-20 23:55:07 +07:00
|
|
|
|
|
|
|
sync_core();
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
|
2008-08-20 23:55:07 +07:00
|
|
|
return 0;
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
}
|
|
|
|
|
2008-11-15 07:21:19 +07:00
|
|
|
int ftrace_make_nop(struct module *mod,
|
|
|
|
struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
2011-05-13 00:33:40 +07:00
|
|
|
unsigned const char *new, *old;
|
2008-11-15 07:21:19 +07:00
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_call_replace(ip, addr);
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
/*
|
|
|
|
* On boot up, and when modules are loaded, the MCOUNT_ADDR
|
|
|
|
* is converted to a nop, and will never become MCOUNT_ADDR
|
|
|
|
* again. This code is either running before SMP (on boot up)
|
|
|
|
* or before the code will ever be executed (module load).
|
|
|
|
* We do not want to use the breakpoint version in this case,
|
|
|
|
* just modify the code directly.
|
|
|
|
*/
|
|
|
|
if (addr == MCOUNT_ADDR)
|
|
|
|
return ftrace_modify_code_direct(rec->ip, old, new);
|
|
|
|
|
2015-11-26 02:13:11 +07:00
|
|
|
ftrace_expected = NULL;
|
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
/* Normal cases use add_brk_on_nop */
|
|
|
|
WARN_ONCE(1, "invalid use of ftrace_make_nop");
|
|
|
|
return -EINVAL;
|
2008-11-15 07:21:19 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
2011-05-13 00:33:40 +07:00
|
|
|
unsigned const char *new, *old;
|
2008-11-15 07:21:19 +07:00
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_nop_replace();
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
/* Should only be called when module is loaded */
|
|
|
|
return ftrace_modify_code_direct(rec->ip, old, new);
|
2008-05-13 02:20:43 +07:00
|
|
|
}
|
|
|
|
|
2012-05-31 00:26:37 +07:00
|
|
|
/*
|
|
|
|
* The modifying_ftrace_code is used to tell the breakpoint
|
|
|
|
* handler to call ftrace_int3_handler(). If it fails to
|
|
|
|
* call this handler for a breakpoint added by ftrace, then
|
|
|
|
* the kernel may crash.
|
|
|
|
*
|
|
|
|
* As atomic_writes on x86 do not need a barrier, we do not
|
|
|
|
* need to add smp_mb()s for this to work. It is also considered
|
|
|
|
* that we can not read the modifying_ftrace_code before
|
|
|
|
* executing the breakpoint. That would be quite remarkable if
|
|
|
|
* it could do that. Here's the flow that is required:
|
|
|
|
*
|
|
|
|
* CPU-0 CPU-1
|
|
|
|
*
|
|
|
|
* atomic_inc(mfc);
|
|
|
|
* write int3s
|
|
|
|
* <trap-int3> // implicit (r)mb
|
|
|
|
* if (atomic_read(mfc))
|
|
|
|
* call ftrace_int3_handler()
|
|
|
|
*
|
|
|
|
* Then when we are finished:
|
|
|
|
*
|
|
|
|
* atomic_dec(mfc);
|
|
|
|
*
|
|
|
|
* If we hit a breakpoint that was not set by ftrace, it does not
|
|
|
|
* matter if ftrace_int3_handler() is called or not. It will
|
|
|
|
* simply be ignored. But it is crucial that a ftrace nop/caller
|
|
|
|
* breakpoint is handled. No other user should ever place a
|
|
|
|
* breakpoint on an ftrace nop/caller location. It must only
|
|
|
|
* be done by this code.
|
|
|
|
*/
|
|
|
|
atomic_t modifying_ftrace_code __read_mostly;
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
static int
|
|
|
|
ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
|
|
|
|
unsigned const char *new_code);
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
/*
|
|
|
|
* Should never be called:
|
|
|
|
* As it is only called by __ftrace_replace_code() which is called by
|
|
|
|
* ftrace_replace_code() that x86 overrides, and by ftrace_update_code()
|
|
|
|
* which is called to turn mcount into nops or nops into function calls
|
|
|
|
* but not to convert a function from not using regs to one that uses
|
|
|
|
* regs, which ftrace_modify_call() is for.
|
|
|
|
*/
|
|
|
|
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
|
|
|
|
unsigned long addr)
|
|
|
|
{
|
|
|
|
WARN_ON(1);
|
2015-11-26 02:13:11 +07:00
|
|
|
ftrace_expected = NULL;
|
2012-05-01 03:20:23 +07:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
static unsigned long ftrace_update_func;
|
|
|
|
|
|
|
|
static int update_ftrace_func(unsigned long ip, void *new)
|
2012-05-31 00:36:38 +07:00
|
|
|
{
|
2014-02-12 08:19:44 +07:00
|
|
|
unsigned char old[MCOUNT_INSN_SIZE];
|
2012-05-31 00:36:38 +07:00
|
|
|
int ret;
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
memcpy(old, (void *)ip, MCOUNT_INSN_SIZE);
|
|
|
|
|
|
|
|
ftrace_update_func = ip;
|
|
|
|
/* Make sure the breakpoints see the ftrace_update_func update */
|
|
|
|
smp_wmb();
|
2012-05-31 00:36:38 +07:00
|
|
|
|
|
|
|
/* See comment above by declaration of modifying_ftrace_code */
|
|
|
|
atomic_inc(&modifying_ftrace_code);
|
|
|
|
|
|
|
|
ret = ftrace_modify_code(ip, old, new);
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
atomic_dec(&modifying_ftrace_code);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_update_ftrace_func(ftrace_func_t func)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_call);
|
|
|
|
unsigned char *new;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, (unsigned long)func);
|
|
|
|
ret = update_ftrace_func(ip, new);
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
/* Also update the regs callback function */
|
|
|
|
if (!ret) {
|
|
|
|
ip = (unsigned long)(&ftrace_regs_call);
|
|
|
|
new = ftrace_call_replace(ip, (unsigned long)func);
|
2014-02-12 08:19:44 +07:00
|
|
|
ret = update_ftrace_func(ip, new);
|
2012-05-01 03:20:23 +07:00
|
|
|
}
|
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-10-23 19:58:16 +07:00
|
|
|
static int is_ftrace_caller(unsigned long ip)
|
|
|
|
{
|
2014-02-12 08:19:44 +07:00
|
|
|
if (ip == ftrace_update_func)
|
2013-10-23 19:58:16 +07:00
|
|
|
return 1;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-08-16 20:57:10 +07:00
|
|
|
/*
|
|
|
|
* A breakpoint was added to the code address we are about to
|
|
|
|
* modify, and this is the handle that will just skip over it.
|
|
|
|
* We are either changing a nop into a trace call, or a trace
|
|
|
|
* call to a nop. While the change is taking place, we treat
|
|
|
|
* it just like it was a nop.
|
|
|
|
*/
|
|
|
|
int ftrace_int3_handler(struct pt_regs *regs)
|
|
|
|
{
|
2013-10-23 19:58:16 +07:00
|
|
|
unsigned long ip;
|
|
|
|
|
2011-08-16 20:57:10 +07:00
|
|
|
if (WARN_ON_ONCE(!regs))
|
|
|
|
return 0;
|
|
|
|
|
2013-10-23 19:58:16 +07:00
|
|
|
ip = regs->ip - 1;
|
|
|
|
if (!ftrace_location(ip) && !is_ftrace_caller(ip))
|
2011-08-16 20:57:10 +07:00
|
|
|
return 0;
|
|
|
|
|
|
|
|
regs->ip += MCOUNT_INSN_SIZE - 1;
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int ftrace_write(unsigned long ip, const char *val, int size)
|
|
|
|
{
|
2014-06-03 23:23:21 +07:00
|
|
|
ip = text_ip_addr(ip);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2014-02-26 09:33:59 +07:00
|
|
|
if (probe_kernel_write((void *)ip, val, size))
|
|
|
|
return -EPERM;
|
|
|
|
|
|
|
|
return 0;
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int add_break(unsigned long ip, const char *old)
|
|
|
|
{
|
|
|
|
unsigned char replaced[MCOUNT_INSN_SIZE];
|
|
|
|
unsigned char brk = BREAKPOINT_INSTRUCTION;
|
|
|
|
|
|
|
|
if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
|
|
|
|
return -EFAULT;
|
|
|
|
|
2015-11-26 02:13:11 +07:00
|
|
|
ftrace_expected = old;
|
|
|
|
|
2011-08-16 20:57:10 +07:00
|
|
|
/* Make sure it is what we expect it to be */
|
|
|
|
if (memcmp(replaced, old, MCOUNT_INSN_SIZE) != 0)
|
|
|
|
return -EINVAL;
|
|
|
|
|
2014-02-26 09:33:59 +07:00
|
|
|
return ftrace_write(ip, &brk, 1);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int add_brk_on_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned const char *old;
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
old = ftrace_call_replace(ip, addr);
|
|
|
|
|
|
|
|
return add_break(rec->ip, old);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static int add_brk_on_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned const char *old;
|
|
|
|
|
|
|
|
old = ftrace_nop_replace();
|
|
|
|
|
|
|
|
return add_break(rec->ip, old);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_breakpoints(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
2014-05-07 08:34:14 +07:00
|
|
|
ftrace_addr = ftrace_get_addr_curr(rec);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2014-05-07 03:26:39 +07:00
|
|
|
ret = ftrace_test_record(rec, enable);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return add_brk_on_nop(rec);
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
2011-08-16 20:57:10 +07:00
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return add_brk_on_call(rec, ftrace_addr);
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* On error, we need to remove breakpoints. This needs to
|
|
|
|
* be done caefully. If the address does not currently have a
|
|
|
|
* breakpoint, we know we are done. Otherwise, we look at the
|
|
|
|
* remaining 4 bytes of the instruction. If it matches a nop
|
|
|
|
* we replace the breakpoint with the nop. Otherwise we replace
|
|
|
|
* it with the call instruction.
|
|
|
|
*/
|
|
|
|
static int remove_breakpoint(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned char ins[MCOUNT_INSN_SIZE];
|
|
|
|
unsigned char brk = BREAKPOINT_INSTRUCTION;
|
|
|
|
const unsigned char *nop;
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
|
|
|
|
/* If we fail the read, just give up */
|
|
|
|
if (probe_kernel_read(ins, (void *)ip, MCOUNT_INSN_SIZE))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
/* If this does not have a breakpoint, we are done */
|
|
|
|
if (ins[0] != brk)
|
2014-02-24 23:12:22 +07:00
|
|
|
return 0;
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
nop = ftrace_nop_replace();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the last 4 bytes of the instruction do not match
|
|
|
|
* a nop, then we assume that this is a call to ftrace_addr.
|
|
|
|
*/
|
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0) {
|
|
|
|
/*
|
|
|
|
* For extra paranoidism, we check if the breakpoint is on
|
|
|
|
* a call that would actually jump to the ftrace_addr.
|
|
|
|
* If not, don't touch the breakpoint, we make just create
|
|
|
|
* a disaster.
|
|
|
|
*/
|
2014-05-07 08:34:14 +07:00
|
|
|
ftrace_addr = ftrace_get_addr_new(rec);
|
2012-05-01 03:20:23 +07:00
|
|
|
nop = ftrace_call_replace(ip, ftrace_addr);
|
|
|
|
|
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) == 0)
|
|
|
|
goto update;
|
|
|
|
|
|
|
|
/* Check both ftrace_addr and ftrace_old_addr */
|
2014-05-07 08:34:14 +07:00
|
|
|
ftrace_addr = ftrace_get_addr_curr(rec);
|
2011-08-16 20:57:10 +07:00
|
|
|
nop = ftrace_call_replace(ip, ftrace_addr);
|
|
|
|
|
2015-11-26 02:13:11 +07:00
|
|
|
ftrace_expected = nop;
|
|
|
|
|
2011-08-16 20:57:10 +07:00
|
|
|
if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0)
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
update:
|
ftrace/x86: Run a sync after fixup on failure
If a failure occurs while enabling a trace, it bails out and will remove
the tracepoints to be back to what the code originally was. But the fix
up had some bugs in it. By injecting a failure in the code, the fix up
ran to completion, but shortly afterward the system rebooted.
There was two bugs here.
The first was that there was no final sync run across the CPUs after the
fix up was done, and before the ftrace int3 handler flag was reset. That
means that other CPUs could still see the breakpoint and trigger on it
long after the flag was cleared, and the int3 handler would think it was
a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
because the ftrace int3 handler flag would have prevented the int3 handler
from going further.
Here's a description of the issue:
CPU0 CPU1
---- ----
remove_breakpoint();
modifying_ftrace_code = 0;
[still sees breakpoint]
<takes trap>
[sees modifying_ftrace_code as zero]
[no breakpoint handler]
[goto failed case]
[trap exception - kernel breakpoint, no
handler]
BUG()
The second bug was that the removal of the breakpoints required the
"within()" logic updates instead of accessing the ip address directly.
As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
the removal of the breakpoint is a modification of the kernel text.
The ftrace_write() includes the "within()" logic, where as, the
probe_kernel_write() does not. This prevented the breakpoint from being
removed at all.
Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz
Reported-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-21 22:43:12 +07:00
|
|
|
return ftrace_write(ip, nop, 1);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_code(unsigned long ip, unsigned const char *new)
|
|
|
|
{
|
|
|
|
/* skip breakpoint */
|
|
|
|
ip++;
|
|
|
|
new++;
|
2014-02-26 09:33:59 +07:00
|
|
|
return ftrace_write(ip, new, MCOUNT_INSN_SIZE - 1);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
return add_update_code(ip, new);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
return add_update_code(ip, new);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int add_update(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = ftrace_test_record(rec, enable);
|
|
|
|
|
2014-05-07 08:34:14 +07:00
|
|
|
ftrace_addr = ftrace_get_addr_new(rec);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
2011-08-16 20:57:10 +07:00
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return add_update_call(rec, ftrace_addr);
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return add_update_nop(rec);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_call_replace(ip, addr);
|
|
|
|
|
2014-02-26 09:33:59 +07:00
|
|
|
return ftrace_write(ip, new, 1);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update_nop(struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long ip = rec->ip;
|
|
|
|
unsigned const char *new;
|
|
|
|
|
|
|
|
new = ftrace_nop_replace();
|
|
|
|
|
2014-02-26 09:33:59 +07:00
|
|
|
return ftrace_write(ip, new, 1);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int finish_update(struct dyn_ftrace *rec, int enable)
|
|
|
|
{
|
|
|
|
unsigned long ftrace_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = ftrace_update_record(rec, enable);
|
|
|
|
|
2014-05-07 08:34:14 +07:00
|
|
|
ftrace_addr = ftrace_get_addr_new(rec);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case FTRACE_UPDATE_IGNORE:
|
|
|
|
return 0;
|
|
|
|
|
2012-05-01 03:20:23 +07:00
|
|
|
case FTRACE_UPDATE_MODIFY_CALL:
|
2011-08-16 20:57:10 +07:00
|
|
|
case FTRACE_UPDATE_MAKE_CALL:
|
|
|
|
/* converting nop to call */
|
|
|
|
return finish_update_call(rec, ftrace_addr);
|
|
|
|
|
|
|
|
case FTRACE_UPDATE_MAKE_NOP:
|
|
|
|
/* converting a call to a nop */
|
|
|
|
return finish_update_nop(rec);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void do_sync_core(void *data)
|
|
|
|
{
|
|
|
|
sync_core();
|
|
|
|
}
|
|
|
|
|
|
|
|
static void run_sync(void)
|
|
|
|
{
|
2017-03-28 20:58:21 +07:00
|
|
|
int enable_irqs;
|
|
|
|
|
|
|
|
/* No need to sync if there's only one CPU */
|
|
|
|
if (num_online_cpus() == 1)
|
|
|
|
return;
|
|
|
|
|
|
|
|
enable_irqs = irqs_disabled();
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2017-03-10 07:16:31 +07:00
|
|
|
/* We may be called with interrupts disabled (on bootup). */
|
2011-08-16 20:57:10 +07:00
|
|
|
if (enable_irqs)
|
|
|
|
local_irq_enable();
|
|
|
|
on_each_cpu(do_sync_core, NULL, 1);
|
|
|
|
if (enable_irqs)
|
|
|
|
local_irq_disable();
|
|
|
|
}
|
|
|
|
|
2012-04-27 20:13:18 +07:00
|
|
|
void ftrace_replace_code(int enable)
|
2011-08-16 20:57:10 +07:00
|
|
|
{
|
|
|
|
struct ftrace_rec_iter *iter;
|
|
|
|
struct dyn_ftrace *rec;
|
|
|
|
const char *report = "adding breakpoints";
|
|
|
|
int count = 0;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = add_breakpoints(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
|
|
|
count++;
|
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
report = "updating code";
|
2015-09-16 23:19:42 +07:00
|
|
|
count = 0;
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = add_update(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
2015-09-16 23:19:42 +07:00
|
|
|
count++;
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
report = "removing breakpoints";
|
2015-09-16 23:19:42 +07:00
|
|
|
count = 0;
|
2011-08-16 20:57:10 +07:00
|
|
|
|
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
|
|
|
|
|
|
|
ret = finish_update(rec, enable);
|
|
|
|
if (ret)
|
|
|
|
goto remove_breakpoints;
|
2015-09-16 23:19:42 +07:00
|
|
|
count++;
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
return;
|
|
|
|
|
|
|
|
remove_breakpoints:
|
2014-02-17 22:22:53 +07:00
|
|
|
pr_warn("Failed on %s (%d):\n", report, count);
|
2014-10-25 04:56:04 +07:00
|
|
|
ftrace_bug(ret, rec);
|
2011-08-16 20:57:10 +07:00
|
|
|
for_ftrace_rec_iter(iter) {
|
|
|
|
rec = ftrace_rec_iter_record(iter);
|
2014-02-24 23:12:22 +07:00
|
|
|
/*
|
|
|
|
* Breakpoints are handled only when this function is in
|
|
|
|
* progress. The system could not work with them.
|
|
|
|
*/
|
|
|
|
if (remove_breakpoint(rec))
|
|
|
|
BUG();
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
ftrace/x86: Run a sync after fixup on failure
If a failure occurs while enabling a trace, it bails out and will remove
the tracepoints to be back to what the code originally was. But the fix
up had some bugs in it. By injecting a failure in the code, the fix up
ran to completion, but shortly afterward the system rebooted.
There was two bugs here.
The first was that there was no final sync run across the CPUs after the
fix up was done, and before the ftrace int3 handler flag was reset. That
means that other CPUs could still see the breakpoint and trigger on it
long after the flag was cleared, and the int3 handler would think it was
a spurious interrupt. Worse yet, the int3 handler could hit other breakpoints
because the ftrace int3 handler flag would have prevented the int3 handler
from going further.
Here's a description of the issue:
CPU0 CPU1
---- ----
remove_breakpoint();
modifying_ftrace_code = 0;
[still sees breakpoint]
<takes trap>
[sees modifying_ftrace_code as zero]
[no breakpoint handler]
[goto failed case]
[trap exception - kernel breakpoint, no
handler]
BUG()
The second bug was that the removal of the breakpoints required the
"within()" logic updates instead of accessing the ip address directly.
As the kernel text is mapped read-only when CONFIG_DEBUG_RODATA is set, and
the removal of the breakpoint is a modification of the kernel text.
The ftrace_write() includes the "within()" logic, where as, the
probe_kernel_write() does not. This prevented the breakpoint from being
removed at all.
Link: http://lkml.kernel.org/r/1392650573-3390-1-git-send-email-pmladek@suse.cz
Reported-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-21 22:43:12 +07:00
|
|
|
run_sync();
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
2012-05-31 00:36:38 +07:00
|
|
|
static int
|
|
|
|
ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
|
|
|
|
unsigned const char *new_code)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = add_break(ip, old_code);
|
|
|
|
if (ret)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
ret = add_update_code(ip, new_code);
|
|
|
|
if (ret)
|
|
|
|
goto fail_update;
|
|
|
|
|
|
|
|
run_sync();
|
|
|
|
|
|
|
|
ret = ftrace_write(ip, new_code, 1);
|
2014-02-24 23:12:22 +07:00
|
|
|
/*
|
|
|
|
* The breakpoint is handled only when this function is in progress.
|
|
|
|
* The system could not work if we could not remove it.
|
|
|
|
*/
|
|
|
|
BUG_ON(ret);
|
2012-05-31 00:36:38 +07:00
|
|
|
out:
|
2014-02-24 23:12:20 +07:00
|
|
|
run_sync();
|
2012-05-31 00:36:38 +07:00
|
|
|
return ret;
|
|
|
|
|
|
|
|
fail_update:
|
2014-02-24 23:12:22 +07:00
|
|
|
/* Also here the system could not work with the breakpoint */
|
|
|
|
if (ftrace_write(ip, old_code, 1))
|
|
|
|
BUG();
|
2012-05-31 00:36:38 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2011-08-16 20:57:10 +07:00
|
|
|
void arch_ftrace_update_code(int command)
|
|
|
|
{
|
2012-05-31 00:26:37 +07:00
|
|
|
/* See comment above by declaration of modifying_ftrace_code */
|
|
|
|
atomic_inc(&modifying_ftrace_code);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2012-04-27 20:13:18 +07:00
|
|
|
ftrace_modify_all_code(command);
|
2011-08-16 20:57:10 +07:00
|
|
|
|
2012-05-31 00:26:37 +07:00
|
|
|
atomic_dec(&modifying_ftrace_code);
|
2011-08-16 20:57:10 +07:00
|
|
|
}
|
|
|
|
|
2014-02-25 01:59:59 +07:00
|
|
|
int __init ftrace_dyn_arch_init(void)
|
ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.
The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.
Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.
e.g.
call ftrace /* 5 bytes */
is replaced with
jmp 3f /* jmp is 2 bytes and we jump 3 forward */
3:
When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace. When it is disabled,
we replace the code back to the jmp.
Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls. A large batch is allocated at
boot up to get most of the calls there.
Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-13 02:20:42 +07:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2008-11-26 12:16:24 +07:00
|
|
|
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
|
2014-02-12 08:19:44 +07:00
|
|
|
static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
|
2008-11-26 12:16:24 +07:00
|
|
|
{
|
2014-02-12 08:19:44 +07:00
|
|
|
static union ftrace_code_union calc;
|
2008-11-26 12:16:24 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
/* Jmp not a call (ignore the .e8) */
|
|
|
|
calc.e8 = 0xe9;
|
|
|
|
calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
|
2008-11-26 12:16:24 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
/*
|
|
|
|
* ftrace external locks synchronize the access to the static variable.
|
|
|
|
*/
|
|
|
|
return calc.code;
|
|
|
|
}
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Currently only x86_64 supports dynamic trampolines */
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
|
|
|
|
#ifdef CONFIG_MODULES
|
|
|
|
#include <linux/moduleloader.h>
|
|
|
|
/* Module allocation simplifies allocating memory for code */
|
|
|
|
static inline void *alloc_tramp(unsigned long size)
|
|
|
|
{
|
|
|
|
return module_alloc(size);
|
|
|
|
}
|
|
|
|
static inline void tramp_free(void *tramp)
|
|
|
|
{
|
2015-01-20 05:37:05 +07:00
|
|
|
module_memfree(tramp);
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
/* Trampolines can only be created if modules are supported */
|
|
|
|
static inline void *alloc_tramp(unsigned long size)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
static inline void tramp_free(void *tramp) { }
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Defined as markers to the end of the ftrace default trampolines */
|
|
|
|
extern void ftrace_regs_caller_end(void);
|
2016-02-16 15:43:21 +07:00
|
|
|
extern void ftrace_epilogue(void);
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
extern void ftrace_caller_op_ptr(void);
|
|
|
|
extern void ftrace_regs_caller_op_ptr(void);
|
|
|
|
|
|
|
|
/* movq function_trace_op(%rip), %rdx */
|
|
|
|
/* 0x48 0x8b 0x15 <offset-to-ftrace_trace_op (4 bytes)> */
|
|
|
|
#define OP_REF_SIZE 7
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The ftrace_ops is passed to the function callback. Since the
|
|
|
|
* trampoline only services a single ftrace_ops, we can pass in
|
|
|
|
* that ops directly.
|
|
|
|
*
|
|
|
|
* The ftrace_op_code_union is used to create a pointer to the
|
|
|
|
* ftrace_ops that will be passed to the callback function.
|
|
|
|
*/
|
|
|
|
union ftrace_op_code_union {
|
|
|
|
char code[OP_REF_SIZE];
|
|
|
|
struct {
|
|
|
|
char op[3];
|
|
|
|
int offset;
|
|
|
|
} __attribute__((packed));
|
|
|
|
};
|
|
|
|
|
2014-11-19 09:14:11 +07:00
|
|
|
static unsigned long
|
|
|
|
create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
{
|
|
|
|
unsigned const char *jmp;
|
|
|
|
unsigned long start_offset;
|
|
|
|
unsigned long end_offset;
|
|
|
|
unsigned long op_offset;
|
|
|
|
unsigned long offset;
|
|
|
|
unsigned long size;
|
|
|
|
unsigned long ip;
|
|
|
|
unsigned long *ptr;
|
|
|
|
void *trampoline;
|
|
|
|
/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
|
|
|
|
unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
|
|
|
|
union ftrace_op_code_union op_ptr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
|
|
|
|
start_offset = (unsigned long)ftrace_regs_caller;
|
|
|
|
end_offset = (unsigned long)ftrace_regs_caller_end;
|
|
|
|
op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
|
|
|
|
} else {
|
|
|
|
start_offset = (unsigned long)ftrace_caller;
|
2016-02-16 15:43:21 +07:00
|
|
|
end_offset = (unsigned long)ftrace_epilogue;
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
op_offset = (unsigned long)ftrace_caller_op_ptr;
|
|
|
|
}
|
|
|
|
|
|
|
|
size = end_offset - start_offset;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allocate enough size to store the ftrace_caller code,
|
2016-02-16 15:43:21 +07:00
|
|
|
* the jmp to ftrace_epilogue, as well as the address of
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
* the ftrace_ops this trampoline is used for.
|
|
|
|
*/
|
|
|
|
trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
|
|
|
|
if (!trampoline)
|
|
|
|
return 0;
|
|
|
|
|
2014-11-19 09:14:11 +07:00
|
|
|
*tramp_size = size + MCOUNT_INSN_SIZE + sizeof(void *);
|
|
|
|
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
/* Copy ftrace_caller onto the trampoline memory */
|
|
|
|
ret = probe_kernel_read(trampoline, (void *)start_offset, size);
|
|
|
|
if (WARN_ON(ret < 0)) {
|
|
|
|
tramp_free(trampoline);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
ip = (unsigned long)trampoline + size;
|
|
|
|
|
2016-02-16 15:43:21 +07:00
|
|
|
/* The trampoline ends with a jmp to ftrace_epilogue */
|
|
|
|
jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_epilogue);
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The address of the ftrace_ops that is used for this trampoline
|
|
|
|
* is stored at the end of the trampoline. This will be used to
|
|
|
|
* load the third parameter for the callback. Basically, that
|
|
|
|
* location at the end of the trampoline takes the place of
|
|
|
|
* the global function_trace_op variable.
|
|
|
|
*/
|
|
|
|
|
|
|
|
ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);
|
|
|
|
*ptr = (unsigned long)ops;
|
|
|
|
|
|
|
|
op_offset -= start_offset;
|
|
|
|
memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);
|
|
|
|
|
|
|
|
/* Are we pointing to the reference? */
|
|
|
|
if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
|
|
|
|
tramp_free(trampoline);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Load the contents of ptr into the callback parameter */
|
|
|
|
offset = (unsigned long)ptr;
|
|
|
|
offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
|
|
|
|
|
|
|
|
op_ptr.offset = offset;
|
|
|
|
|
|
|
|
/* put in the new offset to the ftrace_ops */
|
|
|
|
memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
|
|
|
|
|
|
|
|
/* ALLOC_TRAMP flags lets us know we created it */
|
|
|
|
ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
|
|
|
|
|
|
|
|
return (unsigned long)trampoline;
|
|
|
|
}
|
|
|
|
|
2014-07-04 01:51:36 +07:00
|
|
|
static unsigned long calc_trampoline_call_offset(bool save_regs)
|
|
|
|
{
|
|
|
|
unsigned long start_offset;
|
|
|
|
unsigned long call_offset;
|
|
|
|
|
|
|
|
if (save_regs) {
|
|
|
|
start_offset = (unsigned long)ftrace_regs_caller;
|
|
|
|
call_offset = (unsigned long)ftrace_regs_call;
|
|
|
|
} else {
|
|
|
|
start_offset = (unsigned long)ftrace_caller;
|
|
|
|
call_offset = (unsigned long)ftrace_call;
|
|
|
|
}
|
|
|
|
|
|
|
|
return call_offset - start_offset;
|
|
|
|
}
|
|
|
|
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
|
|
|
|
{
|
|
|
|
ftrace_func_t func;
|
|
|
|
unsigned char *new;
|
|
|
|
unsigned long offset;
|
|
|
|
unsigned long ip;
|
2014-11-19 09:14:11 +07:00
|
|
|
unsigned int size;
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (ops->trampoline) {
|
|
|
|
/*
|
|
|
|
* The ftrace_ops caller may set up its own trampoline.
|
|
|
|
* In such a case, this code must not modify it.
|
|
|
|
*/
|
|
|
|
if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
|
|
|
|
return;
|
|
|
|
} else {
|
2014-11-19 09:14:11 +07:00
|
|
|
ops->trampoline = create_trampoline(ops, &size);
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
if (!ops->trampoline)
|
|
|
|
return;
|
2014-11-19 09:14:11 +07:00
|
|
|
ops->trampoline_size = size;
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
}
|
|
|
|
|
2014-07-04 01:51:36 +07:00
|
|
|
offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
ip = ops->trampoline + offset;
|
|
|
|
|
|
|
|
func = ftrace_ops_get_func(ops);
|
|
|
|
|
|
|
|
/* Do a safe modify in case the trampoline is executing */
|
|
|
|
new = ftrace_call_replace(ip, (unsigned long)func);
|
|
|
|
ret = update_ftrace_func(ip, new);
|
|
|
|
|
|
|
|
/* The update should never fail */
|
|
|
|
WARN_ON(ret);
|
|
|
|
}
|
2014-07-04 01:51:36 +07:00
|
|
|
|
|
|
|
/* Return the address of the function the trampoline calls */
|
|
|
|
static void *addr_from_call(void *ptr)
|
|
|
|
{
|
|
|
|
union ftrace_code_union calc;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = probe_kernel_read(&calc, ptr, MCOUNT_INSN_SIZE);
|
|
|
|
if (WARN_ON_ONCE(ret < 0))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/* Make sure this is a call */
|
|
|
|
if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
|
|
|
|
pr_warn("Expected e8, got %x\n", calc.e8);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return ptr + MCOUNT_INSN_SIZE + calc.offset;
|
|
|
|
}
|
|
|
|
|
2014-11-25 09:00:34 +07:00
|
|
|
void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
|
2014-07-04 01:51:36 +07:00
|
|
|
unsigned long frame_pointer);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the ops->trampoline was not allocated, then it probably
|
|
|
|
* has a static trampoline func, or is the ftrace caller itself.
|
|
|
|
*/
|
|
|
|
static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long offset;
|
|
|
|
bool save_regs = rec->flags & FTRACE_FL_REGS_EN;
|
|
|
|
void *ptr;
|
|
|
|
|
|
|
|
if (ops && ops->trampoline) {
|
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
|
|
|
/*
|
|
|
|
* We only know about function graph tracer setting as static
|
|
|
|
* trampoline.
|
|
|
|
*/
|
|
|
|
if (ops->trampoline == FTRACE_GRAPH_ADDR)
|
|
|
|
return (void *)prepare_ftrace_return;
|
|
|
|
#endif
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
offset = calc_trampoline_call_offset(save_regs);
|
|
|
|
|
|
|
|
if (save_regs)
|
|
|
|
ptr = (void *)FTRACE_REGS_ADDR + offset;
|
|
|
|
else
|
|
|
|
ptr = (void *)FTRACE_ADDR + offset;
|
|
|
|
|
|
|
|
return addr_from_call(ptr);
|
|
|
|
}
|
|
|
|
|
|
|
|
void *arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
|
|
|
|
{
|
|
|
|
unsigned long offset;
|
|
|
|
|
|
|
|
/* If we didn't allocate this trampoline, consider it static */
|
|
|
|
if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
|
|
|
|
return static_tramp_func(ops, rec);
|
|
|
|
|
|
|
|
offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
|
|
|
|
return addr_from_call((void *)ops->trampoline + offset);
|
|
|
|
}
|
|
|
|
|
2014-07-04 02:48:16 +07:00
|
|
|
void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
|
|
|
|
{
|
|
|
|
if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
|
|
|
|
return;
|
|
|
|
|
|
|
|
tramp_free((void *)ops->trampoline);
|
|
|
|
ops->trampoline = 0;
|
|
|
|
}
|
2014-07-04 01:51:36 +07:00
|
|
|
|
ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-03 10:23:31 +07:00
|
|
|
#endif /* CONFIG_X86_64 */
|
|
|
|
#endif /* CONFIG_DYNAMIC_FTRACE */
|
|
|
|
|
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
|
|
|
|
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
|
|
extern void ftrace_graph_call(void);
|
2008-11-26 12:16:24 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
static int ftrace_mod_jmp(unsigned long ip, void *func)
|
|
|
|
{
|
|
|
|
unsigned char *new;
|
2008-11-26 12:16:24 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
new = ftrace_jmp_replace(ip, (unsigned long)func);
|
2008-11-26 12:16:24 +07:00
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
return update_ftrace_func(ip, new);
|
2008-11-26 12:16:24 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_enable_ftrace_graph_caller(void)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_graph_call);
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
return ftrace_mod_jmp(ip, &ftrace_graph_caller);
|
2008-11-26 12:16:24 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
int ftrace_disable_ftrace_graph_caller(void)
|
|
|
|
{
|
|
|
|
unsigned long ip = (unsigned long)(&ftrace_graph_call);
|
|
|
|
|
2014-02-12 08:19:44 +07:00
|
|
|
return ftrace_mod_jmp(ip, &ftrace_stub);
|
2008-11-26 12:16:24 +07:00
|
|
|
}
|
|
|
|
|
2008-11-16 12:02:06 +07:00
|
|
|
#endif /* !CONFIG_DYNAMIC_FTRACE */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Hook the return address and push it in the stack of return addrs
|
|
|
|
* in current thread info.
|
|
|
|
*/
|
2014-11-25 09:00:34 +07:00
|
|
|
void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
|
function-graph: add stack frame test
In case gcc does something funny with the stack frames, or the return
from function code, we would like to detect that.
An arch may implement passing of a variable that is unique to the
function and can be saved on entering a function and can be tested
when exiting the function. Usually the frame pointer can be used for
this purpose.
This patch also implements this for x86. Where it passes in the stack
frame of the parent function, and will test that frame on exit.
There was a case in x86_32 with optimize for size (-Os) where, for a
few functions, gcc would align the stack frame and place a copy of the
return address into it. The function graph tracer modified the copy and
not the actual return address. On return from the funtion, it did not go
to the tracer hook, but returned to the parent. This broke the function
graph tracer, because the return of the parent (where gcc did not do
this funky manipulation) returned to the location that the child function
was suppose to. This caused strange kernel crashes.
This test detected the problem and pointed out where the issue was.
This modifies the parameters of one of the functions that the arch
specific code calls, so it includes changes to arch code to accommodate
the new prototype.
Note, I notice that the parsic arch implements its own push_return_trace.
This is now a generic function and the ftrace_push_return_trace should be
used instead. This patch does not touch that code.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-18 23:45:08 +07:00
|
|
|
unsigned long frame_pointer)
|
2008-11-16 12:02:06 +07:00
|
|
|
{
|
|
|
|
unsigned long old;
|
|
|
|
int faulted;
|
2008-11-26 06:57:25 +07:00
|
|
|
struct ftrace_graph_ent trace;
|
2008-11-16 12:02:06 +07:00
|
|
|
unsigned long return_hooker = (unsigned long)
|
|
|
|
&return_to_handler;
|
|
|
|
|
ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram
On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function
graph tracing and then suspend to RAM, it will triple fault and reboot when
it resumes.
The first fault happens when booting a secondary CPU:
startup_32_smp()
load_ucode_ap()
prepare_ftrace_return()
ftrace_graph_is_dead()
(accesses 'kill_ftrace_graph')
The early head_32.S code calls into load_ucode_ap(), which has an an
ftrace hook, so it calls prepare_ftrace_return(), which calls
ftrace_graph_is_dead(), which tries to access the global
'kill_ftrace_graph' variable with a virtual address, causing a fault
because the CPU is still in real mode.
The fix is to add a check in prepare_ftrace_return() to make sure it's
running in protected mode before continuing. The check makes sure the
stack pointer is a virtual kernel address. It's a bit of a hack, but
it's not very intrusive and it works well enough.
For reference, here are a few other (more difficult) ways this could
have potentially been fixed:
- Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging
is enabled. (No idea what that would break.)
- Track down load_ucode_ap()'s entire callee tree and mark all the
functions 'notrace'. (Probably not realistic.)
- Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu()
or __cpu_up(), and ensure that the pause facility can be queried from
real mode.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@kernel.org
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14 05:53:55 +07:00
|
|
|
/*
|
|
|
|
* When resuming from suspend-to-ram, this function can be indirectly
|
|
|
|
* called from early CPU startup code while the CPU is in real mode,
|
|
|
|
* which would fail miserably. Make sure the stack pointer is a
|
|
|
|
* virtual address.
|
|
|
|
*
|
|
|
|
* This check isn't as accurate as virt_addr_valid(), but it should be
|
|
|
|
* good enough for this purpose, and it's fast.
|
|
|
|
*/
|
|
|
|
if (unlikely((long)__builtin_frame_address(0) >= 0))
|
|
|
|
return;
|
|
|
|
|
2014-06-25 21:35:14 +07:00
|
|
|
if (unlikely(ftrace_graph_is_dead()))
|
|
|
|
return;
|
|
|
|
|
2008-12-06 09:43:41 +07:00
|
|
|
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
|
2008-11-16 12:02:06 +07:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Protect against fault, even if it shouldn't
|
|
|
|
* happen. This tool is too much intrusive to
|
|
|
|
* ignore such a protection.
|
|
|
|
*/
|
|
|
|
asm volatile(
|
2009-02-10 23:53:23 +07:00
|
|
|
"1: " _ASM_MOV " (%[parent]), %[old]\n"
|
|
|
|
"2: " _ASM_MOV " %[return_hooker], (%[parent])\n"
|
2008-11-16 12:02:06 +07:00
|
|
|
" movl $0, %[faulted]\n"
|
2009-02-11 01:07:13 +07:00
|
|
|
"3:\n"
|
2008-11-16 12:02:06 +07:00
|
|
|
|
|
|
|
".section .fixup, \"ax\"\n"
|
2009-02-11 01:07:13 +07:00
|
|
|
"4: movl $1, %[faulted]\n"
|
|
|
|
" jmp 3b\n"
|
2008-11-16 12:02:06 +07:00
|
|
|
".previous\n"
|
|
|
|
|
2009-02-11 01:07:13 +07:00
|
|
|
_ASM_EXTABLE(1b, 4b)
|
|
|
|
_ASM_EXTABLE(2b, 4b)
|
2008-11-16 12:02:06 +07:00
|
|
|
|
2009-05-14 00:52:19 +07:00
|
|
|
: [old] "=&r" (old), [faulted] "=r" (faulted)
|
2009-02-10 23:53:23 +07:00
|
|
|
: [parent] "r" (parent), [return_hooker] "r" (return_hooker)
|
2008-11-16 12:02:06 +07:00
|
|
|
: "memory"
|
|
|
|
);
|
|
|
|
|
2008-12-03 11:50:02 +07:00
|
|
|
if (unlikely(faulted)) {
|
|
|
|
ftrace_graph_stop();
|
|
|
|
WARN_ON(1);
|
2008-11-16 12:02:06 +07:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2008-11-26 06:57:25 +07:00
|
|
|
trace.func = self_addr;
|
2011-02-12 08:36:02 +07:00
|
|
|
trace.depth = current->curr_ret_stack + 1;
|
2008-11-26 06:57:25 +07:00
|
|
|
|
2008-12-03 11:50:05 +07:00
|
|
|
/* Only trace if the calling function expects to */
|
|
|
|
if (!ftrace_graph_entry(&trace)) {
|
|
|
|
*parent = old;
|
2011-02-12 08:36:02 +07:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ftrace_push_return_trace(old, self_addr, &trace.depth,
|
2016-08-19 18:53:00 +07:00
|
|
|
frame_pointer, parent) == -EBUSY) {
|
2011-02-12 08:36:02 +07:00
|
|
|
*parent = old;
|
|
|
|
return;
|
2008-12-03 11:50:05 +07:00
|
|
|
}
|
2008-11-16 12:02:06 +07:00
|
|
|
}
|
2008-11-26 03:07:04 +07:00
|
|
|
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|