linux_dsm_epyc7002/kernel/trace
Steven Rostedt c1bf08ac26 ftrace: Be first to run code modification on modules
If some other kernel subsystem has a module notifier, and adds a kprobe
to a ftrace mcount point (now that kprobes work on ftrace points),
when the ftrace notifier runs it will fail and disable ftrace, as well
as kprobes that are attached to ftrace points.

Here's the error:

 WARNING: at kernel/trace/ftrace.c:1618 ftrace_bug+0x239/0x280()
 Hardware name: Bochs
 Modules linked in: fat(+) stap_56d28a51b3fe546293ca0700b10bcb29__8059(F) nfsv4 auth_rpcgss nfs dns_resolver fscache xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack lockd sunrpc ppdev parport_pc parport microcode virtio_net i2c_piix4 drm_kms_helper ttm drm i2c_core [last unloaded: bid_shared]
 Pid: 8068, comm: modprobe Tainted: GF            3.7.0-0.rc8.git0.1.fc19.x86_64 #1
 Call Trace:
  [<ffffffff8105e70f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff81134106>] ? __probe_kernel_read+0x46/0x70
  [<ffffffffa0180000>] ? 0xffffffffa017ffff
  [<ffffffffa0180000>] ? 0xffffffffa017ffff
  [<ffffffff8105e76a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810fd189>] ftrace_bug+0x239/0x280
  [<ffffffff810fd626>] ftrace_process_locs+0x376/0x520
  [<ffffffff810fefb7>] ftrace_module_notify+0x47/0x50
  [<ffffffff8163912d>] notifier_call_chain+0x4d/0x70
  [<ffffffff810882f8>] __blocking_notifier_call_chain+0x58/0x80
  [<ffffffff81088336>] blocking_notifier_call_chain+0x16/0x20
  [<ffffffff810c2a23>] sys_init_module+0x73/0x220
  [<ffffffff8163d719>] system_call_fastpath+0x16/0x1b
 ---[ end trace 9ef46351e53bbf80 ]---
 ftrace failed to modify [<ffffffffa0180000>] init_once+0x0/0x20 [fat]
  actual: cc:bb:d2:4b:e1

A kprobe was added to the init_once() function in the fat module on load.
But this happened before ftrace could have touched the code. As ftrace
didn't run yet, the kprobe system had no idea it was a ftrace point and
simply added a breakpoint to the code (0xcc in the cc:bb:d2:4b:e1).

Then when ftrace went to modify the location from a call to mcount/fentry
into a nop, it didn't see a call op, but instead it saw the breakpoint op
and not knowing what to do with it, ftrace shut itself down.

The solution is to simply give the ftrace module notifier the max priority.
This should have been done regardless, as the core code ftrace modification
also happens very early on in boot up. This makes the module modification
closer to core modification.

Link: http://lkml.kernel.org/r/20130107140333.593683061@goodmis.org

Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Frank Ch. Eigler <fche@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-21 13:21:50 -05:00
..
blktrace.c simple_open: automatically convert to simple_open() 2012-04-05 15:25:50 -07:00
ftrace.c ftrace: Be first to run code modification on modules 2013-01-21 13:21:50 -05:00
Kconfig tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
Makefile trace: Stop compiling in trace_clock unconditionally 2012-09-13 22:52:08 -04:00
power-traces.c perf: Clean up power events by introducing new, more generic ones 2011-01-04 08:16:54 +01:00
ring_buffer_benchmark.c tracing: Use NUMA allocation for per-cpu ring buffer pages 2011-06-14 22:04:39 -04:00
ring_buffer.c Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 18:18:58 -08:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace_branch.c tracing: Cache comms only after an event occurred 2012-10-31 16:45:31 -04:00
trace_clock.c tracing: Add a counter clock for those that do not trust clocks 2011-09-19 11:35:58 -04:00
trace_entries.h Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent 2012-03-24 08:19:09 +01:00
trace_event_perf.c perf/core improvements and fixes: 2012-08-21 11:27:00 +02:00
trace_events_filter_test.h tracing/filter: Add startup tests for events filter 2011-08-19 14:35:59 -04:00
trace_events_filter.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_events.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_export.c tracing: Do not enable function event with enable 2012-05-10 15:55:43 -04:00
trace_functions_graph.c tracing: Cache comms only after an event occurred 2012-10-31 16:45:31 -04:00
trace_functions.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
trace_irqsoff.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
trace_kdb.c kdb,ftdump: Remove reference to internal kdb include 2010-10-22 15:34:11 -05:00
trace_kprobe.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_mmiotrace.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
trace_nop.c
trace_output.c tracing: Format non-nanosec times from tsc clock without a decimal point. 2012-11-13 15:48:40 -05:00
trace_output.h
trace_printk.c tracing: Add percpu buffers for trace_printk() 2012-04-23 21:15:55 -04:00
trace_probe.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_probe.h tracing: Provide trace events interface for uprobes 2012-05-07 14:30:17 +02:00
trace_sched_switch.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_sched_wakeup.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
trace_selftest_dynamic.c ftrace: Add self-tests for multiple function trace users 2011-05-18 19:24:51 -04:00
trace_selftest.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_stack.c tracing: Remove unneeded checks from the stack tracer 2012-11-19 15:07:13 -05:00
trace_stat.c
trace_stat.h
trace_syscalls.c tracing: Cleanup unnecessary function declarations 2012-10-31 16:45:34 -04:00
trace_uprobe.c trace: use kbasename() 2012-12-17 17:15:17 -08:00
trace.c tracing: Fix regression of trace_pipe 2013-01-14 13:13:32 -05:00
trace.h tracing: Format non-nanosec times from tsc clock without a decimal point. 2012-11-13 15:48:40 -05:00