linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-13 21:56:47 +07:00

Author	SHA1	Message	Date
Steven Rostedt	7374e82771	tracing: Register the ftrace internal events during early boot All trace events including ftrace internel events (like trace_printk and function tracing), register functions that describe how to print their output. The events may be recorded as soon as the ring buffer is allocated, but they are just raw binary in the buffer. The mapping of event ids to how to print them are held within a structure that is registered on system boot. If a crash happens in boot up before these functions are registered then their output (via ftrace_dump_on_oops) will be useless: Dumping ftrace buffer: --------------------------------- <...>-1 0.... 319705us : Unknown type 6 --------------------------------- This can be quite frustrating for a kernel developer trying to see what is going wrong. There's no reason to register them so late in the boot up process. They can be registered by early_initcall(). Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-06-14 15:22:14 -04:00
Borislav Petkov	8d240dd88c	ftrace: Remove a superfluous check register_ftrace_function() checks ftrace_disabled and calls __register_ftrace_function which does it again. Drop the first check and add the unlikely hint to the second one. Also, drop the label as John correctly notices. No functional change. Link: http://lkml.kernel.org/r/20120329171140.GE6409@aftab Cc: Borislav Petkov <bp@amd64.org> Cc: John Kacur <jkacur@redhat.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-06-14 15:22:12 -04:00
Eric Dumazet	047fe36052	splice: fix racy pipe->buffers uses Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered by splice_shrink_spd() called from vmsplice_to_pipe() commit `35f3d14dbb` (pipe: add support for shrinking and growing pipes) added capability to adjust pipe->buffers. Problem is some paths don't hold pipe mutex and assume pipe->buffers doesn't change for their duration. Fix this by adding nr_pages_max field in struct splice_pipe_desc, and use it in place of pipe->buffers where appropriate. splice_shrink_spd() loses its struct pipe_inode_info argument. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tom Herbert <therbert@google.com> Cc: stable <stable@vger.kernel.org> # 2.6.35 Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-06-13 21:16:42 +02:00
Steven Rostedt	f2bf1f6f5f	tracing: Have tracing_off() actually turn tracing off A recent update to have tracing_on/off() only affect the ftrace ring buffers instead of all ring buffers had a cut and paste error. The tracing_off() did the exact same thing as tracing_on() and would not actually turn off tracing. Unfortunately, tracing_off() is more important to be working than tracing_on() as this is a key development tool, as it lets the developer turn off tracing as soon as a problem is discovered. It is also used by panic and oops code. This bug also breaks the 'echo func:traceoff > set_ftrace_filter' Cc: <stable@vger.kernel.org> # 3.4 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-06-06 22:15:14 -04:00
Linus Torvalds	65a50c951a	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf ui browser: Stop using 'self' perf annotate browser: Read perf config file for settings perf config: Allow '_' in config file variable names perf annotate browser: Make feature toggles global perf annotate browser: The idx_asm field should be used in asm only view perf tools: Convert critical messages to ui__error() perf ui: Make --stdio default when TUI is not supported tools lib traceevent: Silence compiler warning on 32bit build perf record: Fix branch_stack type in perf_record_opts perf tools: Reconstruct event with modifiers from perf_event_attr perf top: Fix counter name fixup when fallbacking to cpu-clock perf tools: fix thread_map__new_by_pid_str() memory leak in error path perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specified tools lib traceevent: Fix signature of create_arg_item() tools lib traceevent: Use proper function parameter type tools lib traceevent: Fix freeing arg on process_dynamic_array() tools lib traceevent: Fix a possibly wrong memory dereference tools lib traceevent: Fix a possible memory leak tools lib traceevent: Allow expressions in __print_symbolic() fields perf evlist: Explicititely initialize input_name ...	2012-05-30 11:12:00 -07:00
Linus Torvalds	654443e20d	Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull user-space probe instrumentation from Ingo Molnar: "The uprobes code originates from SystemTap and has been used for years in Fedora and RHEL kernels. This version is much rewritten, reviews from PeterZ, Oleg and myself shaped the end result. This tree includes uprobes support in 'perf probe' - but SystemTap (and other tools) can take advantage of user probe points as well. Sample usage of uprobes via perf, for example to profile malloc() calls without modifying user-space binaries. First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled. If you don't know which function you want to probe you can pick one from 'perf top' or can get a list all functions that can be probed within libc (binaries can be specified as well): $ perf probe -F -x /lib/libc.so.6 To probe libc's malloc(): $ perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0x7eac0) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 Make use of it to create a call graph (as the flat profile is going to look very boring): $ perf record -e probe_libc:malloc -gR make [ perf record: Woken up 173 times to write data ] [ perf record: Captured and wrote 44.190 MB perf.data (~1930712 $ perf report \| less 32.03% git libc-2.15.so [.] malloc \| --- malloc 29.49% cc1 libc-2.15.so [.] malloc \| --- malloc \| \|--0.95%-- 0x208eb1000000000 \| \|--0.63%-- htab_traverse_noresize 11.04% as libc-2.15.so [.] malloc \| --- malloc \| 7.15% ld libc-2.15.so [.] malloc \| --- malloc \| 5.07% sh libc-2.15.so [.] malloc \| --- malloc \| 4.99% python-config libc-2.15.so [.] malloc \| --- malloc \| 4.54% make libc-2.15.so [.] malloc \| --- malloc \| \|--7.34%-- glob \| \| \| \|--93.18%-- 0x41588f \| \| \| --6.82%-- glob \| 0x41588f ... Or: $ perf report -g flat \| less # Overhead Command Shared Object Symbol # ........ ............. ............. .......... # 32.03% git libc-2.15.so [.] malloc 27.19% malloc 29.49% cc1 libc-2.15.so [.] malloc 24.77% malloc 11.04% as libc-2.15.so [.] malloc 11.02% malloc 7.15% ld libc-2.15.so [.] malloc 6.57% malloc ... The core uprobes design is fairly straightforward: uprobes probe points register themselves at (inode:offset) addresses of libraries/binaries, after which all existing (or new) vmas that map that address will have a software breakpoint injected at that address. vmas are COW-ed to preserve original content. The probe points are kept in an rbtree. If user-space executes the probed inode:offset instruction address then an event is generated which can be recovered from the regular perf event channels and mmap-ed ring-buffer. Multiple probes at the same address are supported, they create a dynamic callback list of event consumers. The basic model is further complicated by the XOL speedup: the original instruction that is probed is copied (in an architecture specific fashion) and executed out of line when the probe triggers. The XOL area is a single vma per process, with a fixed number of entries (which limits probe execution parallelism). The API: uprobes are installed/removed via /sys/kernel/debug/tracing/uprobe_events, the API is integrated to align with the kprobes interface as much as possible, but is separate to it. Injecting a probe point is privileged operation, which can be relaxed by setting perf_paranoid to -1. You can use multiple probes as well and mix them with kprobes and regular PMU events or tracepoints, when instrumenting a task." Fix up trivial conflicts in mm/memory.c due to previous cleanup of unmap_single_vma(). * 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf probe: Detect probe target when m/x options are absent perf probe: Provide perf interface for uprobes tracing: Fix kconfig warning due to a typo tracing: Provide trace events interface for uprobes tracing: Extract out common code for kprobes/uprobes trace events tracing: Modify is_delete, is_return from int to bool uprobes/core: Decrement uprobe count before the pages are unmapped uprobes/core: Make background page replacement logic account for rss_stat counters uprobes/core: Optimize probe hits with the help of a counter uprobes/core: Allocate XOL slots for uprobes use uprobes/core: Handle breakpoint and singlestep exceptions uprobes/core: Rename bkpt to swbp uprobes/core: Make order of function parameters consistent across functions uprobes/core: Make macro names consistent uprobes: Update copyright notices uprobes/core: Move insn to arch specific structure uprobes/core: Remove uprobe_opcode_sz uprobes/core: Make instruction tables volatile uprobes: Move to kernel/events/ uprobes/core: Clean up, refactor and improve the code ...	2012-05-24 11:39:34 -07:00
Ingo Molnar	9ba0541453	Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent Pull an ftrace ring-buffer fix from Steve Rostedt: * fix kernel crash when changing the size of the ring-buffer on boxes where possible_cpus != online_cpus. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-24 09:06:24 +02:00
Steven Rostedt	6a31e1f135	ring-buffer: Check for valid buffer before changing size On some machines the number of possible CPUS is not the same as the number of CPUs that is on the machine. Ftrace uses possible_cpus to update the tracing structures but the ring buffer only allocates per cpu buffers for online CPUs when they come up. When the wakeup tracer was enabled in such a case, the ftrace code enabled all possible cpu buffers, but the code in ring_buffer_resize() did not check to see if the buffer in question was allocated. Since boot up CPUs did not match possible CPUs it caused the following crash: BUG: unable to handle kernel NULL pointer dereference at 00000020 IP: [<c1097851>] ring_buffer_resize+0x16a/0x28d *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: [last unloaded: scsi_wait_scan] Pid: 1387, comm: bash Not tainted 3.4.0-test+ #13 /DG965MQ EIP: 0060:[<c1097851>] EFLAGS: 00010217 CPU: 0 EIP is at ring_buffer_resize+0x16a/0x28d EAX: f5a14340 EBX: f6026b80 ECX: 00000ff4 EDX: 00000ff3 ESI: 00000000 EDI: 00000002 EBP: f4275ecc ESP: f4275eb0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 00000020 CR3: 34396000 CR4: 000007d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process bash (pid: 1387, ti=f4274000 task=f4380cb0 task.ti=f4274000) Stack: c109cf9a f6026b98 00000162 00160f68 00000006 00160f68 00000002 f4275ef0 c109d013 f4275ee8 c123b72a c1c0bf00 c1cc81dc 00000005 f4275f98 00000007 f4275f70 c109d0c7 7700000e 75656b61 00000070 f5e90900 f5c4e198 00000301 Call Trace: [<c109cf9a>] ? tracing_set_tracer+0x115/0x1e9 [<c109d013>] tracing_set_tracer+0x18e/0x1e9 [<c123b72a>] ? _copy_from_user+0x30/0x46 [<c109d0c7>] tracing_set_trace_write+0x59/0x7f [<c10ec01e>] ? fput+0x18/0x1c6 [<c11f8732>] ? security_file_permission+0x27/0x2b [<c10eaacd>] ? rw_verify_area+0xcf/0xf2 [<c10ec01e>] ? fput+0x18/0x1c6 [<c109d06e>] ? tracing_set_tracer+0x1e9/0x1e9 [<c10ead77>] vfs_write+0x8b/0xe3 [<c10ebead>] ? fget_light+0x30/0x81 [<c10eaf54>] sys_write+0x42/0x63 [<c1834fbf>] sysenter_do_call+0x12/0x28 This happens with the latency tracer as the ftrace code updates the saved max buffer via its cpumask and not with a global setting. Adding a check in ring_buffer_resize() to make sure the buffer being resized exists, fixes the problem. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-23 15:35:17 -04:00
Linus Torvalds	e8650a0823	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "As usual, it's mostly typo fixes, redundant code elimination and some documentation updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits) edac, mips: don't change code that has been removed in edac/mips tree xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer lib: Change mail address of Oskar Schirmer net: Change mail address of Oskar Schirmer arm/m68k: Change mail address of Sebastian Hess i2c: Change mail address of Oskar Schirmer net: Fix tcp_build_and_update_options comment in struct tcp_sock atomic64_32.h: fix parameter naming mismatch Kconfig: replace "--- help ---" with "---help---" c2port: fix bogus Kconfig "default no" edac: Fix spelling errors. qla1280: Remove redundant NULL check before release_firmware() call remoteproc: remove redundant NULL check before release_firmware() qla2xxx: Remove redundant NULL check before release_firmware() call. aic94xx: Get rid of redundant NULL check before release_firmware() call tehuti: delete redundant NULL check before release_firmware() qlogic: get rid of a redundant test for NULL before call to release_firmware() bna: remove redundant NULL test before release_firmware() tg3: remove redundant NULL test before release_firmware() call typhoon: get rid of redundant conditional before all to release_firmware() ...	2012-05-22 19:22:50 -07:00
Linus Torvalds	2ff2b289a6	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...	2012-05-22 18:18:55 -07:00
Linus Torvalds	c54894cd46	Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "Nothing exciting. Most are updates to debug stuff and related fixes. Two not-too-critical bugs are fixed - WARN_ON() triggering spurious during cpu offlining and unlikely lockdep related oops." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: lockdep: fix oops in processing workqueue workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active workqueue: Catch more locking problems with flush_work() workqueue: change BUG_ON() to WARN_ON() trace: Remove unused workqueue tracer	2012-05-22 17:36:56 -07:00
Ingo Molnar	6f5e3577d4	Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2012-05-21 09:44:36 +02:00
Ingo Molnar	bb27f55eb9	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-21 09:17:50 +02:00
Richard Weinberger	895b67fd58	tracing: Remove kernel_lock annotations The BKL is gone, these annotations are useless. Link: http://lkml.kernel.org/r/1320654202-4433-1-git-send-email-richard@nod.at Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-19 08:28:51 -04:00
Vaibhav Nagarnaik	a591c73f12	tracing: Fix initial buffer_size_kb state Make sure that the state of buffer_size_kb is initialized correctly and returns actual size of the ring buffer. Link: http://lkml.kernel.org/r/1336066834-1673-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-19 08:28:50 -04:00
Vaibhav Nagarnaik	05fdd70d2f	ring-buffer: Merge separate resize loops There are 2 separate loops to resize cpu buffers that are online and offline. Merge them to make the code look better. Also change the name from update_completion to update_done to allow shorter lines. Link: http://lkml.kernel.org/r/1337372991-14783-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-19 08:28:50 -04:00
Arnaldo Carvalho de Melo	16ee6576e2	Merge remote-tracking branch 'tip/perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit `e7c72d8` perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-05-18 13:13:33 -03:00
Steven Rostedt	b732d439cb	ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER The function tracer will enable the -pg option with gcc, which requires that frame pointers. When FRAME_POINTER is defined in the kernel config it adds the gcc option -fno-omit-frame-pointer which causes some problems on some architectures. For those architectures, the FRAME_POINTER select was not set. When FUNCTION_TRACER was selected on these architectures that can not have -fno-omit-frame-pointer, the -pg option is still set. But when FRAME_POINTER is not selected, the kernel config would add the gcc option -fomit-frame-pointer. Adding this option is incompatible with -pg even on archs that do not need frame pointers with -pg. The answer to this was to just not add either -fno-omit-frame-pointer or -fomit-frame-pointer on these archs that want function tracing but do not set FRAME_POINTER. As it turns out, for archs that require frame pointers for function tracing, the same can be used. If gcc requires frame pointers with -pg, it will simply add it. The best thing to do is not select FRAME_POINTER when function tracing is selected, and let gcc add it if needed. Only add the -fno-omit-frame-pointer when something else selects FRAME_POINTER, but do not add -fomit-frame-pointer if function tracing is selected. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 20:00:29 -04:00
Steven Rostedt	e4f5d5440b	ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() To remove duplicate code, have the ftrace arch_ftrace_update_code() use the generic ftrace_modify_all_code(). This requires that the default ftrace_replace_code() becomes a weak function so that an arch may override it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 20:00:27 -04:00
Steven Rostedt	8ed3e2cfe4	ftrace: Make ftrace_modify_all_code() global for archs to use Rename __ftrace_modify_code() to ftrace_modify_all_code() and make it global for all archs to use. This will remove the duplication of code, as archs that can modify code without stop_machine() can use it directly outside of the stop_machine() call. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 20:00:26 -04:00
Steven Rostedt	f0cf973a22	ftrace: Return record ip addr for ftrace_location() ftrace_location() is passed an addr, and returns 1 if the addr is on a ftrace nop (or caller to ftrace_caller), and 0 otherwise. To let kprobes know if it should move a breakpoint or not, it must return the actual addr that is the start of the ftrace nop. This way a kprobe placed on the location of a ftrace nop, can instead be placed on the instruction after the nop. Even if the probe addr is on the second or later byte of the nop, it can simply be moved forward. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:58:49 -04:00
Steven Rostedt	a650e02a52	ftrace: Consolidate ftrace_location() and ftrace_text_reserved() Both ftrace_location() and ftrace_text_reserved() do basically the same thing. They search to see if an address is in the ftace table (contains an address that may change from nop to call ftrace_caller). The difference is that ftrace_location() searches a single address, but ftrace_text_reserved() searches a range. This also makes the ftrace_text_reserved() faster as it now uses a bsearch() instead of linearly searching all the addresses within a page. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:58:48 -04:00
Steven Rostedt	9644302e33	ftrace: Speed up search by skipping pages by address As all records in a page of the ftrace table are sorted, we can speed up the search algorithm by checking if the address to look for falls in between the first and last record ip on the page. This speeds up both the ftrace_location() and ftrace_text_reserved() algorithms, as it can skip full pages when the search address is not in them. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:58:46 -04:00
Steven Rostedt	706c81f87f	ftrace: Remove extra helper functions The ftrace_record_ip() and ftrace_alloc_dyn_node() were from the time of the ftrace daemon. Although they were still used, they still make things a bit more complex than necessary. Move the code into the one function that uses it, and remove the helper functions. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:58:45 -04:00
Steven Rostedt	9fd49328fc	ftrace: Sort all function addresses, not just per page Instead of just sorting the ip's of the functions per ftrace page, sort the entire list before adding them to the ftrace pages. This will allow the bsearch algorithm to be sped up as it can also sort by pages, not just records within a page. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:58:44 -04:00
Vaibhav Nagarnaik	71babb2705	tracing: change CPU ring buffer state from tracing_cpumask According to Documentation/trace/ftrace.txt: tracing_cpumask: This is a mask that lets the user only trace on specified CPUS. The format is a hex string representing the CPUS. The tracing_cpumask currently doesn't affect the tracing state of per-CPU ring buffers. This patch enables/disables CPU recording as its corresponding bit in tracing_cpumask is set/unset. Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:50:38 -04:00
Namhyung Kim	0a3d7ce7e6	tracing: Check return value of tracing_dentry_percpu() If tracing_dentry_percpu() failed, tracing_init_debugfs_percpu() will try to create each cpu directories on debugfs' root directory as d_percpu is NULL. Link: http://lkml.kernel.org/r/1335143517-2285-1-git-send-email-namhyung.kim@lge.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:50:37 -04:00
Steven Rostedt	308f7eeb78	ring-buffer: Reset head page before running self test When the ring buffer does its consistency test on itself, it removes the head page, runs the tests, and then adds it back to what the "head_page" pointer was. But because the head_page pointer may lack behind the real head page (held by the link list pointer). The reset may be incorrect. Instead, if the head_page exists (it does not on first allocation) reset it back to the real head page before running the consistency tests. Then it will be put back to its original location after the tests are complete. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:50:36 -04:00
Steven Rostedt	659f451ff2	ring-buffer: Add integrity check at end of iter read There use to be ring buffer integrity checks after updating the size of the ring buffer. But now that the ring buffer can modify the size while the system is running, the integrity checks were removed, as they require the ring buffer to be disabed to perform the check. Move the integrity check to the reading of the ring buffer via the iterator reads (the "trace" file). As reading via an iterator requires disabling the ring buffer, it is a perfect place to have it. If the ring buffer happens to be disabled when updating the size, we still perform the integrity check. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 19:50:23 -04:00
Vaibhav Nagarnaik	5040b4b7bc	ring-buffer: Make addition of pages in ring buffer atomic This patch adds the capability to add new pages to a ring buffer atomically while write operations are going on. This makes it possible to expand the ring buffer size without reinitializing the ring buffer. The new pages are attached between the head page and its previous page. Link: http://lkml.kernel.org/r/1336096792-25373-2-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 16:25:51 -04:00
Vaibhav Nagarnaik	83f40318da	ring-buffer: Make removal of ring buffer pages atomic This patch adds the capability to remove pages from a ring buffer without destroying any existing data in it. This is done by removing the pages after the tail page. This makes sure that first all the empty pages in the ring buffer are removed. If the head page is one in the list of pages to be removed, then the page after the removed ones is made the head page. This removes the oldest data from the ring buffer and keeps the latest data around to be read. To do this in a non-racey manner, tracing is stopped for a very short time while the pages to be removed are identified and unlinked from the ring buffer. The pages are freed after the tracing is restarted to minimize the time needed to stop tracing. The context in which the pages from the per-cpu ring buffer are removed runs on the respective CPU. This minimizes the events not traced to only NMI trace contexts. Link: http://lkml.kernel.org/r/1336096792-25373-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 16:18:57 -04:00
Steven Rostedt	6edb2a8a38	tracing: Clean up tracing_mark_write() On gcc 4.5 the function tracing_mark_write() would give a warning of page2 being uninitialized. This is due to a bug in gcc because the logic prevents page2 from being used uninitialized, and gcc 4.6+ does not complain (correctly). Instead of adding a "unitialized" around page2, which could show a bug later on, I combined page1 and page2 into an array map_pages[]. This binds the two and the two are modified according to nr_pages (what gcc 4.5 seems to ignore). This no longer gives a warning with gcc 4.5 nor with gcc 4.6. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 16:18:57 -04:00
Ingo Molnar	9cba26e66d	Merge branch 'perf/uprobes' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/uprobes	2012-05-14 14:43:40 +02:00
Steven Rostedt	9b63776fa3	tracing: Do not enable function event with enable With the adding of function tracing event to perf, it caused a side effect that produces the following warning when enabling all events in ftrace: # echo 1 > /sys/kernel/debug/tracing/events/enable [console] event trace: Could not enable event function This is because when enabling all events via the debugfs system it ignores events that do not have a ->reg() function assigned. This was to skip over the ftrace internal events (as they are not TRACE_EVENTs). But as the ftrace function event now has a ->reg() function attached to it for use with perf, it is no longer ignored. Worse yet, this ->reg() function is being called when it should not be. It returns an error and causes the above warning to be printed. By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE) and have all ftrace internel event structures have it set, setting the events/enable will no longe try to incorrectly enable the function event and does not warn. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-10 15:55:43 -04:00
Steven Rostedt	68179686ac	tracing: Remove ftrace_disable/enable_cpu() The ftrace_disable_cpu() and ftrace_enable_cpu() functions were needed back before the ring buffer was lockless. Now that the ring buffer is lockless (and has been for some time), these functions serve no purpose, and unnecessarily slow down operations of the tracer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-08 21:06:26 -04:00
Jiri Olsa	50e18b94c6	tracing: Use seq_*_private interface for some seq files It's appropriate to use __seq_open_private interface to open some of trace seq files, because it covers all steps we are duplicating in tracing code - zallocating the iterator and setting it as seq_file's private. Using this for following files: trace available_filter_functions enabled_functions Link: http://lkml.kernel.org/r/1335342219-2782-5-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> [ Fixed warnings for: kernel/trace/trace.c: In function '__tracing_open': kernel/trace/trace.c:2418:11: warning: unused variable 'ret' [-Wunused-variable] kernel/trace/trace.c:2417:19: warning: unused variable 'm' [-Wunused-variable] ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-08 21:04:12 -04:00
Srikar Dronamraju	f3f096cfed	tracing: Provide trace events interface for uprobes Implements trace_event support for uprobes. In its current form it can be used to put probes at a specified offset in a file and dump the required registers when the code flow reaches the probed address. The following example shows how to dump the instruction pointer and %ax a register at the probed text address. Here we are trying to probe zfree in /bin/zsh: # cd /sys/kernel/debug/tracing/ # cat /proc/`pgrep zsh`/maps \| grep /bin/zsh \| grep r-xp 00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh # objdump -T /bin/zsh \| grep -w zfree 0000000000446420 g DF .text 0000000000000012 Base zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events # cat uprobe_events p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420 # echo 1 > events/uprobes/enable # sleep 20 # echo 0 > events/uprobes/enable # cat trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Linux-mm <linux-mm@kvack.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Anton Arapov <anton@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 14:30:17 +02:00
Srikar Dronamraju	8ab83f5647	tracing: Extract out common code for kprobes/uprobes trace events Move parts of trace_kprobe.c that can be shared with upcoming trace_uprobe.c. Common code to kernel/trace/trace_probe.h and kernel/trace/trace_probe.c. There are no functional changes. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Linux-mm <linux-mm@kvack.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Anton Arapov <anton@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120409091144.8343.76218.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 14:29:57 +02:00
Srikar Dronamraju	3a6b76661d	tracing: Modify is_delete, is_return from int to bool is_delete and is_return can take utmost 2 values and are better of being a boolean than a int. There are no functional changes. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Linux-mm <linux-mm@kvack.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Anton Arapov <anton@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120409091133.8343.65289.sendpatchset@srdronam.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 14:29:35 +02:00
Vaibhav Nagarnaik	438ced1720	ring-buffer: Add per_cpu ring buffer control files Add a debugfs entry under per_cpu/ folder for each cpu called buffer_size_kb to control the ring buffer size for each CPU independently. If the global file buffer_size_kb is used to set size, the individual ring buffers will be adjusted to the given size. The buffer_size_kb will report the common size to maintain backward compatibility. If the buffer_size_kb file under the per_cpu/ directory is used to change buffer size for a specific CPU, only the size of the respective ring buffer is updated. When tracing/buffer_size_kb is read, it reports 'X' to indicate that sizes of per_cpu ring buffers are not equivalent. Link: http://lkml.kernel.org/r/1328212844-11889-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Justin Teravest <teravest@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-23 21:17:51 -04:00
Dan Carpenter	5a26c8f0cf	tracing: Remove an unneeded check in trace_seq_buffer() memcpy() returns a pointer to "bug". Hopefully, it's not NULL here or we would already have Oopsed. Link: http://lkml.kernel.org/r/20120420063145.GA22649@elgon.mountain Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-23 21:16:10 -04:00
Steven Rostedt	07d777fe8c	tracing: Add percpu buffers for trace_printk() Currently, trace_printk() uses a single buffer to write into to calculate the size and format needed to save the trace. To do this safely in an SMP environment, a spin_lock() is taken to only allow one writer at a time to the buffer. But this could also affect what is being traced, and add synchronization that would not be there otherwise. Ideally, using percpu buffers would be useful, but since trace_printk() is only used in development, having per cpu buffers for something never used is a waste of space. Thus, the use of the trace_bprintk() format section is changed to be used for static fmts as well as dynamic ones. Then at boot up, we can check if the section that holds the trace_printk formats is non-empty, and if it does contain something, then we know a trace_printk() has been added to the kernel. At this time the trace_printk per cpu buffers are allocated. A check is also done at module load time in case a module is added that contains a trace_printk(). Once the buffers are allocated, they are never freed. If you use a trace_printk() then you should know what you are doing. A buffer is made for each type of context: normal softirq irq nmi The context is checked and the appropriate buffer is used. This allows for totally lockless usage of trace_printk(), and they no longer even disable interrupts. Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-23 21:15:55 -04:00
Steven Rostedt	db4c75cbeb	tracing: Fix stacktrace of latency tracers (irqsoff and friends) While debugging a latency with someone on IRC (mirage335) on #linux-rt (OFTC), we discovered that the stacktrace output of the latency tracers (preemptirqsoff) was empty. This bug was caused by the creation of the dynamic length stack trace again (like commit `12b5da3` "tracing: Fix ent_size in trace output" was). This bug is caused by the latency tracers requiring the next event to determine the time between the current event and the next. But by grabbing the next event, the iter->ent_size is set to the next event instead of the current one. As the stacktrace event is the last event, this makes the ent_size zero and causes nothing to be printed for the stack trace. The dynamic stacktrace uses the ent_size to determine how much of the stack can be printed. The ent_size of zero means no stack. The simple fix is to save the iter->ent_size before finding the next event. Note, mirage335 asked to remain anonymous from LKML and git, so I will not add the Reported-by and Tested-by tags, even though he did report the issue and tested the fix. Cc: stable@vger.kernel.org # 3.1+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-19 17:00:13 -04:00
Masanari Iida	59bf896406	Fix "the the" in various Kconfig Fix typo "the the" in various Kconfig. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-04-18 14:12:27 +02:00
Steven Rostedt	348f0fc238	tracing: Fix regression with tracing_on The change to make tracing_on affect only the ftrace ring buffer, caused a bug where it wont affect any ring buffer. The problem was that the buffer of the trace_array was passed to the write function and not the trace array itself. The trace_array can change the buffer when running a latency tracer. If this happens, then the buffer being disabled may not be the buffer currently used by ftrace. This will cause the tracing_on file to become useless. The simple fix is to pass the trace_array to the write function instead of the buffer. Then the actual buffer may be changed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-16 15:41:28 -04:00
Mark Brown	6e48b550d1	tracing: Fix build breakage without CONFIG_PERF_EVENTS (again) Today's -next fails to link for me: kernel/built-in.o:(.data+0x178e50): undefined reference to `perf_ftrace_event_register' It looks like multiple fixes have been merged for the issue fixed by commit `fa73dc9` (tracing: Fix build breakage without CONFIG_PERF_EVENTS) though I can't identify the other changes that have gone in at the minute, it's possible that the changes which caused the breakage fixed by the previous commit got dropped but the fix made it in. Link: http://lkml.kernel.org/r/1334307179-21255-1-git-send-email-broonie@opensource.wolfsonmicro.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-13 21:37:04 -04:00
Stephen Boyd	d3283fb45c	trace: Remove unused workqueue tracer This tracer was temporarily removed in `6416669` (workqueue: temporarily remove workqueue tracing, 2010-06-29) but never reinstated after concurrency managed workqueues were completed. For almost two years it hasn't been compilable so it seems nobody is using it. Delete it. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-04-11 09:18:48 -07:00
Linus Torvalds	5d32c88f0b	Merge branch 'akpm' (Andrew's patch-bomb) Merge batch of fixes from Andrew Morton: "The simple_open() cleanup was held back while I wanted for laggards to merge things. I still need to send a few checkpoint/restore patches. I've been wobbly about merging them because I'm wobbly about the overall prospects for success of the project. But after speaking with Pavel at the LSF conference, it sounds like they're further toward completion than I feared - apparently davem is at the "has stopped complaining" stage regarding the net changes. So I need to go back and re-review those patchs and their (lengthy) discussion." * emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches) memcg swap: use mem_cgroup_uncharge_swap fix backlight: add driver for DA9052/53 PMIC v1 C6X: use set_current_blocked() and block_sigmask() MAINTAINERS: add entry for sparse checker MAINTAINERS: fix REMOTEPROC F: typo alpha: use set_current_blocked() and block_sigmask() simple_open: automatically convert to simple_open() scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open() libfs: add simple_open() hugetlbfs: remove unregister_filesystem() when initializing module drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback fs/xattr.c:setxattr(): improve handling of allocation failures fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed fs/xattr.c: suppress page allocation failure warnings from sys_listxattr() sysrq: use SEND_SIG_FORCED instead of force_sig() proc: fix mount -t proc -o AAA	2012-04-05 15:30:34 -07:00
Stephen Boyd	234e340582	simple_open: automatically convert to simple_open() Many users of debugfs copy the implementation of default_open() when they want to support a custom read/write function op. This leads to a proliferation of the default_open() implementation across the entire tree. Now that the common implementation has been consolidated into libfs we can replace all the users of this function with simple_open(). This replacement was done with the following semantic patch: <smpl> @ open @ identifier open_f != simple_open; identifier i, f; @@ -int open_f(struct inode i, struct file f) -{ ( -if (i->i_private) -f->private_data = i->i_private; \| -f->private_data = i->i_private; ) -return 0; -} @ has_open depends on open @ identifier fops; identifier open.open_f; @@ struct file_operations fops = { ... -.open = open_f, +.open = simple_open, ... }; </smpl> [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-05 15:25:50 -07:00
Steven Rostedt	12b5da349a	tracing: Fix ent_size in trace output When reading the trace file, the records of each of the per_cpu buffers are examined to find the next event to print out. At the point of looking at the event, the size of the event is recorded. But if the first event is chosen, the other events in the other CPU buffers will reset the event size that is stored in the iterator descriptor, causing the event size passed to the output functions to be incorrect. In most cases this is not a problem, but for the case of stack traces, it is. With the change to the stack tracing to record a dynamic number of back traces, the output depends on the size of the entry instead of the fixed 8 back traces. When the entry size is not correct, the back traces would not be fully printed. Note, reading from the per-cpu trace files were not affected. Reported-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-03-27 12:05:44 -04:00
Ingo Molnar	7fd52392c5	Merge branch 'linus' into perf/urgent Merge reason: we need to fix a non-trivial merge conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-26 17:19:03 +02:00
Ingo Molnar	04a54d27ce	Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent	2012-03-24 08:19:09 +01:00
Wolfgang Mauerer	01de982abf	tracing: Fix ftrace stack trace entries 8 hex characters tell only half the tale for 64 bit CPUs, so use the appropriate length. Link: http://lkml.kernel.org/r/1332411501-8059-2-git-send-email-wolfgang.mauerer@siemens.com Cc: stable@vger.kernel.org Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-03-22 12:19:23 -04:00
Linus Torvalds	e2a0883e40	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...	2012-03-21 13:36:41 -07:00
Al Viro	38eff28926	constify path argument of trace_seq_path() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-20 21:29:40 -04:00
Mark Brown	fa73dc9400	tracing: Fix build breakage without CONFIG_PERF_EVENTS Today's -next fails to build for me: CC kernel/trace/trace_export.o In file included from kernel/trace/trace_export.c:197: kernel/trace/trace_entries.h:58: error: 'perf_ftrace_event_register' undeclared here (not in a function) make[2]: * [kernel/trace/trace_export.o] Error 1 make[1]: * [kernel/trace] Error 2 make: *** [kernel] Error 2 because as of ced390 (ftrace, perf: Add support to use function tracepoint in perf) perf_trace_event_register() is declared in trace.h only if CONFIG_PERF_EVENTS is enabled but I don't have that set. Ensure that we always have a definition of perf_trace_event_register() by making the definition unconditional. Link: http://lkml.kernel.org/r/1330426967-17067-1-git-send-email-broonie@opensource.wolfsonmicro.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-03-13 18:34:59 -04:00
Rajesh Bhagat	db6544e007	ftrace: Fix function_graph for archs that test ftrace_trace_function When CONFIG_DYNAMIC_FTRACE is not set, some archs (ARM) test the variable function_trace_function to determine if it should call the function tracer. If it is not set to ftrace_stub, then it will call the function and return, and not call the function graph tracer. But some of these archs (ARM) do not have the assembly code to test if function tracing is enabled or not (quick stop of tracing) and it calls the helper routine ftrace_test_stop_func() instead. If function tracer is enabled and then disabled, the variable ftrace_trace_function is still set to the helper routine ftrace_test_stop_func(), and not to ftrace_stub. This will prevent the function graph tracer from ever running. Output before patch /debug/tracing # echo function > current_tracer /debug/tracing # echo function_graph > current_tracer /debug/tracing # cat trace Output after patch /debug/tracing # echo function > current_tracer /debug/tracing # echo function_graph > current_tracer /debug/tracing # cat trace 0) ! 253.375 us \| } /* irq_enter */ 0) \| generic_handle_irq() { 0) \| handle_fasteoi_irq() { 0) 9.208 us \| _raw_spin_lock(); 0) \| handle_irq_event() { 0) \| handle_irq_event_percpu() { Signed-off-by: Rajesh Bhagat <rajesh.lnx@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-03-13 15:07:37 -04:00
Steven Rostedt	b892e5c897	tracing: Keep NMI watchdog from triggering when dumping trace As ftrace_dump() (called by ftrace_dump_on_oops) disables interrupts as it dumps its output to the console, it can keep interrupts disabled for long periods of time. This is likely to trigger the NMI watchdog, and it can disrupt the output of critical data. Add a touch_nmi_watchdog() to each event that is written to the screen to keep the NMI watchdog from affecting the output. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-03-01 22:06:48 -05:00
Gerlando Falauto	8c9cf542b8	tracing: Do not select FRAME_POINTER on PPC On PowerPC, FUNCTION_TRACER selects FRAME_POINTER, even though the architecture does not support it. This causes the following warning: warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && FUNCTION_TRACER && KMEMCHECK) selects FRAME_POINTER which has unmet direct dependencies (DEBUG_KERNEL && (CRIS \|\| M68K \|\| FRV \|\| UML \|\| AVR32 \|\| SUPERH \|\| BLACKFIN \|\| MN10300) \|\| ARCH_WANT_FRAME_POINTERS) So remove the warning by adding the extra condition "if !PPC" to FUNCTION_TRACER for FRAME_POINTER selection Link: http://lkml.kernel.org/r/1330330101-8618-1-git-send-email-gerlando.falauto@keymile.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-27 08:45:11 -05:00
Steven Rostedt	499e547057	tracing/ring-buffer: Only have tracing_on disable tracing buffers As the ring-buffer code is being used by other facilities in the kernel, having tracing_on file disable all buffers is not a desired affect. It should only disable the ftrace buffers that are being used. Move the code into the trace.c file and use the buffer disabling for tracing_on() and tracing_off(). This way only the ftrace buffers will be affected by them and other kernel utilities will not be confused to why their output suddenly stopped. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-22 15:50:28 -05:00
Jiri Olsa	5500fa5119	ftrace, perf: Add filter support for function trace event Adding support to filter function trace event via perf interface. It is now possible to use filter interface in the perf tool like: perf record -e ftrace:function --filter="(ip == mm_)" ls The filter syntax is restricted to the the 'ip' field only, and following operators are accepted '==' '!=' '\|\|', ending up with the filter strings like: ip == f1[, ]f2 ... \|\| ip != f3[, ]f4 ... with comma ',' or space ' ' as a function separator. If the space ' ' is used as a separator, the right side of the assignment needs to be enclosed in double quotes '"', e.g.: perf record -e ftrace:function --filter '(ip == do_execve,sys_,ext)' ls perf record -e ftrace:function --filter '(ip == "do_execve,sys_,ext")' ls perf record -e ftrace:function --filter '(ip == "do_execve sys_ ext*")' ls The '==' operator adds trace filter with same effect as would be added via set_ftrace_filter file. The '!=' operator adds trace filter with same effect as would be added via set_ftrace_notrace file. The right side of the '!=', '==' operators is list of functions or regexp. to be added to filter separated by space. The '\|\|' operator is used for connecting multiple filter definitions together. It is possible to have more than one '==' and '!=' operators within one filter string. Link: http://lkml.kernel.org/r/1329317514-8131-8-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:30 -05:00
Jiri Olsa	02aa3162ed	ftrace: Allow to specify filter field type for ftrace events Adding FILTER_TRACE_FN event field type for function tracepoint event, so it can be properly recognized within filtering code. Currently all fields of ftrace subsystem events share the common field type FILTER_OTHER. Since the function trace fields need special care within the filtering code we need to recognize it properly, hence adding the FILTER_TRACE_FN event type. Adding filter parameter to the FTRACE_ENTRY macro, to specify the filter field type for the event. Link: http://lkml.kernel.org/r/1329317514-8131-7-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:29 -05:00
Jiri Olsa	ced39002f5	ftrace, perf: Add support to use function tracepoint in perf Adding perf registration support for the ftrace function event, so it is now possible to register it via perf interface. The perf_event struct statically contains ftrace_ops as a handle for function tracer. The function tracer is registered/unregistered in open/close actions. To be efficient, we enable/disable ftrace_ops each time the traced process is scheduled in/out (via TRACE_REG_PERF_(ADD\|DELL) handlers). This way tracing is enabled only when the process is running. Intentionally using this way instead of the event's hw state PERF_HES_STOPPED, which would not disable the ftrace_ops. It is now possible to use function trace within perf commands like: perf record -e ftrace:function ls perf stat -e ftrace:function ls Allowed only for root. Link: http://lkml.kernel.org/r/1329317514-8131-6-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:27 -05:00
Jiri Olsa	e59a0bff3e	ftrace: Add FTRACE_ENTRY_REG macro to allow event registration Adding FTRACE_ENTRY_REG macro so particular ftrace entries could specify registration function and thus become accesible via perf. This will be used in upcomming patch for function trace. Link: http://lkml.kernel.org/r/1329317514-8131-5-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:26 -05:00
Jiri Olsa	489c75c3b3	ftrace, perf: Add add/del tracepoint perf registration actions Adding TRACE_REG_PERF_ADD and TRACE_REG_PERF_DEL to handle perf event schedule in/out actions. The add action is invoked for when the perf event is scheduled in, while the del action is invoked when the event is scheduled out. Link: http://lkml.kernel.org/r/1329317514-8131-4-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:25 -05:00
Jiri Olsa	ceec0b6fc7	ftrace, perf: Add open/close tracepoint perf registration actions Adding TRACE_REG_PERF_OPEN and TRACE_REG_PERF_CLOSE to differentiate register/unregister from open/close actions. The register/unregister actions are invoked for the first/last tracepoint user when opening/closing the event. The open/close actions are invoked for each tracepoint user when opening/closing the event. Link: http://lkml.kernel.org/r/1329317514-8131-3-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:24 -05:00
Jiri Olsa	e248491ac2	ftrace: Add enable/disable ftrace_ops control interface Adding a way to temporarily enable/disable ftrace_ops. The change follows the same way as 'global' ftrace_ops are done. Introducing 2 global ftrace_ops - control_ops and ftrace_control_list which take over all ftrace_ops registered with FTRACE_OPS_FL_CONTROL flag. In addition new per cpu flag called 'disabled' is also added to ftrace_ops to provide the control information for each cpu. When ftrace_ops with FTRACE_OPS_FL_CONTROL is registered, it is set as disabled for all cpus. The ftrace_control_list contains all the registered 'control' ftrace_ops. The control_ops provides function which iterates ftrace_control_list and does the check for 'disabled' flag on current cpu. Adding 3 inline functions: ftrace_function_local_disable/ftrace_function_local_enable - enable/disable the ftrace_ops on current cpu ftrace_function_local_disabled - get disabled ftrace_ops::disabled value for current cpu Link: http://lkml.kernel.org/r/1329317514-8131-2-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:23 -05:00
Steven Rostedt	5b34926114	tracing: Don't use p->len field to determine output in __print_() functions If more than one __print_() function is used in a tracepoint (__print_flags(), __print_symbols(), etc), then the temp seq buffer will not be zero on entry. Using the temp seq buffer's length to know if data has been printed or not in the current function is incorrect and may produce incorrect results. Currently, no in-tree tracepoint causes this bug, but new ones may be created. Cc: Andrew Vagin <avagin@openvz.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-21 11:08:13 -05:00
Andrey Vagin	e404b321db	tracing: Don't print an extra separator of flags If __print_flags() is used after another __print_*() function, the temp seq_file buffer will not be empty on entry, and the delimiter will be printed even though there's just one field. We get something like: \|S instead of just: S This is because the length of the temp seq buffer is used to determine if the delimiter is printed or not. But this algorithm fails when the seq buffer is not empty on entry, and the delimiter will be printed because it thinks that a previous field was already printed. Link: http://lkml.kernel.org/r/1329650167-480655-1-git-send-email-avagin@openvz.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-20 20:33:31 -05:00
Thomas Meyer	47b0edcb59	tracing/trivial: Use kcalloc instead of kzalloc to allocate array The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107 Link: http://lkml.kernel.org/r/1322600880.1534.347.camel@localhost.localdomain Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-13 13:48:11 -05:00
Geunsik Lim	1e42e83fde	ftrace: sched_switch plugin is deprecated Actually, sched_switch function tracer is merged into wakeup/wakeup_rt Update 'mini-HOWTO' for ftrace(Kernel function tracer). If we want to trace "sched:sched_switch" to trace sched_switch func, We may utilize event option.(e.g: trace-cmd list -e \| grep sched) This patch is based on Linux-3.3.rc2-SMP-PREEMPT Link: http://lkml.kernel.org/r/1328695537-15081-1-git-send-email-geunsik.lim@gmail.com Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-13 09:14:47 -05:00
Jiri Olsa	ac483c446b	ftrace: Change filter/notrace set functions to return exit code Currently the ftrace_set_filter and ftrace_set_notrace functions do not return any return code. So there's no way for ftrace_ops user to tell wether the filter was correctly applied. The set_ftrace_filter interface returns error in case the filter did not match: # echo krava > set_ftrace_filter bash: echo: write error: Invalid argument Changing both ftrace_set_filter and ftrace_set_notrace functions to return zero if the filter was applied correctly or -E* values in case of error. Link: http://lkml.kernel.org/r/1325495060-6402-2-git-send-email-jolsa@redhat.com Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-02-03 09:48:18 -05:00
Linus Torvalds	83c2f912b4	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) perf tools: Fix compile error on x86_64 Ubuntu perf report: Fix --stdio output alignment when --showcpuutilization used perf annotate: Get rid of field_sep check perf annotate: Fix usage string perf kmem: Fix a memory leak perf kmem: Add missing closedir() calls perf top: Add error message for EMFILE perf test: Change type of '-v' option to INCR perf script: Add missing closedir() calls tracing: Fix compile error when static ftrace is enabled recordmcount: Fix handling of elf64 big-endian objects. perf tools: Add const.h to MANIFEST to make perf-tar-src-pkg work again perf tools: Add support for guest/host-only profiling perf kvm: Do guest-only counting by default perf top: Don't update total_period on process_sample perf hists: Stop using 'self' for struct hist_entry perf hists: Rename total_session to total_period x86: Add counter when debug stack is used with interrupts enabled x86: Allow NMIs to hit breakpoints in i386 x86: Keep current stack in NMI breakpoints ...	2012-01-15 11:26:35 -08:00
Linus Torvalds	972b2c7199	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc//mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount ...	2012-01-08 12:19:57 -08:00
Linus Torvalds	35b740e466	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits) perf kvm: Fix copy & paste error in description perf script: Kill script_spec__delete perf top: Fix a memory leak perf stat: Introduce get_ratio_color() helper perf session: Remove impossible condition check perf tools: Fix feature-bits rework fallout, remove unused variable perf script: Add generic perl handler to process events perf tools: Use for_each_set_bit() to iterate over feature flags perf tools: Unify handling of features when writing feature section perf report: Accept fifos as input file perf tools: Moving code in some files perf tools: Fix out-of-bound access to struct perf_session perf tools: Continue processing header on unknown features perf tools: Improve macros for struct feature_ops perf: builtin-record: Document and check that mmap_pages must be a power of two. perf: builtin-record: Provide advice if mmap'ing fails with EPERM. perf tools: Fix truncated annotation perf script: look up thread using tid instead of pid perf tools: Look up thread names for system wide profiling perf tools: Fix comm for processes with named threads ...	2012-01-06 08:02:58 -08:00
Al Viro	f4ae40a6a5	switch debugfs to umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:56 -05:00
Tejun Heo	38b78eb855	tracing: Factorize filter creation There are four places where new filter for a given filter string is created, which involves several different steps. This patch factors those steps into create_[system_]filter() functions which in turn make use of create_filter_{start\|finish}() for common parts. The only functional change is that if replace_filter_string() is requested and fails, creation fails without any side effect instead of being ignored. Note that system filter is now installed after the processing is complete which makes freeing before and then restoring filter string on error unncessary. -v2: Rebased to resolve conflict with `49aa29513e` and updated both create_filter() functions to always set *filterp instead of requiring the caller to clear it to %NULL on entry. Link: http://lkml.kernel.org/r/1323988305-1469-2-git-send-email-tj@kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:27:02 -05:00
Steven Rostedt	762e120788	tracing: Have stack tracing set filtered functions at boot Add stacktrace_filter= to the kernel command line that lets the user pick specific functions to check the stack on. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:26:49 -05:00
Steven Rostedt	2a85a37f16	ftrace: Allow access to the boot time function enabling Change set_ftrace_early_filter() to ftrace_set_early_filter() and make it a global function. This will allow other subsystems in the kernel to be able to enable function tracing at start up and reuse the ftrace function parsing code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:26:35 -05:00
Steven Rostedt	d2d45c7a03	tracing: Have stack_tracer use a separate list of functions The stack_tracer is used to look at every function and check if the current stack is bigger than the last recorded max stack size. When a new max is found, then it saves that stack off. Currently the stack tracer is limited by the global_ops of the function tracer. As the stack tracer has nothing to do with the ftrace function tracer, except that it uses it as its internal engine, the stack tracer should have its own list. A new file is added to the tracing debugfs directory called: stack_trace_filter that can be used to select which functions you want to check the stack on. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:25:57 -05:00
Steven Rostedt	69a3083c4a	ftrace: Decouple hash items from showing filtered functions The set_ftrace_filter shows "hashed" functions, which are functions that are added with operations to them (like traceon and traceoff). As other subsystems may be able to show what functions they are using for function tracing, the hash items should no longer be shown just because the FILTER flag is set. As they have nothing to do with other subsystems filters. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:25:24 -05:00
Steven Rostedt	fc13cb0ce4	ftrace: Allow other users of function tracing to use the output listing The function tracer is set up to allow any other subsystem (like perf) to use it. Ftrace already has a way to list what functions are enabled by the global_ops. It would be very helpful to let other users of the function tracer to be able to use the same code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:25:06 -05:00
Steven Rostedt	06a51d9307	ftrace: Create ftrace_hash_empty() helper routine There are two types of hashes in the ftrace_ops; one type is the filter_hash and the other is the notrace_hash. Either one may be null, meaning it has no elements. But when elements are added, the hash is allocated. Throughout the code, a check needs to be made to see if a hash exists or the hash has elements, but the check if the hash exists is usually missing causing the possible "NULL pointer dereference bug". Add a helper routine called "ftrace_hash_empty()" that returns true if the hash doesn't exist or its count is zero. As they mean the same thing. Last-bug-reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:23:11 -05:00
Steven Rostedt	c842e97552	ftrace: Fix ftrace hash record update with notrace When disabling the "notrace" records, that means we want to trace them. If the notrace_hash is zero, it means that we want to trace all records. But to disable a zero notrace_hash means nothing. The check for the notrace_hash count was incorrect with: if (hash && !hash->count) return With the correct comment above it that states that we do nothing if the notrace_hash has zero count. But !hash also means that the notrace hash has zero count. I think this was done to protect against dereferencing NULL. But if !hash is true, then we go through the following loop without doing a single thing. Fix it to: if (!hash \|\| !hash->count) return; Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:21:43 -05:00
Steven Rostedt	5855fead9c	ftrace: Use bsearch to find record ip Now that each set of pages in the function list are sorted by ip, we can use bsearch to find a record within each set of pages. This speeds up the ftrace_location() function by magnitudes. For archs (like x86) that need to add a breakpoint at every function that will be converted from a nop to a callback and vice versa, the breakpoint callback needs to know if the breakpoint was for ftrace or not. It requires finding the breakpoint ip within the records. Doing a linear search is extremely inefficient. It is a must to be able to do a fast binary search to find these locations. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:20:50 -05:00
Steven Rostedt	68950619f8	ftrace: Sort the mcount records on each page Sort records by ip locations of the ftrace mcount calls on each of the set of pages in the function list. This helps in localizing cache usuage when updating the function locations, as well as gives us the ability to quickly find an ip location in the list. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:19:58 -05:00
Steven Rostedt	85ae32ae01	ftrace: Replace record newlist with record page list As new functions come in to be initalized from mcount to nop, they are done by groups of pages. Whether it is the core kernel or a module. There's no need to keep track of these on a per record basis. At startup, and as any module is loaded, the functions to be traced are stored in a group of pages and added to the function list at the end. We just need to keep a pointer to the first page of the list that was added, and use that to know where to start on the list for initializing functions. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:19:03 -05:00
Steven Rostedt	a790087554	ftrace: Allocate the mcount record pages as groups Allocate the mcount record pages as a group of pages as big as can be allocated and waste no more than a single page. Grouping the mcount pages as much as possible helps with cache locality, as we do not need to redirect with descriptors as we cross from page to page. It also allows us to do more with the records later on (sort them with bigger benefits). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:18:30 -05:00
Steven Rostedt	3208230983	ftrace: Remove usage of "freed" records Records that are added to the function trace table are permanently there, except for modules. By separating out the modules to their own pages that can be freed in one shot we can remove the "freed" flag and simplify some of the record management. Another benefit of doing this is that we can also move the records around; sort them. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:17:57 -05:00
Steven Rostedt	c88fd8634e	ftrace: Allow archs to modify code without stop machine The stop machine method to modify all functions in the kernel (some 20,000 of them) is the safest way to do so across all archs. But some archs may not need this big hammer approach to modify code on SMP machines, and can simply just update the code it needs. Adding a weak function arch_ftrace_update_code() that now does the stop machine, will also let any arch override this method. If the arch needs to check the system and then decide if it can avoid stop machine, it can still call ftrace_run_stop_machine() to use the old method. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:16:58 -05:00
Jiri Olsa	30fb6aa740	ftrace: Fix unregister ftrace_ops accounting Multiple users of the function tracer can register their functions with the ftrace_ops structure. The accounting within ftrace will update the counter on each function record that is being traced. When the ftrace_ops filtering adds or removes functions, the function records will be updated accordingly if the ftrace_ops is still registered. When a ftrace_ops is removed, the counter of the function records, that the ftrace_ops traces, are decremented. When they reach zero the functions that they represent are modified to stop calling the mcount code. When changes are made, the code is updated via stop_machine() with a command passed to the function to tell it what to do. There is an ENABLE and DISABLE command that tells the called function to enable or disable the functions. But the ENABLE is really a misnomer as it should just update the records, as records that have been enabled and now have a count of zero should be disabled. The DISABLE command is used to disable all functions regardless of their counter values. This is the big off switch and is not the complement of the ENABLE command. To make matters worse, when a ftrace_ops is unregistered and there is another ftrace_ops registered, neither the DISABLE nor the ENABLE command are set when calling into the stop_machine() function and the records will not be updated to match their counter. A command is passed to that function that will update the mcount code to call the registered callback directly if it is the only one left. This means that the ftrace_ops that is still registered will have its callback called by all functions that have been set for it as well as the ftrace_ops that was just unregistered. Here's a way to trigger this bug. Compile the kernel with CONFIG_FUNCTION_PROFILER set and with CONFIG_FUNCTION_GRAPH not set: CONFIG_FUNCTION_PROFILER=y # CONFIG_FUNCTION_GRAPH is not set This will force the function profiler to use the function tracer instead of the function graph tracer. # cd /sys/kernel/debug/tracing # echo schedule > set_ftrace_filter # echo function > current_tracer # cat set_ftrace_filter schedule # cat trace # tracer: nop # # entries-in-buffer/entries-written: 692/68108025 #P:4 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| kworker/0:2-909 [000] .... 531.235574: schedule <-worker_thread <idle>-0 [001] .N.. 531.235575: schedule <-cpu_idle kworker/0:2-909 [000] .... 531.235597: schedule <-worker_thread sshd-2563 [001] .... 531.235647: schedule <-schedule_hrtimeout_range_clock # echo 1 > function_profile_enabled # echo 0 > function_porfile_enabled # cat set_ftrace_filter schedule # cat trace # tracer: function # # entries-in-buffer/entries-written: 159701/118821262 #P:4 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| <idle>-0 [002] ...1 604.870655: local_touch_nmi <-cpu_idle <idle>-0 [002] d..1 604.870655: enter_idle <-cpu_idle <idle>-0 [002] d..1 604.870656: atomic_notifier_call_chain <-enter_idle <idle>-0 [002] d..1 604.870656: __atomic_notifier_call_chain <-atomic_notifier_call_chain The same problem could have happened with the trace_probe_ops, but they are modified with the set_frace_filter file which does the update at closure of the file. The simple solution is to change ENABLE to UPDATE and call it every time an ftrace_ops is unregistered. Link: http://lkml.kernel.org/r/1323105776-26961-3-git-send-email-jolsa@redhat.com Cc: stable@vger.kernel.org # 3.0+ Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 07:09:14 -05:00
Paul E. McKenney	a8eecf2248	trace: Allow ftrace_dump() to be called from modules Add an EXPORT_SYMBOL_GPL() so that rcutorture can dump the trace buffer upon detection of an RCU error. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:25 -08:00
Ingo Molnar	cc991b83b3	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2011-12-06 19:09:15 +01:00
Ingo Molnar	d6c1c49de5	Merge branch 'perf/urgent' into perf/core Merge reason: Add these cherry-picked commits so that future changes on perf/core don't conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-06 06:43:49 +01:00
Steven Rostedt	ddf6e0e507	ftrace: Fix hash record accounting bug If the set_ftrace_filter is cleared by writing just whitespace to it, then the filter hash refcounts will be decremented but not updated. This causes two bugs: 1) No functions will be enabled for tracing when they all should be 2) If the users clears the set_ftrace_filter twice, it will crash ftrace: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7() Modules linked in: Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32 Call Trace: [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f [<ffffffff8111bdfe>] ? kfree+0xe5/0x115 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f [<ffffffff8112e49a>] fput+0xfd/0x1c2 [<ffffffff8112b54c>] filp_close+0x6d/0x78 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b ---[ end trace 77a3a7ee73794a02 ]--- Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-05 13:28:47 -05:00
Steven Rostedt	c7c6ec8bec	ftrace: Remove force undef config value left for testing A forced undef of a config value was used for testing and was accidently left in during the final commit. This causes x86 to run slower than needed while running function tracing as well as causes the function graph selftest to fail when DYNMAIC_FTRACE is not set. This is because the code in MCOUNT expects the ftrace code to be processed with the config value set that happened to be forced not set. The forced config option was left in by: commit `6331c28c96` ftrace: Fix dynamic selftest failure on some archs Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian Cc: stable@vger.kernel.org Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-05 13:28:45 -05:00
Li Zefan	27b14b56af	tracing: Restore system filter behavior Though not all events have field 'prev_pid', it was allowed to do this: # echo 'prev_pid == 100' > events/sched/filter but commit `75b8e98263` (tracing/filter: Swap entire filter of events) broke it without any reason. Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-05 13:28:45 -05:00
Ilya Dryomov	cb59974742	tracing: fix event_subsystem ref counting Fix a bug introduced by `e9dbfae5`, which prevents event_subsystem from ever being released. Ref_count was added to keep track of subsystem users, not for counting events. Subsystem is created with ref_count = 1, so there is no need to increment it for every event, we have nr_events for that. Fix this by touching ref_count only when we actually have a new user - subsystem_open(). Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Link: http://lkml.kernel.org/r/1320052062-7846-1-git-send-email-idryomov@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-05 13:28:44 -05:00
Tejun Heo	d3d9acf646	trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter ftrace_event_call->filter is sched RCU protected but didn't use rcu_assign_pointer(). Use it. TODO: Add proper __rcu annotation to call->filter and all its users. -v2: Use RCU_INIT_POINTER() for %NULL clearing as suggested by Eric. Link: http://lkml.kernel.org/r/20111123164949.GA29639@google.com Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@kernel.org # (2.6.39+) Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-01 22:16:47 -05:00
Steven Rostedt	39eaf7ef88	tracing: Add entries in buffer and total entries to default output header Knowing the number of event entries in the ring buffer compared to the total number that were written is useful information. The latency format gives this information and there's no reason that the default format does not. This information is now added to the default header, along with the number of online CPUs: # tracer: nop # # entries-in-buffer/entries-written: 159836/64690869 #P:4 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| <idle>-0 [000] ...2 49.442971: local_touch_nmi <-cpu_idle <idle>-0 [000] d..2 49.442973: enter_idle <-cpu_idle <idle>-0 [000] d..2 49.442974: atomic_notifier_call_chain <-enter_idle <idle>-0 [000] d..2 49.442976: __atomic_notifier_call_chain <-atomic_notifier The above shows that the trace contains 159836 entries, but 64690869 were written. One could figure out that there were 64531033 entries that were dropped. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-17 11:10:43 -05:00
Steven Rostedt	77271ce4b2	tracing: Add irq, preempt-count and need resched info to default trace output People keep asking how to get the preempt count, irq, and need resched info and we keep telling them to enable the latency format. Some developers think that traces without this info is completely useless, and for a lot of tasks it is useless. The first option was to enable the latency trace as the default format, but the header for the latency format is pretty useless for most tracers and it also does the timestamp in straight microseconds from the time the trace started. This is sometimes more difficult to read as the default trace is seconds from the start of boot up. Latency format: # tracer: nop # # nop latency trace v1.1.5 on 3.2.0-rc1-test+ # -------------------------------------------------------------------- # latency: 0 us, #159771/64234230, CPU#1 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # \| task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # \| / _----=> need-resched # \|\| / _---=> hardirq/softirq # \|\|\| / _--=> preempt-depth # \|\|\|\| / delay # cmd pid \|\|\|\|\| time \| caller # \ / \|\|\|\|\| \ \| / migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule default format: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule The latency format header has latency information that is pretty meaningless for most tracers. Although some of the header is useful, and we can add that later to the default format as well. What is really useful with the latency format is the irqs-off, need-resched hard/softirq context and the preempt count. This commit adds the option irq-info which is on by default that adds this information: # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| <idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle <idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle If a user wants the old format, they can disable the 'irq-info' option: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| <idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] 49.309309: need_resched <-mwait_idle <idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] 49.309315: need_resched <-mwait_idle Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-17 09:58:48 -05:00
Ingo Molnar	efc96737bd	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2011-11-11 08:19:37 +01:00
Jiri Olsa	7e9a49ef54	tracing/latency: Fix header output for latency tracers In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers do not display proper latency header. The involved/fixed latency tracers are: wakeup_rt wakeup preemptirqsoff preemptoff irqsoff The patch adds proper handling of tracer configuration options for latency tracers, and displaying correct header info accordingly. * The current output (for wakeup tracer) with both graph and function tracers disabled is: # tracer: wakeup # <idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0 <idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up ... * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 55 us, #4/4, CPU#0 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # \| task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # \| / _----=> need-resched # \|\| / _---=> hardirq/softirq # \|\|\| / _--=> preempt-depth # \|\|\|\| / delay # cmd pid \|\|\|\|\| time \| caller # \ / \|\|\|\|\| \ \| / cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0 cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up * The current output (for wakeup tracer) with only function tracer enabled is: # tracer: wakeup # cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0 cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 207 us, #109/109, CPU#1 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # \| task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # \| / _----=> need-resched # \|\| / _---=> hardirq/softirq # \|\|\| / _--=> preempt-depth # \|\|\|\| / delay # cmd pid \|\|\|\|\| time \| caller # \ / \|\|\|\|\| \ \| / <idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1 <idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-07 13:48:35 -05:00
Steven Rostedt	d4d34b981a	ftrace: Fix hash record accounting bug If the set_ftrace_filter is cleared by writing just whitespace to it, then the filter hash refcounts will be decremented but not updated. This causes two bugs: 1) No functions will be enabled for tracing when they all should be 2) If the users clears the set_ftrace_filter twice, it will crash ftrace: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7() Modules linked in: Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32 Call Trace: [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f [<ffffffff8111bdfe>] ? kfree+0xe5/0x115 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f [<ffffffff8112e49a>] fput+0xfd/0x1c2 [<ffffffff8112b54c>] filp_close+0x6d/0x78 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b ---[ end trace 77a3a7ee73794a02 ]--- Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-07 13:48:05 -05:00
Steven Rostedt	8ee3c92b7f	ftrace: Remove force undef config value left for testing A forced undef of a config value was used for testing and was accidently left in during the final commit. This causes x86 to run slower than needed while running function tracing as well as causes the function graph selftest to fail when DYNMAIC_FTRACE is not set. This is because the code in MCOUNT expects the ftrace code to be processed with the config value set that happened to be forced not set. The forced config option was left in by: commit `6331c28c96` ftrace: Fix dynamic selftest failure on some archs Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian Cc: stable@vger.kernel.org Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-07 11:02:33 -05:00
Steven Rostedt	49aa29513e	tracing: Add boiler plate for subsystem filter The system filter can be used to set multiple event filters that exist within the system. But currently it displays the last filter written that does not necessarily correspond to the filters within the system. The system filter itself is not used to filter any events. The system filter is just a means to set filters of the events within it. Because this causes an ambiguous state when the system filter reads a filter string but the events within the system have different strings it is best to just show a boiler plate: ### global filter ### # Use this to set filters for multiple events. # Only events with the given fields will be affected. # If no events are modified, an error message will be displayed here. If an error occurs while writing to the system filter, the system filter will replace the boiler plate with the error message as it currently does. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-04 21:25:36 -04:00
Li Zefan	ed0449af53	tracing: Restore system filter behavior Though not all events have field 'prev_pid', it was allowed to do this: # echo 'prev_pid == 100' > events/sched/filter but commit `75b8e98263` (tracing/filter: Swap entire filter of events) broke it without any reason. Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-11-02 13:56:25 -04:00
Paul Gortmaker	6e5fdeedca	kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure These files were getting <linux/module.h> via an implicit non-obvious path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:05 -04:00
Ilya Dryomov	d631097577	tracing: fix event_subsystem ref counting Fix a bug introduced by `e9dbfae5`, which prevents event_subsystem from ever being released. Ref_count was added to keep track of subsystem users, not for counting events. Subsystem is created with ref_count = 1, so there is no need to increment it for every event, we have nr_events for that. Fix this by touching ref_count only when we actually have a new user - subsystem_open(). Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Link: http://lkml.kernel.org/r/1320052062-7846-1-git-send-email-idryomov@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-31 13:59:23 -04:00
Paul Gortmaker	56d82e000c	kernel: Add <linux/module.h> to files using it implicitly These files are doing things like module_put and try_module_get so they need to call out the module.h for explicit inclusion, rather than getting it via <linux/device.h> which we ideally want to remove the module.h inclusion from. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 09:20:12 -04:00
Linus Torvalds	7115e3fcf4	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits) perf symbols: Increase symbol KSYM_NAME_LEN size perf hists browser: Refuse 'a' hotkey on non symbolic views perf ui browser: Use libslang to read keys perf tools: Fix tracing info recording perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads perf hists: Don't consider filtered entries when calculating column widths perf hists: Don't decay total_period for filtered entries perf hists browser: Honour symbol_conf.show_{nr_samples,total_period} perf hists browser: Do not exit on tab key with single event perf annotate browser: Don't change selection line when returning from callq perf tools: handle endianness of feature bitmap perf tools: Add prelink suggestion to dso update message perf script: Fix unknown feature comment perf hists browser: Apply the dso and thread filters when merging new batches perf hists: Move the dso and thread filters from hist_browser perf ui browser: Honour the xterm colors perf top tui: Give color hints just on the percentage, like on --stdio perf ui browser: Make the colors configurable and change the defaults perf tui: Remove unneeded call to newtCls on startup perf hists: Don't format the percentage on hist_entry__snprintf ... Fix up conflicts in arch/x86/kernel/kprobes.c manually. Ingo's tree did the insane "add volatile to const array", which just doesn't make sense ("volatile const"?). But we could remove the const and make the array volatile to make doubly sure that gcc doesn't optimize it away.. Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the reader_lock has been turned into a raw lock by the core locking merge, and there was a new user of it introduced in this perf core merge. Make sure that new use also uses the raw accessor functions.	2011-10-26 17:03:38 +02:00
Linus Torvalds	3cfef95246	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock() lockdep: Comment all warnings lib: atomic64: Change the type of local lock to raw_spinlock_t locking, lib/atomic64: Annotate atomic64_lock::lock as raw locking, x86, iommu: Annotate qi->q_lock as raw locking, x86, iommu: Annotate irq_2_ir_lock as raw locking, x86, iommu: Annotate iommu->register_lock as raw locking, dma, ipu: Annotate bank_lock as raw locking, ARM: Annotate low level hw locks as raw locking, drivers/dca: Annotate dca_lock as raw locking, powerpc: Annotate uic->lock as raw locking, x86: mce: Annotate cmci_discover_lock as raw locking, ACPI: Annotate c3_lock as raw locking, oprofile: Annotate oprofilefs lock as raw locking, video: Annotate vga console lock as raw locking, latencytop: Annotate latency_lock as raw locking, timer_stats: Annotate table_lock as raw locking, rwsem: Annotate inner lock as raw locking, semaphores: Annotate inner lock as raw locking, sched: Annotate thread_group_cputimer as raw ... Fix up conflicts in kernel/posix-cpu-timers.c manually: making cputimer->cputime a raw lock conflicted with the ABBA fix in commit `bcd5cff721` ("cputimer: Cure lock inversion").	2011-10-26 16:17:32 +02:00
Steven Rostedt	436fc28026	tracing: Fix returning of duplicate data after EOF in trace_pipe_raw The trace_pipe_raw handler holds a cached page from the time the file is opened to the time it is closed. The cached page is used to handle the case of the user space buffer being smaller than what was read from the ring buffer. The left over buffer is held in the cache so that the next read will continue where the data left off. After EOF is returned (no more data in the buffer), the index of the cached page is set to zero. If a user app reads the page again after EOF, the check in the buffer will see that the cached page is less than page size and will return the cached page again. This will cause reading the trace_pipe_raw again after EOF to return duplicate data, making the output look like the time went backwards but instead data is just repeated. The fix is to not reset the index right after all data is read from the cache, but to reset it after all data is read and more data exists in the ring buffer. Cc: stable <stable@kernel.org> Reported-by: Jeremy Eder <jeder@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-14 10:44:25 -04:00
Geunsik Lim	9b5f8b31af	ftrace: Fix README to state tracing_on to start/stop tracing tracing_enabled option is deprecated. To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on without tracing_enabled. This patch is based on Linux 3.1.0-rc1 Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com> Link: http://lkml.kernel.org/r/1313127022-23830-1-git-send-email-leemgs1@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-14 10:41:33 -04:00
Ingo Molnar	910e94dd0c	Merge branch 'tip/perf/core' of git://github.com/rostedt/linux into perf/core	2011-10-12 17:14:47 +02:00
Steven Rostedt	d696b58ca2	tracing: Do not allocate buffer for trace_marker When doing intense tracing, the kmalloc inside trace_marker can introduce side effects to what is being traced. As trace_marker() is used by userspace to inject data into the kernel ring buffer, it needs to do so with the least amount of intrusion to the operations of the kernel or the user space application. As the ring buffer is designed to write directly into the buffer without the need to make a temporary buffer, and userspace already went through the hassle of knowing how big the write will be, we can simply pin the userspace pages and write the data directly into the buffer. This improves the impact of tracing via trace_marker tremendously! Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the use of get_user_pages_fast() and kmap_atomic(). Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-11 09:13:53 -04:00
Steven Rostedt	e0a413f619	tracing: Warn on output if the function tracer was found corrupted As the function tracer is very intrusive, lots of self checks are performed on the tracer and if something is found to be strange it will shut itself down keeping it from corrupting the rest of the kernel. This shutdown may still allow functions to be traced, as the tracing only stops new modifications from happening. Trying to stop the function tracer itself can cause more harm as it requires code modification. Although a WARN_ON() is executed, a user may not notice it. To help the user see that something isn't right with the tracing of the system a big warning is added to the output of the tracer that lets the user know that their data may be incomplete. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-11 09:13:25 -04:00
Masami Hiramatsu	02ca1521ad	ftrace/kprobes: Fix not to delete probes if in use Fix kprobe-tracer not to delete a probe if the probe is in use. In that case, delete operation will return -EBUSY. This bug can cause a kernel panic if enabled probes are deleted during perf record. (Add some probes on functions) sh-4.2# perf probe --del probe:\* sh-4.2# exit (kernel panic) This is originally reported on the fedora bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=742383 I've also checked that this problem doesn't happen on tracepoints when module removing because perf event locks target module. $ sudo ./perf record -e xfs:\* -aR sh sh-4.2# rmmod xfs ERROR: Module xfs is in use sh-4.2# exit [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.203 MB perf.data (~8862 samples) ] Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20111004104438.14591.6553.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-10-10 15:13:03 -04:00
Rafael J. Wysocki	d727b60659	Merge branch 'pm-runtime' into pm-for-linus * pm-runtime: PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set PM / Runtime: Replace dev_dbg() with trace_rpm_() PM / Runtime: Introduce trace points for tracing rpm_ functions PM / Runtime: Don't run callbacks under lock for power.irq_safe set USB: Add wakeup info to debugging messages PM / Runtime: pm_runtime_idle() can be called in atomic context PM / Runtime: Add macro to test for runtime PM events PM / Runtime: Add might_sleep() to runtime PM functions	2011-10-07 23:16:55 +02:00
Ming Lei	2a5306cc5f	PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set Do not build kernel/trace/rpm-traces.c if CONFIG_PM_RUNTIME is not set, which avoids a build failure. [rjw: Added the changelog and modified the subject slightly.] Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-09-29 22:07:23 +02:00
Ming Lei	53b615ccca	PM / Runtime: Introduce trace points for tracing rpm_* functions This patch introduces 3 trace points to prepare for tracing rpm_idle/rpm_suspend/rpm_resume functions, so we can use these trace points to replace the current dev_dbg(). Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-09-27 22:53:27 +02:00
Steven Rostedt	e36de1de4a	tracing: Fix preemptirqsoff tracer to not stop at preempt off If irqs are disabled when preemption count reaches zero, the preemptirqsoff tracer should not flag that as the end. When interrupts are enabled and preemption count is not zero the preemptirqsoff correctly continues its tracing. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-09-22 11:11:51 -04:00
Steven Rostedt	6249687f76	tracing: Add a counter clock for those that do not trust clocks When debugging tight race conditions, it can be helpful to have a synchronized tracing method. Although in most cases the global clock provides this functionality, if timings is not the issue, it is more comforting to know that the order of events really happened in a precise order. Instead of using a clock, add a "counter" that is simply an incrementing atomic 64bit counter that orders the events as they are perceived to happen. The trace_clock_counter() is added from the attempt by Peter Zijlstra trying to convert the trace_clock_global() to it. I took Peter's counter code and made trace_clock_counter() instead, and added it to the choice of clocks. Just echo counter > /debug/tracing/trace_clock to activate it. Requested-by: Thomas Gleixner <tglx@linutronix.de> Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-09-19 11:35:58 -04:00
Thomas Gleixner	5389f6fad2	locking, tracing: Annotate tracing locks as raw The tracing locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:52 +02:00
Vaibhav Nagarnaik	c64e148a3b	trace: Add ring buffer stats to measure rate of events The stats file under per_cpu folder provides the number of entries, overruns and other statistics about the CPU ring buffer. However, the numbers do not provide any indication of how full the ring buffer is in bytes compared to the overall size in bytes. Also, it is helpful to know the rate at which the cpu buffer is filling up. This patch adds an entry "bytes: " in printed stats for per_cpu ring buffer which provides the actual bytes consumed in the ring buffer. This field includes the number of bytes used by recorded events and the padding bytes added when moving the tail pointer to next page. It also adds the following time stamps: "oldest event ts:" - the oldest timestamp in the ring buffer "now ts:" - the timestamp at the time of reading The field "now ts" provides a consistent time snapshot to the userspace when being read. This is read from the same trace clock used by tracing event timestamps. Together, these values provide the rate at which the buffer is filling up, from the formula: bytes / (now_ts - oldest_event_ts) Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1313531179-9323-3-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-30 12:27:45 -04:00
Vaibhav Nagarnaik	f81ab074c3	trace: Add a new readonly entry to report total buffer size The current file "buffer_size_kb" reports the size of per-cpu buffer and not the overall memory allocated which could be misleading. A new file "buffer_total_size_kb" adds up all the enabled CPU buffer sizes and reports it. This is only a readonly entry. Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1313531179-9323-2-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-30 12:27:44 -04:00
Steven Rostedt	86b6ef21b8	tracing: Add preempt disable for filter self test The self testing for event filters does not really need preemption disabled as there are no races at the time of testing, but the functions it calls uses rcu_dereference_sched() which will complain if preemption is not disabled. Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-30 12:27:07 -04:00
Jiri Olsa	1d0e78e380	tracing/filter: Add startup tests for events filter Adding automated tests running as late_initcall. Tests are compiled in with CONFIG_FTRACE_STARTUP_TEST option. Adding test event "ftrace_test_filter" used to simulate filter processing during event occurance. String filters are compiled and tested against several test events with different values. Also testing that evaluation of explicit predicates is ommited due to the lazy filter evaluation. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-11-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:59 -04:00
Jiri Olsa	f30120fce1	tracing/filter: Change filter_match_preds function to use walk_pred_tree Changing filter_match_preds function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-10-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:58 -04:00
Jiri Olsa	96bc293a97	tracing/filter: Change fold_pred function to use walk_pred_tree Changing fold_pred_tree function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-9-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:57 -04:00
Jiri Olsa	1b797fe5aa	tracing/filter: Change fold_pred_tree function to use walk_pred_tree Changing fold_pred_tree function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-8-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:56 -04:00
Jiri Olsa	c00b060f36	tracing/filter: Change count_leafs function to use walk_pred_tree Changing count_leafs function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-7-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:55 -04:00
Jiri Olsa	f03f597994	tracing/filter: Unify predicate tree walking, change check_pred_tree function to use it Adding walk_pred_tree function to be used for walking throught the filter predicates. For each predicate the callback function is called, allowing users to add their own functionality or customize their way through the filter predicates. Changing check_pred_tree function to use walk_pred_tree. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-6-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:54 -04:00
Jiri Olsa	3f78f935e7	tracing/filter: Simplify tracepoint event lookup We dont need to perform lookup through the ftrace_events list, instead we can use the 'tp_event' field. Each perf_event contains tracepoint event field 'tp_event', which got initialized during the tracepoint event initialization. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-5-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:53 -04:00
Jiri Olsa	61aaef5530	tracing/filter: Remove field_name from filter_pred struct The field_name was used just for finding event's fields. This way we don't need to care about field_name allocation/free. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-4-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:52 -04:00
Jiri Olsa	9d96cd1743	tracing/filter: Separate predicate init and filter addition Making the code cleaner by having one function to fully prepare the predicate (create_pred), and another to add the predicate to the filter (filter_add_pred). As a benefit, this way the dry_run flag stays only inside the replace_preds function and is not passed deeper. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-3-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:51 -04:00
Jiri Olsa	81570d9caa	tracing/filter: Use static allocation for filter predicates Don't dynamically allocate filter_pred struct, use static memory. This way we can get rid of the code managing the dynamic filter_pred struct object. The create_pred function integrates create_logical_pred function. This way the static predicate memory is returned only from one place. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1313072754-4620-2-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:51 -04:00
Linus Torvalds	5ccc38740a	Merge branch 'for-linus' of git://git.kernel.dk/linux-block * 'for-linus' of git://git.kernel.dk/linux-block: (23 commits) Revert "cfq: Remove special treatment for metadata rqs." block: fix flush machinery for stacking drivers with differring flush flags block: improve rq_affinity placement blktrace: add FLUSH/FUA support Move some REQ flags to the common bio/request area allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH xen/blkback: Make description more obvious. cfq-iosched: Add documentation about idling block: Make rq_affinity = 1 work as expected block: swim3: fix unterminated of_device_id table block/genhd.c: remove useless cast in diskstats_show() drivers/cdrom/cdrom.c: relax check on dvd manufacturer value drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse bsg-lib: add module.h include cfq-iosched: Reduce linked group count upon group destruction blk-throttle: correctly determine sync bio loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices loop: add management interface for on-demand device allocation loop: replace linked list of allocated devices with an idr index ...	2011-08-19 10:47:07 -07:00
Namhyung Kim	c09c47caed	blktrace: add FLUSH/FUA support Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or FUA follows WRITE, use the same 'F' flag for both cases and distinguish them by their (relative) position. The end results look like (other flags might be shown also): - WRITE: W - WRITE_FLUSH: FW - WRITE_FUA: WF - WRITE_FLUSH_FUA: FWF Note that we reuse TC_BARRIER due to lack of bit space of act_mask so that the older versions of blktrace tools will report flush requests as barriers from now on. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Namhyung Kim <namhyung@gmail.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-08-11 10:36:05 +02:00
Steven Rostedt	3a301d7c1c	tracing: Clean up tb_fmt to not give faulty compile warning gcc incorrectly states that the variable "fmt" is uninitialized when CC_OPITMIZE_FOR_SIZE is set. Instead of just blindly setting fmt to NULL, the code is cleaned up a little to be a bit easier for humans to follow, as well as gcc to know the variables are initialized. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-10 20:36:32 -04:00
Ingo Molnar	3272cab406	Merge branch 'linus' into perf/urgent Merge reason: Include most of the merge window trees, to do fixes on top. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-05 10:33:55 +02:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Jesper Juhl	f629299b54	trace events: Update version number reference to new 3.x scheme for EVENT_POWER_TRACING_DEPRECATED What was scheduled to be 2.6.41 is now going to be 3.1 . Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1107250929370.8080@swampdragon.chaosbits.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-25 09:37:21 +02:00
Ingo Molnar	40bcea7bbe	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2011-07-21 09:32:40 +02:00
Ingo Molnar	492f73a303	Merge branch 'perf/urgent' into perf/core Merge reason: pick up the latest fixes - they won't make v3.0. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 09:29:21 +02:00
Masami Hiramatsu	7f6878a3d7	tracing/kprobe: Update symbol reference when loading module Since the address of a module-local variable can only be solved after the target module is loaded, the symbol fetch-argument should be updated when loading target module. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20110627072703.6528.75042.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-15 15:45:32 -04:00
Masami Hiramatsu	6142431810	tracing/kprobes: Support module init function probing To support probing module init functions, kprobe-tracer allows user to define a probe on non-existed function when it is given with a module name. This also enables user to set a probe on a function on a specific module, even if a same name (but different) function is locally defined in another module. The module name must be in the front of function name and separated by a ':'. e.g. btrfs:btrfs_init_sysfs Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/20110627072656.6528.89970.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-15 15:17:14 -04:00
Masami Hiramatsu	1538f888f1	tracing/kprobes: Merge trace probe enable/disable functions Merge redundant enable/disable functions into enable_trace_probe() and disable_trace_probe(). Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: yrl.pp-manager.tt@hitachi.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Link: http://lkml.kernel.org/r/20110627072644.6528.26910.stgit@fedora15 [ converted kprobe selftest to use enable_trace_probe ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-15 15:10:58 -04:00
Steven Rostedt	f7bc8b61f6	ftrace: Fix regression where ftrace breaks when modules are loaded Enabling function tracer to trace all functions, then load a module and then disable function tracing will cause ftrace to fail. This can also happen by enabling function tracing on the command line: ftrace=function and during boot up, modules are loaded, then you disable function tracing with 'echo nop > current_tracer' you will trigger a bug in ftrace that will shut itself down. The reason is, the new ftrace code keeps ref counts of all ftrace_ops that are registered for tracing. When one or more ftrace_ops are registered, all the records that represent the functions that the ftrace_ops will trace have a ref count incremented. If this ref count is not zero, when the code modification runs, that function will be enabled for tracing. If the ref count is zero, that function will be disabled from tracing. To make sure the accounting was working, FTRACE_WARN_ON()s were added to updating of the ref counts. If the ref count hits its max (> 2^30 ftrace_ops added), or if the ref count goes below zero, a FTRACE_WARN_ON() is triggered which disables all modification of code. Since it is common for ftrace_ops to trace all functions in the kernel, instead of creating > 20,000 hash items for the ftrace_ops, the hash count is just set to zero, and it represents that the ftrace_ops is to trace all functions. This is where the issues arrise. If you enable function tracing to trace all functions, and then add a module, the modules function records do not get the ref count updated. When the function tracer is disabled, all function records ref counts are subtracted. Since the modules never had their ref counts incremented, they go below zero and the FTRACE_WARN_ON() is triggered. The solution to this is rather simple. When modules are loaded, and their functions are added to the the ftrace pool, look to see if any ftrace_ops are registered that trace all functions. And for those, update the ref count for the module function records. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-14 23:02:27 -04:00
Masami Hiramatsu	7143f168e2	tracing/kprobes: Rename probe_* to trace_probe_* Rename probe_* to trace_probe_* for avoiding namespace confliction. This also fixes improper names of find_probe_event() and cleanup_all_probes() to find_trace_probe() and release_all_trace_probes() respectively. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20110627072636.6528.60374.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-14 17:44:43 -04:00
Steven Rostedt	4a9bd3f134	tracing: Have dynamic size event stack traces Currently the stack trace per event in ftace is only 8 frames. This can be quite limiting and sometimes useless. Especially when the "ignore frames" is wrong and we also use up stack frames for the event processing itself. Change this to be dynamic by adding a percpu buffer that we can write a large stack frame into and then copy into the ring buffer. For interrupts and NMIs that come in while another event is being process, will only get to use the 8 frame stack. That should be enough as the task that it interrupted will have the full stack frame anyway. Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-14 16:36:53 -04:00
Steven Rostedt	6331c28c96	ftrace: Fix dynamic selftest failure on some archs Archs that do not implement CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, will fail the dynamic ftrace selftest. The function tracer has a quick 'off' variable that will prevent the call back functions from being called. This variable is called function_trace_stop. In x86, this is implemented directly in the mcount assembly, but for other archs, an intermediate function is used called ftrace_test_stop_func(). In dynamic ftrace, the function pointer variable ftrace_trace_function is used to update the caller code in the mcount caller. But for archs that do not have CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST set, it only calls ftrace_test_stop_func() instead, which in turn calls __ftrace_trace_function. When more than one ftrace_ops is registered, the function it calls is ftrace_ops_list_func(), which will iterate over all registered ftrace_ops and call the callbacks that have their hash matching. The issue happens when two ftrace_ops are registered for different functions and one is then unregistered. The __ftrace_trace_function is then pointed to the remaining ftrace_ops callback function directly. This mean it will be called for all functions that were registered to trace by both ftrace_ops that were registered. This is not an issue for archs with CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, because the update of ftrace_trace_function doesn't happen until after all functions have been updated, and then the mcount caller is updated. But for those archs that do use the ftrace_test_stop_func(), the update is immediate. The dynamic selftest fails because it hits this situation, and the ftrace_ops that it registers fails to only trace what it was suppose to and instead traces all other functions. The solution is to delay the setting of __ftrace_trace_function until after all the functions have been updated according to the registered ftrace_ops. Also, function_trace_stop is set during the update to prevent function tracing from calling code that is caused by the function tracer itself. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-13 22:25:09 -04:00
Steven Rostedt	072126f452	ftrace: Update filter when tracing enabled in set_ftrace_filter() Currently, if set_ftrace_filter() is called when the ftrace_ops is active, the function filters will not be updated. They will only be updated when tracing is disabled and re-enabled. Update the functions immediately during set_ftrace_filter(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-13 22:10:05 -04:00
Steven Rostedt	41fb61c2d0	ftrace: Balance records when updating the hash Whenever the hash of the ftrace_ops is updated, the record counts must be balance. This requires disabling the records that are set in the original hash, and then enabling the records that are set in the updated hash. Moving the update into ftrace_hash_move() removes the bug where the hash was updated but the records were not, which results in ftrace triggering a warning and disabling itself because the ftrace_ops filter is updated while the ftrace_ops was registered, and then the failure happens when the ftrace_ops is unregistered. The current code will not trigger this bug, but new code will. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-13 22:00:50 -04:00
Steven Rostedt	4376cac667	ftrace: Do not disable interrupts for modules in mcount update When I mounted an NFS directory, it caused several modules to be loaded. At the time I was running the preemptirqsoff tracer, and it showed the following output: # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 2.6.33.9-rt30-mrg-test # -------------------------------------------------------------------- # latency: 1177 us, #4/4, CPU#3 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # \| task: modprobe-19370 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: ftrace_module_notify # => ended at: ftrace_module_notify # # # _------=> CPU# # / _-----=> irqs-off # \| / _----=> need-resched # \|\| / _---=> hardirq/softirq # \|\|\| / _--=> preempt-depth # \|\|\|\| /_--=> lock-depth # \|\|\|\|\|/ delay # cmd pid \|\|\|\|\|\| time \| caller # \ / \|\|\|\|\|\| \ \| / modprobe-19370 3d.... 0us!: ftrace_process_locs <-ftrace_module_notify modprobe-19370 3d.... 1176us : ftrace_process_locs <-ftrace_module_notify modprobe-19370 3d.... 1178us : trace_hardirqs_on <-ftrace_module_notify modprobe-19370 3d.... 1178us : <stack trace> => ftrace_process_locs => ftrace_module_notify => notifier_call_chain => __blocking_notifier_call_chain => blocking_notifier_call_chain => sys_init_module => system_call_fastpath That's over 1ms that interrupts are disabled on a Real-Time kernel! Looking at the cause (being the ftrace author helped), I found that the interrupts are disabled before the code modification of mcounts into nops. The interrupts only need to be disabled on start up around this code, not when modules are being loaded. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-07 22:39:38 -04:00
Steven Rostedt	e4a3f541f0	tracing: Still trace filtered irq functions when irq trace is disabled If a function is set to be traced by the set_graph_function, but the option funcgraph-irqs is zero, and the traced function happens to be called from a interrupt, it will not be traced. The point of funcgraph-irqs is to not trace interrupts when we are preempted by an irq, not to not trace functions we want to trace that happen to be in a irq. Luckily the current->trace_recursion element is perfect to add a flag to help us be able to trace functions within an interrupt even when we are not tracing interrupts that preempt the trace. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-07 22:26:27 -04:00
Steven Rostedt	43dd61c9a0	ftrace: Fix regression of :mod:module function enabling The new code that allows different utilities to pick and choose what functions they trace broke the :mod: hook that allows users to trace only functions of a particular module. The reason is that the :mod: hook bypasses the hash that is setup to allow individual users to trace their own functions and uses the global hash directly. But if the global hash has not been set up, it will cause a bug: echo ':mod:radeon' > /sys/kernel/debug/set_ftrace_filter produces: [drm:drm_mode_getfb] ERROR* invalid framebuffer id [drm:radeon_crtc_page_flip] ERROR failed to reserve new rbo buffer before flip BUG: unable to handle kernel paging request at ffffffff8160ec90 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1 Oops: 0003 [#1] SMP Jul 7 04:02:28 phyllis kernel: [55303.858604] CPU 1 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt Pid: 10344, comm: bash Tainted: G WC 3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ RIP: 0010:[<ffffffff810d9136>] [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 RSP: 0018:ffff88003a96bda8 EFLAGS: 00010246 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470) Stack: 0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5 ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80 ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00 Call Trace: [<ffffffff810d92f5>] match_records+0x155/0x1b0 [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100 [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210 [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20 [<ffffffff81166e48>] vfs_write+0xc8/0x190 [<ffffffff81167001>] sys_write+0x51/0x90 [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 RSP <ffff88003a96bda8> CR2: ffffffff8160ec90 ---[ end trace a5d031828efdd88e ]--- Reported-by: Brian Marete <marete@toshnix.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-07 11:30:08 -04:00
Steven Rostedt	40ee4dffff	tracing: Have "enable" file use refcounts like the "filter" file The "enable" file for the event system can be removed when a module is unloaded and the event system only has events from that module. As the event system nr_events count goes to zero, it may be freed if its ref_count is also set to zero. Like the "filter" file, the "enable" file may be opened by a task and referenced later, after a module has been unloaded and the events for that event system have been removed. Although the "filter" file referenced the event system structure, the "enable" file only references a pointer to the event system name. Since the name is freed when the event system is removed, it is possible that an access to the "enable" file may reference a freed pointer. Update the "enable" file to use the subsystem_open() routine that the "filter" file uses, to keep a reference to the event system structure while the "enable" file is opened. Cc: <stable@kernel.org> Reported-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-07 11:22:29 -04:00
Steven Rostedt	e9dbfae53e	tracing: Fix bug when reading system filters on module removal The event system is freed when its nr_events is set to zero. This happens when a module created an event system and then later the module is removed. Modules may share systems, so the system is allocated when it is created and freed when the modules are unloaded and all the events under the system are removed (nr_events set to zero). The problem arises when a task opened the "filter" file for the system. If the module is unloaded and it removed the last event for that system, the system structure is freed. If the task that opened the filter file accesses the "filter" file after the system has been freed, the system will access an invalid pointer. By adding a ref_count, and using it to keep track of what is using the event system, we can free it after all users are finished with the event system. Cc: <stable@kernel.org> Reported-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-07 11:19:18 -04:00
Ingo Molnar	931da6137e	Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2011-07-05 11:55:43 +02:00
Masami Hiramatsu	1fd8df2c39	tracing/kprobes: Fix kprobe-tracer to support stack trace Fix to support kernel stack trace correctly on kprobe-tracer. Since the execution path of kprobe-based dynamic events is different from other tracepoint-based events, normal ftrace_trace_stack() doesn't work correctly. To fix that, this introduces ftrace_trace_stack_regs() which traces stack via pt_regs instead of current stack register. e.g. # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events # echo 1 > /sys/kernel/debug/tracing/options/stacktrace # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable # head -n 20 /sys/kernel/debug/tracing/trace bash-2968 [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca) bash-2968 [000] 10297.050247: <stack trace> => schedule_timeout => n_tty_read => tty_read => vfs_read => sys_read => system_call_fastpath kworker/0:1-2940 [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca) kworker/0:1-2940 [000] 10297.050266: <stack trace> => worker_thread => kthread => kernel_thread_helper sshd-1132 [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca) sshd-1132 [000] 10297.050365: <stack trace> => sysret_careful Note: Even with this fix, the first entry will be skipped if the probe is put on the function entry area before the frame pointer is set up (usually, that is 4 bytes (push %bp; mov %sp %bp) on x86), because stack unwinder depends on the frame pointer. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:53 -04:00
Vaibhav Nagarnaik	d7ec4bfed6	ring-buffer: Set __GFP_NORETRY flag for ring buffer allocating process The tracing ring buffer is allocated from kernel memory. While allocating a large chunk of memory, OOM might happen which destabilizes the system. Thus random processes might get killed during the allocation. This patch adds __GFP_NORETRY flag to the ring buffer allocation calls to make it fail more gracefully if the system will not be able to complete the allocation request. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1307491302-9236-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:51 -04:00
Peter Huewe	22fe9b54d8	tracing: Convert to kstrtoul_from_user This patch replaces the code for getting an unsigned long from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Link: http://lkml.kernel.org/r/1307476707-14762-1-git-send-email-peterhuewe@gmx.de Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:50 -04:00
Jiri Olsa	749230b06a	tracing, function_graph: Add context-info support for function_graph tracer The function_graph tracer does not follow global context-info option. Adding TRACE_ITER_CONTEXT_INFO trace_flags check to enable it. With following commands: # echo function_graph > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| \| 1) 0.079 us \| } /* __vma_link_rb / 1) 0.056 us \| copy_page_range(); 1) \| security_vm_enough_memory() { ... This is what it looks like now: # tracer: function_graph # } / update_ts_time_stats */ timekeeping_max_deferment(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-6-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:49 -04:00
Jiri Olsa	199abfab40	tracing, function_graph: Remove lock-depth from latency trace The lock_depth was removed in commit `e6e1e25` tracing: Remove lock_depth from event entry Removing the lock_depth info from function_graph latency header. With following commands: # echo function_graph > ./current_tracer # echo 1 > options/latency-format # cat trace This is what it looked like before: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59756/311298, CPU#0 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # \| task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / _-=> lock-depth # \|\|\|\| / # CPU\|\|\|\|\| DURATION FUNCTION CALLS # \| \|\|\|\|\| \| \| \| \| \| \| 0) .... 0.068 us \| } /* __rcu_read_unlock / ... This is what it looks like now: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59747/1744610, CPU#0 \| (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # \| task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / # CPU\|\|\|\| DURATION FUNCTION CALLS # \| \|\|\|\| \| \| \| \| \| \| 0) ..s. 1.641 us \| } / __rcu_process_callbacks */ ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-5-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:49 -04:00
Jiri Olsa	f56e7f8efb	tracing, function: Fix trace header to follow context-info option The header display of function tracer does not follow the context-info option, so field names are displayed even if this option is off. Added check for TRACE_ITER_CONTEXT_INFO trace_flags. With following commands: # echo function > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| add_preempt_count <-schedule rcu_note_context_switch <-schedule ... This is what it looks like now: # tracer: function # _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-4-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:48 -04:00
Jiri Olsa	ffeb80fc30	tracing, function_graph: Merge overhead and duration display functions Functions print_graph_overhead() and print_graph_duration() displays data for one field - DURATION. I merged them into single function print_graph_duration(), and added a way to display the empty parts of the field. This way the print_graph_irq() function can use this column to display the IRQ signs if needed and the DURATION field details stays inside the print_graph_duration() function. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-3-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:47 -04:00
Jiri Olsa	321e68b095	tracing, function_graph: Remove dependency of abstime and duration fields on latency The display of absolute time and duration fields is based on the latency field. This was added during the irqsoff/wakeup tracers graph support changes. It's causing confusion in what fields will be displayed for the function_graph tracer itself. So I'm removing this depency, and adding absolute time and duration fields to the preemptirqsoff preemptoff irqsoff wakeup tracers. With following commands: # echo function_graph > ./current_tracer # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| \| 0) 0.068 us \| } /* page_add_file_rmap / 0) \| _raw_spin_unlock() { ... This is what it looks like now: # tracer: function_graph # # CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| 0) 0.068 us \| } / add_preempt_count / 0) 0.993 us \| } / vfsmount_lock_local_lock */ ... For preemptirqsoff preemptoff irqsoff wakeup tracers, this is what it looked like before: SNIP # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / _-=> lock-depth # \|\|\|\| / # CPU TASK/PID \|\|\|\|\| DURATION FUNCTION CALLS # \| \| \| \|\|\|\|\| \| \| \| \| \| \| 1) <idle>-0 \| d..1 0.000 us \| acpi_idle_enter_simple(); ... This is what it looks like now: SNIP # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / # TIME CPU TASK/PID \|\|\|\| DURATION FUNCTION CALLS # \| \| \| \| \|\|\|\| \| \| \| \| \| \| 19.847735 \| 1) <idle>-0 \| d..1 0.000 us \| acpi_idle_enter_simple(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-2-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:47 -04:00
Paul McQuade	bd38c0e6f9	ftrace: Fixed an include coding style issue Removed <asm/ftrace.h> because <linux/ftrace.h> was already declared. Braces of struct's coding style fixed. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul McQuade <tungstentide@gmail.com> Link: http://lkml.kernel.org/r/4DE59711.3090900@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:45 -04:00
Steven Rostedt	cf30cf67d6	tracing: Add disable_on_free option Add a trace option to disable tracing on free. When this option is set, a write into the free_buffer file will not only shrink the ring buffer down to zero, but it will also disable tracing. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:45 -04:00
Vaibhav Nagarnaik	4f271a2a60	tracing: Add a proc file to stop tracing and free buffer The proc file entry buffer_size_kb is used to set the size of tracing buffer. The memory to expand the buffer size is kernel memory. Consider a use case where tracing is handled by a user space utility, which acts as a gate keeper for tracing requests. In an OOM condition, tracing is considered a low priority task and if the utility gets killed the ring buffer memory cannot be released back to the kernel. This patch adds a proc file called "free_buffer" whose purpose is to stop tracing and free up the ring buffer when it is closed. The user space process can then set the desired size in buffer_size_kb file and open the fd to the "free_buffer" file. Under OOM condition, if the process gets killed, the kernel closes the file descriptor. The release handler stops the tracing and releases the kernel memory automatically. Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Link: http://lkml.kernel.org/r/1308012717-11148-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:37 -04:00
Vaibhav Nagarnaik	7ea5906405	tracing: Use NUMA allocation for per-cpu ring buffer pages The tracing ring buffer is a group of per-cpu ring buffers where allocation and logging is done on a per-cpu basis. The events that are generated on a particular CPU are logged in the corresponding buffer. This is to provide wait-free writes between CPUs and good NUMA node locality while accessing the ring buffer. However, the allocation routines consider NUMA locality only for buffer page metadata and not for the actual buffer page. This causes the pages to be allocated on the NUMA node local to the CPU where the allocation routine is running at the time. This patch fixes the problem by using a NUMA node specific allocation routine so that the pages are allocated from a NUMA node local to the logging CPU. I tested with the getuid_microbench from autotest. It is a simple binary that calls getuid() in a loop and measures the average time for the syscall to complete. The following command was used to test: $ getuid_microbench 1000000 Compared the numbers found on kernel with and without this patch and found that logging latency decreases by 30-50 ns/call. tracing with non-NUMA allocation - 569 ns/call tracing with NUMA allocation - 512 ns/call Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:04:39 -04:00
Vaibhav Nagarnaik	e7e2ee89a9	tracing: Schedule a delayed work to call wakeup() In using syscall tracing by concurrent processes, the wakeup() that is called in the event commit function causes contention on the spin lock of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid tracepoints, and by running getuid_microbench from autotest in parallel I found that the contention causes exponential latency increase in the tracing path. The autotest binary getuid_microbench calls getuid() in a tight loop for the given number of iterations and measures the average time required to complete a single invocation of syscall. The patch schedules a delayed work after 2 ms once an event commit calls to wake up the trace wait_queue. This removes the delay caused by contention on spin lock in wakeup() and amortizes the wakeup() calls scheduled over the 2 ms period. In the following example, the script enables the sys_enter_getuid and sys_exit_getuid tracepoints and runs the getuid_microbench in parallel with the given number of processes. The output clearly shows the latency increase caused by contentions. $ ~/getuid.sh 1 1000000 calls in 0.720974253 s (720.974253 ns/call) $ ~/getuid.sh 2 1000000 calls in 1.166457554 s (1166.457554 ns/call) 1000000 calls in 1.168933765 s (1168.933765 ns/call) $ ~/getuid.sh 3 1000000 calls in 1.783827516 s (1783.827516 ns/call) 1000000 calls in 1.795553270 s (1795.553270 ns/call) 1000000 calls in 1.796493376 s (1796.493376 ns/call) $ ~/getuid.sh 4 1000000 calls in 4.483041796 s (4483.041796 ns/call) 1000000 calls in 4.484165388 s (4484.165388 ns/call) 1000000 calls in 4.484850762 s (4484.850762 ns/call) 1000000 calls in 4.485643576 s (4485.643576 ns/call) $ ~/getuid.sh 5 1000000 calls in 6.497521653 s (6497.521653 ns/call) 1000000 calls in 6.502000236 s (6502.000236 ns/call) 1000000 calls in 6.501709115 s (6501.709115 ns/call) 1000000 calls in 6.502124100 s (6502.124100 ns/call) 1000000 calls in 6.502936358 s (6502.936358 ns/call) After the patch, the latencies scale better. 1000000 calls in 0.728720455 s (728.720455 ns/call) 1000000 calls in 0.842782857 s (842.782857 ns/call) 1000000 calls in 0.883803135 s (883.803135 ns/call) 1000000 calls in 0.902077764 s (902.077764 ns/call) 1000000 calls in 0.902838202 s (902.838202 ns/call) 1000000 calls in 0.908896885 s (908.896885 ns/call) 1000000 calls in 0.932523515 s (932.523515 ns/call) 1000000 calls in 0.958009672 s (958.009672 ns/call) 1000000 calls in 0.986188020 s (986.188020 ns/call) 1000000 calls in 0.989771102 s (989.771102 ns/call) 1000000 calls in 0.933518391 s (933.518391 ns/call) 1000000 calls in 0.958897947 s (958.897947 ns/call) 1000000 calls in 1.031038897 s (1031.038897 ns/call) 1000000 calls in 1.089516025 s (1089.516025 ns/call) 1000000 calls in 1.141998347 s (1141.998347 ns/call) Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 21:59:41 -04:00
Steven Rostedt	db5e7ecc4a	tracing: Fix regression in printk_formats file The fix to fix the printk_formats of modules broke the printk_formats of trace_printks in the kernel. The update of what to show via the seq_file was only updated if the passed in fmt was NULL, which happens only on the first iteration. The result was showing the first format every time instead of iterating through the available formats. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-09 08:42:15 -04:00
Steven Rostedt	a4f18ed11a	ftrace: Revert `8ab2b7efd` ftrace: Remove unnecessary disabling of irqs Revert the commit that removed the disabling of interrupts around the initial modifying of mcount callers to nops, and update the comment. The original comment was outdated and stated that the interrupts were being disabled to prevent kstop machine, which was required with the old ftrace daemon, but was no longer the case. What the comment failed to mention was that interrupts needed to be disabled to keep interrupts from preempting the modifying of the code and then executing the code that was partially modified. Revert the commit and update the comment. Reported-by: Richard W.M. Jones <rjones@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-07 14:49:19 -04:00
Steven Rostedt	265a5b7ee3	kprobes/trace: Fix kprobe selftest for gcc 4.6 With gcc 4.6, the self test kprobe function: kprobe_trace_selftest_target() is optimized such that kallsyms does not list it. The kprobes test uses this function to insert a probe and test it. But it will fail the test if the function is not listed in kallsyms. Adding a __used annotation keeps the symbol in the kallsyms table. Suggested-by: David Daney <ddaney@caviumnetworks.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-07 14:47:36 -04:00
GuoWen Li	0aff1c0cef	ftrace: Fix possible undefined return code kernel/trace/ftrace.c: In function 'ftrace_regex_write.clone.15': kernel/trace/ftrace.c:2743:6: warning: 'ret' may be used uninitialized in this function Signed-off-by: GuoWen Li <guowen.li.linux@gmail.com> Link: http://lkml.kernel.org/r/201106011918.47939.guowen.li.linux@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-06 22:34:25 -04:00
Steven Rostedt	b1cff0ad10	ftrace: Add internal recursive checks Witold reported a reboot caused by the selftests of the dynamic function tracer. He sent me a config and I used ktest to do a config_bisect on it (as my config did not cause the crash). It pointed out that the problem config was CONFIG_PROVE_RCU. What happened was that if multiple callbacks are attached to the function tracer, we iterate a list of callbacks. Because the list is managed by synchronize_sched() and preempt_disable, the access to the pointers uses rcu_dereference_raw(). When PROVE_RCU is enabled, the rcu_dereference_raw() calls some debugging functions, which happen to be traced. The tracing of the debug function would then call rcu_dereference_raw() which would then call the debug function and then... well you get the idea. I first wrote two different patches to solve this bug. 1) add a __rcu_dereference_raw() that would not do any checks. 2) add notrace to the offending debug functions. Both of these patches worked. Talking with Paul McKenney on IRC, he suggested to add recursion detection instead. This seemed to be a better solution, so I decided to implement it. As the task_struct already has a trace_recursion to detect recursion in the ring buffer, and that has a very small number it allows, I decided to use that same variable to add flags that can detect the recursion inside the infrastructure of the function tracer. I plan to change it so that the task struct bit can be checked in mcount, but as that requires changes to all archs, I will hold that off to the next merge window. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.com Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-25 22:13:49 -04:00
liubo	2fc1b6f0d0	tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine Filesystem, like Btrfs, has some "ULL" macros, and when these macros are passed to tracepoints'__print_symbolic(), there will be 64->32 truncate WARNINGS during compiling on 32bit box. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Link: http://lkml.kernel.org/r/4DACE6E0.7000507@cn.fujitsu.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-25 22:13:44 -04:00
Steven Rostedt	3b6cfdb171	ftrace: Set ops->flag to enabled even on static function tracing When dynamic ftrace is not configured, the ops->flags still needs to have its FTRACE_OPS_FL_ENABLED bit set in ftrace_startup(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-25 22:13:42 -04:00
Steven Rostedt	17bb615ad4	tracing: Have event with function tracer check error return The self tests for event tracer does not check if the function tracing was successfully activated. It needs to before it continues the tests, otherwise the wrong errors may be reported. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-25 22:13:39 -04:00
Steven Rostedt	a1cd617359	ftrace: Have ftrace_startup() return failure code The register_ftrace_function() returns an error code on failure except if the call to ftrace_startup() fails. Add a error return to ftrace_startup() if it fails to start, allowing register_ftrace_funtion() to return a proper error value. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-25 22:13:37 -04:00
Linus Torvalds	80fe02b5da	Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set__buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug	2011-05-19 17:41:22 -07:00
Steven Rostedt	95950c2ecb	ftrace: Add self-tests for multiple function trace users Add some basic sanity tests for multiple users of the function tracer at startup. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 19:24:51 -04:00
Steven Rostedt	936e074b28	ftrace: Modify ftrace_set_filter/notrace to take ops Since users of the function tracer can now pick and choose which functions they want to trace agnostically from other users of the function tracer, we need to pass the ops struct to the ftrace_set_filter() functions. The functions ftrace_set_global_filter() and ftrace_set_global_notrace() is added to keep the old filter functions which are used to modify the generic function tracers. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 19:22:52 -04:00
Steven Rostedt	cdbe61bfe7	ftrace: Allow dynamically allocated function tracers Now that functions may be selected individually, it only makes sense that we should allow dynamically allocated trace structures to be traced. This will allow perf to allocate a ftrace_ops structure at runtime and use it to pick and choose which functions that structure will trace. Note, a dynamically allocated ftrace_ops will always be called indirectly instead of being called directly from the mcount in entry.S. This is because there's no safe way to prevent mcount from being preempted before calling the function, unless we modify every entry.S to do so (not likely). Thus, dynamically allocated functions will now be called by the ftrace_ops_list_func() that loops through the ops that are allocated if there are more than one op allocated at a time. This loop is protected with a preempt_disable. To determine if an ftrace_ops structure is allocated or not, a new util function was added to the kernel/extable.c called core_kernel_data(), which returns 1 if the address is between _sdata and _edata. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:51 -04:00
Steven Rostedt	b848914ce3	ftrace: Implement separate user function filtering ftrace_ops that are registered to trace functions can now be agnostic to each other in respect to what functions they trace. Each ops has their own hash of the functions they want to trace and a hash to what they do not want to trace. A empty hash for the functions they want to trace denotes all functions should be traced that are not in the notrace hash. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:50 -04:00
Steven Rostedt	07fd5515f3	ftrace: Free hash with call_rcu_sched() When a hash is modified and might be in use, we need to perform a schedule RCU operation on it, as the hashes will soon be used directly in the function tracer callback. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:50 -04:00
Steven Rostedt	2b499381bc	ftrace: Have global_ops store the functions that are to be traced This is a step towards each ops structure defining its own set of functions to trace. As the current code with pid's and such are specific to the global_ops, it is restructured to be used with the global ops. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:49 -04:00
Steven Rostedt	bd69c30b1d	ftrace: Add ops parameter to ftrace_startup/shutdown functions In order to allow different ops to enable different functions, the ftrace_startup() and ftrace_shutdown() functions need the ops parameter passed to them. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:48 -04:00
Steven Rostedt	647bcd03d5	ftrace: Add enabled_functions file Add the enabled_functions file that is used to show all the functions that have been enabled for tracing as well as their ref counts. This helps seeing if any function has been registered and what functions are being traced. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:47 -04:00
Steven Rostedt	ed926f9b35	ftrace: Use counters to enable functions to trace Every function has its own record that stores the instruction pointer and flags for the function to be traced. There are only two flags: enabled and free. The enabled flag states that tracing for the function has been enabled (actively traced), and the free flag states that the record no longer points to a function and can be used by new functions (loaded modules). These flags are now moved to the MSB of the flags (actually just the top 32bits). The rest of the bits (30 bits) are now used as a ref counter. Everytime a tracer register functions to trace, those functions will have its counter incremented. When tracing is enabled, to determine if a function should be traced, the counter is examined, and if it is non-zero it is set to trace. When a ftrace_ops is registered to trace functions, its hashes are examined. If the ftrace_ops filter_hash count is zero, then all functions are set to be traced, otherwise only the functions in the hash are to be traced. The exception to this is if a function is also in the ftrace_ops notrace_hash. Then that function's counter is not incremented for this ftrace_ops. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:47 -04:00
Steven Rostedt	33dc9b1267	ftrace: Separate hash allocation and assignment When filtering, allocate a hash to insert the function records. After the filtering is complete, assign it to the ftrace_ops structure. This allows the ftrace_ops structure to have a much smaller array of hash buckets instead of wasting a lot of memory. A read only empty_hash is created to be the minimum size that any ftrace_ops can point to. When a new hash is created, it has the following steps: o Allocate a default hash. o Walk the function records assigning the filtered records to the hash o Allocate a new hash with the appropriate size buckets o Move the entries from the default hash to the new hash. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:46 -04:00
Steven Rostedt	f45948e898	ftrace: Create a global_ops to hold the filter and notrace hashes Combine the filter and notrace hashes to be accessed by a single entity, the global_ops. The global_ops is a ftrace_ops structure that is passed to different functions that can read or modify the filtering of the function tracer. The ftrace_ops structure was modified to hold a filter and notrace hashes so that later patches may allow each ftrace_ops to have its own set of rules to what functions may be filtered. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:45 -04:00
Steven Rostedt	1cf41dd799	ftrace: Use hash instead for FTRACE_FL_FILTER When multiple users are allowed to have their own set of functions to trace, having the FTRACE_FL_FILTER flag will not be enough to handle the accounting of those users. Each user will need their own set of functions. Replace the FTRACE_FL_FILTER with a filter_hash instead. This is temporary until the rest of the function filtering accounting gets in. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:44 -04:00
Steven Rostedt	b448c4e3ae	ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions To prepare for the accounting system that will allow multiple users of the function tracer, having the FTRACE_FL_NOTRACE as a flag in the dyn_trace record does not make sense. All ftrace_ops will soon have a hash of functions they should trace and not trace. By making a global hash of functions not to trace makes this easier for the transition. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-18 15:29:44 -04:00
Ingo Molnar	9cb5baba5e	Merge commit 'v2.6.39-rc7' into sched/core	2011-05-12 09:36:18 +02:00
Ingo Molnar	932fed4e2e	Merge commit 'v2.6.39-rc7' into perf/core Merge reason: pull in the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-10 17:05:45 +02:00
Arjan van de Ven	a3a4a5acd3	Regression: partial revert "tracing: Remove lock_depth from event entry" This partially reverts commit `e6e1e25935`. That commit changed the structure layout of the trace structure, which in turn broke PowerTOP (1.9x generation) quite badly. I appreciate not wanting to expose the variable in question, and PowerTOP was not using it, so I've replaced the variable with just a padding field - that way if in the future a new field is needed it can just use this padding field. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-06 13:20:59 -07:00
Ingo Molnar	ac0a3260f3	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2011-05-01 19:11:42 +02:00

... 2 3 4 5 6 ...

2224 Commits