linux_dsm_epyc7002/kernel/trace
Petr Mladek d303de1fcf tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
A customer reported the following softlockup:

[899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464]
[899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] Kernel panic - not syncing: softlockup: hung tasks
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12
[899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000
[899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00
[899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000
[899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8
[899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000
[899688.160002]  tracing_read_pipe+0x336/0x3c0
[899688.160002]  __vfs_read+0x26/0x140
[899688.160002]  vfs_read+0x87/0x130
[899688.160002]  SyS_read+0x42/0x90
[899688.160002]  do_syscall_64+0x74/0x160

It caught the process in the middle of trace_access_unlock(). There is
no loop. So, it must be looping in the caller tracing_read_pipe()
via the "waitagain" label.

Crashdump analyze uncovered that iter->seq was completely zeroed
at this point, including iter->seq.seq.size. It means that
print_trace_line() was never able to print anything and
there was no forward progress.

The culprit seems to be in the code:

	/* reset all but tr, trace, and overruns */
	memset(&iter->seq, 0,
	       sizeof(struct trace_iterator) -
	       offsetof(struct trace_iterator, seq));

It was added by the commit 53d0aa7730 ("ftrace:
add logic to record overruns"). It was v2.6.27-rc1.
It was the time when iter->seq looked like:

     struct trace_seq {
	unsigned char		buffer[PAGE_SIZE];
	unsigned int		len;
     };

There was no "size" variable and zeroing was perfectly fine.

The solution is to reinitialize the structure after or without
zeroing.

Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.com

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:34 -04:00
..
blktrace.c
bpf_trace.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-28 17:47:33 -07:00
fgraph.c ftrace: Look up the address of return_to_handler() using helpers 2019-09-18 12:24:47 +10:00
ftrace_internal.h treewide: Rename rcu_dereference_raw_notrace() to _check() 2019-08-01 14:16:21 -07:00
ftrace.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2019-09-18 12:34:53 -07:00
Makefile
power-traces.c
preemptirq_delay_test.c
ring_buffer_benchmark.c tracing: Use CONFIG_PREEMPTION 2019-07-31 19:03:35 +02:00
ring_buffer.c
rpm-traces.c
trace_benchmark.c
trace_benchmark.h
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Add tracing_check_open_get_tr() 2019-10-12 20:44:07 -04:00
trace_dynevent.h tracing/dynevent: Pass extra arguments to match operation 2019-08-31 12:19:38 -04:00
trace_entries.h
trace_event_perf.c tracing: Pass type into tracing_generic_entry_update() 2019-07-16 15:14:48 -04:00
trace_events_filter_test.h
trace_events_filter.c tracing: Have error path in predicate_parse() free its allocated memory 2019-09-28 17:13:39 -04:00
trace_events_hist.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_events_trigger.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_events.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_export.c
trace_functions_graph.c fgraph: Remove redundant ftrace_graph_notrace_addr() test 2019-07-30 21:50:03 -04:00
trace_functions.c
trace_hwlat.c tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency 2019-10-12 20:49:33 -04:00
trace_irqsoff.c
trace_kdb.c
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_kprobe.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_mmiotrace.c
trace_nop.c
trace_output.c tracing: Be more clever when dumping hex in __print_hex() 2019-09-17 11:21:28 -04:00
trace_output.h
trace_preemptirq.c
trace_printk.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_probe_tmpl.h
trace_probe.c tracing/probe: Fix to check the difference of nr_args before adding probe 2019-09-28 17:07:53 -04:00
trace_probe.h tracing/probe: Reject exactly same probe event 2019-09-19 11:09:16 -04:00
trace_sched_switch.c
trace_sched_wakeup.c sched/core: Convert get_task_struct() to return the task 2019-07-25 15:51:54 +02:00
trace_selftest_dynamic.c
trace_selftest.c
trace_seq.c
trace_stack.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_stat.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace_stat.h
trace_syscalls.c
trace_uprobe.c tracing: Add locked_down checks to the open calls of files created for tracefs 2019-10-12 20:48:06 -04:00
trace.c tracing: Initialize iter->seq after zeroing in tracing_read_pipe() 2019-10-12 20:49:34 -04:00
trace.h tracing: Add tracing_check_open_get_tr() 2019-10-12 20:44:07 -04:00
tracing_map.c
tracing_map.h