The structure of the ret_stack array on the task struct is going to
change, and accessing it directly via the curr_ret_stack index will no
longer give the ret_stack entry that holds the return address. To access
that, architectures must now use ftrace_graph_get_ret_stack() to get the
associated ret_stack that matches the saved return address.
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).
Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.
Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().
This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.
Reproducer:
cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent
Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000
Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4
Fixes: 20380bb390 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The unwind code sets the sp member of struct stackframe to
'frame pointer + 0x10' unconditionally, without regard for whether
doing so produces a legal value. So let's simply remove it now that
we have stopped using it anyway.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
The function name is now renamed to 'timer_probe' for consistency with
the CLOCKSOURCE_OF_DECLARE => TIMER_OF_DECLARE change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function return. This will result in many useless entries
(return_to_handler) showing up in
a) a stack tracer's output
b) perf call graph (with perf record -g)
c) dump_backtrace (at panic et al.)
For example, in case of a),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo 1 > /proc/sys/kernel/stack_trace_enabled
$ cat /sys/kernel/debug/tracing/stack_trace
Depth Size Location (54 entries)
----- ---- --------
0) 4504 16 gic_raise_softirq+0x28/0x150
1) 4488 80 smp_cross_call+0x38/0xb8
2) 4408 48 return_to_handler+0x0/0x40
3) 4360 32 return_to_handler+0x0/0x40
...
In case of b),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ perf record -e mem:XXX:x -ag -- sleep 10
$ perf report
...
| | |--0.22%-- 0x550f8
| | | 0x10888
| | | el0_svc_naked
| | | sys_openat
| | | return_to_handler
| | | return_to_handler
...
In case of c),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo c > /proc/sysrq-trigger
...
Call trace:
[<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
[<ffffffc000092250>] return_to_handler+0x0/0x40
[<ffffffc000092250>] return_to_handler+0x0/0x40
...
This patch replaces such entries with real addresses preserved in
current->ret_stack[] at unwind_frame(). This way, we can cover all
the cases.
Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[will: fixed minor context changes conflicting with irq stack bits]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function's return. This will result in many useless entries
(return_to_handler) showing up in a call stack list.
We will fix this problem in a later patch ("arm64: ftrace: fix a stack
tracer's output under function graph tracer"). But since real return
addresses are saved in ret_stack[] array in struct task_struct,
unwind functions need to be notified of, in addition to a stack pointer
address, which task is being traced in order to find out real return
addresses.
This patch extends unwind functions' interfaces by adding an extra
argument of a pointer to task_struct.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Seeing the 'of' characters in a symbol that is being called from
ACPI seems to freak out people. So let's do a bit of pointless
renaming so that these folks do feel at home.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is now absolutely trivial to convert the arch timer driver to
use ACPI probing, just like its DT counterpart.
Let's enjoy another crapectomy.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nobody seems to be producing !SMP systems anymore, so this is just
becoming a source of kernel bugs, particularly if people want to use
coherent DMA with non-shared pages.
This patch forces CONFIG_SMP=y for arm64, removing a modest amount of
code in the process.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Using the information presented by GTDT (Generic Timer Description Table)
to initialize the arch timer (not memory-mapped).
CC: Daniel Lezcano <daniel.lezcano@linaro.org>
CC: Thomas Gleixner <tglx@linutronix.de>
Originally-by: Amit Daniel Kachhap <amit.daniel@samsung.com>
Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Tested-by: Yijing Wang <wangyijing@huawei.com>
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
Tested-by: Timur Tabi <timur@codeaurora.org>
Tested-by: Robert Richter <rrichter@cavium.com>
Acked-by: Robert Richter <rrichter@cavium.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On platforms implementing CPU power management, the CPUidle subsystem
can allow CPUs to enter idle states where local timers logic is lost on power
down. To keep the software timers functional the kernel relies on an
always-on broadcast timer to be present in the platform to relay the
interrupt signalling the timer expiries.
For platforms implementing CPU core gating that do not implement an always-on
HW timer or implement it in a broken way, this patch adds code to initialize
the kernel hrtimer based clock event device upon boot (which can be chosen as
tick broadcast device by the kernel).
It relies on a dynamically chosen CPU to be always powered-up. This CPU then
relays the timer interrupt to CPUs in deep-idle states through its HW local
timer device.
Having a CPU always-on has implications on power management platform
capabilities and makes CPUidle suboptimal, since at least a CPU is kept
always in a shallow idle state by the kernel to relay timer interrupts,
but at least leaves the kernel with a functional system with some working
power management capabilities.
The hrtimer based clock event device is unconditionally registered, but
has the lowest possible rating such that any broadcast-capable HW clock
event device present will be chosen in preference as the tick broadcast
device.
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Clock providers should be initialized before clocksource_of_init.
If not, Clock source initialization can be fail to get the clock.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Register with the generic sched_clock framework now that it
supports 64 bits. This fixes two problems with the current
sched_clock support for machines using the architected timers.
First off, we don't subtract the start value from subsequent
sched_clock calls so we can potentially start off with
sched_clock returning gigantic numbers. Second, there is no
support for suspend/resume handling so problems such as discussed
in 6a4dae5 (ARM: 7565/1: sched: stop sched_clock() during
suspend, 2012-10-23) can happen without this patch. Finally, it
allows us to move the sched_clock setup into drivers clocksource
out of the arch ports.
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Under arm64, we will calibrate the delay loop statically using a known
timer frequency, so delete read_current_timer(), or it will cause
compiling issue with allmodconfig.
The related error:
ERROR: "read_current_timer" [lib/rbtree_test.ko] undefined!
ERROR: "read_current_timer" [lib/interval_tree_test.ko] undefined!
ERROR: "read_current_timer" [fs/ext4/ext4.ko] undefined!
ERROR: "read_current_timer" [crypto/tcrypt.ko] undefined!
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This converts arm and arm64 to use CLKSRC_OF DT based initialization for
the arch timer. A new function arch_timer_arch_init is added to allow for
arch specific setup.
This has a side effect of enabling sched_clock on omap5 and exynos5. There
should not be any reason not to use the arch timers for sched_clock.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
The arch_timer driver supports a superset of the functionality of the
arm_generic driver, and is not tied to a particular arch.
This patch moves arm64 to use the arch_timer driver, gaining additional
functionality in doing so, and removes the (now unused) arm_generic
driver. Timer-related hooks specific to arm64 are moved into
arch/arm64/kernel/time.c.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
This patch adds support for the ARM generic timers with A64 instructions
for accessing the timer registers. It uses the physical counter as the
clock source and the virtual counter as sched_clock.
The timer frequency can be specified via DT or read from the CNTFRQ_EL0
register. The physical counter is also accessible from user space
allowing fast gettimeofday() implementation.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>