This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an ARM system has multiple cpus in the same socket and the
kernel is booted with maxcpus=1, secondary cpus are possible but
not present due to how platform_smp_prepare_cpus() is called.
Since most typical ARM processors don't actually support physical
hotplug, initialize the present map to be equal to the possible
map in generic ARM SMP code. Also, always call
platform_smp_prepare_cpus() as long as max_cpus is non-zero (0
means no SMP) to allow platform code to do any SMP setup.
After applying this patch it's possible to boot an ARM system
with maxcpus=1 on the command line and then hotplug in secondary
cpus via sysfs. This is more in line with how x86 does things.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we bring a CPU online, we should wait for it to become active
before entering the idle thread, so we know that the scheduler and
thread migration is going to work.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch makes TTBR1 point to swapper_pg_dir so that global, kernel
mappings can be used exclusively on v6 and v7 cores where they are
needed.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than having each platform class provide a mach/smp.h header for
smp_cross_call(), arrange for them to register the function with the
core ARM SMP code instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This function is only called by percpu_timer_setup() which is
also __cpuinit marked. Thus it's safe to mark this function as
__cpuinit as well.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.
In order to do so we need a generic callback from the existing scheduler IPI.
This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.
BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
The current code support of dummy timers in absence of local
timer is compile time. This is an attempt to convert it to runtime
so that on few SOC version if the local timers aren't supported
kernel can switch to dummy timers. OMAP4430 ES1.0 does suffer from
this limitation.
This patch should not have any functional impact on affected
files.
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Bryan Huntsman <bryanh@codeaurora.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Erik Gilling <konkers@android.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: David Brown <davidb@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have two places where we create identity mappings - one when we bring
secondary CPUs online, and one where we setup some mappings for soft-
reboot. Combine these two into a single implementation. Also collect
the identity mapping deletion function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The MMU is always configured to read page tables from the L2 cache
so there's little point flushing them out of the L2 cache back to
RAM. Remove these flushes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we soft-CPU hotplug a CPU, we reset the stack pointer and
jump back to start_secondary(). This allows us to restart as if
the CPU was actually reset.
However, we weren't resetting the frame pointer, which could cause
problems with backtracing. Reset the frame pointer to zero (which
means no parent frame) just like the early assembly code also does.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
smp.c is becoming too large, so split out the TLB maintainence
broadcasting into a separate smp_tlb.c file.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a CPU is hot unplugged, the generic tick code cleans up the
clock event device, but fails to call down to the device's set_mode
function to actually shut the device down.
To work around this, we've historically had a local_timer_stop()
callback out of the hotplug code. However, this adds needless
complexity when we have the clock event device itself available.
Explicitly call the clock event device's set_mode function with
CLOCK_EVT_MODE_UNUSED, so that the hardware can be cleanly shutdown
without any special external callbacks. When/if the generic code
is fixed, percpu_timer_stop() can be killed off.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We used to print a bland error message which gave no clue as to the
failure when we failed to bring up a secondary CPU. Resolve this by
separating the two failure cases.
If boot_secondary() fails, we print a message indicating the returned
error code from boot_secondary():
"CPU%u: failed to boot: %d\n", cpu, ret.
However, if boot_secondary() succeeded, but the CPU did not appear to
mark itself online within the timeout, indicate that it failed to come
online:
"CPU%u: failed to come online\n", cpu
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Don't call idle_task_exit() with interrupts disabled, and ensure
that we have a memory barrier after interrupts are disabled but
before signalling that this CPU has shut down.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We always need to wait for the dying CPU to reach a safe state before
taking it down, irrespective of the requirements of the platform.
Move the completion code into the ARM SMP hotplug code rather than
having each platform re-implement this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All platforms call trace_hardirqs_off() in their secondary startup code,
so move this into the core SMP code - it doesn't need to be in the
per-platform code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There is a certain amount of smp_prepare_cpus() which doesn't belong
in the platform support code - that is, code which is invariant to the
SMP implementation. Move this code into arch/arm/kernel/smp.c, and
add a platform_ prefix to the original function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Wait for CPUs to indicate that they've stopped, after sending the
stop IPI, rather than blindly continuing on and hoping that they've
stopped in time. Print a warning if we fail to stop the other CPUs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The IPI and local timer interrupts weren't being properly accounted
for in /proc/stat. Collect them from the irq_stat structure, and
return their sum.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This separates out the individual IPI interrupt counts from the
total IPI count, which allows better visibility of what IPIs are
being used for.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As per x86, align the initial column according to how many IRQs we
have. Also, provide an english explaination for the 'LOC:' and
'IPI:' lines.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the ipi_count into irq_stat, which allows the ipi_data structure
to be entirely removed.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide __inc_irq_stat() and __get_irq_stat() to increment and
read the irq stat counters.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
send_ipi_message() does nothing except call smp_cross_call(). As
this is a static function, nothing external to this file calls it,
so we can easily clean up this now unnecessary indirection.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We should not be incrementing mm_users when we startup a secondary
CPU - doing so results in mm_users incrementing by one each time we
hotplug a CPU, which will eventually wrap, and will cause problems.
Other architectures such as x86 do not increment mm_users, but only
mm_count, so we follow that pattern.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we've now removed the spinlock and bitmask, we have nothing left
which requires interrupts to be disabled when sending an IPI. All
current IPI-sending implementations use the GIC, which also does not
require interrupts disabled when calling gic_raise_softirq().
Remove the now unnecessary IRQ disable.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid using bitmasks and locks in the percpu area for IPIs, and instead
use individual software generated interrupts to identify the reason for
the IPI. This avoids the problems of having spinlocks in the percpu
area.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to use smp_cross_call() to trigger a number of different
software generated interrupts, rather than combining them all on one
SGI. Recover the SGI number via do_IPI.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When FUNCTION_GRAPH_TRACER is enabled, place do_IRQ() and friends in the
IRQ_ENTRY section so that the irq-related features of the function graph
tracer work.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Make the entire kernel image available for secondary CPUs rather
than just the first MB of memory. This allows the startup code
to appear in the cpuinit sections.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
No need to send IPI if there's one CPU, especially when booting
systems with CONFIG_SMP_ON_UP that may not even support IPI.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
x86 calls machine_shutdown() from the various machine_*() calls which
take the machine down ready for halting, restarting, etc, and uses
this to bring the system safely to a point where those actions can be
performed. Such actions are stopping the secondary CPUs.
So, change the ARM implementation of these to reflect what x86 does.
This solves kexec problems on ARM SMP platforms, where the secondary
CPUs were left running across the kexec call.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The TWD local timers are unable to wake up the CPU when it is placed
into a low power mode, eg. C3. Therefore, we need to adapt things
such that the TWD code can cope with this.
We do this by always providing a broadcast tick function, and marking
the fact that the TWD local timer will stop in low power modes. This
means that when the CPU is placed into a low power mode, the core
timer code marks this fact, and allows an IPI to be given to the core.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
This patch fixes the preempt leak in the cpuidle path invoked from
cpu-hotplug. The fix is suggested by Russell King and is based
on x86 idea of calling init_idle() on the idle task when it's
re-used which also resets the preempt count amongst other things
dump:
BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in:
Backtrace:
[<c0024f90>] (dump_backtrace+0x0/0x110) from [<c0173bc4>] (dump_stack+0x18/0x1c)
r7:c02149e4 r6:c033df00 r5:c7836000 r4:00000000
[<c0173bac>] (dump_stack+0x0/0x1c) from [<c003b4f0>] (__schedule_bug+0x60/0x70)
[<c003b490>] (__schedule_bug+0x0/0x70) from [<c0174214>] (schedule+0x98/0x7b8)
r5:c7836000 r4:c7836000
[<c017417c>] (schedule+0x0/0x7b8) from [<c00228c4>] (cpu_idle+0xb4/0xd4)
# [<c0022810>] (cpu_idle+0x0/0xd4) from [<c0171dd8>] (secondary_start_kernel+0xe0/0xf0)
r5:c7836000 r4:c0205f40
[<c0171cf8>] (secondary_start_kernel+0x0/0xf0) from [<c002d57c>] (prm_rmw_mod_reg_bits+0x88/0xa4)
r7:c02149e4 r6:00000001 r5:00000001 r4:c7836000
Backtrace aborted due to bad frame pointer <c7837fbc>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The page table and secondary data which we're asking the secondary CPU
to make use of has to hit RAM to ensure that the secondary CPU can see
it since it may not be taking part in coherency or cache searches at
this point.
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix:
WARNING: vmlinux.o(.text+0x247c): Section mismatch in reference from the function cpu_idle() to the function .cpuexit.text:cpu_die()
The function cpu_idle() references a function in an exit section.
Often the function cpu_die() has valid usage outside the exit section
and the fix is to remove the __cpuexit annotation of cpu_die.
WARNING: vmlinux.o(.cpuexit.text+0x3c): Section mismatch in reference from the function cpu_die() to the function .cpuinit.text:secondary_start_kernel()
The function __cpuexit cpu_die() references
a function __cpuinit secondary_start_kernel().
This is often seen when error handling in the exit function
uses functionality in the init path.
The fix is often to remove the __cpuinit annotation of
secondary_start_kernel() so it may be used outside an init section.
Sam says:
> The annotation of cpu_die() is wrong.
> To be annotated __cpuexit the function shall:
> - be used in exit context and only in exit context with HOTPLUG_CPU=n
> - be used outside exit context with HOTPLUG_CPU=y
So, this also means __cpu_disable(), __cpu_die() and twd_timer_stop() are
also wrong. However, removing __cpuexit from cpu_die() creates:
WARNING: vmlinux.o(.text+0x6834): Section mismatch in reference from the function cpu_die() to the function .cpuinit.text:secondary_start_kernel()
The function cpu_die() references
the function __cpuinit secondary_start_kernel().
This is often because cpu_die lacks a __cpuinit
annotation or the annotation of secondary_start_kernel is wrong.
so fix this using __ref.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
We suffer an unfortunate combination of "features" which makes highmem
support on platforms without hardware TLB maintainence broadcast difficult:
- we need kmap_high_get() support for DMA cache coherence
- this requires kmap_high() to take a spinlock with IRQs disabled
- kmap_high() occasionally calls flush_all_zero_pkmaps() to clear
out old mappings
- flush_all_zero_pkmaps() calls flush_tlb_kernel_range(), which
on s/w IPI'd systems eventually calls smp_call_function_many()
- smp_call_function_many() must not be called with IRQs disabled:
WARNING: at kernel/smp.c:380 smp_call_function_many+0xc4/0x240()
Modules linked in:
Backtrace:
[<c00306f0>] (dump_backtrace+0x0/0x108) from [<c0286e6c>] (dump_stack+0x18/0x1c)
r6:c007cd18 r5:c02ff228 r4:0000017c
[<c0286e54>] (dump_stack+0x0/0x1c) from [<c0053e08>] (warn_slowpath_common+0x50/0x80)
[<c0053db8>] (warn_slowpath_common+0x0/0x80) from [<c0053e50>] (warn_slowpath_null+0x18/0x1c)
r7:00000003 r6:00000001 r5:c1ff4000 r4:c035fa34
[<c0053e38>] (warn_slowpath_null+0x0/0x1c) from [<c007cd18>] (smp_call_function_many+0xc4/0x240)
[<c007cc54>] (smp_call_function_many+0x0/0x240) from [<c007cec0>] (smp_call_function+0x2c/0x38)
[<c007ce94>] (smp_call_function+0x0/0x38) from [<c005980c>] (on_each_cpu+0x1c/0x38)
[<c00597f0>] (on_each_cpu+0x0/0x38) from [<c0031788>] (flush_tlb_kernel_range+0x50/0x58)
r6:00000001 r5:00000800 r4:c05f3590
[<c0031738>] (flush_tlb_kernel_range+0x0/0x58) from [<c009c600>] (flush_all_zero_pkmaps+0xc0/0xe8)
[<c009c540>] (flush_all_zero_pkmaps+0x0/0xe8) from [<c009c6b4>] (kmap_high+0x8c/0x1e0)
[<c009c628>] (kmap_high+0x0/0x1e0) from [<c00364a8>] (kmap+0x44/0x5c)
[<c0036464>] (kmap+0x0/0x5c) from [<c0109dfc>] (cramfs_readpage+0x3c/0x194)
[<c0109dc0>] (cramfs_readpage+0x0/0x194) from [<c0090c14>] (__do_page_cache_readahead+0x1f0/0x290)
[<c0090a24>] (__do_page_cache_readahead+0x0/0x290) from [<c0090ce4>] (ra_submit+0x30/0x38)
[<c0090cb4>] (ra_submit+0x0/0x38) from [<c0089384>] (filemap_fault+0x3dc/0x438)
r4:c1819988
[<c0088fa8>] (filemap_fault+0x0/0x438) from [<c009d21c>] (__do_fault+0x58/0x43c)
[<c009d1c4>] (__do_fault+0x0/0x43c) from [<c009e8cc>] (handle_mm_fault+0x104/0x318)
[<c009e7c8>] (handle_mm_fault+0x0/0x318) from [<c0033c98>] (do_page_fault+0x188/0x1e4)
[<c0033b10>] (do_page_fault+0x0/0x1e4) from [<c0033ddc>] (do_translation_fault+0x7c/0x84)
[<c0033d60>] (do_translation_fault+0x0/0x84) from [<c002b474>] (do_DataAbort+0x40/0xa4)
r8:c1ff5e20 r7:c0340120 r6:00000805 r5:c1ff5e54 r4:c03400d0
[<c002b434>] (do_DataAbort+0x0/0xa4) from [<c002bcac>] (__dabt_svc+0x4c/0x60)
...
So we disable highmem support on these systems.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Makes code futureproof against the impending change to mm->cpu_vm_mask.
It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>