Commit Graph

900 Commits

Author SHA1 Message Date
Thomas Gleixner
b8f8c3cf0a nohz: prevent tick stop outside of the idle loop
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

	scheduler switch to idle task
	enable interrupts

Window starts here

	----> interrupt happens (does not set NEED_RESCHED)
	      	irq_exit() stops the tick

	----> interrupt happens (does set NEED_RESCHED)

	return from schedule()
	
	cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
	preempt_disable();

	while(1) {
		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
		 			          are in the idle loop

		 while (!need_resched())
		       halt();

		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
		 preempt_enable_no_resched();
		 schedule();
		 preempt_disable();
	}
}

In hindsight we should have done this forever, but ... 

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>, 
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-18 18:10:28 +02:00
David S. Miller
ada44a0430 sparc64: Prevent stack backtrace false positives on trap frames.
When we fully commit to returning back to kernel mode from
a trap, zero out the regs->magic value to prevent false
positives during stack backtraces.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 21:50:01 -07:00
David S. Miller
14d2c68baa sparc64: Fix stack tracing through trap frames.
The offset to the pt_regs area was wrong, so we weren't
looking at the right location for the magic cookie.

A trap frame is composed of a "struct sparc_stackf" then
a "struct pt_regs", the code was using "struct reg_window"
instead of "struct sparc_stackf".

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 18:15:53 -07:00
David S. Miller
a051bc5bb1 sparc64: Fix kernel thread stack termination.
Because of the silly way I set up the initial stack for
new kernel threads, there is a loop at the top of the
stack.

To fix this, properly add another stack frame that is copied
from the parent and terminate it in the child by setting
the frame pointer in that frame to zero.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 18:14:28 -07:00
David S. Miller
93dae5b70e sparc64: Add global register dumping facility.
When a cpu really is stuck in the kernel, it can be often
impossible to figure out which cpu is stuck where.  The
worst case is when the stuck cpu has interrupts disabled.

Therefore, implement a global cpu state capture that uses
SMP message interrupts which are not disabled by the
normal IRQ enable/disable APIs of the kernel.

As long as we can get a sysrq 'y' to the kernel, we can
get a dump.  Even if the console interrupt cpu is wedged,
we can trigger it from userspace using /proc/sysrq-trigger

The output is made compact so that this facility is more
useful on high cpu count systems, which is where this
facility will likely find itself the most useful :)

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 00:33:45 -07:00
Adrian Bunk
b00dc83764 sparc64: remove CVS keywords
This patch removes the CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 00:33:43 -07:00
Al Viro
f52111b154 [PATCH] take init_files to fs/file.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-16 17:22:20 -04:00
David S. Miller
9a28dbf8af sparc64: Use a TS_RESTORE_SIGMASK
This mirrors x86 changeset 5a8da0ea82
("signals: x86 TS_RESTORE_SIGMASK") on sparc64.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 22:45:15 -07:00
David S. Miller
94d149c34c sparc: Fix mremap address range validation.
Just like mmap, we need to validate address ranges regardless
of MAP_FIXED.

sparc{,64}_mmap_check()'s flag argument is unused, remove.

Based upon a report and preliminary patch by
Jan Lieskovsky <jlieskov@redhat.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 16:33:33 -07:00
David S. Miller
28e6103665 sparc: Fix debugger syscall restart interactions.
So, forever, we've had this ptrace_signal_deliver implementation
which tries to handle all of the nasties that can occur when the
debugger looks at a process about to take a signal.  It's meant
to address all of these issues inside of the kernel so that the
debugger need not be mindful of such things.

Problem is, this doesn't work.

The idea was that we should do the syscall restart business first, so
that the debugger captures that state.  Otherwise, if the debugger for
example saves the child's state, makes the child execute something
else, then restores the saved state, we won't handle the syscall
restart properly because we lose the "we're in a syscall" state.

The code here worked for most cases, but if the debugger actually
passes the signal through to the child unaltered, it's possible that
we would do a syscall restart when we shouldn't have.

In particular this breaks the case of debugging a process under a gdb
which is being debugged by yet another gdb.  gdb uses sigsuspend
to wait for SIGCHLD of the inferior, but if gdb itself is being
debugged by a top-level gdb we get a ptrace_stop().  The top-level gdb
does a PTRACE_CONT with SIGCHLD to let the inferior gdb see the
signal.  But ptrace_signal_deliver() assumed the debugger would cancel
out the signal and therefore did a syscall restart, because the return
error was ERESTARTNOHAND.

Fix this by simply making ptrace_signal_deliver() a nop, and providing
a way for the debugger to control system call restarting properly:

1) Report a "in syscall" software bit in regs->{tstate,psr}.
   It is set early on in trap entry to a system call and is fully
   visible to the debugger via ptrace() and regsets.

2) Test this bit right before doing a syscall restart.  We have
   to do a final recheck right after get_signal_to_deliver() in
   case the debugger cleared the bit during ptrace_stop().

3) Clear the bit in trap return so we don't accidently try to set
   that bit in the real register.

As a result we also get a ptrace_{is,clear}_syscall() for sparc32 just
like sparc64 has.

M68K has this same exact bug, and is now the only other user of the
ptrace_signal_deliver hook.  It needs to be fixed in the same exact
way as sparc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-11 02:07:19 -07:00
David S. Miller
986bef854f sparc: Fix ptrace() detach.
Forever we had a PTRACE_SUNOS_DETACH which was unconditionally
recognized, regardless of the personality of the process.

Unfortunately, this value is what ended up in the GLIBC sys/ptrace.h
header file on sparc as PTRACE_DETACH and PT_DETACH.

So continue to recognize this old value.  Luckily, it doesn't conflict
with anything we actually care about.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-11 01:59:05 -07:00
David S. Miller
dc5dc7e6d7 sparc: Fix SA_ONSTACK signal handling.
We need to be more liberal about the alignment of the buffer given to
us by sigaltstack().  The user should not need to be mindful of all of
the alignment constraints we have for the stack frame.

This mirrors how we handle this situation in clone() as well.

Also, we align the stack even in non-SA_ONSTACK cases so that signals
due to bad stack alignment can be delivered properly.  This makes such
errors easier to debug and recover from.

Finally, add the sanity check x86 has to make sure we won't overflow
the signal stack.

This fixes glibc testcases nptl/tst-cancel20.c and
nptl/tst-cancelx20.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-07 18:54:05 -07:00
David S. Miller
1e38c126c9 sparc: Fix fork/clone/vfork system call restart.
We clobber %i1 as well as %i0 for these system calls,
because they give two return values.

Therefore, on error, we have to restore %i1 properly
or else the restart explodes since it uses the wrong
arguments.

This fixes glibc's nptl/tst-eintr1.c testcase.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-07 16:21:28 -07:00
David S. Miller
5816339310 sparc: Fix mmap VA span checking.
We should not conditionalize VA range checks on MAP_FIXED.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-07 02:24:28 -07:00
David S. Miller
8376005ea4 sparc64: use compat_sys_utimes instead of home-grown local copy.
Noticed by Christoph Hellwig.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-05 12:32:39 -07:00
David S. Miller
81d6ec6b36 Revert "[SPARC64]: Wrap SMP IPIs with irq_enter()/irq_exit()."
This reverts commit 2664ef44cf.

Ingo moved around where the softlockup dependency sits
so this change is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-03 21:00:55 -07:00
David S. Miller
2678fefedb sparc64: Fix syscall restart, for real...
The change I put into copy_thread() just papered over the real
problem.

When we are looking to see if we should do a syscall restart, when
deliverying a signal, we should only interpret the syscall return
value as an error if the carry condition code(s) are set.

Otherwise it's a success return.

Also, sigreturn paths should do a pt_regs_clear_trap_type().

It turns out that doing a syscall restart when returning from a fork()
does and should happen, from time to time.  Even if copy_thread()
returns success, copy_process() can still unwind and signal
-ERESTARTNOINTR in the parent.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 05:22:52 -07:00
David S. Miller
c26d3c0138 sparc64: Stop creating dummy root PCI host controller devices.
It just creates confusion, errors, and bugs.

For one thing, this can cause dup sysfs or procfs nodes to get
created:

[    1.198015] proc_dir_entry '00.0' already registered
[    1.198036] Call Trace:
[    1.198052]  [00000000004f2534] create_proc_entry+0x7c/0x98
[    1.198092]  [00000000005719e4] pci_proc_attach_device+0xa4/0xd4
[    1.198126]  [00000000007d991c] pci_proc_init+0x64/0x88
[    1.198158]  [00000000007c62a4] kernel_init+0x190/0x330
[    1.198183]  [0000000000426cf8] kernel_thread+0x38/0x48
[    1.198210]  [00000000006a0d90] rest_init+0x18/0x5c

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 05:22:50 -07:00
Huang Weiyi
8cd0ae3acc sparc64: remove duplicated include
Remove dulicated include file <asm/timer.h> in arch/sparc64/kernel/smp.c.

Signed-off-by: Huang Weiyi <hwy@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 03:19:38 -07:00
David S. Miller
e2fdd7fd99 sparc: Add kgdb support.
Current limitations:

1) On SMP single stepping has some fundamental issues,
   shared with other sw single-step architectures such
   as mips and arm.

2) On 32-bit sparc we don't support SMP kgdb yet.  That
   requires some reworking of the IPI mechanisms and
   infrastructure on that platform.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 02:38:50 -07:00
David S. Miller
6eda3a7592 sparc64: Split entry.S up into seperate files.
entry.S was a hodge-podge of several totally unrelated
sets of assembler routines, ranging from FPU trap handlers
to hypervisor call functions.

Split it up into topic-sized pieces.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-28 00:47:20 -07:00
David S. Miller
fd7354108a sparc64: Fix accidental syscall restart on child return from clone/fork/vfork.
This fixes a regression added by
238468b2ac ("[SPARC64]: Use trap type
stored in pt_regs to handle syscall restart.")

Because we now encode the "returning from syscall" status in the
pt_regs area, we have to be mindful to zap it out in the child
of a fork.

During a parallel kernel build I saw an accidental -EINTR return
from vfork() in 'make' because of this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 15:09:15 -07:00
David S. Miller
90888816ba sparc64: Clean up handling of pt_regs trap type encoding.
If we use this from more than one place, it's better to
have helpers instead of twiddling magic constants all
over.

Add pt_regs_trap_type(), pt_regs_clear_trap_type(), and
pt_regs_is_syscall().

Use them in do_signal().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 14:52:51 -07:00
David S. Miller
5526b7e451 sparc: Remove old style signal frame support.
Back around the same time we were bootstrapping the first 32-bit sparc
Linux kernel with a SunOS userland, we made the signal frame match
that of SunOS.

By the time we even started putting together a native Linux userland
for 32-bit Sparc we realized this layout wasn't sufficient for Linux's
needs.

Therefore we changed the layout, yet kept support for the old style
signal frame layout in there.  The detection mechanism is that we had
sys_sigaction() start passing in a negative signal number to indicate
"new style signal frames please".

Anyways, no binaries exist in the world that use the old stuff.  In
fact, I bet Jakub Jelinek and myself are the only two people who ever
had such binaries to be honest.

So let's get rid of this stuff.

I added an assertion using WARN_ON_ONCE() that makes sure 32-bit
applications are passing in that negative signal number still.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 02:26:36 -07:00
David S. Miller
7cf069955f sparc64: Kill bogus RT_ALIGNEDSZ macro from signal.c
The structure has to be 8-byte aligned in size, so
this macro is just noise.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 00:25:30 -07:00
David S. Miller
5da496e4b9 sparc64: Kill unused local ISA bus layer.
No more drivers use this, and therefore it can die.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-26 21:41:23 -07:00
David S. Miller
dc8ca2a111 sparc64: Do not ignore 'pmu' device ranges.
I must have disabled this due to other bugs which were fixed over
time.  And this is needed in order for child devices of "pmu"
to get proper resource values.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-26 21:41:20 -07:00
David S. Miller
09337f501e sparc64: Kill CONFIG_SPARC32_COMPAT
It's completely superfluous, CONFIG_COMPAT is sufficient.

What this used to be is an umbrella for enabling code shared
by all 32-bit compat binary support types.  But with the
removal of SunOS and Solaris support, the only one left is
Linux 32-bit ELF.

Update defconfig.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-26 21:41:19 -07:00
David S. Miller
227c331178 sparc64: Fix wedged irq regression.
Kernel bugzilla 10273

As reported by Jos van der Ende, ever since commit
5a606b72a4 ("[SPARC64]: Do not ACK an
INO if it is disabled or inprogress.") sun4u interrupts
can get stuck.

What this changset did was add the following conditional to
the various IRQ chip ->enable() handlers on sparc64:

	if (unlikely(desc->status & (IRQ_DISABLED|IRQ_INPROGRESS)))
		return;

which is correct, however it means that special care is needed
in the ->enable() method.

Specifically we must put the interrupt into IDLE state during
an enable, or else it might never be sent out again.

Setting the INO interrupt state to IDLE resets the state machine,
the interrupt input to the INO is retested by the hardware, and
if an interrupt is being signalled by the device, the INO
moves back into TRANSMIT state, and an interrupt vector is sent
to the cpu.

The two sun4v IRQ chip handlers were already doing this properly,
only sun4u got it wrong.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-26 21:41:15 -07:00
David S. Miller
2664ef44cf [SPARC64]: Wrap SMP IPIs with irq_enter()/irq_exit().
Otherwise all sorts of bad things can happen, including
spurious softlockup reports.

Other platforms have this same bug, in one form or
another, just don't see the issue because they
don't sleep as long as sparc64 can in NOHZ.

Thanks to some brilliant debugging by Peter Zijlstra.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-25 03:11:37 -07:00
David S. Miller
020cfb05f2 [SPARC64]: Fix args to 64-bit sys_semctl() via sys_ipc().
Second and third arguments were swapped for whatever reason.

Reported by Tom Callaway.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-25 02:12:05 -07:00
David S. Miller
77c664fa58 [SPARC64]: Detect trap frames in stack backtraces.
Now that we have a magic cookie in the pt_regs, we can
properly detect trap frames in stack bactraces.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-24 03:28:52 -07:00
David S. Miller
7697daaa89 [SPARC64]: %l6 trap return handling no longer necessary.
Now that we indicate the "restart system call" in the
trap type field of pt_regs->magic, we don't need to
set the %l6 boolean in all of the trap return paths.

And we therefore don't need to pass it to do_notify_resume().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-24 03:15:22 -07:00
David S. Miller
238468b2ac [SPARC64]: Use trap type stored in pt_regs to handle syscall restart.
Now that we can check the trap type directly, we don't need the
funny restart_syscall indication from the trap return paths.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-24 03:01:48 -07:00
David S. Miller
8243e40acb [SPARC64]: Store magic cookie and trap type in pt_regs.
This sets us up for several simplifications and facilities:

1) The magic cookie lets us identify trap frames more
   accurately in stack backtraces.

2) The trap type lets us simplify all of the "are we in
   a syscall" state management and checks.

3) We can now see if a task off the cpu is sleeping in
   a system call or not.  In fact, we can see what
   trap it is sleeping in whatever the type.  The utrace
   guys will use this.

Based upon some discussions with Roland McGrath.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:20 -07:00
David S. Miller
db9a7fb12c [SPARC64]: PROM debug console can be CON_ANYTIME.
No per-cpu or similar resources need to be setup before
we can use this console device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:20 -07:00
Adrian Bunk
c6ca978370 sparc64: cleanup after SunOS/Solaris binary emulation removal
The following cleanups are now possible:
- arch/sparc64/kernel/entry.S:ret_sys_call no longer has to be global
- arch/sparc64/kernel/sparc64_ksyms.c:
  remove no longer used prototypes

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:19 -07:00
David S. Miller
919ee677b6 [SPARC64]: Add NUMA support.
Currently there is only code to parse NUMA attributes on
sun4v/niagara systems, but later on we will add such parsing
for older systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:17 -07:00
David S. Miller
c1b1a5f1f1 [SPARC64]: NUMA device infrastructure.
Record and propagate NUMA information for devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:16 -07:00
David S. Miller
0c49a573ea [SPARC64]: Kill pci_iommu_table_init() declaration.
No longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:15 -07:00
David S. Miller
ce3b1d47a8 [SPARC64]: Once we have the boot cmdline, call parse_early_param()
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:14 -07:00
David S. Miller
4a28333984 [SPARC64]: Initialize MDESC earlier and use lmb_alloc()
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:12 -07:00
David S. Miller
ad072004ca [SPARC64]: Use lmb_alloc() for PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:11 -07:00
David S. Miller
b97094560b [SPARC64]: Call real_setup_per_cpu_areas() earlier and use lmb_alloc().
We have to do it like this before we can move the PROM and MDESC device
tree code over to using lmb_alloc().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-23 23:32:11 -07:00
Linus Torvalds
8a32272688 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC]: Remove SunOS and Solaris binary support.
2008-04-21 17:20:53 -07:00
David S. Miller
ec98c6b9b4 [SPARC]: Remove SunOS and Solaris binary support.
As per Documentation/feature-removal-schedule.txt

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-21 15:10:15 -07:00
Matthew Wilcox
950e4da324 arch: Remove unnecessary inclusions of asm/semaphore.h
None of these files use any of the functionality promised by
asm/semaphore.h.  It's possible that they rely on it dragging in some
unrelated header file, but I can't build all these files, so we'll have
fix any build failures as they come up.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:14:49 -04:00
Matthew Wilcox
64ac24e738 Generic semaphore implementation
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 10:42:34 -04:00
David S. Miller
d786a4a659 [SPARC]: Fix several regset and ptrace bugs.
1) ptrace should pass 'current' to task_user_regset_view()

2) When fetching general registers using a 64-bit view, and
   the target is 32-bit, we have to convert.

3) Skip the whole register window get/set code block if
   the user isn't asking to access anything in there.

   Otherwise we have problems if the user doesn't have
   an address space setup.  Fetching ptrace register is
   still valid at such a time, and ptrace does not try
   to access the register window area of the regset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-09 19:39:25 -07:00
David S. Miller
ad4f957640 [SPARC64]: Fix user accesses in regset code.
If target is not current we need to use access_process_vm().

Noticed by Roland McGrath.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 16:55:14 -07:00