Some architectures provide an execve function that does not set errno, but
instead returns the result code directly. Rename these to kernel_execve to
get the right semantics there. Moreover, there is no reasone for any of these
architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
remove these right away.
[akpm@osdl.org: build fix]
[bunk@stusta.de: build fix]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On systems running with virtual cpus there is optimization potential in
regard to spinlocks and rw-locks. If the virtual cpu that has taken a lock
is known to a cpu that wants to acquire the same lock it is beneficial to
yield the timeslice of the virtual cpu in favour of the cpu that has the
lock (directed yield).
With CONFIG_PREEMPT="n" this can be implemented by the architecture without
common code changes. Powerpc already does this.
With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock,
_raw_read_trylock and _raw_write_trylock in kernel/spinlock.c. If the lock
could not be taken cpu_relax is called. A directed yield is not possible
because cpu_relax doesn't know anything about the lock. To be able to
yield the lock in favour of the current lock holder variants of cpu_relax
for spinlocks and rw-locks are needed. The new _raw_spin_relax,
_raw_read_relax and _raw_write_relax primitives differ from cpu_relax
insofar that they have an argument: a pointer to the lock structure.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One of the changes necessary for shared page tables is to standardize the
pxx_page macros. pte_page and pmd_page have always returned the struct
page associated with their entry, while pte_page_kernel and pmd_page_kernel
have returned the kernel virtual address. pud_page and pgd_page, on the
other hand, return the kernel virtual address.
Shared page tables needs pud_page and pgd_page to return the actual page
structures. There are very few actual users of these functions, so it is
simple to standardize their usage.
Since this is basic cleanup, I am submitting these changes as a standalone
patch. Per Hugh Dickins' comments about it, I am also changing the
pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove definitions of PAGE_* from the user view
Delete unnecessary comments referring to the size of pages
Only include <asm-generic> if we're in __KERNEL__
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.
All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.
Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new IRQF_ constants and remove the SA_INTERRUPT define
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.
Patch purpose:
This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket. The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.
Patch design and implementation:
The design and implementation is very similar to the UDP case for INET
sockets. Basically we build upon the existing Unix domain socket API for
retrieving user credentials. Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message). To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt. Then the application
retrieves the security context using the auxiliary data mechanism.
An example server application for Unix datagram socket should look like this:
toggle = 1;
toggle_len = sizeof(toggle);
setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
cmsg_hdr->cmsg_level == SOL_SOCKET &&
cmsg_hdr->cmsg_type == SCM_SECURITY) {
memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
}
}
sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.
Testing:
We have tested the patch by setting up Unix datagram client and server
applications. We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.
Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (23 commits)
[PARISC] Move os_id_to_string() inside #ifndef __ASSEMBLY__
[PARISC] Fix do_gettimeofday() hang
[PARISC] Fix PCREL22F relocation problem for most modules
[PARISC] Refactor show_regs in traps.c
[PARISC] Add os_id_to_string helper
[PARISC] OS_ID_LINUX == 0x0006
[PARISC] Ensure Space ID hashing is turned off
[PARISC] Match show_cache_info with reality
[PARISC] Remove unused macro fixup_branch in syscall.S
[PARISC] Add is_compat_task() helper
[PARISC] Update Thibaut Varene's CREDITS entry
[PARISC] Reduce data footprint in pdc_stable.c
[PARISC] pdc_stable version 0.30
[PARISC] Work around machines which do not support chassis warnings
[PARISC] PDC_CHASSIS is implemented on all machines
[PARISC] Remove unconditional #define PIC in syscall macros
[PARISC] Use MFIA in current_text_addr on pa2.0 processors
[PARISC] Remove dead function pc_in_user_space
[PARISC] Test ioc_needs_fdc variable instead of open coding
[PARISC] Fix gcc 4.1 warnings in sba_iommu.c
...
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)
NOTE: ia64 needs testing. i386 and x86_64 tested.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a helper to asm/pdc.h to translate OS_ID values to strings
and use it in the pdc_stable driver.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
We were assigned an OS_ID of 0x0006. Consistently use OS_ID_LINUX
instead of using the magic number. Also update the OS_ID_ defines in
asm/pdc.h to reflect this.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Check PDC_CACHE to see if spaceid hashing is turned on, and fail to
boot if that is the case.
However, some old machines do not implement the PDC_CACHE_RET_SPID
firmware call, so continue to boot if the call fails because of
PDC_BAD_OPTION (but fail in all other error returns).
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
show_cache_info and struct pdc_cache_cf were out of sync with
published documentation. Fix the reporting of cache associativity
and update the pdc_cache_cf bitfields to match documentation.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
This patch removes a limitation of the original code, so that CHASSIS
codes can be sent to all machines. On machines with a LCD panel, this
code displays "INI" during bootup, "RUN" when the system is booted and
running, "FLT" when a panic occurs, etc.
This part of the code can be enabled/disabled through CONFIG_PDC_CHASSIS
This patch also adds minimalistic support for Chassis warnings, through
a proc entry '/proc/chassis', which will reflect the warnings status (PSU
or fans failure when they happen, NVRAM battery level and temperature
thresholds overflows).
This part of the code can be enabled/disabled through CONFIG_PDC_CHASSIS_WARN
Signed-off-by: Thibaut VARENE <varenet@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Joel Soete noticed correctly that the fixup's clobbers must be listed
as the ASM clobbers. FIXUP_BRANCH in unaligned.c has a new macro which
lists all the clobbers in the fixup, we use this throughout the file
to simplify the process of listing clobbers in the future.
A missing "r1" clobber is added to our uaccess.h for the 64-bit
__put_kernel_asm. Interestingly this is a pretty serious bug since gcc
generates pretty good use of r1 as a temporary and the uses of
__put_kernel_asm are varied and dangerous if r1 is scratched during
an invalid write.
Signed-off-by: Joel Soete <soete.joel@tiscali.be>
Signed-off-by: Carlos O'Donell <carlos@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
ldcw,co should always be used on pa2.0, otherwise the strict cache
width alignment requirement is not relaxed.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
The floppy driver is already calling add_disk_randomness as it should, so this
was redundant.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These include nothing more than the basic set of files listed in
asm-generic/Kbuild.asm. Any extra arch-specific files will need to be
added.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
These aren't needed by glibc or klibc, and they're broken in some cases
anyway. The uClibc folks are apparently switching over to stop using
them too (now that we agreed that they should be dropped, at least).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Since it is way more work to change most drivers to comply with parisc, take
the easy way out and make ioremap _NO_CACHE by default. This is in line with
what powerpc does.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Most are easy, but sync_file_range needed special handling when entering
through the 32-bit syscall table.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
More work towards supporing multiple page sizes on 64-bit. Convert
some assumptions that 64bit uses 3 level page tables into testing
PT_NLEVELS. Also some BUG() to BUG_ON() conversions and some cleanups
to assembler.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Current implementations define NODES_SHIFT in include/asm-xxx/numnodes.h for
each arch. Its definition is sometimes configurable. Indeed, ia64 defines 5
NODES_SHIFT values in the current git tree. But it looks a bit messy.
SGI-SN2(ia64) system requires 1024 nodes, and the number of nodes already has
been changeable by config. Suitable node's number may be changed in the
future even if it is other architecture. So, I wrote configurable node's
number.
This patch set defines just default value for each arch which needs multi
nodes except ia64. But, it is easy to change to configurable if necessary.
On ia64 the number of nodes can be already configured in generic ia64 and SN2
config. But, NODES_SHIFT is defined for DIG64 and HP'S machine too. So, I
changed it so that all platforms can be configured via CONFIG_NODES_SHIFT. It
would be simpler.
See also: http://marc.theaimsgroup.com/?l=linux-kernel&m=114358010523896&w=2
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix up some ISA/EISA stuff.
(Note: isa_ accessors have been removed from asm/io.h)
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Remove CONFIG_DEBUG_IOREMAP, it's now obsolete and won't work anyway.
Remove it from lib/KConfig since it was only available on parisc.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Enable CONFIG_HPPA_IOREMAP by default and remove all now unnecessary code.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Instead of making it a #define in asm/io.h, allow user to select
to turn on IOREMAP from the config menu.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
We need to do a little renaming of our original syntax because
of the difference in arguments.
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
This should now allow SG_IO and fuse to function correctly on our
platform.
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Fix a lot of typos. Eyeballed by jmc@ in OpenBSD.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Michael Tokarev
add missing ()-pair in __ffz() macro for parisc
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets. Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense. The same thing was discussed and conceptually
agreed quite some time ago:
http://lkml.org/lkml/2003/7/12/116
Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it. As far
as the existing POLLHUP handling, the patch leaves it as is. The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files. The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.
There is "a stupid program" to test POLLRDHUP delivery here:
http://www.xmailserver.org/pollrdhup-test.c
It tests poll(2), but since the delivery is same epoll(2) will work equally.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
unused isa_...() helpers removed.
Adrian Bunk:
The asm-sh part was rediffed due to unrelated changes.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Seems like needless clutter having a bunch of #if defined(CONFIG_$ARCH) in
include/linux/cache.h. Move the per architecture section definition to
asm/cache.h, and keep the if-not-defined dummy case in linux/cache.h to
catch architectures which don't implement the section.
Verified that symbols still go in .data.read_mostly on parisc,
and the compile doesn't break.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make new MADV_REMOVE, MADV_DONTFORK, MADV_DOFORK consistent across all
arches. The idea is to make it possible to use them portably even before
distros include them in libc headers.
Move common flags to asm-generic/mman.h
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently, copy-on-write may change the physical address of a page even if the
user requested that the page is pinned in memory (either by mlock or by
get_user_pages). This happens if the process forks meanwhile, and the parent
writes to that page. As a result, the page is orphaned: in case of
get_user_pages, the application will never see any data hardware DMA's into
this page after the COW. In case of mlock'd memory, the parent is not getting
the realtime/security benefits of mlock.
In particular, this affects the Infiniband modules which do DMA from and into
user pages all the time.
This patch adds madvise options to control whether memory range is inherited
across fork. Useful e.g. for when hardware is doing DMA from/into these
pages. Could also be useful to an application wanting to speed up its forks
by cutting large areas out of consideration.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Wire up some new syscalls that have been merged upstream,
o inotify
o openat et al
o pselect6/ppoll
o migrate_pages
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Add enough arch-specific compat signals code to enable parisc64
to compile and boot out of the mainline tree. There are likely still
many dragons here, but this is a start to squashing the last
big difference between the mainline tree and the parisc-linux tree.
The remaining bugs can be squashed as they come up.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Add the parisc version of the "mark rodata section read only" patches.
Based on code from and Signed-off-by Arjan van de Ven
<arjan@infradead.org>, Ingo Molnar <mingo@elte.hu>, Andi Kleen <ak@muc.de>,
Andrew Morton <akpm@osdl.org>, Linus Torvalds <torvalds@osdl.org>.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Implement atomic64_t so atomic_long_t works on parisc. Also
clean up some of the coding style in atomic.h, and make sure
ATOMIC_INIT is cast properly.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Drop the unused do_check_pgt_cache routine from mm/init.c and its
prototype in asm/pgalloc.h
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Remove two unnecessary extern declarations from asm/pci.h.
They collide with what gcc4.0 assumed was static (and should be static).
Found by Joel Soete.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Helge,
o Convert a bunch of kmalloc/memset uses to kzalloc.
o pci.c: Add some __read_mostly annotations.
o pci.c: Move constant pci_post_reset_delay to asm/pci.h
o grfioctl.h: Add A4450A to comment of CRT_ID_VISUALIZE_EG.
o Add some consts to perf.c/perf_images.h
Matthew,
o sticore.c: Add some consts to suppress compile warnings.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
{get,put}_thread_info() were introduced in 2.5.4 and never
had been called by anything in the tree.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add per-arch sched_cacheflush() which is a write-back cacheflush used by
the migration-cost calibration code at bootup time.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cleanup asm-parisc/processor.h to use C99 initializers in
INIT_THREAD().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Change to asm-parisc/pci.h makes the define of PCI_HOST_ADDR symmetrical
with PCI_BUS_ADDR. Also add a comment about PA_VIEW and LMMIO/ELMMIO/GMMIO.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Define some constants for HugeTLB pages, not that parisc-linux supports
it yet.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Make flush_data_cache_local, flush_instruction_cache_local and
flush_tlb_all_local take a void * so they don't have to be cast
when using on_each_cpu(). This becomes a problem when on_each_cpu
is a macro (as it is in current -mm).
Also move the prototype of flush_tlb_all_local into tlbflush.h and
remove its declaration from .c files.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
add the per-arch mutex.h files for the remaining architectures.
We default to asm-generic/mutex-dec.h, because that performs
quite well on most arches. Arches that do not have atomic
decrement/increment instructions should switch to mutex-xchg.h
instead. Arches can also provide their own implementation for
the mutex fastpath primitives.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add atomic_xchg() to all the architectures. Needed by the new mutex code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Most of the architectures have the same asm/futex.h. This consolidates them
into asm-generic, with the arches including it from their own asm/futex.h.
In the case of UML, this reverts the old broken futex.h and goes back to using
the same one as almost everyone else.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kill L1_CACHE_SHIFT from all arches. Since L1_CACHE_SHIFT_MAX is not used
anymore with the introduction of INTERNODE_CACHE, kill L1_CACHE_SHIFT_MAX.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Several counters already have the need to use 64 atomic variables on 64 bit
platforms (see mm_counter_t in sched.h). We have to do ugly ifdefs to fall
back to 32 bit atomic on 32 bit platforms.
The VM statistics patch that I am working on will also make more extensive
use of atomic64.
This patch introduces a new type atomic_long_t by providing definitions in
asm-generic/atomic.h that works similar to the c "long" type. Its 32 bits
on 32 bit platforms and 64 bits on 64 bit platforms.
Also cleans up the determination of the mm_counter_t in sched.h.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is the patch to implement madvise(MADV_REMOVE) - which frees up a
given range of pages & its associated backing store. Current
implementation supports only shmfs/tmpfs and other filesystems return
-ENOSYS.
"Some app allocates large tmpfs files, then when some task quits and some
client disconnect, some memory can be released. However the only way to
release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli
Databases want to use this feature to drop a section of their bufferpool
(shared memory segments) - without writing back to disk/swap space.
This feature is also useful for supporting hot-plug memory on UML.
Concerns raised by Andrew Morton:
- "We have no plan for holepunching! If we _do_ have such a plan (or
might in the future) then what would the API look like? I think
sys_holepunch(fd, start, len), so we should start out with that."
- Using madvise is very weird, because people will ask "why do I need to
mmap my file before I can stick a hole in it?"
- None of the other madvise operations call into the filesystem in this
manner. A broad question is: is this capability an MM operation or a
filesytem operation? truncate, for example, is a filesystem operation
which sometimes has MM side-effects. madvise is an mm operation and with
this patch, it gains FS side-effects, only they're really, really
significant ones."
Comments:
- Andrea suggested the fs operation too but then it's more efficient to
have it as a mm operation with fs side effects, because they don't
immediatly know fd and physical offset of the range. It's possible to
fixup in userland and to use the fs operation but it's more expensive,
the vmas are already in the kernel and we can use them.
Short term plan & Future Direction:
- We seem to need this interface only for shmfs/tmpfs files in the short
term. We have to add hooks into the filesystem for correctness and
completeness. This is what this patch does.
- In the future, plan is to support both fs and mmap apis also. This
also involves (other) filesystem specific functions to be implemented.
- Current patch doesn't support VM_NONLINEAR - which can be addressed in
the future.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since taking a spinlock disables preempt, and we need to spinlock tlb flush
on SMP for N class, we might as well just spinlock on uniprocessor machines
too.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
We actually have two separate bad bugs
1. The read_lock implementation spins with disabled interrupts. This is
completely wrong
2. Our spin_lock_irqsave should check to see if interrupts were enabled
before the call and re-enable interrupts around the inner spin loop.
The problem is that if we spin with interrupts off, we can't receive
IPIs. This has resulted in a bug where SMP machines suddenly spit
smp_call_function timeout messages and hang.
The scenario I've caught is
CPU0 does a flush_tlb_all holding the vmlist_lock for write.
CPU1 tries a cat of /proc/meminfo which tries to acquire vmlist_lock for
read
CPU1 is now spinning with interrupts disabled
CPU0 tries to execute a smp_call_function to flush the local tlb caches
This is now a deadlock because CPU1 is spinning with interrupts disabled
and can never receive the IPI
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
This really only adds them for the machines I can check SMP on, which
is CPU interrupts and IOSAPIC (so not any of the GSC based machines).
With this patch, irqbalanced can be used to maintain irq balancing.
Unfortunately, irqbalanced is a bit x86 centric, so it doesn't do an
incredibly good job, but it does work.
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Since irq.c uses smp_send_all_nop, we must define it for UP builds
as well. Make it a static inline so it gets optimized away. This forces
irq.c to include <asm/smp.h> though.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Fix our interrupts not to use smp_call_function
On K and D class smp, the generic code calls this under an irq
spinlock, which causes the WARN_ON() message in smp_call_function()
(and is also illegal because it could deadlock).
The fix is to use a new scheme based on the IPI_NOP.
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Introduce an atomic_inc_not_zero operation. Make this a special case of
atomic_add_unless because lockless pagecache actually wants
atomic_inc_not_negativeone due to its offset refcount.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix more include file problems that surfaced since I submitted the previous
fix-missing-includes.patch. This should now allow not to include sched.h
from module.h, which is done by a followup patch.
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
__MUTEX_INITIALIZER() has no users, and equates to the more commonly used
DECLARE_MUTEX(), thus making it pretty much redundant. Remove it for good.
Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the pid argument a long as on every other arcihtecture. Despite pid_t
beeing a 32bit type even on 64bit parisc this is not an ABI change due to
the parisc calling conventions. And even if it did it wouldn't matter too
much because 64bit userspace on parisc is in an embrionic stage.
Acked-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Removed some more references to check_region().
I checked these changes into the 'checkreg' branch of
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/misc-2.6.git
The only valid references remaining are in:
drivers/scsi/advansys.c
drivers/scsi/BusLogic.c
drivers/cdrom/sbpcd.c
sound/oss/pss.c
Remove last vestiges of ide_check_region()
drivers/char/specialix: trim trailing whitespace
drivers/char/specialix: eliminate use of check_region()
Remove outdated and unused references to check_region()
[sound oss] remove check_region() usage from cs4232, wavfront
[netdrvr eepro] trim trailing whitespace
[netdrvr eepro] remove check_region() usage
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following series implements memory hot-add for ppc64 and i386. There are
x86_64 and ia64 implementations that will be submitted shortly as well,
through the normal maintainers.
This patch:
local_mapnr is unused, except for in an alpha header. Keep the alpha one,
kill the rest.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's a worrying function translation_exists in parisc cacheflush.h,
unaffected by split ptlock since flush_dcache_page is using it on some other
mm, without any relevant lock. Oh well, make it a slightly more robust by
factoring the pfn check within it. And it looked liable to confuse a
camouflaged swap or file entry with a good pte: fix that too.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was one small but very significant change in the previous patch:
mprotect's flush_tlb_range fell outside the page_table_lock: as it is in 2.4,
but that doesn't prove it safe in 2.6.
On some architectures flush_tlb_range comes to the same as flush_tlb_mm, which
has always been called from outside page_table_lock in dup_mmap, and is so
proved safe. Others required a deeper audit: I could find no reliance on
page_table_lock in any; but in ia64 and parisc found some code which looks a
bit as if it might want preemption disabled. That won't do any actual harm,
so pending a decision from the maintainers, disable preemption there.
Remove comments on page_table_lock from flush_tlb_mm, flush_tlb_range and
flush_tlb_page entries in cachetlb.txt: they were rather misleading (what
generic code does is different from what usually happens), the rules are now
changing, and it's not yet clear where we'll end up (will the generic
tlb_flush_mmu happen always under lock? never under lock? or sometimes under
and sometimes not?).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fixup.S needs to specify .level and use correct LDREG macro.
New binutils has a bug where it doesn't "promote" from PA1.0 to PA1.1
correctly when using ",s" completer.
remove use of __LP64__ in assembly.h and add some white space.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
drivers/infiniband depends on definition of pgprot_noncached() macro.
Someone else will have to fix it's wrong.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
add ECANCELED - SuSv3 wants one L. IB/SDP actually returns this error.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Fix the alloc_slabmgmt panic
Hopefully this should also fix a lot of other intermittent kernel bugs.
The problem has been around since 2.6.9-rc2-pa6 when we allowed
floating point registers to be used in kernel code. The essence of
the problem is that gcc prefers to use floating point for integer
divides and multiples. Further, it can rely on the values in the no
clobber fp regs being correct across a function call. Unfortunately,
our task switch function only saves the integer no clobber registers,
not the fp ones, so if gcc makes a function call to any function in
the kernel which could sleep, the values it is relying on in any no
clobber floating point register may be lost. In the case of
alloc_slabmgmt, the value of the page offset is being stored in %fr12
across a call to kmem_getpages(), which sleeps if no pages are
available. Thus, the offset can be trashed and the slab code can end
up with a completely bogus address leading to corruption.
Kudos to Randolph who came up with the program to trip this problem at
will and thus allowed it to be tracked and fixed.
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2.6.12-rc1-pa6 use work queue in LED/LCD driver instead of tasklet.
Main advantage is it allows use of msleep() in the led_LCD_driver to
"atomically" perform two MMIO writes (CMD, then DATA).
Lead to nice cleanup of the main led_work_func() and led_LCD_driver().
Kudos to David for being persistent.
From: David Pye <dmp@davidmpye.dyndns.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Optimize ext2_find_next_zero_bit. Gives about 25% perf improvement with a
rsync test with ext3.
Signed-off-by: Randolph Chung <tausq@parisc-linux.org>
fix ext3 performance - ext2_find_next_zero() was culprit.
Kudos to jejb for pointing out the the possibility that ext2_test_bit
and ext2_find_next_zero() may in fact not be enumerating bits in
the bitmap because of endianess. Took sparc64 implementation and
adapted it to our tree. I suspect the real problem is ffz() wants
an unsigned long and was getting garbage in the top half of the
unsigned int. Not confirmed but that's what I suspect.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Fix find_next_bit for 32-bit
Make masking consistent for bitops
From: Joel Soete <soete.joel@tiscali.be>
Signed-off-by: Randolph Chung <tausq@parisc-linux.org>
Add back incorrectly removed ext2_find_first_zero_bit definition
Signed-off-by: James Bottomley <jejb@parisc-linux.org>
Fixup bitops.h to use volatile for *_bit() ops
Based on this email thread:
http://marc.theaimsgroup.com/?t=108826637900003
In a nutshell:
*_bit() want use of volatile.
__*_bit() are "relaxed" and don't use spinlock or volatile.
other minor changes:
o replaces hweight64() macro with alias to generic_hweight64() (Joel Soete)
o cleanup ext2* macros so (a) it's obvious what the XOR magic is about
and (b) one version that works for both 32/64-bit.
o replace 2 uses of CONFIG_64BIT with __LP64__. bitops.h used both.
I think header files that might go to user space should use
something userspace will know about (__LP64__).
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Move SHIFT_PER_LONG to standard location for BITS_PER_LONG (asm/types.h)
and ditch the second definition of BITS_PER_LONG in bitops.h
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
export profile_pc() symbol - oprofile needs it when built as a module.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Take into account nullified insn and lock functions for profiling
This is needed at the end of functions; it is typical that the return
branch nullifies the next insn, which is in the next function. This
causes profiling data to show up against the "wrong" function.
We also count lock times against the locker. This is consistent with
other architectures.
Signed-off-by: Randolph Chung <tausq@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Neaten up the CONFIG_PA20 ifdefs
More merge fixes, this time for SMP
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Prettify the CONFIG_DEBUG_SPINLOCK __SPIN_LOCK_UNLOCKED initializers.
Clean up some warnings with CONFIG_DEBUG_SPINLOCK enabled.
Fix build with spinlock debugging turned on. Patch is cleaner like this,
too.
Remove mandatory 16-byte alignment requirement on PA2.0 processors by
using the ldcw,CO completer. Provides a nice insn savings.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
move pa_tlb_lock and it's primary consumers to tlb_flush.h
Future step will be to move spinlock_t definition out of system.h.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2.6.12-rc4-pa3 : first pass at making sure use of RFI conforms to
PA 2.0 arch pages F-4 and F-5, PA 1.1 Arch page 3-19 and 3-20.
The discussion revolves around all the rules for clearing PSW Q-bit.
The hard part is meeting all the rules for "relied upon translation".
.align directive is used to guarantee the critical sequence ends more than
8 instructions (32 bytes) from the end of page.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Convert pa_dev->hpa from an unsigned long to a struct resource.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Fix up users of ->hpa to use ->hpa.start instead.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Fix parse_tree_node. much more needs to be done to fix this file.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Make drivers.c compile based on a patch from Pat Mochel.
From: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Fix drivers.c to create new device tree nodes when no match is found.
Signed-off-by: Richard Hirst <rhirst@parisc-linux.org>
Do a proper depth-first search returning parents before children, using the
new klist infrastructure.
Signed-off-by: Richard Hirst <rhirst@parisc-linux.org>
Fixed parisc_device traversal so that pdc_stable works again
Fixed check_dev so it doesn't dereference a parisc_device until it
has verified the bus type
Signed-off-by: Randolph Chung <tausq@parisc-linux.org>
Convert pa_dev->hpa from an unsigned long to a struct resource.
Use insert_resource() instead of request_mem_region().
Request resources at bus walk time instead of driver probe time.
Don't release the resources as we don't have any hotplug parisc_device
support yet.
Add parisc_pathname() to conveniently get the textual representation
of the hwpath used in sysfs.
Inline the remnants of claim_device() into its caller.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
I noticed that some of the STI regions weren't showing up in iomem.
Reading the STI spec indicated that all STI devices occupy at least 32MB.
So check for STI HPAs and give them 32MB instead of 4kB.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
As recently done by Russell King for ARM, commit
4732efbeb9 introduces a generic asm/futex.h copied
along most arches, which includes a "-ENOSYS support" to be changed if needed.
However, it includes an unused var (taken from the "real" version) which GCC
warns about.
Remove it from all arches having that file version (i.e. same GIT id).
$ git-diff-tree -r HEAD
and
$ git-ls-tree -r HEAD include/|grep 9feff4ce14
may be more interesting than looking at the patch itself, to make sure I've
just copied the arm header to all other archs having the original dummy version
of this file.
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As written in Documentation/feature-removal-schedule.txt, remove the
io_remap_page_range() kernel API.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code. It does the following
things:
- consolidates and enhances the spinlock/rwlock debugging code
- simplifies the asm/spinlock.h files
- encapsulates the raw spinlock type and moves generic spinlock
features (such as ->break_lock) into the generic code.
- cleans up the spinlock code hierarchy to get rid of the spaghetti.
Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c. (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)
Also, i've enhanced the rwlock debugging facility, it will now track
write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.
The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:
include/asm-i386/spinlock_types.h | 16
include/asm-x86_64/spinlock_types.h | 16
I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:
SMP | UP
----------------------------|-----------------------------------
asm/spinlock_types_smp.h | linux/spinlock_types_up.h
linux/spinlock_types.h | linux/spinlock_types.h
asm/spinlock_smp.h | linux/spinlock_up.h
linux/spinlock_api_smp.h | linux/spinlock_api_up.h
linux/spinlock.h | linux/spinlock.h
/*
* here's the role of the various spinlock/rwlock related include files:
*
* on SMP builds:
*
* asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
* initializers
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel
* implementations, mostly inline assembly code
*
* (also included on UP-debug builds:)
*
* linux/spinlock_api_smp.h:
* contains the prototypes for the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*
* on UP builds:
*
* linux/spinlock_type_up.h:
* contains the generic, simplified UP spinlock type.
* (which is an empty structure on non-debug builds)
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* linux/spinlock_up.h:
* contains the __raw_spin_*()/etc. version of UP
* builds. (which are NOPs on non-debug, non-preempt
* builds)
*
* (included on UP-non-debug builds:)
*
* linux/spinlock_api_up.h:
* builds the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*/
All SMP and UP architectures are converted by this patch.
arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.
From: Grant Grundler <grundler@parisc-linux.org>
Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
Builds 32-bit SMP kernel (not booted or tested). I did not try to build
non-SMP kernels. That should be trivial to fix up later if necessary.
I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids
some ugly nesting of linux/*.h and asm/*.h files. Those particular locks
are well tested and contained entirely inside arch specific code. I do NOT
expect any new issues to arise with them.
If someone does ever need to use debug/metrics with them, then they will
need to unravel this hairball between spinlocks, atomic ops, and bit ops
that exist only because parisc has exactly one atomic instruction: LDCW
(load and clear word).
From: "Luck, Tony" <tony.luck@intel.com>
ia64 fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were three changes necessary in order to allow
sparc64 to use setup-res.c:
1) Sparc64 roots the PCI I/O and MEM address space using
parent resources contained in the PCI controller structure.
I'm actually surprised no other platforms do this, especially
ones like Alpha and PPC{,64}. These resources get linked into the
iomem/ioport tree when PCI controllers are probed.
So the hierarchy looks like this:
iomem --|
PCI controller 1 MEM space --|
device 1
device 2
etc.
PCI controller 2 MEM space --|
...
ioport --|
PCI controller 1 IO space --|
...
PCI controller 2 IO space --|
...
You get the idea. The drivers/pci/setup-res.c code allocates
using plain iomem_space and ioport_space as the root, so that
wouldn't work with the above setup.
So I added a pcibios_select_root() that is used to handle this.
It uses the PCI controller struct's io_space and mem_space on
sparc64, and io{port,mem}_resource on every other platform to
keep current behavior.
2) quirk_io_region() is buggy. It takes in raw BUS view addresses
and tries to use them as a PCI resource.
pci_claim_resource() expects the resource to be fully formed when
it gets called. The sparc64 implementation would do the translation
but that's absolutely wrong, because if the same resource gets
released then re-claimed we'll adjust things twice.
So I fixed up quirk_io_region() to do the proper pcibios_bus_to_resource()
conversion before passing it on to pci_claim_resource().
3) I was mistakedly __init'ing the function methods the PCI controller
drivers provide on sparc64 to implement some parts of these
routines. This was, of course, easy to fix.
So we end up with the following, and that nasty SPARC64 makefile
ifdef in drivers/pci/Makefile is finally zapped.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch gathers all the struct flock64 definitions (and the operations),
puts them under !CONFIG_64BIT and cleans up the arch files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch just gathers together all the struct flock definitions except
xtensa into asm-generic/fcntl.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch puts the most popular of each fcntl operation/flag into
asm-generic/fcntl.h and cleans up the arch files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch puts the most popular of each open flag into asm-generic/fcntl.h
and cleans up the arch files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This set of patches creates asm-generic/fcntl.h and consolidates as much as
possible from the asm-*/fcntl.h files into it.
This patch just gathers all the identical bits of the asm-*/fcntl.h files into
asm-generic/fcntl.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
unused and useless..
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
IRQ_PER_CPU is not used by all architectures. This patch introduces the
macros ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation
of dead code in __do_IRQ().
ARCH_HAS_IRQ_PER_CPU is defined by architectures using IRQ_PER_CPU in their
include/asm_ARCH/irq.h file.
Through grepping the tree I found the following architectures currently use
IRQ_PER_CPU:
cris, ia64, ppc, ppc64 and parisc.
Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The size of auxiliary vector is fixed at 42 in linux/sched.h. But it isn't
very obvious when looking at linux/elf.h. This patch adds AT_VECTOR_SIZE
so that we can change it if necessary when a new vector is added.
Because of include file ordering problems, doing this necessitated the
extraction of the AT_* symbols into a standalone header file.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When I first wrote the compat layer patches, I was somewhat cavalier about
the definition of compat_uid_t and compat_gid_t (or maybe I just
misunderstood :-)). This patch makes the compat types much more consistent
with the types we are being compatible with and hopefully will fix a few
bugs along the way.
compat type type in compat arch
__compat_[ug]id_t __kernel_[ug]id_t
__compat_[ug]id32_t __kernel_[ug]id32_t
compat_[ug]id_t [ug]id_t
The difference is that compat_uid_t is always 32 bits (for the archs we
care about) but __compat_uid_t may be 16 bits on some.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter
(which at least on UP usually means an immediate context switch to one of
the waiter threads). This waiter wakes up and after a few instructions it
attempts to acquire the cv internal lock, but that lock is still held by
the thread calling pthread_cond_signal. So it goes to sleep and eventually
the signalling thread is scheduled in, unlocks the internal lock and wakes
the waiter again.
Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
to avoid this performance issue, but it was removed when locks were
redesigned to the 3 state scheme (unlocked, locked uncontended, locked
contended).
Following scenario shows why simply using FUTEX_REQUEUE in
pthread_cond_signal together with using lll_mutex_unlock_force in place of
lll_mutex_unlock is not enough and probably why it has been disabled at
that time:
The number is value in cv->__data.__lock.
thr1 thr2 thr3
0 pthread_cond_wait
1 lll_mutex_lock (cv->__data.__lock)
0 lll_mutex_unlock (cv->__data.__lock)
0 lll_futex_wait (&cv->__data.__futex, futexval)
0 pthread_cond_signal
1 lll_mutex_lock (cv->__data.__lock)
1 pthread_cond_signal
2 lll_mutex_lock (cv->__data.__lock)
2 lll_futex_wait (&cv->__data.__lock, 2)
2 lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock)
# FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
2 lll_mutex_unlock_force (cv->__data.__lock)
0 cv->__data.__lock = 0
0 lll_futex_wake (&cv->__data.__lock, 1)
1 lll_mutex_lock (cv->__data.__lock)
0 lll_mutex_unlock (cv->__data.__lock)
# Here, lll_mutex_unlock doesn't know there are threads waiting
# on the internal cv's lock
Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
but it will cost us not one, but 2 extra syscalls and, what's worse, one of
these extra syscalls will be done for every single waiting loop in
pthread_cond_*wait.
We would need to use lll_mutex_unlock_force in pthread_cond_signal after
requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait.
Another alternative is to do the unlocking pthread_cond_signal needs to do
(the lock can't be unlocked before lll_futex_wake, as that is racy) in the
kernel.
I have implemented both variants, futex-requeue-glibc.patch is the first
one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel.
The kernel interface allows userland to specify how exactly an unlocking
operation should look like (some atomic arithmetic operation with optional
constant argument and comparison of the previous futex value with another
constant).
It has been implemented just for ppc*, x86_64 and i?86, for other
architectures I'm including just a stub header which can be used as a
starting point by maintainers to write support for their arches and ATM
will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been
(lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running
32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL.
With the following benchmark on UP x86-64 I get:
for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \
for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done
time elf/ld.so --library-path .:nptl-orig /tmp/bench
real 0m0.655s user 0m0.253s sys 0m0.403s
real 0m0.657s user 0m0.269s sys 0m0.388s
time elf/ld.so --library-path .:nptl-requeue /tmp/bench
real 0m0.496s user 0m0.225s sys 0m0.271s
real 0m0.531s user 0m0.242s sys 0m0.288s
time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
real 0m0.380s user 0m0.176s sys 0m0.204s
real 0m0.382s user 0m0.175s sys 0m0.207s
The benchmark is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt
Older futex-requeue-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt
Older futex-wake_op-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt
Will post a new version (just x86-64 fixes so that the patch
applies against pthread_cond_signal.S) to libc-hacker ml soon.
Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
testcase that will not test the atomicity of the operation, but at least
check if the threads that should have been woken up are woken up and
whether the arithmetic operation in the kernel gave the expected results.
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Jamie Lokier <jamie@shareable.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is used only in slab.c and each architecture gets to define whcih
underlying type is to be used.
Seems a bit silly - move it to slab.c and use the same type for all
architectures: unsigned int.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Someone mentioned that almost all the architectures used basically the same
implementation of get_order. This patch consolidates them into
asm-generic/page.h and includes that in the appropriate places. The
exceptions are ia64 and ppc which have their own (presumably optimised)
versions.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In yenta_socket, we default to using the resource setting of the CardBus
bridge. However, this is a PCI-bus-centric view of resources and thus needs
to be converted to generic resources first. Therefore, add a call to
pcibios_bus_to_resource() call in between. This function is a mere wrapper on
x86 and friends, however on some others it already exists, is added in this
patch (alpha, arm, ppc, ppc64) or still needs to be provided (parisc -- where
is its pcibios_resource_to_bus() ?).
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the kernel is working well and we want to restart cleanly
kernel_restart is the function to use. But in many instances
the kernel wants to reboot when thing are expected to be working
very badly such as from panic or a software watchdog handler.
This patch adds the function emergency_restart() so that
callers can be clear what semantics they expect when calling
restart. emergency_restart() is expected to be callable
from interrupt context and possibly reliable in even more
trying circumstances.
This is an initial generic implementation for all architectures.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove legacy ISA serial ports for Accent, Boca, Fourport, Hub6 and MCA
from the architecture specific serial.h include.
The only ports which remain in asm-*/serial.h are the platform specific
entries. These should really be converted by platform maintainers to
use a platform device, such as can be found in
arch/arm/mach-footbridge/isa.c
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With CONFIG_PCI=n:
In file included from include/linux/pci.h:917,
from lib/iomap.c:6:
include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list
include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want.
include/asm/pci.h: In function `pci_dma_burst_advice':
include/asm/pci.h:106: dereferencing pointer to incomplete type
include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function)
include/asm/pci.h:106: (Each undeclared identifier is reported only once
include/asm/pci.h:106: for each function it appears in.)
make[1]: *** [lib/iomap.o] Error 1
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After seeing, at best, "guesses" as to the following kind
of information in several drivers, I decided that we really
need a way for platforms to specifically give advice in this
area for what works best with their PCI controller implementation.
Basically, this new interface gives DMA bursting advice on
PCI. There are three forms of the advice:
1) Burst as much as possible, it is not necessary to end bursts
on some particular boundary for best performance.
2) Burst on some byte count multiple. A DMA burst to some multiple of
number of bytes may be done, but it is important to end the burst
on an exact multiple for best performance.
The best example of this I am aware of are the PPC64 PCI
controllers, where if you end a burst mid-cacheline then
chip has to refetch the data and the IOMMU translations
which hurts performance a lot.
3) Burst on a single byte count multiple. Bursts shall end
exactly on the next multiple boundary for best performance.
Sparc64 and Alpha's PCI controllers operate this way. They
disconnect any device which tries to burst across a cacheline
boundary.
Actually, newer sparc64 PCI controllers do not have this behavior.
That is why the "pdev" is passed into the interface, so I can
add code later to check which PCI controller the system is using
and give advice accordingly.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is based on work by Carlos O'Donell and Matthew Wilcox. It
introduces/updates the compat_time_t type and uses it for compat siginfo
structures. I have built this on ppc64 and x86_64.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The preempt_count member of struct thread_info is currently either defined
as int, unsigned int or __s32 depending on arch. This patch makes the type
of preempt_count an int on all archs.
Having preempt_count be an unsigned type prevents the catching of
preempt_count < 0 bugs, and using int on some archs and __s32 on others is
not exactely "neat" - much nicer when it's just int all over.
A previous version of this patch was already ACK'ed by Robert Love, and the
only change in this version of the patch compared to the one he ACK'ed is
that this one also makes sure the preempt_count member is consistently
commented.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch effectively eliminates direct use of pgdat->node_mem_map outside
of the DISCONTIG code. On a flat memory system, these fields aren't
currently used, neither are they on a sparsemem system.
There was also a node_mem_map(nid) macro on many architectures. Its use
along with the use of ->node_mem_map itself was not consistent. It has
been removed in favor of two new, more explicit, arch-independent macros:
pgdat_page_nr(pgdat, pagenr)
nid_page_nr(nid, pagenr)
I called them "pgdat" and "nid" because we overload the term "node" to mean
"NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways. I
believe the newer names are much clearer.
These macros can be overridden in the sparsemem case with a theoretically
slower operation using node_start_pfn and pfn_to_page(), instead. We could
make this the only behavior if people want, but I don't want to change too
much at once. One thing at a time.
This patch removes more code than it adds.
Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386
generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64. Full list
here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/
Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin J. Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.
The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.
Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.
In the new code, there are two externally visible symbols:
- smp_processor_id(): debug variant.
- raw_smp_processor_id(): nondebug variant. Replaces all existing
uses of _smp_processor_id() and __smp_processor_id(). Defined
by every SMP architecture in include/asm-*/smp.h.
There is one new internal symbol, dependent on DEBUG_PREEMPT:
- debug_smp_processor_id(): internal debug variant, mapped to
smp_processor_id().
Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file. All related comments got updated and/or
clarified.
I have build/boot tested the following 8 .config combinations on x86:
{SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}
I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other
architectures are untested, but should work just fine.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes some needlessly global identifiers static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Arjan van de Ven <arjanv@infradead.org>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There were still a few comments left refering to verify_area, and two
functions, verify_area_skas & verify_area_tt that just wrap corresponding
access_ok_skas & access_ok_tt functions, just like verify_area does for
access_ok - deprecate those.
There was also a few places that still used verify_area in commented-out
code, fix those up to use access_ok.
After applying this one there should not be anything left but finally
removing verify_area completely, which will happen after a kernel release
or two.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add EOWNERDEAD and ENOTRECOVERABLE to all architectures. This is to
support the upcoming patches for robust mutexes.
We normally don't reserve parts of the name/number space for external
patches, but robust mutexes are sufficiently popular and important to
justify it in this case.
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The attached patch moves the IRQ-related SA_xxx flags (namely, SA_PROBE,
SA_SAMPLE_RANDOM and SA_SHIRQ) from all the arch-specific headers to
linux/signal.h. This looks like a left-over after the irq-handling code
was consolidated. The code was moved to kernel/irq/*, but the flags are
still left per-arch.
Right now, adding a new IRQ flag to the arch-specific header, like this
patch does:
http://cvs.sourceforge.net/viewcvs.py/*checkout*/alsa/alsa-driver/utils/patches/pcsp-kernel-2.6.10-03.diff?rev=1.1
no longer works, it breaks the compilation for all other arches, unless you
add that flag to all the other arch-specific headers too. So I think such
a clean-up makes sense.
Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch eliminates all kernel BUGs, trims about 35k off the typical
kernel, and makes the system slightly faster.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace misleading definition of FIRST_USER_PGD_NR 0 by definition of
FIRST_USER_ADDRESS 0 in all the MMU architectures beyond arm and arm26.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!