Commit Graph

123 Commits

Author SHA1 Message Date
Anton Blanchard
9b83ecb0a3 powerpc: Optimise 64bit csum_partial
The main loop of csum_partial runs very slowly on recent POWER CPUs. After some
analysis on both POWER6 and POWER7 I came up with routine below. First we get
the source aligned to a double word, ignoring any odd alignment to keep things
simple. Then we do 64 bytes at a time, with an entry and exit limb of a further
64 bytes. On both POWER6 and POWER7 this should be as fast as we can go since
we are limited by the latency of the adde instructions.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 11% with
this patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:29 +10:00
Benjamin Herrenschmidt
5f07aa7524 Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
Stephen Rothwell
3880ecb05b powerpc: Fix feature-fixup tests for gcc 4.5
The feature-fixup test declare some extern void variables and then take
their addresses.  Fix this by declaring them as extern u8 instead.

Fixes these warnings (treated as errors):

  CC      arch/powerpc/lib/feature-fixups.o
cc1: warnings being treated as errors
arch/powerpc/lib/feature-fixups.c: In function 'test_cpu_macros':
arch/powerpc/lib/feature-fixups.c:293:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:294:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:297:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_fw_macros':
arch/powerpc/lib/feature-fixups.c:306:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:307:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:310:2: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c: In function 'test_lwsync_macros':
arch/powerpc/lib/feature-fixups.c:321:23: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:322:9: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:326:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'
arch/powerpc/lib/feature-fixups.c:329:3: error: taking address of expression of type 'void'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:41 +10:00
Stephen Rothwell
7fca5dc8aa powerpc: Fix module building for gcc 4.5 and 64 bit
Gcc 4.5 is now generating out of line register save and restore
in the function prefix and postfix when we use -Os.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:38 +10:00
K.Prasad
5aae8a5370 powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras <paulus@samba.org> to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:50 +10:00
Paul Mackerras
0016a4cf55 powerpc: Emulate most Book I instructions in emulate_step()
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:29 +10:00
Andreas Schwab
ca5d0674c3 powerpc: Fix string library functions
The powerpc strncmp implementation does not correctly handle a zero
length, despite the claim in 0119536cd3
(Add hand-coded assembly strcmp).

Additionally, all the length arguments are size_t, not int, so use
PPC_LCMPI and eq instead of cmpwi and le throughout.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:08 +10:00
Jeff Mahoney
637a99022f powerpc: Fix handling of strncmp with zero len
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.

This patch lifts the len <= 0 handling code from memcpy to handle that
case.

Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
CC: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:39 +10:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Benjamin Herrenschmidt
3d98ffbffb powerpc: Fix lwsync feature fixup vs. modules on 64-bit
Anton's commit enabling the use of the lwsync fixup mechanism on 64-bit
breaks modules. The lwsync fixup section uses .long instead of the
FTR_ENTRY_OFFSET macro used by other fixups sections, and thus will
generate 32-bit relocations that our module loader cannot resolve.

This changes it to use the same type as other feature sections.

Note however that we might want to consider using 32-bit for all the
feature fixup offsets and add support for R_PPC_REL32 to module_64.c
instead as that would reduce the size of the kernel image. I'll leave
that as an exercise for the reader for now...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-26 18:29:17 +11:00
Anton Blanchard
789c299ca2 powerpc: Improve 64bit copy_tofrom_user
Here is a patch from Paul Mackerras that improves the ppc64 copy_tofrom_user.
The loop now does 32 bytes at a time and as well as pairing loads and stores.

A quick test case that reads 8kB over and over shows the improvement:

POWER6: 53% faster
POWER7: 51% faster

#define _XOPEN_SOURCE 500
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

#define BUFSIZE (8 * 1024)
#define ITERATIONS 10000000

int main()
{
	char tmpfile[] = "/tmp/copy_to_user_testXXXXXX";
	int fd;
	char *buf[BUFSIZE];
	unsigned long i;

	fd = mkstemp(tmpfile);
	if (fd < 0) {
		perror("open");
		exit(1);
	}

	if (write(fd, buf, BUFSIZE) != BUFSIZE) {
		perror("open");
		exit(1);
	}

	for (i = 0; i < 10000000; i++) {
		if (pread(fd, buf, BUFSIZE, 0) != BUFSIZE) {
			perror("pread");
			exit(1);
		}
	}

	unlink(tmpfile);

	return 0;
}

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:16 +11:00
Anton Blanchard
63e6c5b810 powerpc: Pair loads and stores in copy_4k_page
A number of our chips like loads and stores to be paired. A small kernel
module testcase shows the improvement of pairing loads and stores in
copy_4k_page:

POWER6: +9%
POWER7: +1.5%

#include <linux/module.h>
#include <linux/mm.h>

#define ITERATIONS 10000000

static int __init copypage_init(void)
{
	struct timespec before, after;
	unsigned long i;
	struct page *destpage, *srcpage;
	char *dest, *src;

	destpage = alloc_page(GFP_KERNEL);
	srcpage = alloc_page(GFP_KERNEL);

	dest = page_address(destpage);
	src = page_address(srcpage);

	getnstimeofday(&before);

	for (i = 0; i < ITERATIONS; i++)
		copy_4K_page(dest, src);

	getnstimeofday(&after);

	free_page((unsigned long)dest);
	free_page((unsigned long)src);

	printk(KERN_DEBUG "copy_4K_page loop took %lu ns\n",
		(after.tv_sec - before.tv_sec) * NSEC_PER_SEC +
		(after.tv_nsec - before.tv_nsec));

	return 0;
}

static void __exit copypage_exit(void)
{
}

module_init(copypage_init)
module_exit(copypage_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:16 +11:00
Anton Blanchard
53eae2281a powerpc: Fix lwsync patching code on 64bit
do_lwsync_fixups doesn't work on 64bit, we end up writing lwsyncs to the
wrong addresses:

0:mon> di c0000001000bfacc
c0000001000bfacc  7c2004ac      lwsync

Since the lwsync section has negative offsets we need to use a signed int
pointer so we sign extend the value.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:15 +11:00
Thomas Gleixner
fb3a6bbc91 locking: Convert raw_rwlock to arch_rwlock
Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
0199c4e68d locking: Convert __raw_spin* functions to arch_spin*
Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
445c89514b locking: Convert raw_spinlock to arch_spinlock
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Benjamin Herrenschmidt
bcd6acd51f Merge commit 'origin/master' into next
Conflicts:
	include/linux/kvm.h
2009-12-09 17:14:38 +11:00
Joakim Tjernlund
15d914d72a powerpc/8xx: Start using dcbX instructions in various copy routines
Now that 8xx can fixup dcbX instructions, start using them
where possible like every other PowerPc arch do.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Anton Blanchard
3cd980dbc1 powerpc: perf_event: Cleanup copy_page output by hiding setup symbol
A lot of hits in "setup" doesn't make much sense, so hide this symbol and
allow all the hits to end up in copy_4k_page.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:05 +11:00
Michael Ellerman
ba55bd7436 powerpc: Add configurable -Werror for arch/powerpc
Add the option to build the code under arch/powerpc with -Werror.

The intention is to make it harder for people to inadvertantly introduce
warnings in the arch/powerpc code. It needs to be configurable so that
if a warning is introduced, people can easily work around it while it's
being fixed.

The option is a negative, ie. don't enable -Werror, so that it will be
turned on for allyes and allmodconfig builds.

The default is n, in the hope that developers will build with -Werror,
that will probably lead to some build breaks, I am prepared to be flamed.

It's not enabled for math-emu, which is a steaming pile of warnings.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-16 14:15:45 +10:00
Benjamin Herrenschmidt
b16e7766d6 powerpc: Move dma-noncoherent.c from arch/powerpc/lib to arch/powerpc/mm
(pre-requisite to make the next patches more palatable)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-27 16:32:05 +10:00
Benjamin Herrenschmidt
84532a0fc3 Revert "powerpc: Rework dma-noncoherent to use generic vmalloc layer"
This reverts commit 33f00dcedb.

    While it was a good idea to try to use the mm/vmalloc.c allocator instead
    of our own (in fact, ours is itself a dup on an old variant of the vmalloc
    one), unfortunately, the approach is terminally busted since
    dma_alloc_coherent() can be called at interrupt time or in atomic contexts
    and there's little chances we'll make the code in mm/vmalloc.c cope with\       that :-(

    Until we can get the generic code to forbid that idiocy and fix all
    drivers abusing it, we pretty much have no choice but revert to
    our custom virtual space allocator.

    There's also a problem with SMP safety since freeing such mapping
    would require an IPI which cannot be done at interrupt time.

    However, right now, I don't think we support any platform that is
    both SMP and has non-coherent DMA (don't laugh, I know such things
    do exist !) so we can sort that out later.

    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-27 13:33:14 +10:00
Benjamin Herrenschmidt
e14eee56c2 Merge commit 'origin/master' into next 2009-03-11 17:10:07 +11:00
Mark Nelson
f72b728bf1 powerpc: Fix 64bit __copy_tofrom_user() regression
This fixes a regression introduced by commit
a4e22f02f5 ("powerpc: Update 64bit
__copy_tofrom_user() using CPU_FTR_UNALIGNED_LD_STD").

The same bug that existed in the 64bit memcpy() also exists here so fix
it here too. The fix is the same as that applied to memcpy() with the
addition of fixes for the exception handling code required for
__copy_tofrom_user().

This stops us reading beyond the end of the source region we were told
to copy.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:54 +11:00
Mark Nelson
e423b9ecd6 powerpc: Fix 64bit memcpy() regression
This fixes a regression introduced by commit
25d6e2d7c5 ("powerpc: Update 64bit memcpy()
using CPU_FTR_UNALIGNED_LD_STD").

This commit allowed CPUs that have the CPU_FTR_UNALIGNED_LD_STD CPU
feature bit present to do the memcpy() with unaligned load doubles. But,
along with this came a bug where our final load double would read bytes
beyond a page boundary and into the next (unmapped) page. This was caught
by enabling CONFIG_DEBUG_PAGEALLOC,

The fix was to read only the number of bytes that we need to store rather
than reading a full 8-byte doubleword and storing only a portion of that.

In order to minimise the amount of existing code touched we use the
original do_tail for the src_unaligned case.

Below is an example of the regression, as reported by Sachin Sant:

Unable to handle kernel paging request for data at address 0xc00000003f380000
Faulting instruction address: 0xc000000000039574
cpu 0x1: Vector: 300 (Data Access) at [c00000003baf3020]
    pc: c000000000039574: .memcpy+0x74/0x244
    lr: d00000000244916c: .ext3_xattr_get+0x288/0x2f4 [ext3]
    sp: c00000003baf32a0
   msr: 8000000000009032
   dar: c00000003f380000
 dsisr: 40000000
  current = 0xc00000003e54b010
  paca    = 0xc000000000a53680
    pid   = 1840, comm = readahead
enter ? for help
[link register   ] d00000000244916c .ext3_xattr_get+0x288/0x2f4 [ext3]
[c00000003baf32a0] d000000002449104 .ext3_xattr_get+0x220/0x2f4 [ext3]
(unreliab
le)
[c00000003baf3390] d00000000244a6e8 .ext3_xattr_security_get+0x40/0x5c [ext3]
[c00000003baf3400] c000000000148154 .generic_getxattr+0x74/0x9c
[c00000003baf34a0] c000000000333400 .inode_doinit_with_dentry+0x1c4/0x678
[c00000003baf3560] c00000000032c6b0 .security_d_instantiate+0x50/0x68
[c00000003baf35e0] c00000000013c818 .d_instantiate+0x78/0x9c
[c00000003baf3680] c00000000013ced0 .d_splice_alias+0xf0/0x120
[c00000003baf3720] d00000000243e05c .ext3_lookup+0xec/0x134 [ext3]
[c00000003baf37c0] c000000000131e74 .do_lookup+0x110/0x260
[c00000003baf3880] c000000000134ed0 .__link_path_walk+0xa98/0x1010
[c00000003baf3970] c0000000001354a0 .path_walk+0x58/0xc4
[c00000003baf3a20] c000000000135720 .do_path_lookup+0x138/0x1e4
[c00000003baf3ad0] c00000000013645c .path_lookup_open+0x6c/0xc8
[c00000003baf3b70] c000000000136780 .do_filp_open+0xcc/0x874
[c00000003baf3d10] c0000000001251e0 .do_sys_open+0x80/0x140
[c00000003baf3dc0] c00000000016aaec .compat_sys_open+0x24/0x38
[c00000003baf3e30] c00000000000855c syscall_exit+0x0/0x40

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:53 +11:00
Ilya Yanok
33f00dcedb powerpc: Rework dma-noncoherent to use generic vmalloc layer
This patch rewrites consistent dma allocations support to use vmalloc
layer to allocate virtual memory space from vmalloc pool and get rid
of CONFIG_CONSISTENT_{START,SIZE}.

This greatly simplifies the code by effectively removing a custom
allocator we had for virtual space.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:57 +11:00
Kumar Gala
16c57b3620 powerpc: Unify opcode definitions and support
Create a new header that becomes a single location for defining PowerPC
opcodes used by code that is either generationg instructions
at runtime (fixups, debug, etc.), emulating instructions, or just
compiling instructions old assemblers don't know about.

We currently don't handle the floating point emulation or alignment decode
as both are better handled by the specific decode support they already
have.

Added support for the new dcbzl, dcbal, msgsnd, tlbilx, & wait instructions
since older assemblers don't know about them.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:56 +11:00
Ananth N Mavinakayanahalli
eef336189b powerpc: Don't emulate mr. instructions
Currently emulate_step() emulates mr. instructions without updating cr0
and this can be disastrous. Don't emulate mr.

This bug has been around for a while, but I am not sure if its a worthy
-stable candidate. I'll leave it to Ben do decide.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:07 +11:00
Linus Torvalds
3c92ec8ae9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
  powerpc/44x: Support 16K/64K base page sizes on 44x
  powerpc: Force memory size to be a multiple of PAGE_SIZE
  powerpc/32: Wire up the trampoline code for kdump
  powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
  powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
  powerpc/32: Setup OF properties for kdump
  powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
  powerpc: Prepare xmon_save_regs for use with kdump
  powerpc: Remove default kexec/crash_kernel ops assignments
  powerpc: Make default kexec/crash_kernel ops implicit
  powerpc: Setup OF properties for ppc32 kexec
  powerpc/pseries: Fix cpu hotplug
  powerpc: Fix KVM build on ppc440
  powerpc/cell: add QPACE as a separate Cell platform
  powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
  powerpc/mpc5200: fix error paths in PSC UART probe function
  powerpc/mpc5200: add rts/cts handling in PSC UART driver
  powerpc/mpc5200: Make PSC UART driver update serial errors counters
  powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
  powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
  ...

Fix trivial conflict in drivers/char/Makefile as per Paul's directions
2008-12-28 16:54:33 -08:00
David Howells
8168b5400b powerpc: Rename struct vm_region to avoid conflict with NOMMU
Rename PowerPC's struct vm_region so that I can introduce my own
global version for NOMMU.  It's feasible that the PowerPC version may
wish to use my global one instead.

The NOMMU vm_region struct defines areas of the physical memory map
that are under mmap.  This may include chunks of RAM or regions of
memory mapped devices, such as flash.  It is also used to retain
copies of file content so that shareable private memory mappings of
files can be made.  As such, it may be compatible with what is
described in the banner comment for PowerPC's vm_region struct.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:14 +11:00
Ingo Molnar
30cd324e97 Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' into tracing/core
Conflicts:
	include/linux/ftrace.h
2008-12-19 09:42:40 +01:00
Paul Mackerras
c280266a32 Merge branch 'linux-2.6' into next 2008-12-18 11:06:12 +11:00
Guillaume Knispel
af4d364386 powerpc: Fix corruption error in rh_alloc_fixed()
There is an error in rh_alloc_fixed() of the Remote Heap code:
If there is at least one free block blk won't be NULL at the end of the
search loop, so -ENOMEM won't be returned and the else branch of
"if (bs == s || be == e)" will be taken, corrupting the management
structures.

Signed-off-by: Guillaume Knispel <gknispel@proformatique.com>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-12-17 10:06:14 -06:00
Steven Rostedt
f1eecf0e4f powerpc/ppc32: static ftrace fixes for PPC32
Impact: fix for PowerPC 32 code

There were some early init code that was not safe for static
ftrace to boot on my PowerBook. This code must only use relative
addressing, and static mcount performs a compare of the
ftrace_trace_function pointer, and gets that with an absolute address.
In the early init boot up code, this will cause a fault.

This patch removes tracing from the files containing the offending
functions.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:08:07 +01:00
Mark Nelson
a4e22f02f5 powerpc: Update 64bit __copy_tofrom_user() using CPU_FTR_UNALIGNED_LD_STD
In exactly the same way that we updated memcpy() with new feature
sections in commit 25d6e2d7c5 ("powerpc:
Update 64bit memcpy() using CPU_FTR_UNALIGNED_LD_STD"), we do the same
thing here for __copy_tofrom_user().  Once again this is purely a
performance tweak for Cell and Power6 - this has no effect on all the
other 64bit powerpc chips.

We can make these same changes to __copy_tofrom_user() because the
basic copy algorithm is the same as in memcpy() - this version just
has all the exception handling logic needed when copying to or from
userspace as well as a special case for copying whole 4K pages that
are page aligned.

CPU_FTR_UNALIGNED_LD_STD CPU was added in commit
4ec577a289 ("powerpc: Add new CPU
feature: CPU_FTR_UNALIGNED_LD_STD").

We also make the same simple one line change from cmpldi r1,... to
cmpldi cr1,... for consistency.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-19 16:04:54 +11:00
Hollis Blanchard
7526ff76f8 powerpc: Remove superfluous WARN_ON() from dma-noncoherent.c
I can't tell why this WARN_ON exists, and there's no comment
explaining it.  Whether the pmd is present or not, pte_alloc_kernel()
seems to handle both cases.

Booting a 440 kernel with 64K PAGE_SIZE triggers the warning, but boot
successfully completes and I see no problems beyond that.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-19 16:04:52 +11:00
Mark Nelson
25d6e2d7c5 powerpc: Update 64bit memcpy() using CPU_FTR_UNALIGNED_LD_STD
Update memcpy() to add two new feature sections: one for aligning the
destination before copying and one for copying using aligned load
and store doubles.

These new feature sections will only affect Power6 and Cell because
the CPU feature bit was only added to these two processors.

Power6 gets its best performance in memcpy() when aligning neither the
source nor the destination, while Cell gets its best performance when
just the destination is aligned. But in order to save on CPU feature
bits we can use the previously added CPU_FTR_CP_USE_DCBTZ feature bit
to differentiate between Power6 and Cell (because CPU_FTR_CP_USE_DCBTZ
was added to Cell but not Power6).

The first feature section acts to nop out the branch that takes us to
the code that aligns us to an eight byte boundary for the destination.
We only want to nop out this branch on Power6.

So the ALT_FTR_SECTION_END() for this feature section creates a test
mask of the two feature bits ORed together and provides an expected
result of just CPU_FTR_UNALIGNED_LD_STD, thus we nop out the branch
if we're on a CPU that has CPU_FTR_UNALIGNED_LD_STD set and
CPU_FTR_CP_USE_DCBTZ unset.

For the second feature section added, if we're on a CPU that has the
CPU_FTR_UNALIGNED_LD_STD bit set then we don't want to do the copy
with aligned loads and stores (and the appropriate shifting left and
right instructions), so we want to nop out the branch to
.Lsrc_unaligned.

The andi. used for this branch is moved to just above the branch
because this allows us to nop out both instructions with just one
feature section which gives us better performance and doesn't hurt
readability which two separate feature sections did.

Moving the andi. to just above the branch doesn't have any noticeable
negative effect on the remaining 64bit processors (the ones that
didn't have this feature bit added).

On Cell this simple modification results in an improvement to measured
memcpy() bandwidth of up to 50% in the hot cache case and up to 15% in
the cold cache case.

On Power6 we get memory bandwidth results that are up to three times
faster in the hot cache case and up to 50% faster in the cold cache
case.

Commit 2a9294369b ("powerpc: Add new CPU
feature: CPU_FTR_CP_USE_DCBTZ") was where CPU_FTR_CP_USE_DCBTZ was
added.

To say that Cell gets its best performance in memcpy() with just the
destination aligned is true but only for the reason that the indirect
shift and rotate instructions, sld and srd, are microcoded on Cell.
This means that either the destination or the source can be aligned,
but not both, and seeing as we get better performance with the
destination aligned we choose this option.

While we're at it make a one line change from cmpldi r1,... to
cmpldi cr1,... for consistency.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-11-05 22:08:29 +11:00
Benjamin Herrenschmidt
8aa2659009 powerpc: Fix DMA offset for non-coherent DMA
After Becky's work we can almost have different DMA offsets
between on-chip devices and PCI. Almost because there's a
problem with the non-coherent DMA code that basically ignores
the programmed offset to use the global one for everything.
This fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-10-14 10:35:26 +11:00
Mark Nelson
57dda6ef5b powerpc: New copy_4K_page()
This new copy_4K_page() function was originally tuned for the best
performance on the Cell processor, but after testing on more 64bit
powerpc chips it was found that with a small modification it either
matched the performance offered by the current mainline version or
bettered it by a small amount.

It was found that on a Cell-based QS22 blade the amount of system
time measured when compiling a 2.6.26 pseries_defconfig decreased
by 4%. Using the same test, a 4-way 970MP machine saw a decrease of
2% in system time. No noticeable change was seen on Power4, Power5
or Power6.

The 4096 byte page is copied in thirty-two 128 byte strides. An
initial setup loop executes dcbt instructions for the whole source
page and dcbz instructions for the whole destination page. To do
this, the cache line size is retrieved from ppc64_caches.

A new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, (introduced in the
previous patch) is used to make the modification to this new copy
routine - on Power4, 970 and Cell the feature bit is set so the
setup loop is executed, but on all other 64bit chips the setup
loop is nop'ed out.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-09-15 11:07:42 -07:00
Kumar Gala
9c4cb82515 powerpc: Remove use of CONFIG_PPC_MERGE
Now that arch/ppc is gone and CONFIG_PPC_MERGE is always set, remove
the dead code associated with !CONFIG_PPC_MERGE from arch/powerpc
and include/asm-powerpc.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-08-04 13:18:17 +10:00
Andrea Righi
27ac792ca0 PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:

	u64 val = PAGE_ALIGN(size);

always returns a value < 4GB even if size is greater than 4GB.

The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):

#define PAGE_SHIFT      12
#define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK       (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr)       (((addr)+PAGE_SIZE-1)&PAGE_MASK)

The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.

Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.

See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Michael Ellerman
76bfdcf71c powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S
Replace ifdef clutter with the PPC_LONG and PPC_LONG_ALIGN macros
for readability.

No change to the generated code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:35 +10:00
Michael Ellerman
1856c02040 powerpc: Use WARN_ON(1) instead of __WARN()
__WARN() is not defined for all configs, use WARN_ON(1) instead.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:34 +10:00
Kumar Gala
2d1b202762 powerpc: Fixup lwsync at runtime
To allow for a single kernel image on e500 v1/v2/mc we need to fixup lwsync
at runtime.  On e500v1/v2 lwsync causes an illop so we need to patch up
the code.  We default to 'sync' since that is always safe and if the cpu
is capable we will replace 'sync' with 'lwsync'.

We introduce CPU_FTR_LWSYNC as a way to determine at runtime if this is
needed.  This flag could be moved elsewhere since we dont really use it
for the normal CPU_FTR purpose.

Finally we only store the relative offset in the fixup section to keep it
as small as possible rather than using a full fixup_entry.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-03 16:58:10 +10:00
Kumar Gala
5888da1876 powerpc: Fix building of feature-fixup tests on ppc32
We need to use PPC_LCMPI otherwise we get compile errors like:

arch/powerpc/lib/feature-fixups-test.S: Assembler messages:
arch/powerpc/lib/feature-fixups-test.S:142: Error: Unrecognized opcode: `cmpdi'
arch/powerpc/lib/feature-fixups-test.S:149: Error: Unrecognized opcode: `cmpdi'
arch/powerpc/lib/feature-fixups-test.S:164: Error: Unrecognized opcode: `cmpdi'

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-03 16:58:09 +10:00
Andrew Lewis
03d70617b8 powerpc: Prevent memory corruption due to cache invalidation of unaligned DMA buffer
On PowerPC processors with non-coherent cache architectures the DMA
subsystem calls invalidate_dcache_range() before performing a DMA read
operation.  If the address and length of the DMA buffer are not aligned
to a cache-line boundary this can result in memory outside of the DMA
buffer being invalidated in the cache.  If this memory has an
uncommitted store then the data will be lost and a subsequent read of
that address will result in an old value being returned from main memory.

Only when the DMA buffer starts on a cache-line boundary and is an exact
mutiple of the cache-line size can invalidate_dcache_range() be called,
otherwise flush_dcache_range() must be called.  flush_dcache_range()
will first flush uncommitted writes, and then invalidate the cache.

Signed-off-by: Andrew Lewis <andrew-lewis at netspace.net.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01 11:28:54 +10:00
Michael Ellerman
362e7701fd powerpc: Add self-tests of the feature fixup code
This commit adds tests of the feature fixup code, they are run during
boot if CONFIG_FTR_FIXUP_SELFTEST=y. Some of the tests manually invoke
the patching routines to check their behaviour, and others use the
macros and so are patched during the normal patching done during boot.

Because we have two sets of macros with different names, we use a macro
to generate the test of the macros, very niiiice.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01 11:28:30 +10:00
Michael Ellerman
9b1a735de6 powerpc: Add logic to patch alternative feature sections
This commit adds the logic to patch alternative sections.  This is fairly
straightforward, except for branches.  Relative branches that jump from
inside the else section to outside of it need to be translated as they're
moved, otherwise they will jump to the wrong location.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01 11:28:29 +10:00
Michael Ellerman
fac23fe4be powerpc: Introduce infrastructure for feature sections with alternatives
The current feature section logic only supports nop'ing out code, this means
if you want to choose at runtime between instruction sequences, one or both
cases will have to execute the nop'ed out contents of the other section, eg:

BEGIN_FTR_SECTION
	or	1,1,1
END_FTR_SECTION_IFSET(FOO)
BEGIN_FTR_SECTION
	or	2,2,2
END_FTR_SECTION_IFCLR(FOO)

and the resulting code will be either,

	or	1,1,1
	nop

or,
	nop
	or	2,2,2

For small code segments this is fine, but for larger code blocks and in
performance criticial code segments, it would be nice to avoid the nops.
This commit starts to implement logic to allow the following:

BEGIN_FTR_SECTION
	or	1,1,1
FTR_SECTION_ELSE
	or	2,2,2
ALT_FTR_SECTION_END_IFSET(FOO)

and the resulting code will be:

	or	1,1,1
or,
	or	2,2,2

We achieve this by extending the existing FTR macros. The current feature
section semantic just becomes a special case, ie. if the else case is empty
we nop out the default case.

The key limitation is that the size of the else case must be less than or
equal to the size of the default case. If the else case is smaller the
remainder of the section is nop'ed.

We let the linker put the else case code in with the rest of the text,
so that relative branches from the else case are more likley to link,
this has the disadvantage that we can't free the unused else cases.

This commit introduces the required macro and linker script changes, but
does not enable the patching of the alternative sections.

We also need to update two hand-made section entries in reg.h and timex.h

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01 11:28:28 +10:00
Michael Ellerman
51c52e8669 powerpc: Split out do_feature_fixups() from cputable.c
The logic to patch CPU feature sections lives in cputable.c, but these days
it's used for CPU features as well as firmware features.  Move it into
it's own file for neatness and as preparation for some additions.

While we're moving the code, we pull the loop body logic into a separate
routine, and remove a comment which doesn't apply anymore.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01 11:28:24 +10:00