So we tried to speed things up a bit using flush_hash_pages() directly
but that falls over on 603 of course meaning we fail to flush the TLB
properly and we may even end up having it corrupt memory randomly by
accessing a hash table that doesn't exist.
This removes the "optimization" by always going through flush_tlb_page()
for now at least.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we always call start-cpu irrespective of if the CPU is
stopped or not. Unfortunatley on POWER7, firmware seems to not like
start-cpu being called when a cpu already been started. This was not
the case on POWER6 and earlier.
This patch checks to see if the CPU is stopped or not via an
query-cpu-stopped-state call, and only calls start-cpu on CPUs which
are stopped.
This fixes a bug with kexec on POWER7 on PHYP where only the primary
thread would make it to the second kernel.
Reported-by: Ankita Garg <ankita@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This moves query_cpu_stopped() out of the hotplug cpu code and into
smp.c so it can called in other places and renames it to
smp_query_cpu_stopped().
It also cleans up the return values by adding some #defines
Cc: <stable@kernel.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
By setting "reset_type" to one of the following values, the default
software reset mechanism may be overidden. Here the possible values of
"reset_type":
1 - PPC4xx core reset
2 - PPC4xx chip reset
3 - PPC4xx system reset (default)
This will be used by a new PPC440SPe board port, which needs a "chip
reset" instead of the default "system reset" to be asserted.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
A defconfig for the IBM ISS 476 simulator
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This is a trivial 4xx plaform that uses the new simple bsp from
Josh and is handy to use in simulators such as ISS or even Mambo
who don't properly implement most of the actual devices in the
SoC but really only the core.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
476 requires an isync after loading MMU and debug related SPR's. Some of
these are in performance-critical paths and may need to be optimized, but
initially, we're playing it safe.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The 47x core's MCSR varies from 44x, so it needs it's own machine check
handler.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch adds the base support for the 476 processor. The code was
primarily written by Ben Herrenschmidt and Torez Smith, but I've been
maintaining it for a while.
The goal is to have a single binary that will run on 44x and 47x, but
we still have some details to work out. The biggest is that the L1 cache
line size differs on the two platforms, but it's currently a compile-time
option.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The 47x platform supports multiple cores and shares code with 44x.
Break out code that is common for initializing the primary and secondary
cpus into a function which can be called for both.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch adds a marker to the exception stack frame to aid in debugging.
It's already inserted on other platforms and xmon recognizes it and
identifies exception frames when showing stack traces.
Signed-off-by: Torez Smith <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Adds support for suspend/resume for VIO devices. This is needed for
support for HMC initiated hibernation.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Replace open-coded rate limiting logic with __ratelimit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current setting for SECTION_SIZE_BITS is quite small compared to
everyone else:
arch/powerpc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 24
arch/sparc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 30
arch/ia64/include/asm/sparsemem.h:#define SECTION_SIZE_BITS (30)
arch/s390/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 28
arch/x86/include/asm/sparsemem.h:# define SECTION_SIZE_BITS 27
And it has proven to be an issue during boot on very large machines.
If hotplug memory is enabled, drivers/base/memory.c does this:
for (i = 0; i < NR_MEM_SECTIONS; i++) {
if (!present_section_nr(i))
continue;
err = add_memory_block(0, __nr_to_section(i), MEM_ONLINE,
0, BOOT);
if (!ret)
ret = err;
}
Which creates a sysfs directory for every 16MB of memory. As a result
I'm seeing up to 30 minutes spent here during boot:
c000000000248ee0 .__sysfs_add_one+0x28/0x128
c0000000002492a8 .sysfs_add_one+0x38/0x188
c000000000249c88 .create_dir+0x70/0x138
c000000000249d98 .sysfs_create_dir+0x48/0x78
c00000000032bad8 .kobject_add_internal+0x140/0x308
c00000000032beb4 .kobject_init_and_add+0x4c/0x68
c00000000046c2c0 .sysdev_register+0xa0/0x220
c00000000047b1dc .add_memory_block+0x124/0x1e8
c0000000008d1f28 .memory_dev_init+0xf4/0x168
c0000000008d1b64 .driver_init+0x50/0x64
c000000000890378 .do_basic_setup+0x40/0xd4
I assume there are some O(n^2) issues in sysfs as we add all the memory
nodes. Bumping SECTION_SIZE_BITS to 256 MB drops the time to about 10
seconds and results in a much smaller /sys.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have had issues in the past with ibm,os-term initiating shutdown of a
partition. This is confusing to the user, especially if panic_timeout is
non zero.
The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set
and since we set it on every boot we basically never call ibm,os-term.
An extended version of ibm,os-term has since been implemented which gives us
the behaviour we want:
"When the platform supports extended ibm,os-term behavior, the return to the
RTAS will always occur unless there is a kernel assisted dump active as
initiated by an ibm,configure-kernel-dump call."
This patch checks for the ibm,extended-os-term property and calls ibm,os-term
if it exists.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
enabled via this:
/*
* If another node is sufficiently far away then it is better
* to reclaim pages in a zone before going off node.
*/
if (distance > RECLAIM_DISTANCE)
zone_reclaim_mode = 1;
Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
RECLAIM_DISTANCE it never kicks in.
The local to remote bandwidth ratios can be quite large on System p
machines so it makes sense for us to reclaim clean pagecache locally before
going off node.
The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix I2C-drivers which missed setting clientdata to NULL before freeing the
structure it points to. Also fix drivers which do this _after_ the structure
was freed already.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Colin Leroy <colin@colino.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use set_cpus_allowed_ptr rather than set_cpus_allowed.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1,E2;
@@
- set_cpus_allowed(E1, cpumask_of_cpu(E2))
+ set_cpus_allowed_ptr(E1, cpumask_of(E2))
@@
expression E;
identifier I;
@@
- set_cpus_allowed(E, I)
+ set_cpus_allowed_ptr(E, &I)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In some error handling cases the lock is not unlocked.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* spin_lock_irqsave (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add an unlock before exiting the function.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
kasprintf combines kmalloc and sprintf, and takes care of the size
calculation itself.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression a,flag;
expression list args;
statement S;
@@
a =
- \(kmalloc\|kzalloc\)(...,flag)
+ kasprintf(flag,args)
<... when != a
if (a == NULL || ...) S
...>
- sprintf(a,args);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
dlpar_free_cc_nodes frees its argument, so dlpar_online_cpu should not be
called on the same value. Skip over the call to dlpar_online_cpu by
jumping directly to out.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E,E2;
@@
dlpar_free_cc_nodes(E)
...
(
E = E2
|
* E
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The conditionals were testing different values, but then all freeing the
same one, which could result in a double free.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,e;
identifier f;
iterator I;
statement S;
@@
*kfree(x);
... when != &x
when != x = e
when != I(x,...) S
*x
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.
This patch lifts the len <= 0 handling code from memcpy to handle that
case.
Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
CC: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix minor nits found by cppcheck
[./macintosh/windfarm_pm81.c:760]: (style) Redundant condition. It is safe to deallocate a NULL pointer
[./macintosh/windfarm_pm81.c:762]: (style) Redundant condition. It is safe to deallocate a NULL pointer
Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix minor nits found by cppcheck
[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable chans can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable i can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:1260]: (style) Redundant condition. It is safe to deallocate a NULL pointer
Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This avoids storing these registers in memory.
CPU6 errata will still use the old way.
Remove some G2 leftover accesses from 2.4
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Only the swap function cares about the ACCESSED bit in
the pte. Do not waste cycles updateting ACCESSED when swap
is not compiled into the kernel.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Only modules will cause ITLB Misses as we always pin
the first 8MB of kernel memory.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This removes a couple of insn's from the TLB Miss
handlers whithout changing functionality.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Instead of referencing mem_map directly, use pfn_to_page. Otherwise
the kernel crashes when trying to start userspace if ARCH_PFN_OFFSET is
non-zero and CONFIG_BOOKE is not defined
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add support for H_EM_GET_PARMS hcall that will return data
related to power modes from the platform. Export the data
directly to user space for administrative tools to interpret
and use.
cat /proc/powerpc/lparcfg will export power mode data
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Data address breakpoint exceptions are currently handled along with page-faults
which require interrupts to remain in enabled state. Since exception handling
for data breakpoints aren't pre-empt safe, we handle them separately.
Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
BenH: Added to vio_cmo_dev_attrs as well
Provide a modalias entry for VIO devices in sysfs. I believe
this was another initrd generation bugfix for anaconda.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We can't just clear the user read permission in book3e pte, because
that will also clear supervisor read permission. This surely isn't
desired. Fix the problem by adding the supervisor read back.
BenH: Slightly simplified the ifdef and applied to ppc64 too
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The mpsc serial driver needx to set the port's device tree element
to register properly.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The maple platform failed to load because it's firmware could not take a
link address of 0x4000000. A new platform type with a link address of
0x400000 had to be created for the maple.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This driver seems to be specific to a "Sky CPU" board for which we
don't appear to have upstream support (or not any more). No Kconfig
file in the kernel ever enables it. So remove it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Force MSI irq handlers to run with interrupts disabled
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] hpwdt - fix lower timeout limit
[WATCHDOG] iTCO_wdt: TCO Watchdog patch for additional Intel Cougar Point DeviceIDs
[WATCHDOG] doc: Fix use of WDIOC_SETOPTIONS ioctl.
[WATCHDOG] doc: watchdog simple example: don't fail on fsync()
[WATCHDOG] set max63xx driver as ARM only
[WATCHDOG] powerpc: pika_wdt ident cannot be const
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
smc91c92_cs: fix the problem of "Unable to find hardware address"
r8169: clean up my printk uglyness
net: Hook up cxgb4 to Kconfig and Makefile
cxgb4: Add main driver file and driver Makefile
cxgb4: Add remaining driver headers and L2T management
cxgb4: Add packet queues and packet DMA code
cxgb4: Add HW and FW support code
cxgb4: Add register, message, and FW definitions
netlabel: Fix several rcu_dereference() calls used without RCU read locks
bonding: fix potential deadlock in bond_uninit()
net: check the length of the socket address passed to connect(2)
stmmac: add documentation for the driver.
stmmac: fix kconfig for crc32 build error
be2net: fix bug in vlan rx path for big endian architecture
be2net: fix flashing on big endian architectures
be2net: fix a bug in flashing the redboot section
bonding: bond_xmit_roundrobin() fix
drivers/net: Add missing unlock
net: gianfar - align BD ring size console messages
net: gianfar - initialize per-queue statistics
...
copy_to_user() returns the number of bytes left to be copied.
This was a typo from: d82ef020cf "proc: pagemap: Hold mmap_sem during
page walk".
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some BIOSes don't configure HPA during boot but do so while resuming.
This causes harddrives to shrink during resume making libata detach
and reattach them. This can be worked around by unlocking HPA if old
size equals native size.
Add ATA_DFLAG_UNLOCK_HPA so that HPA unlocking can be controlled
per-device and update ata_dev_revalidate() such that it sets
ATA_DFLAG_UNLOCK_HPA and fails with -EIO when the above condition is
detected.
This patch fixes the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=15396
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Oleksandr Yermolenko <yaa.bta@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Crucial said,
Thank you for contacting us. We know that with our M225 line of SSDs
you sometimes need to disable NCQ (native command queuing) to avoid
just the type of errors you're seeing. Our recommendation for the
M225 is to add libata.force=noncq to your Linux kernel boot options,
under the kernel ATA library option.
I have sent your feedback to the engineers working on the C300, and
asked them to please pass it on to the firmware team. I have been
notified that they are in the process of testing and finalizing a
new firmware version, that you can expect to see released around the
end of April. We’ll keep you posted as to when it will be available
for download.
So, turn off NCQ on the drive w/ the current firmware revision.
Reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=15573
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: lethalwp@scarlet.be
Reported-by: Luke Macken <lmacken@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
On configurations where IRQ line is shared with a different
controller, spurious IRQs may happen continuously. The message was
put there primarily for debugging anyway. Kill it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>