Commit Graph

1192 Commits

Author SHA1 Message Date
Markus F.X.J. Oberhumer
66761522a7 [IA64] fix stack alignment for ia32 signal handlers
This fixes the setup of the alignment of the signal frame, so that all
signal handlers are run with a properly aligned stack frame.

The current code "over-aligns" the stack pointer so that the stack frame
is effectively always mis-aligned by 4 bytes.  But what we really want
is that on function entry ((sp + 4) & 15) == 0, which matches what would
happen if the stack were aligned before a "call" instruction.

i386 and x86_64 are already fixed by d347f37227

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-08 11:22:59 -07:00
Bernhard Walle
d217c265dc Add IRQF_IRQPOLL flag on IA64
Add IRQF_IRQPOLL for the timer interrupt on IA64.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:22 -07:00
Ananth N Mavinakayanahalli
bf8f6e5b3e Kprobes: The ON/OFF knob thru debugfs
This patch provides a debugfs knob to turn kprobes on/off

o A new file /debug/kprobes/enabled indicates if kprobes is enabled or
  not (default enabled)
o Echoing 0 to this file will disarm all installed probes
o Any new probe registration when disabled will register the probe but
  not arm it. A message will be printed out in such a case.
o When a value 1 is echoed to the file, all probes (including ones
  registered in the intervening period) will be enabled
o Unregistration will happen irrespective of whether probes are globally
  enabled or not.
o Update Documentation/kprobes.txt to reflect these changes. While there
  also update the doc to make it current.

We are also looking at providing sysrq key support to tie to the disabling
feature provided by this patch.

[akpm@linux-foundation.org: Use bool like a bool!]
[akpm@linux-foundation.org: add printk facility levels]
[cornelia.huck@de.ibm.com: Add the missing arch_trampoline_kprobe() for s390]
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:19 -07:00
Christoph Hellwig
4c4308cb93 kprobes: kretprobes simplifications
- consolidate duplicate code in all arch_prepare_kretprobe instances
   into common code
 - replace various odd helpers that use hlist_for_each_entry to get
   the first elemenet of a list with either a hlist_for_each_entry_save
   or an opencoded access to the first element in the caller
 - inline add_rp_inst into it's only remaining caller
 - use kretprobe_inst_table_head instead of opencoding it

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:19 -07:00
Bjorn Helgaas
873ec74615 EFI: warn only for pre-1.00 system tables
We used to warn unless the EFI system table major revision was exactly 1.
But EFI 2.00 firmware is starting to appear, and the 2.00 changes don't
affect anything in Linux.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:10 -07:00
Ananth N Mavinakayanahalli
0f95b7fc83 Kprobes: print details of kretprobe on assertion failure
In certain cases like when the real return address can't be found or when
the number of tracked calls to a kretprobed function is less than the
number of returns, we may not be able to find the correct return address
after processing a kretprobe.  Currently we just do a BUG_ON, but no
information is provided about the actual failing kretprobe.

Print out details of the kretprobe before calling BUG().

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:08 -07:00
Simon Horman
6672f76a5a kdump/kexec: calculate note size at compile time
Currently the size of the per-cpu region reserved to save crash notes is
set by the per-architecture value MAX_NOTE_BYTES.  Which in turn is
currently set to 1024 on all supported architectures.

While testing ia64 I recently discovered that this value is in fact too
small.  The particular setup I was using actually needs 1172 bytes.  This
lead to very tedious failure mode where the tail of one elf note would
overwrite the head of another if they ended up being alocated sequentially
by kmalloc, which was often the case.

It seems to me that a far better approach is to caclculate the size that
the area needs to be.  This patch does just that.

If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is
needed then this should be as easy as making MAX_NOTE_BYTES larger in
arch/asm-ia64/kexec.h.  Perhaps 2048 would be a good choice.  However, I
think that the approach in this patch is a much more robust idea.

Acked-by:  Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Randy Dunlap
e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Christoph Hellwig
1eeb66a1bb move die notifier handling to common code
This patch moves the die notifier handling to common code.  Previous
various architectures had exactly the same code for it.  Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)

arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at.  avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Akinobu Mita
0e6b9c98be use SLAB_PANIC flag cleanup
Use SLAB_PANIC and delete duplicated panic().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Ian Molton <spyro@f2s.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
Yasunori Goto
a3142c8e1d Fix section mismatch of memory hotplug related code.
This is to fix many section mismatches of code related to memory hotplug.
I checked compile with memory hotplug on/off on ia64 and x86-64 box.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
John Keller
0e17b56098 [IA64] - Altix: hotplug after intr redirect can crash system
When redirecting a device interrupt on SN, not all links
between platform specific structures are being updated.
This can result in a system crash if an interrupt
redirection is followed by an unplug of that device.

The complete fix also requires a prom update. Though,
this patch is backward compatable and not dependent on
the prom patch.

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-08 11:06:41 -07:00
Siddha, Suresh B
bb8416bf8b [IA64] save and restore cpus_allowed in cpu_idle_wait
Save and Restore the task's cpus_allowed mask, across the set_cpus_allowed()
in cpu_idle_wait(). Without this, we will endup corrupting task's cpu affinity.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-08 10:01:25 -07:00
Tony Luck
9d1d4cc89f [IA64] Removal of percpu TR cleanup in kexec code
The kexec code wasn't in the kernel when Ken Chen created
this patch (00b65985fb) to
change the mapping of percpu area from a TR to a TC.

Zou Nanhai spotted this extra piece, and Simon Horman
concurred.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-08 10:00:28 -07:00
Tony Luck
0f7ac29e59 [IA64] Fix some section mismatch errors
Section mismatch: reference to ...

 .init.text:prefill_possible_map from .text between 'setup_per_cpu_areas' and 'cpu_init'
 .init.text:iosapic_override_isa_irq from .text between 'iosapic_init' and 'iosapic_remove'

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-07 13:17:00 -07:00
Linus Torvalds
a989705c4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] update memory attribute aliasing documentation & test cases
  [IA64] fail mmaps that span areas with incompatible attributes
  [IA64] allow WB /sys/.../legacy_mem mmaps
  [IA64] make ioremap avoid unsupported attributes
  [IA64] rename ioremap variables to match i386
  [IA64] relax per-cpu TLB requirement to DTC
  [IA64] remove per-cpu ia64_phys_stacked_size_p8
  [IA64] Fix example error injection program
  [IA64] Itanium MC Error Injection Tool: pal_mc_error_inject() interface
  [IA64] Itanium MC Error Injection Tool: Makefile changes
  [IA64] Itanium MC Error Injection Tool: Driver sysfs interface
  [IA64] Itanium MC Error Injection Tool: Doc and sample application
  [IA64] Itanium MC Error Injection Tool: Kernel configuration
2007-05-07 12:34:57 -07:00
Benjamin Herrenschmidt
afa37394d6 get_unmapped_area handles MAP_FIXED on ia64
Handle MAP_FIXED in ia64 arch_get_unmapped_area and
hugetlb_get_unmapped_area(), just call prepare_hugepage_range in the later and
is_hugepage_only_range() in the former.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: William Irwin <bill.irwin@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:56 -07:00
Christoph Lameter
d85f33855c Make page->private usable in compound pages
If we add a new flag so that we can distinguish between the first page and the
tail pages then we can avoid to use page->private in the first page.
page->private == page for the first page, so there is no real information in
there.

Freeing up page->private makes the use of compound pages more transparent.
They become more usable like real pages.  Right now we have to be careful f.e.
 if we are going beyond PAGE_SIZE allocations in the slab on i386 because we
can then no longer use the private field.  This is one of the issues that
cause us not to support debugging for page size slabs in SLAB.

Having page->private available for SLUB would allow more meta information in
the page struct.  I can probably avoid the 16 bit ints that I have in there
right now.

Also if page->private is available then a compound page may be equipped with
buffer heads.  This may free up the way for filesystems to support larger
blocks than page size.

We add PageTail as an alias of PageReclaim.  Compound pages cannot currently
be reclaimed.  Because of the alias one needs to check PageCompound first.

The RFC for the this approach was discussed at
http://marc.info/?t=117574302800001&r=1&w=2

[nacc@us.ibm.com: fix hugetlbfs]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:53 -07:00
Michael Ellerman
7fe3730de7 MSI: arch must connect the irq and the msi_desc
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.

set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.

The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.

Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Dan Williams
f282b97021 msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2)
Allows architectures to advertise that they support MSI rather than listing
each architecture as a PCI_MSI dependency.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Jean Delvare
6473d160b4 PCI: Cleanup the includes of <linux/pci.h>
I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.

In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.

My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:

arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c

I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.

Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
  [PATCH] scatterlist.h needs types.h
  http://lkml.org/lkml/2007/3/01/141

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Tony Luck
d29182534c Pull mem-attribute into release branch 2007-04-30 13:56:17 -07:00
Tony Luck
b643b0fdbc Pull percpu-dtc into release branch 2007-04-30 13:56:00 -07:00
Tony Luck
e0cc09e295 Pull error-inject into release branch 2007-04-30 13:55:43 -07:00
David Howells
b1bdb691c3 [AF_RXRPC/AFS]: Arch-specific fixes.
Fixes for various arch compilation problems:

 (*) Missing module exports.

 (*) Variable name collision when rxkad and af_rxrpc both built in
     (rxrpc_debug).

 (*) Large constant representation problem (AFS_UUID_TO_UNIX_TIME).

 (*) Configuration dependencies.

 (*) printk() format warnings.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-27 15:28:45 -07:00
Arnaldo Carvalho de Melo
27d7ff46a3 [SK_BUFF]: Introduce skb_copy_to_linear_data{_offset}
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-04-25 22:28:29 -07:00
Arnaldo Carvalho de Melo
d626f62b11 [SK_BUFF]: Introduce skb_copy_from_linear_data{_offset}
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-04-25 22:28:23 -07:00
Arnaldo Carvalho de Melo
4305b54135 [SK_BUFF]: Convert skb->end to sk_buff_data_t
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:29 -07:00
Arnaldo Carvalho de Melo
27a884dc3c [SK_BUFF]: Convert skb->tail to sk_buff_data_t
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo
4c13eb6657 [ETH]: Make eth_type_trans set skb->dev like the other *_type_trans
One less thing for drivers writers to worry about.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:30 -07:00
Mike Habeck
2e0d232bff [IA64] SGI Altix : fix pcibr_dmamap_ate32() bug
On a SGI Altix TIOCP based PCI bus we need to include the ATE_PIO attribute
bit if we're mapping a 32bit MSI address.

Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-04-06 15:38:12 -07:00
Venki Pallipadi
8a3a78d149 [IA64] Fix CPU freq displayed in /proc/cpuinfo
My patch: git commit=95235ca2c20ac0b31a8eb39e2d599bcc3e9c9a10 introduced a bug
in IA64 cpuinfo output.

Patch changed the proc_freq from 1HZ resolution to 1KHz resolution, but left
format string unchanged at " %lu.%06lu". Below is the fix.

Thanks to Bjorn for catching this.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-04-06 15:37:45 -07:00
Ishimatsu Yasuaki
9438a1218e [IA64] Fix wrong assumption about irq and vector in msi_ia64.c
This patch fixes a wrong assumption in ia64 MSI code that IRQ equals
vector.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-04-06 15:37:06 -07:00
Russ Anderson
fbff71e1ec [IA64] BTE error timer fix
The bte recovery_timer was not being set correctly.

Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-04-06 15:31:33 -07:00
Bjorn Helgaas
6d40fc514c [IA64] fail mmaps that span areas with incompatible attributes
Example memory map (from HP sx1000 with VGA enabled):
    0x00000 - 0x9FFFF supports only WB (cacheable) access
    0xA0000 - 0xBFFFF supports only UC (uncacheable) access
    0xC0000 - 0xFFFFF supports only WB (cacheable) access

Some versions of X map the entire 0x00000-0xFFFFF area at once.  With the
example above, this mmap must fail because there's no memory attribute that's
safe for the entire area.

Prior to this patch, we performed the mmap with a UC mapping.  When X
accessed the WB memory at 0xC0000, it caused an MCA.  The crash can happen
when mapping 0xC0000 from either /dev/mem or a /sys/.../legacy_mem file.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-30 09:38:25 -07:00
Bjorn Helgaas
2cb22e23a5 [IA64] allow WB /sys/.../legacy_mem mmaps
Allow cacheable mmaps of legacy_mem if WB access is supported for the region.
The "legacy_mem" file often contains a shadow option ROM, and some versions of
X depend on this.

Tim Yamin <plasm@roo.me.uk> reported that this change fixes X on a Dell
PowerEdge 3250.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-30 09:38:03 -07:00
Bjorn Helgaas
9b50ffb0c0 [IA64] make ioremap avoid unsupported attributes
Example memory map (from HP sx1000 with VGA enabled):
    0x00000 - 0x9FFFF supports only WB (cacheable) access
    0xA0000 - 0xBFFFF supports only UC (uncacheable) access
    0xC0000 - 0xFFFFF supports only WB (cacheable) access

pci_read_rom() indirectly uses ioremap(0xC0000) to read the shadow VGA option
ROM.  ioremap() used to default to a 16MB or 64MB UC kernel identity mapping,
which would cause an MCA when reading 0xC0000 since only WB is supported there.

X uses reads the option ROM to initialize devices.  A smaller test case is:
  # echo 1 > /sys/bus/pci/devices/0000:aa:03.0/rom
  # cp /sys/bus/pci/devices/0000:aa:03.0/rom x

To avoid this, we can use the same ioremap_page_range() strategy that most
architectures use for all ioremaps.  These page table mappings come out of the
vmalloc area.  On ia64, these are in region 5 (0xA... addresses) and typically
use 16KB or 64KB mappings instead of 16MB or 64MB mappings.  The smaller
mappings give more flexibility to use the correct attributes.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-30 09:37:41 -07:00
Bjorn Helgaas
c4add2e537 [IA64] rename ioremap variables to match i386
No functional change, just use the same names as i386.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-30 09:37:17 -07:00
Tony Luck
dbfc2f6f95 [IA64] Fix arch/ia64/pci/pci.c:571: warning: `return' with a value
Typo/thinko in bba6f6fc68

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-29 15:41:37 -07:00
Jack Steiner
ead6caae1e [IA64] Speed up boot - skip unnecessary clock calibration
Skip clock calibration if cpu being brought online is exactly the same
speed, stepping, etc., as the previous cpu. This significantly reduces
the time to boot very large systems.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-29 15:17:11 -07:00
KAMEZAWA Hiroyuki
83d2cd3de4 [IA64] bugfix stack layout upside-down
ia64 expects following vm layout:

== low memory
[register-stack grows up]
[memory-stack grows down]
== high memory

But the code assigns the base of the register stack at the
maximum stack size offset from the fixed address where the
stack *might* start.  Stack randomization will result in the
memory stack starting at a lower address than this, and if the
user has set a low stack limit with "ulimit -s", then you can
end up with the register stack above the memory stack (or if
you were very unlucky right on top of it!).

Fix: Calculate the base address for the register stack starting
from the actual address of the memory stack.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-29 15:15:24 -07:00
Kenji Kaneshige
8a3a0ee736 [IA64] Fix possible invalid memory access in ia64_setup_msi_irq()
The following 'if' statement in ia64_setup_msi_irq() always fails even
if create_irq() returns <0 value, because variable 'irq' is defined as
unsigned int. It would cause invalid memory access.

        irq = create_irq();
        if (irq < 0)
                return irq;

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-29 15:02:58 -07:00
Eric W. Biederman
bba6f6fc68 [PATCH] MSI-X: fix resume crash
So I think the right solution is to simply make pci_enable_device just
flip enable bits and move the rest of the work someplace else.

However a thorough cleanup is a little extreme for this point in the
release cycle, so I think a quick hack that makes the code not stomp the
irq when msi irq's are enabled should be the first fix.  Then we can
later make the code not change the irqs at all.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-28 13:59:37 -07:00
Linus Torvalds
37c70d0d09 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI: IA64: fix %ll build warnings
  ACPI: IA64: fix allnoconfig build
  ACPI: Only use IPI on known broken machines (AMD, Dothan/BaniasPentium M)
  ACPI: ibm-acpi: allow module to load when acpi notifiers can't be set (v2)
  ACPI: parse 2nd MADT by default
  ACPICA: revert "acpi_serialize" changes
  sony-laptop: MAINTAINERS fix entry, add L: and W:
  ACPI: resolve HP nx6125 S3 immediate wakeup regression
  ACPI: Add support to parse 2nd MADT
2007-03-22 19:43:02 -07:00
Bernhard Walle
58a69c367c [IA64] Fix wrong /proc/iomem on SGI Altix
In sn_io_slot_fixup(), the parent is re-set from the bus to
io(port|mem)_resource because the address is changed in a way that it's not
child of the bus any more.

However, only the root is set but not the parent/child/sibling relationship in
the resource tree which causes 'cat /proc/iomem' to stop after this memory
area. Depding on the poition in the tree the iomem may be nearly completely
empty.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Acked-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-20 13:54:44 -07:00
John Keller
0bdfc19007 [IA64] Altix: ioremap vga_console_iobase
When booting an SN system without specifing a console
(i.e., no "console=" on boot line), the system will hang during
boot at the point where /sbin/init is run.

The problem is that vga_console_iobase is not converted to a
virtual address before storing in io_space[0].mmio_base.
The conversion was happening in sn_scan_pcdp(), but not in
setup_vga_console().

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-20 13:49:53 -07:00
Jay Lan
60b548dfe4 [IA64] Fix typo/thinko in crash.c
Clearly should be checking for "val == DIE_INIT_SLAVE_ENTER".

Signed-off-by: Jay Lan <jlan@sgi.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-20 13:47:47 -07:00
Jack Steiner
c5e83e3f42 [IA64] Fix get_model_name() for mixed cpu type systems
If a system consists of mixed processor types, kmalloc()
can be called before the per-cpu data page is initialized.
If the slab contains sufficient memory, then kmalloc() works
ok. However, if the slabs are empty, slab calls the memory
allocator. This requires per-cpu data (NODE_DATA()) & the
cpu dies.

Also noted by Russ Anderson who had a very similar patch.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-20 13:42:23 -07:00
Zou Nan hai
a3f5c338b9 [IA64] min_low_pfn and max_low_pfn calculation fix
We have seen bad_pte_print when testing crashdump on an SN machine in
recent 2.6.20 kernel.  There are tons of bad pte print (pfn < max_low_pfn)
reports when the crash kernel boots up, all those reported bad pages
are inside initmem range; That is because if the crash kernel code and
data happens to be at the beginning of the 1st node. build_node_maps in
discontig.c will bypass reserved regions with filter_rsvd_memory. Since
min_low_pfn is calculated in build_node_map, so in this case, min_low_pfn
will be greater than kernel code and data.

Because pages inside initmem are freed and reused later, we saw
pfn_valid check fail on those pages.

I think this theoretically happen on a normal kernel. When I check
min_low_pfn and max_low_pfn calculation in contig.c and discontig.c.
I found more issues than this.

1. min_low_pfn and max_low_pfn calculation is inconsistent between
contig.c and discontig.c,
min_low_pfn is calculated as the first page number of boot memmap in
contig.c (Why? Though this may work at the most of the time, I don't
think it is the right logic). It is calculated as the lowest physical
memory page number bypass reserved regions in discontig.c.
max_low_pfn is calculated include reserved regions in contig.c. It is
calculated exclude reserved regions in discontig.c.

2. If kernel code and data region is happen to be at the begin or the
end of physical memory, when min_low_pfn and max_low_pfn calculation is
bypassed kernel code and data, pages in initmem will report bad.

3. initrd is also in reserved regions, if it is at the begin or at the
end of physical memory, kernel will refuse to reuse the memory. Because
the virt_addr_valid check in free_initrd_mem.

So it is better to fix and clean up those issues.
Calculate min_low_pfn and max_low_pfn in a consistent way.

Signed-off-by:	Zou Nan hai <nanhai.zou@intel.com>
Acked-by: Jay Lan <jlan@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-20 13:41:57 -07:00
Len Brown
54b8c39fbd Pull misc-for-upstream into release branch 2007-03-20 11:05:41 -04:00
Len Brown
8140a90ec1 ACPI: IA64: fix allnoconfig build
The evils of Kconfig's select bite us once again...
ia64/Kconfig selects ACPI, which depends on PM.
But select ignores dependencies, allnoconfig
chooses CONFIG_PM=n, and thus the menu of sub-options
under ACPI vanish, which breaks the build.

Manually select PM along with ACPI for now.
Some day, we should delete them both, or fix select.

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-03-19 23:41:51 -04:00
Bernhard Walle
6471572559 [PATCH] Fix wrong /proc/iomem on SGI Altix
In sn_io_slot_fixup(), the parent is re-set from the bus to
io(port|mem)_resource because the address is changed in a way that it's not
child of the bus any more.

However, only the root is set but not the parent/child/sibling relationship
in the resource tree which causes 'cat /proc/iomem' to stop after this
memory area.  Depding on the poition in the tree the iomem may be nearly
completely empty.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: John Keller <jpk@sgi.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-18 11:35:07 -07:00
Linus Torvalds
f4cd87aabb Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] refresh config files
  [IA64] put kdump_find_rsvd_region in __init
  [IA64] Remove sparse warning from unwind code
  [IA64] add missing syscall trace clear
  [IA64] Cleanup in crash.c
  [IA64] kexec: declare ia64_mca_pal_base in mca.h rather than kexec.h
  [IA64] pci_get_legacy_ide_irq should return irq (not GSI)
  [IA64] whitespace fixes for include/asm-ia64/sal.h
  [IA64] Cache error recovery
  [IA64] Proper handling of TLB errors from duplicate itr.d dropins
2007-03-09 22:00:51 -08:00
Len Brown
c207908fcc Pull altix into release branch 2007-03-09 23:17:39 -05:00
Tony Luck
e3a696e03c [IA64] refresh config files
Bring defconfig, tiger_defconfig and zx1_defconfig up to date. Also
sprinkle KEXEC and KDUMP combinations around liberally so that my
usual regression test builds will see all combinations:

 tiger_defconfig gets KEXEC=y, CRASH_DUMP=n
 zx1_defconfig   gets KEXEC=n, CRASH_DUMP=y
 defconfig       gets KEXEC=y, CRASH_DUMP=y
 others remain at     KEXEC=n, CRASH_DUMP=n

Signed-off-by: Tony Luck <tomy.luck@intel.com>
2007-03-08 11:20:17 -08:00
Horms
2a3a2827c7 [IA64] put kdump_find_rsvd_region in __init
kdump_find_rsvd_region() is only called by
reserve_memory() which is in __init, so it seems that
kdump_find_rsvd_region() should also be in there.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-08 10:29:58 -08:00
Akiyama, Nobuyuki
8e43d75ad0 [IA64] add missing syscall trace clear
The ptrace misses clearing the syscall trace flag.
The increased syscall overhead is retained after the trace is finished.
This case happens when strace is terminated by force.

Signed-off-by: Akiyama, Nobuyuki <akiyama.nobuyuk@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-08 10:27:24 -08:00
Simon Horman
0ac1faca4a [IA64] Cleanup in crash.c
Grammatical fixes (s/freezed/frozen/)
Make some variables static
Change a C++ "//" comment to "/* ... */"

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-08 10:25:06 -08:00
Russ Anderson
396e8e76c5 [IA64] Cache error recovery
Similar to memory error recovery, when a cache error is consumed
by a user process terminate the user instead of crashing the system.

Signed-off-by: Russ Anderson (rja@sgi.com)
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-08 09:44:45 -08:00
Russ Anderson
618b206f0b [IA64] Proper handling of TLB errors from duplicate itr.d dropins
Jack Steiner noticed that duplicate TLB DTC entries do not cause a
linux panic.  See discussion:

http://www.gelato.unsw.edu.au/archives/linux-ia64/0307/6108.html

The current TLB recovery code is recovering from the duplicate itr.d
dropins, masking the underlying problem.  This change modifies
the MCA recovery code to look for the TLB check signature of the
duplicate TLB entry and panic in that case.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-08 09:41:46 -08:00
Fenghua Yu
3bc207d2b7 [IA64] fsys_getcpu for IA64
On 1.6GHz Montectio Tiger4, the following performance data is measured with
kernel built with defconfig which has NUMA configured:

Fastest sys_getcpu: 502 itc counts.
Fastest fsys_getcpu: 28 itc counts.

fsys_getcpu performance is largly impacted by whether data (node_to_cpu_map
etc) is in cache. It can take fsys_getcpu up to ~150 itc counts in cold
cache case.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-07 16:27:09 -08:00
Horms
ddbad07630 [IA64] remove duplicate declaration of efi_initialize_iomem_resources
efi_initialize_iomem_resources() is declared in both include/linux/efi.h
and arch/ia64/kernel/setup.c. This patch removes the latter.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-07 16:18:38 -08:00
Tony Luck
e55fdf11f3 [IA64] Pick highest possible saved_max_pfn for crash_dump
Berhhard Walle noted that on his HP rx8640 he ended up with saved_max_pfn
smaller than the highest address of system ram in /proc/iomem and proposed
a patch to base the address on the unrounded and unfiltered EFI memory
map address.  Simon Horman and Magnus Damm suggested that the whole test
be moved earlier in the function.  This is the combination of both of
these patches.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-07 16:13:25 -08:00
KAMEZAWA Hiroyuki
e253eb0c08 [IA64] fix NULL pointer in ia64/irq_chip-mask/unmask function
This patch fixes boot failure because irq_desc->mask() is NULL.

- Added mask/unmask functions to ia64's irq desc function table.
- rename hw_interrupt_type to irq_chip. hw_interrupt_type is old name.
- Tony: Added same change to arch/ia64/sn/kernel/irq.c as pointed out
  by Eric Biederman ... mask/unmask functions there can be no-op.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-07 14:57:35 -08:00
Magnus Damm
cee87af2a5 [IA64] kexec: Use EFI_LOADER_DATA for ELF core header
The address where the ELF core header is stored is passed to the secondary
kernel as a kernel command line option.  The memory area for this header is
also marked as a separate EFI memory descriptor on ia64.

The separate EFI memory descriptor is at the moment of the type
EFI_UNUSABLE_MEMORY.  With such a type the secondary kernel skips over the
entire memory granule (config option, 16M or 64M) when detecting memory.
If we are lucky we will just lose some memory, but if we happen to have
data in the same granule (such as an initramfs image), then this data will
never get mapped and the kernel bombs out when trying to access it.

So this is an attempt to fix this by changing the EFI memory descriptor
type into EFI_LOADER_DATA.  This type is the same type used for the kernel
data and for initramfs.  In the secondary kernel we then handle the ELF
core header data the same way as we handle the initramfs image.

This patch contains the kernel changes to make this happen.  Pretty
straightforward, we reserve the area in reserve_memory().  The address for
the area comes from the kernel command line and the size comes from the
specialized EFI parsing function vmcore_find_descriptor_size().

The kexec-tools-testing code for this can be found here:
http://lists.osdl.org/pipermail/fastboot/2007-February/005983.html

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-06 14:50:33 -08:00
Nick Piggin
41d5e5d73e [IA64] permon use-after-free fix
Perfmon associates vmalloc()ed memory with a file descriptor, and installs
a vma mapping that memory.  Unfortunately, the vm_file field is not filled
in, so processes with mappings to that memory do not prevent the file from
being closed and the memory freed.  This results in use-after-free bugs and
multiple freeing of pages, etc.

I saw this bug on an Altix on SLES9.  Haven't reproduced upstream but it
looks like the same issue is there.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-06 14:49:52 -08:00
Alexandr Andreev
50157b09b3 [IA64] sync compat getdents
Add VERIFY_WRITE check in the beginning like compat_sys_getdents() (EINVAL vs
EFAULT).

Signed-off-by: Alexandr Andreev <aandreev@openvz.org>
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-06 14:49:24 -08:00
Lee Schermerhorn
a27e5a13d5 [IA64] always build arch/ia64/lib/xor.o
Always build ia64 xor.o because multiple config options now depend on it.

Necessary to build .20-mm* on ia64 when, e.g., CONFIG_ASYNC_TX_DMA is
defined.  Don't know if '_ASYNC_TX_DMA makes sense on ia64.  If not, maybe
Kconfig should preclude it.

Could have defined a Kconfig option that defaults to true if MD_RAID456 ||
ASYNC_TX_DMA to control building of xor.o, but xor.o is only 848 bytes and
this IS ia64...

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Bob Picco <bob.picco@hp.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-06 14:48:52 -08:00
Horms
f4a570997e [IA64] point saved_max_pfn to the max_pfn of the entire system
Make saved_max_pfn point to max_pfn of entire system.

Without this patch is so that vmcore is zero length on ia64.  This is
because saved_max_pfn was wrongly being set to the max_pfn of the crash
kernel's address space, rather than the max_pfg on the physical memory of
the machine - the whole purpose of vmcore is to access physical memory that
is not part of the crash kernel's addresss space.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Sort-Of-Acked-By: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-03-06 14:47:54 -08:00
John Keller
3fd0b2d9ad ACPI: Altix: reinitialize acpi tables
To provide compatibilty with SN kernels that do and do not
have ACPI IO support, the SN PROM must build different
versions of some ACPI tables based on which kernel is booting.
As such, the tables may have to change at kernel boot time.
By default, prior to kernel boot, the PROM builds an empty
DSDT (header only) and no SSDTs. If an ACPI capable kernel
boots, the kernel will notify the PROM, at platform setup time,
and the PROM will build full DSDT and SSDT tables.

With the latest changes to acpi_table_init(), the table lengths
are saved, and when our PROM changes them, the changes are not seen,
and the kernel will crash on boot. Because of issues with kexec support,
we are not able to create the tables prior to acpi_table_init().
As a result, we are making a second call to acpi_table_init() to
process the rebuilt DSDT and SSDTs.

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-03-01 02:32:58 -05:00
Eric W. Biederman
9f0a5ba550 [PATCH] irq: Remove set_native_irq_info
This patch replaces all instances of "set_native_irq_info(irq, mask)"
with "irq_desc[irq].affinity = mask".  The latter form is clearer
uses fewer abstractions, and makes access to this field uniform
accross different architectures.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-26 10:34:07 -08:00
John Keller
690b8d9d54 ACPI: Altix: cannot register acpi bus driver before bus scan
SN code to initialize the Hub/TIO infrastructure needs to
execute before bus scanning. This was previously done with
an early call to acpi_bus_register_driver().  But now that
ACPI is using the Linux driver model, a driver cannot be registered
that early. Make changes to have the init routines invoked via
calls to acpi_get_devices().

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-23 23:06:59 -05:00
Linus Torvalds
874ff01bd9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
  Documentation/kernel-docs.txt update.
  arch/cris: typo in KERN_INFO
  Storage class should be before const qualifier
  kernel/printk.c: comment fix
  update I/O sched Kconfig help texts - CFQ is now default, not AS.
  Remove duplicate listing of Cris arch from README
  kbuild: more doc. cleanups
  doc: make doc. for maxcpus= more visible
  drivers/net/eexpress.c: remove duplicate comment
  add a help text for BLK_DEV_GENERIC
  correct a dead URL in the IP_MULTICAST help text
  fix the BAYCOM_SER_HDX help text
  fix SCSI_SCAN_ASYNC help text
  trivial documentation patch for platform.txt
  Fix typos concerning hierarchy
  Fix comment typo "spin_lock_irqrestore".
  Fix misspellings of "agressive".
  drivers/scsi/a100u2w.c: trivial typo patch
  Correct trivial typo in log2.h.
  Remove useless FIND_FIRST_BIT() macro from cardbus.c.
  ...
2007-02-19 13:29:02 -08:00
Robert P. J. Day
85d1fe095c Fix comment typo "spin_lock_irqrestore".
Fix "spin_lock_irqrestore" to "spin_unlock_irqrestore."

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 19:21:17 +01:00
Len Brown
c0cd79d114 Pull fluff into release branch
Conflicts:

	arch/x86_64/pci/mmconfig.c
	drivers/acpi/bay.c

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 22:10:32 -05:00
John Keller
db2d4ccdc8 ACPI: IA64: react to acpi_table_parse() return value change
acpi_boot_init() is making a bad check on the return
status from acpi_table_parse(). acpi_table_parse() now
returns zero on success, one on failure.

Signed-off-by: Aaron Young <ayoung@sgi.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 22:07:36 -05:00
Zhang, Yanmin
9f271d576a ATA convert GSI to irq on ia64
If an ATA drive uses legacy mode, ata driver will choose 14 and 15
as the fixed irq number. On ia64 platform, such numbers are GSI and
should be converted to irq vector.

Below patch against kernel 2.6.20 fixes it.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:53 -05:00
Eric W. Biederman
0b4d414714 [PATCH] sysctl: remove insert_at_head from register_sysctl
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name.  Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman
4e00990118 [PATCH] sysctl: C99 convert arch/ia64/kernel/perfmon and remove ABI breakage
This convters the sysctl ctl_tables to use C99 initializers.  While I was
looking at it I discovered it was using a portion of the sysctl binary
addresses space under CTL_KERN KERN_OSTYPE which was completely inappropriate.
 So I completely removed all of the sysctl binary names, to remove and avoid
the ABI conflict.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:56 -08:00
Eric W. Biederman
68cbf07536 [PATCH] sysctl: C99 Convert arch/ia64/sn/kernel/xpc_main.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:56 -08:00
Eric W. Biederman
79eec3d3d9 [PATCH] sysctl: sn: remove sysctl ABI BREAKAGE
By not using the enumeration in sysctl.h (or even understanding it) the SN
platform placed their arch specific xpc directory on top of CTL_KERN and only
because they didn't have 4 entries in their xpc directory got lucky and didn't
break glibc.

This is totally irresponsible.  So this patch entirely removes sys_sysctl
support from their sysctl code.  Hopefully they don't have ascii name
conflicts as well.

And now that they have no ABI numbers add them to the end instead of the
sysctl list instead of the head so nothing else will be overridden.

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:56 -08:00
Thomas Gleixner
38515e908b [PATCH] Scheduled removal of SA_xxx interrupt flags fixups
The obsolete SA_xxx interrupt flags have been used despite the scheduled
removal.  Fixup the remaining users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: James Simmons <jsimmons@infradead.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:54 -08:00
Arjan van de Ven
5dfe4c964a [PATCH] mark struct file_operations const 2
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:44 -08:00
Alon Bar-Lev
7a3a06d0e1 [PATCH] Dynamic kernel command-line: fixups
Remove in-source externs, linux/init.h is included in all cases.
This is a fixups for "Dynamic kernel command-line" patch.

It also includes some uml __init fixups so that we can __initdata also its
command_line.

Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:39 -08:00
Alon Bar-Lev
a8d91b8477 [PATCH] Dynamic kernel command-line: ia64
1. Rename saved_command_line into boot_command_line.
2. Set command_line as __initdata.

[akpm@osdl.org: move some declarations to the right place]
Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:38 -08:00
Kirill Korotaev
cefc8be824 [PATCH] Consolidate bust_spinlocks()
Part of long forgotten patch
http://groups.google.com/group/fa.linux.kernel/msg/e98e941ce1cf29f6?dmode=source
Since then, m32r grabbed two copies.

Leave s390 copy because of important absence of CONFIG_VT, but remove
references to non-existent timerlist_lock.  ia64 also loses timerlist_lock.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:34 -08:00
Alexey Dobriyan
85cc9b1144 [PATCH] sn2: use static ->proc_fops
fix-rmmod-read-write-races-in-proc-entries.patch doesn't want dynamically
allocated ->proc_fops, because it will set it to NULL at module unload time.

Regardless of module status, switch to statically allocated ->proc_fops which
leads to simpler code without wrappers.

AFAICS, also fix the following bug: "sn_force_interrupt" proc entry set
->write for itself, but was created with 0444 permissions. Change to 0644.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:34 -08:00
Kyle McMartin
d4d23add3a [PATCH] Common compat_sys_sysinfo
I noticed that almost all architectures implemented exactly the same
sys32_sysinfo...  except parisc, where a bug was to be found in handling of
the uptime.  So let's remove a whole whack of code for fun and profit.
Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it
would be the best tested.

This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but
instead extracting out the common code from sys_sysinfo.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:32 -08:00
Robert P. J. Day
c376222960 [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:27 -08:00
Jean-Paul Saman
67d38229df [PATCH] disable init/initramfs.c: architectures
Update all arch/*/kernel/vmlinux.lds.S to not include space for initramfs
when CONFIG_BLK_DEV_INITRAMFS is not selected.  This saves another 4 kbytes
on most platfoms (some reserve PAGE_SIZE for initramfs).

Signed-off-by: Jean-Paul Saman <jean-paul.saman@nxp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Christoph Lameter
09ae1f585e [PATCH] optional ZONE_DMA: optional ZONE_DMA for ia64
ZONE_DMA less operation for IA64 SGI platform

Disable ZONE_DMA for SGI SN2.  All memory is addressable by all devices and we
do not need any special memory pool.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
66701b1499 [PATCH] optional ZONE_DMA: introduce CONFIG_ZONE_DMA
This patch simply defines CONFIG_ZONE_DMA for all arches.  We later do special
things with CONFIG_ZONE_DMA after the VM and an arch are prepared to work
without ZONE_DMA.

CONFIG_ZONE_DMA can be defined in two ways depending on how an architecture
handles ISA DMA.

First if CONFIG_GENERIC_ISA_DMA is set by the arch then we know that the arch
needs ZONE_DMA because ISA DMA devices are supported.  We can catch this in
mm/Kconfig and do not need to modify arch code.

Second, arches may use ZONE_DMA in an unknown way.  We set CONFIG_ZONE_DMA for
all arches that do not set CONFIG_GENERIC_ISA_DMA in order to insure backwards
compatibility.  The arches may later undefine ZONE_DMA if their arch code has
been verified to not depend on ZONE_DMA.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
9195481d2f [PATCH] Drop nr_free_pages_pgdat()
Function is unnecessary now.  We can use the summing features of the ZVCs to
get the values we need.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Al Viro
ccbebdaccf [PATCH] arch/ia64: ansify
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09 09:14:06 -08:00
Linus Torvalds
78149df6d5 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (41 commits)
  Revert "PCI: remove duplicate device id from ata_piix"
  msi: Make MSI useable more architectures
  msi: Kill the msi_desc array.
  msi: Remove attach_msi_entry.
  msi: Fix msi_remove_pci_irq_vectors.
  msi: Remove msi_lock.
  msi: Kill msi_lookup_irq
  MSI: Combine pci_(save|restore)_msi/msix_state
  MSI: Remove pci_scan_msi_device()
  MSI: Replace pci_msi_quirk with calls to pci_no_msi()
  PCI: remove duplicate device id from ipr
  PCI: remove duplicate device id from ata_piix
  PCI: power management: remove noise on non-manageable hw
  PCI: cleanup MSI code
  PCI: make isa_bridge Alpha-only
  PCI: remove quirk_sis_96x_compatible()
  PCI: Speed up the Intel SMBus unhiding quirk
  PCI Quirk: 1k I/O space IOBL_ADR fix on P64H2
  shpchp: delete trailing whitespace
  shpchp: remove DBG_XXX_ROUTINE
  ...
2007-02-07 19:23:44 -08:00
Eric W. Biederman
f7feaca77d msi: Make MSI useable more architectures
The arch hooks arch_setup_msi_irq and arch_teardown_msi_irq are now
responsible for allocating and freeing the linux irq in addition to
setting up the the linux irq to work with the interrupt.

arch_setup_msi_irq now takes a pci_device and a msi_desc and returns
an irq.

With this change in place this code should be useable by all platforms
except those that won't let the OS touch the hardware like ppc RTAS.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Eric W. Biederman
5b912c108c msi: Kill the msi_desc array.
We need to be able to get from an irq number to a struct msi_desc.
The msi_desc array in msi.c had several short comings the big one was
that it could not be used outside of msi.c.  Using irq_data in struct
irq_desc almost worked except on some architectures irq_data needs to
be used for something else.

So this patch adds a msi_desc pointer to irq_desc, adds the appropriate
wrappers and changes all of the msi code to use them.

The dynamic_irq_init/cleanup code was tweaked to ensure the new
field is left in a well defined state.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Linus Torvalds
21d37bbc65 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (140 commits)
  ACPICA: reduce table header messages to fit within 80 columns
  asus-laptop: merge with ACPICA table update
  ACPI: bay: Convert ACPI Bay driver to be compatible with sysfs update.
  ACPI: bay: new driver is EXPERIMENTAL
  ACPI: bay: make drive_bays static
  ACPI: bay: make bay a platform driver
  ACPI: bay: remove prototype procfs code
  ACPI: bay: delete unused variable
  ACPI: bay: new driver adding removable drive bay support
  ACPI: dock: check if parent is on dock
  ACPICA: fix gcc build warnings
  Altix: Add ACPI SSDT PCI device support (hotplug)
  Altix: ACPI SSDT PCI device support
  ACPICA: reduce conflicts with Altix patch series
  ACPI_NUMA: fix HP IA64 simulator issue with extended memory domain
  ACPI: fix HP RX2600 IA64 boot
  ACPI: build fix for IBM x440 - CONFIG_X86_SUMMIT
  ACPICA: Update version to 20070126
  ACPICA: Fix for incorrect parameter passed to AcpiTbDeleteTable during table load.
  ACPICA: Update copyright to 2007.
  ...
2007-02-07 15:36:08 -08:00
Chen, Kenneth W
00b65985fb [IA64] relax per-cpu TLB requirement to DTC
Instead of pinning per-cpu TLB into a DTR, use DTC.  This will free up
one TLB entry for application, or even kernel if access pattern to
per-cpu data area has high temporal locality.

Since per-cpu is mapped at the top of region 7 address, we just need to
add special case in alt_dtlb_miss.  The physical address of per-cpu data
is already conveniently stored in IA64_KR(PER_CPU_DATA).  Latency for
alt_dtlb_miss is not affected as we can hide all the latency.  It was
measured that alt_dtlb_miss handler has 23 cycles latency before and
after the patch.

The performance effect is massive for applications that put lots of tlb
pressure on CPU.  Workload environment like database online transaction
processing or application uses tera-byte of memory would benefit the most.
Measurement with industry standard database benchmark shown an upward
of 1.6% gain.  While smaller workloads like cpu, java also showing small
improvement.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-06 15:04:48 -08:00
Chen, Kenneth W
a0776ec8e9 [IA64] remove per-cpu ia64_phys_stacked_size_p8
It's not efficient to use a per-cpu variable just to store
how many physical stack register a cpu has.  Ever since the
incarnation of ia64 up till upcoming Montecito processor, that
variable has "glued" to 96. Having a variable in memory means
that the kernel is burning an extra cacheline access on every
syscall and kernel exit path.  Such "static" value is better
served with the instruction patching utility exists today.
Convert ia64_phys_stacked_size_p8 into dynamic insn patching.

This also has a pleasant side effect of eliminating access to
per-cpu area while psr.ic=0 in the kernel exit path. (fixable
for per-cpu DTC work, but why bother?)

There are some concerns with the default value that the instruc-
tion encoded in the kernel image.  It shouldn't be concerned.
The reasons are:

(1) cpu_init() is called at CPU initialization.  In there, we
    find out physical stack register size from PAL and patch
    two instructions in kernel exit code.  The code in question
    can not be executed before the patching is done.

(2) current implementation stores zero in ia64_phys_stacked_size_p8,
    and that's what the current kernel exit path loads the value with.
    With the new code, it is equivalent that we store reg size 96
    in ia64_phys_stacked_size_p8, thus creating a better safety net.
    Given (1) above can never fail, having (2) is just a bonus.

All in all, this patch allow one less memory reference in the kernel
exit path, thus reducing syscall and interrupt return latency; and
avoid polluting potential useful data in the CPU cache.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-06 15:04:18 -08:00
Len Brown
57e1c5c87d Pull test into release branch 2007-02-06 15:31:00 -05:00
Jan Beulich
cde14bbfb3 [IA64] swiotlb bug fixes
This patch fixes
- marking I-cache clean of pages DMAed to now only done for IA64
- broken multiple inclusion in include/asm-x86_64/swiotlb.h
- missing call to mark_clean in swiotlb_sync_sg()
- a (perhaps only theoretical) issue in swiotlb_dma_supported() when
io_tlb_end is exactly at the end of memory

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 18:46:40 -08:00
Fenghua Yu
86afa9eb88 [IA64] Hook up getcpu system call for IA64
getcpu system call returns cpu# and node# on which this system call and
its caller are running. This patch hooks up its implementation on IA64.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 16:56:36 -08:00
Bob Picco
524fd988bb [IA64] clean up sparsemem memory_present call
Eliminate arch specific memory_present call ia64 NUMA by utilizing
sparse_memory_present_with_active_regions.

Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 16:54:11 -08:00
George Beshers
f1c0afa2e8 [IA64] show_mem() for IA64 sparsemem NUMA
On the ia64 architecture only this patch upgrades show_mem() for sparse
memory to be the same as it was for discontig memory.  It has been shown to
work on NUMA and flatmem architectures.

Signed-off-by: George Beshers <gbeshers@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 16:51:59 -08:00
Jan Beulich
671496affd [IA64] missing exports hwsw_sync_...
Add missing exports to allow several drivers to be built as module with
CONFIG_IA64_HP_ZX1_SWIOTLB.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 16:50:11 -08:00
Kirill Korotaev
d00195ebc1 [IA64] alignment bug in ldscript
Occasionally the FSYS_RETURN patch list can have an odd length, causing other
data structures to get out of alignment.  In OpenVZ it is odd and we get
misaligned kernel image, which does not boot.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 16:45:42 -08:00
Bob Picco
139b830477 [IA64] register memory ranges in a consistent manner
While pursuing and unrelated issue with 64Mb granules I noticed a problem
related to inconsistent use of add_active_range.  There doesn't appear any
reason to me why FLATMEM versus DISCONTIG_MEM should register memory to
add_active_range with different code.  So I've changed the code into a
common implementation.

The other subtle issue fixed by this patch was calling add_active_range in
count_node_pages before granule aligning is performed.  We were lucky with
16MB granules but not so with 64MB granules.  count_node_pages has reserved
regions filtered out and as a consequence linked kernel text and data
aren't covered by calls to count_node_pages.  So linked kernel regions
wasn't reported to add_active_regions.  This resulted in free_initmem
causing numerous bad_page reports.  This won't occur with this patch
because now all known memory regions are reported by
register_active_ranges.

Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 15:07:47 -08:00
Jan Beulich
d1598e05fa [IA64] Enable SWIOTLB only when needed
Don't force CONFIG_SWIOTLB on when not actually needed (i.e. HP_ZX1 and
SGI_SN2).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:33:08 -08:00
Russ Anderson
980dbfd421 [IA64-SGI] Check for TIO errors on shub2 Altix
The shub2 error interrupt handler must check for TIO errors.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:27:54 -08:00
Alex Williamson
451fe00cf7 [IA64] Clear IRQ affinity when unregistered
When we offline a CPU, migrate_irqs() tries to determine whether the
affinity bits of the IRQ descriptor match any of the remaining online
CPUs.  If not, it fixes up the interrupt to point somewhere else.
Unfortunately, if an IRQ is unregistered the IRQ descriptor may still
have affinity to the CPU being offlined, but the no_irq_chip handler
doesn't provide a set_affinity function.  This causes us to hit the
WARN_ON in migrate_irqs().

The easiest solution seems to be setting all the bits in the affinity
mask when the last interrupt is removed from the vector.  I hit this on
an older kernel with Xen/ia64 using driver domains (so it probably needs
more testing on upstream).  Xen essentially uses the bind/unbind
interface in sysfs to unregister a device from a driver and thus
unregister the interrupt.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:09:51 -08:00
Len Brown
06f87adff1 [IA64] fix ACPI Kconfig issues
All IA64 systems except IA64_HP_SIM include ACPI and PCI.
So prevent IA64 Kconfigs that try to do irritating things like building
PCI without building ACPI.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:07:50 -08:00
Bernhard Walle
c2c77fe8df [IA64] Fix NULL-pointer dereference in ia64_machine_kexec()
This patch fixes a NULL-pointer dereference in ia64_machine_kexec().

The variable ia64_kimage is set in machine_kexec_prepare() which is
called from sys_kexec_load(). If kdump wasn't configured before,
ia64_kimage is NULL.  machine_kdump_on_init() passes ia64_kimage() to
machine_kexec() which assumes a valid value.

The patch also adds a few sanity checks for the image to simplify
debugging of similar problems in future.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:06:44 -08:00
bibo,mao
87f76d3aaf [IA64] find thread for user rbs address
I encountered one problem when running ptrace test case the situation
is this: traced process's syscall parameter needs to be accessed, but
for sys_clone system call with clone_flag (CLONE_VFORK | CLONE_VM |
SIGCHLD) parameter.  This syscall's parameter accessing result is wrong.

The reason is that vforked child process mm point is the same, but
tgid is different. Without this patch find_thread_for_addr will return
vforked process if vforked process is also stopped, but not the thread
which calls vfork syscall.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 14:04:21 -08:00
Aron Griffis
ae0af3e346 [IA64] use snprintf() on features field of /proc/cpuinfo
Some patches have turned up on xen-devel recently to convert strcpy()
to safer alternatives and so forth.  While reviewing those patches
I noticed that the features string building could be cleaned up.

This patch uses snprintf() instead of strcpy() and direct character
pointer manipulation.  It makes the features string building safe and
gets rid of the special case for features output in show_cpuinfo()

Additionally I removed the (int) cast of ARRAY_SIZE, which seems to
serve no purpose.

Signed-off-by: Aron Griffis <aron@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 13:54:31 -08:00
bibo,mao
90f9d70a58 [IA64] enable singlestep on system call
As is pointed out in
http://www.gelato.org/community/view_linear.php?id=1_1036&from=authors&value=Ian%20Wienand#1_1039,
if single step on break instruction, the break fault has higher
priority than the single-step trap. When the break fault handler
is entered, it advances the IP by 1 instruction so break instruction
single-stepping is skipped, actually it is next instruction which
is single stepped.

This patch modifies this, it adds TIF_SINGLESTEP bit for thread
flags, and generate a fake sigtrap when single stepping break
instruction. Test case in attachment can verify this. Any comments
is welcome.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 13:49:29 -08:00
Horms
c237508afa [IA64] kexec: Move machine_shutdown from machine_kexec.c to process.c
This moves the ia64 implementation of machine_shutdown() from
machine_kexec.c to process.c, which is in keeping with the implelmentation
on other architectures, and seems like a much more appropriate home for it.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 13:49:10 -08:00
Horms
9473252f20 [IA64] add newline to PAL-code warning message
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:32:59 -08:00
Horms
abac08dbb4 [IA64] kexec: Remove inline declaration of efi_get_pal_addr()
Remove the Remove inline declaration of efi_get_pal_addr() as it is
declared in linux/efi.h.

Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:31:43 -08:00
Horms
8a697d0a4c [IA64] kexec: Minor enhancement to includes in crash.c
linux/uaccess.h was being included, but it seems that
really the following includes are needed.

asm/page.h: for __va() and PAGE_SHIFT
asm/uaccess.h: for copy_to_user()

I guess that linux/uaccess.h pulls in both asm/page.h and asm/uaccess.h.
I notices this while backporting the code to xen's linux-2.6.16.33,
which does not have linux/uaccess.h. I'm posting it as I think it is a
correct, though somewhat cosmetic fix.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:31:04 -08:00
Horms
233c2f99d6 [IA64] kexec: typo in the saved_max_pfn description in contig.c
Fix a typo in the saved_max_pfn description in contig.c

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:30:25 -08:00
Horms
475c63bded [IA64] Zero size /proc/vmcore on ia64
Set saved_max_pfn when discontig memory is in use.

This sets up saved_max_pfn when disctontig memory is in use.
This mirrors the code for contig memory.

This patch does not entirely solve the problem of making vmcore work,
however it does appear to be neccessary. Please consider applying.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:29:33 -08:00
Magnus Damm
bcb9b99d1f [IA64] kexec: Fix CONFIG_SMP=n compilation
Kexec support for 2.6.20 on ia64 does not build properly using a config
made up by CONFIG_SMP=n and CONFIG_HOTPLUG_CPU=n:

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Jay Lan <jlan@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:27:21 -08:00
John Keller
72253943f7 [PATCH] Altix: more ACPI PRT support
The SN Altix platform does not conform to the IOSAPIC IRQ routing model.
Add code in acpi_unregister_gsi() to check if (acpi_irq_model ==
ACPI_IRQ_MODEL_PLATFORM) and return.

Due to an oversight, this code was not added previously when
similar code was added to acpi_register_gsi().

http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2

Signed-off-by: John Keller <jpk@sgi.com>
Acked-by: Len Brown <lenb@kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-03 11:26:06 -08:00
Magnus Damm
29a002776b [PATCH] kexec: Avoid migration of already disabled irqs (ia64)
This patch fixes up ia64 kexec support for HP rx2620 hardware.  It does
this by skipping migration of already disabled irqs.  This is most likely a
problem on other ia64 platforms as well, but I've only been able to
reproduce it on one machine so far.

The full story is that handle_bad_irq() gets invoked before starting the
new kernel without this patch.  This seems to happen when fixup_irqs()
calls generic_handle_irq() on already migrated (and disabled) irqs.  So by
avoiding migration of disabled irqs we stay away of handle_bad_irq().

The code has been tested on three different ia64 machines, all with good
results.  It is possible to trigger the same bug by offlining a processor
using echo 0 > /sys/devices/system/cpu/cpuX/online.

More detailed information is available in the following mail thread:
http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Zou, Nanhai <nanhai.zou@intel.com>
Acked-by: Jay Lan <jlan@sgi.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-03 11:26:06 -08:00
John Keller
6f09a9250a Altix: ACPI SSDT PCI device support
Add SN platform support for running with an ACPI
capable PROM that defines PCI devices in SSDT
tables. There is a SSDT table for every occupied
slot on a root bus, containing info for every
PPB and/or device on the bus. The SSDTs will be
dynamically loaded/unloaded at hotplug enable/disable.

Platform specific information that is currently
passed via a SAL call, will now be passed via the
Vendor resource in the ACPI Device object(s) defined
in each SSDT.

Signed-off-by: John Keller <jpk@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 22:14:35 -05:00
Len Brown
647fb47dfa ACPICA: reduce conflicts with Altix patch series
Syntax only -- no functional changes.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 22:14:22 -05:00
Alexey Starikovskiy
defad23020 ACPI_NUMA: fix HP IA64 simulator issue with extended memory domain
ACPI 3.0 incorporated the SRAT spec, upping the table version to 2,
and extending the size of the proximity domain from 1-byte to 4-bytes.
This extension was into a reserved field that firmware should
set to 0, but the HP simulator had non-zero values there
resulting in unexpected huge numbers.

So mask the domain down to 8-bits for now.
A more general fix will be to check the table version
supplied by firmware and get paranoid about reserved fields.

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 22:02:55 -05:00
Alexey Starikovskiy
f18c5a08bf ACPICA: Allow ACPI id to be u32 instead of u8.
Allow ACPI id to be u32 instead of u8.
Requires drop of conversion tables with the acpiid as index.

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:31 -05:00
Alexey Starikovskiy
5f3b1a8b67 ACPICA: Remove duplicate table definitions (non-conflicting)
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:29 -05:00
Alexey Starikovskiy
cee324b145 ACPICA: use new ACPI headers.
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:28 -05:00
Alexey Starikovskiy
ad71860a17 ACPICA: minimal patch to integrate new tables into Linux
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-02 21:14:22 -05:00
Fenghua Yu
539d517ad1 [IA64] Itanium MC Error Injection Tool: Makefile changes
This patch has Makefile changes.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-01-29 15:27:07 -08:00
Fenghua Yu
62fa562af3 [IA64] Itanium MC Error Injection Tool: Driver sysfs interface
This kernel driver patch provides sysfs interface for user application to
call pal_mc_error_inject() procedure.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-01-29 15:26:40 -08:00
Fenghua Yu
e9ef08bdc1 [IA64] Itanium MC Error Injection Tool: Kernel configuration
This patch has kenrel configuration changes for the MC Error Injection
Tool.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-01-29 15:25:49 -08:00
Linus Torvalds
e947382ed3 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  Revert "ACPI: ibm-acpi: make non-generic bay support optional"
  ACPI: update MAINTAINERS
  ACPI: schedule obsolete features for deletion
  ACPI: delete two spurious ACPI messages
  ACPI: rename cstate_entry_s to cstate_entry
  ACPI: ec: enable printk on cmdline use
  ACPI: Altix: ACPI _PRT support
2007-01-11 18:25:44 -08:00
Dave Hansen
a2f3aa0257 [PATCH] Fix sparsemem on Cell
Fix an oops experienced on the Cell architecture when init-time functions,
early_*(), are called at runtime.  It alters the call paths to make sure
that the callers explicitly say whether the call is being made on behalf of
a hotplug even, or happening at boot-time.

It has been compile tested on ppc64, ia64, s390, i386 and x86_64.

Acked-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:20 -08:00
John Keller
3948ec9406 ACPI: Altix: ACPI _PRT support
Provide ACPI _PRT support for SN Altix systems.

The SN Altix platform does not conform to the
IOSAPIC IRQ routing model, so a new acpi_irq_model
(ACPI_IRQ_MODEL_PLATFORM) has been defined. The SN
platform specific code sets acpi_irq_model to
this new value, and keys off of it in acpi_register_gsi()
to avoid the iosapic code path.

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-01-04 12:18:19 -05:00
Linus Torvalds
18ed1c0513 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits)
  ACPI: replace kmalloc+memset with kzalloc
  ACPI: Add support for acpi_load_table/acpi_unload_table_id
  fbdev: update after backlight argument change
  ACPI: video: Add dev argument for backlight_device_register
  ACPI: Implement acpi_video_get_next_level()
  ACPI: Kconfig - depend on PM rather than selecting it
  ACPI: fix NULL check in drivers/acpi/osl.c
  ACPI: make drivers/acpi/ec.c:ec_ecdt static
  ACPI: prevent processor module from loading on failures
  ACPI: fix single linked list manipulation
  ACPI: ibm_acpi: allow clean removal
  ACPI: fix git automerge failure
  ACPI: ibm_acpi: respond to workqueue update
  ACPI: dock: add uevent to indicate change in device status
  ACPI: ec: Lindent once again
  ACPI: ec: Change #define to enums there possible.
  ACPI: ec: Style changes.
  ACPI: ec: Acquire Global Lock under EC mutex.
  ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead.
  ACPI: ec: Rename gpe_bit to gpe
  ...
2006-12-22 18:46:56 -08:00
Ingo Molnar
0888f06ac9 [PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and APM idle code
Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
xruns starting at the 2.6.18-rt kernel, and those problems persisted all
until current -rt kernels. The latencies were serious and unjustified by
system load, often in the milliseconds range.

After a patient and heroic multi-month effort of Fernando, where he
tested dozens of kernels, tried various configs, boot options,
test-patches of mine and provided latency traces of those incidents, the
following 'smoking gun' trace was captured by him:

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (c01262ba 0 0)
  IRQ_19-1479  1D..1    0us : resched_task (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __spin_unlock_irqrestore (try_to_wake_up)
  ...
  <idle>-0     1...1   11us!: default_idle (cpu_idle)
  ...
  <idle>-0     0Dn.1  602us : smp_apic_timer_interrupt (c0103baf 1 0)
  ...
   <...>-5856  0D..2  618us : __switch_to (__schedule)
   <...>-5856  0D..2  618us : __schedule <<idle>-0> (20 162)
   <...>-5856  0D..2  619us : __spin_unlock_irq (__schedule)
   <...>-5856  0...1  619us : trace_stop_sched_switched (__schedule)
   <...>-5856  0D..1  619us : trace_stop_sched_switched <<...>-5856> (37 0)

what is visible in this trace is that CPU#1 ran try_to_wake_up() for
PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
for CPU#0. But it decided to not send an IPI that no CPU - due to
TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
result was a 600+ usecs latency and a missed wakeup!

the bug turned out to be an idle-wakeup bug introduced into the mainline
kernel this summer via an optimization in the x86_64 tree:

    commit 495ab9c045
    Author: Andi Kleen <ak@suse.de>
    Date:   Mon Jun 26 13:59:11 2006 +0200

    [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status

    During some profiling I noticed that default_idle causes a lot of
    memory traffic. I think that is caused by the atomic operations
    to clear/set the polling flag in thread_info. There is actually
    no reason to make this atomic - only the idle thread does it
    to itself, other CPUs only read it. So I moved it into ti->status.

the problem is this type of change:

        if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
-               clear_thread_flag(TIF_POLLING_NRFLAG);
+               current_thread_info()->status &= ~TS_POLLING;
                smp_mb__after_clear_bit();
                while (!need_resched()) {
                        local_irq_disable();

this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
clear_thread_flag() is defined as:

        clear_bit(flag, &ti->flags);

and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:

  static inline void clear_bit(int nr, volatile unsigned long * addr)
  {
          __asm__ __volatile__( LOCK_PREFIX
                  "btrl %1,%0"

hence smp_mb__after_clear_bit() is defined as a simple compile barrier:

  #define smp_mb__after_clear_bit()       barrier()

but the explicit TS_POLLING clearing introduced by the patch:

+               current_thread_info()->status &= ~TS_POLLING;

is not an atomic op! So the clearing of the TS_POLLING bit is freely
reorderable with the reading of the NEED_RESCHED bit - and both now
reside in different memory addresses.

CPU idle wakeup very much depends on ordered memory ops, the clearing of
the TS_POLLING flag must always be done before we test need_resched()
and hit the idle instruction(s). [Symmetrically, the wakeup code needs
to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
ordering is paramount.]

Fernando's dual-core Athlon64 system has a sufficiently advanced memory
ordering model so that it triggered this scenario very often.

( And it also turned out that the reason why these latencies never
  triggered on my testsystems is that i routinely use idle=poll, which
  was the only idle variant not affected by this bug. )

The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
act as an absolute barrier between the TS_POLLING write and the
NEED_RESCHED read. This affects almost all idling methods (default,
ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22 08:55:51 -08:00
Burman Yan
36bcbec7ce ACPI: replace kmalloc+memset with kzalloc
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-12-20 16:54:54 -05:00
Robert P. J. Day
5cbded585d [PATCH] getting rid of all casts of k[cmz]alloc() calls
Run this:

	#!/bin/sh
	for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
	  echo "De-casting $f..."
	  perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
	done

And then go through and reinstate those cases where code is casting pointers
to non-pointers.

And then drop a few hunks which conflicted with outstanding work.

Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:58 -08:00
bibo,mao
df3e0d1c69 [IA64] kprobe clears qp bits for special instructions
On IA64 there exists some special instructions which
always need to be executed regradless of qp bits, such
as com.crel.unc, tbit.trel.unc etc.
This patch clears qp bits when inserting kprobe trap code
and disables probepoint on slot 1 for these special
instructions.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 12:04:42 -08:00
Tony Luck
08ed38b680 [IA64] enable trap code on slot 1
Because slot 1 of one instr bundle crosses border of two consecutive
8-bytes, kprobe on slot 1 is disabled. This patch enables kprobe on
slot1, it only replaces higher 8-bytes of the instruction bundle and
changes the exception code to ignore the low 12 bits of the break
number (which is across the border in the lower 8-bytes of the bundle).

For those instructions which must execute regardless qp bits,
kprobe on slot 1 is still disabled.

Signed-off-by: bibo,mao <bibo.mao@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 12:00:55 -08:00
Tony Luck
75f6a1de41 [IA64] Take defensive stance on ia64_pal_get_brand_info()
Stephane thought he saw a problem here (but was just confused
by the return value from ia64_pal_get_brand_info()).  But we
should be more defensive here in case an prototype PAL for
a future processor doesn't implement this PAL call.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 11:56:36 -08:00
Dean Nelson
a460ef8d0a [IA64] fix possible XPC deadlock when disconnecting
This patch eliminates a potential deadlock that is possible when XPC
disconnects a channel to a partition that has gone down. This deadlock will
occur if at least one of the kthreads created by XPC for the purpose of making
callouts to the channel's registerer is detained in the registerer and will
not be returning back to XPC until some registerer request occurs on the now
downed partition. The potential for a deadlock is removed by ensuring that
there always is a kthread available to make the channel disconnecting callout
to the registerer.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 11:48:53 -08:00
Jack Steiner
1cf24bdbbb [IA64] - Reduce overhead of FP exception logging messages
Improve the scalability of the fpswa code that rate-limits
logging of messages.

There are 2 distinctly different problems in this code.

1) If prctl is used to disable logging, last_time is never
   updated. The result is that fpu_swa_count is zeroed out on
   EVERY fp fault. This causes a very very hot cache line.
   The fix reduces the wallclock time of a 1024p FP exception test
   from 28734 sec to 19 sec!!!

2) On VERY large systems, excessive messages are logged because
   multiple cpus can each reset or increment fpu_swa_count at
   about the same time. The result is that hundreds of messages
   are logged each second. The fixes reduces the logging rate
   to ~1 per second.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 11:47:09 -08:00
Tony Luck
8b9c106856 [IA64] fix arch/ia64/mm/contig.c:235: warning: unused variable `nid'
This warning only shows up with CONFIG_VIRTUAL_MEM_MAP=y and
CONFIG_FLATMEM=y.

There is only one caller left for register_active_ranges() from the
contig.c code ... so it doesn't need to pick up the node number, the
node number is always zero.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 11:18:55 -08:00
Tony Luck
f889a26a70 [IA64] s/termios/ktermios/ in simserial.c
This got missed in 606d099cdd

Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 10:47:36 -08:00
Horms
53da5763bf [IA64] kexec/kdump: tidy up declaration of relocate_new_kernel_t
* Make NORET_TYPE and ATTRIB_NORET in line with the
  declaration for other architectures
* Add parameter names

Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 10:12:08 -08:00
Horms
ad1c3ba7e5 [IA64] Kexec/Kdump: honour non-zero crashkernel offset.
There seems to be a value in both allowing the kernel to determine
the base offset of the crashkernel automatically and allowing
users's to sepcify it.

The old behaviour on ia64, which is still the current behaviour on
most architectures is for the user to always specify the address.
Recently ia64 was changed so that it is always automatically determined.

With this patch the kernel automatically determines the offset if
the supplied value is 0, otherwise it uses the value provided.

This should probably be backed by a documentation change.

Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 10:11:37 -08:00
Horms
45a98fc622 [IA64] CONFIG_KEXEC/CONFIG_CRASH_DUMP permutations
Actually, on reflection I think that there is a good case for
keeping the options separate. I am thinking particularly of people
who want a very small crashdump kernel and thus don't want to compile
in kexec.

The patch below should fix things up so that all valid combinations of
KEXEC, CRASH_DUMP and VMCORE compile cleanly - VMCORE depends on
CRASH_DUMP which is why I said valid combinations. In a nutshell
it just untangles unrelated code and switches around a few defines.

Please note that it creats a new file, arch/ia64/kernel/crash_dump.c
This is in keeping with the i386 implementation.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 10:11:00 -08:00
Jay Lan
adf142e379 [IA64] Do not call SN_SAL_SET_CPU_NUMBER twice on cpu 0
This is an SN specific patch.

Architectually, cpu_init is always called twice on cpu 0
and thus resulted in two SN_SAL_SET_CPU_NUMBER calls.

This was harmless in production kernel; however, it can
cause problem on booting up a crashdump kernel at Altix.

Here is the patch that detects the second sn_cpu_init
call and skips the second call to SN_SAL_SET_CPU_NUMBER.

Signed-Off-By: Jay Lan <jlan@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-12 10:09:39 -08:00
David Howells
f0d1b0b30d [PATCH] LOG2: Implement a general integer log2 facility in the kernel
This facility provides three entry points:

	ilog2()		Log base 2 of unsigned long
	ilog2_u32()	Log base 2 of u32
	ilog2_u64()	Log base 2 of u64

These facilities can either be used inside functions on dynamic data:

	int do_something(long q)
	{
		...;
		y = ilog2(x)
		...;
	}

Or can be used to statically initialise global variables with constant values:

	unsigned n = ilog2(27);

When performing static initialisation, the compiler will report "error:
initializer element is not constant" if asked to take a log of zero or of
something not reducible to a constant.  They treat negative numbers as
unsigned.

When not dealing with a constant, they fall back to using fls() which permits
them to use arch-specific log calculation instructions - such as BSR on
x86/x86_64 or SCAN on FRV - if available.

[akpm@osdl.org: MMC fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Wojtek Kaniewski <wojtekka@toxygen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:51 -08:00
Josef Sipek
b66ffad904 [PATCH] struct path: convert ia64
Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:45 -08:00
Linus Torvalds
6ee7e78e7c Merge branch 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] replace kmalloc+memset with kzalloc
  [IA64] resolve name clash by renaming is_available_memory()
  [IA64] Need export for csum_ipv6_magic
  [IA64] Fix DISCONTIGMEM without VIRTUAL_MEM_MAP
  [PATCH] Add support for type argument in PAL_GET_PSTATE
  [IA64] tidy up return value of ip_fast_csum
  [IA64] implement csum_ipv6_magic for ia64.
  [IA64] More Itanium PAL spec updates
  [IA64] Update processor_info features
  [IA64] Add se bit to Processor State Parameter structure
  [IA64] Add dp bit to cache and bus check structs
  [IA64] SN: Correctly update smp_affinty mask
  [IA64] sparse cleanups
  [IA64] IA64 Kexec/kdump
2006-12-07 15:39:22 -08:00
Yan Burman
52fd91088b [IA64] replace kmalloc+memset with kzalloc
Replace kmalloc+memset with kzalloc

Signed-off-by: Yan Burman <burman.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 13:46:43 -08:00
Christoph Lameter
66888a6e5f [IA64] resolve name clash by renaming is_available_memory()
There is a name clash with ia64 arch code in Andrew's tree. Rename
is_avialable_memory() to is_memory_available() to avoid the clash.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 13:46:12 -08:00
Tony Luck
a5f8ee0291 [IA64] Need export for csum_ipv6_magic
Now we have our own highly optimized assembly code version of
this routine (Thanks Ken!) we should export it so that it can
be used.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 13:18:57 -08:00
Venkatesh Pallipadi
17e77b1cc3 [PATCH] Add support for type argument in PAL_GET_PSTATE
PAL_GET_PSTATE accepts a type argument to return different kinds of
frequency information.
Refer: Intel ItaniumArchitecture Software Developer's Manual -
Volume 2: System Architecture, Revision 2.2
(http://developer.intel.com/design/itanium/manuals/245318.htm)

Add the support for type argument and use Instantaneous frequency
in the acpi driver.

Also fix a bug, where in return value of PAL_GET_PSTATE was getting compared
with 'control' bits instead of 'status' bits.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 11:21:55 -08:00
Chen, Kenneth W
6dbfc19b7e [IA64] tidy up return value of ip_fast_csum
While working on implementing csum_ipv6_magic, I noticed that current
version of ip_fast_csum will potentially return bits above "unsigned
short" as 1.  While no harm is done right now because all call sites
will chop off the upper bits when it uses the return value.  However,
this is still dangerous and buggy.  Here is a patch to enforce that the
function really returns unsigned short in the native register format.

The fix is free as there are plenty open slot to add one more asm instruction.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 11:19:59 -08:00
Chen, Kenneth W
007d77d0c5 [IA64] implement csum_ipv6_magic for ia64.
The asm version is 4.4 times faster than the generic C version and
10X smaller in code size.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 11:17:26 -08:00
Russ Anderson
5b4d5681ff [IA64] More Itanium PAL spec updates
Additional updates to conform with Rev 2.2 of Volume 2 of "Intel
Itanium Architecture Software Developer's Manual" (January 2006).

Add pal_bus_features_s bits 52 & 53 (page 2:347)
Add pal_vm_info_2_s field max_purges (page 2:2:451)
Add PAL_GET_HW_POLICY call (page 2:381)
Add PAL_SET_HW_POLICY call (page 2:439)

Sample output before:
---------------------------------------------------------------------
cobra:~ # cat /proc/pal/cpu0/vm_info
Physical Address Space         : 50 bits
Virtual Address Space          : 61 bits
Protection Key Registers(PKR)  : 16
Implemented bits in PKR.key    : 24
Hash Tag ID                    : 0x2
Size of RR.rid                 : 24
Supported memory attributes    : WB, UC, UCE, WC, NaTPage
---------------------------------------------------------------------

Sample output after:
---------------------------------------------------------------------
cobra:~ # cat /proc/pal/cpu0/vm_info
Physical Address Space         : 50 bits
Virtual Address Space          : 61 bits
Protection Key Registers(PKR)  : 16
Implemented bits in PKR.key    : 24
Hash Tag ID                    : 0x2
Max Purges                     : 1
Size of RR.rid                 : 24
Supported memory attributes    : WB, UC, UCE, WC, NaTPage
---------------------------------------------------------------------

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 11:10:16 -08:00
Russ Anderson
895309ff6f [IA64] Update processor_info features
Add the printing of additional processor features to proc_features.

Based on Rev 2.2 of Volume 2 of "Intel Itanium Architecture Software
Developer's Manual" (January 2006) fields (pages 2:430-2:432).
This patch gets the features back in sync with the spec.

Sample output before:
--------------------------------------------------------------
cobra:~ # cat /proc/pal/cpu0/processor_info
XIP,XPSR,XFS implemented                 : On NoCtrl
XR1-XR3 implemented                      : On NoCtrl
Disable dynamic predicate prediction     : NotImpl
Disable processor physical number        : NotImpl
Disable dynamic data cache prefetch      : NotImpl
Disable dynamic inst cache prefetch      : NotImpl
Disable dynamic branch prediction        : NotImpl
Disable BINIT on processor time-out      : On Ctrl
Disable dynamic power management (DPM)   : NotImpl
Disable coherency                        : NotImpl
Disable cache                            : NotImpl
Enable CMCI promotion                    : Off Ctrl
Enable MCA to BINIT promotion            : Off Ctrl
Enable MCA promotion                     : NotImpl
Enable BERR promotion                    : NotImpl
cobra:~ #
--------------------------------------------------------------

Sample output after:
--------------------------------------------------------------
cobra:~ # cat /proc/pal/cpu0/processor_info
Unimplemented instruction address fault  : NotImpl
INIT, PMI, and LINT pins                 : NotImpl
Simple unimplimented instr addresses     : On NoCtrl
Variable P-state performance             : NotImpl
Virtual machine features implemeted      : On NoCtrl
XIP,XPSR,XFS implemented                 : On NoCtrl
XR1-XR3 implemented                      : On NoCtrl
Disable dynamic predicate prediction     : NotImpl
Disable processor physical number        : NotImpl
Disable dynamic data cache prefetch      : NotImpl
Disable dynamic inst cache prefetch      : NotImpl
Disable dynamic branch prediction        : NotImpl
Disable P-states                         : Off Ctrl
Enable MCA on Data Poisoning             : Off Ctrl
Enable vmsw instruction                  : On Ctrl
Enable extern environmental notification : NotImpl
Disable BINIT on processor time-out      : On Ctrl
Disable dynamic power management (DPM)   : NotImpl
Disable coherency                        : NotImpl
Disable cache                            : NotImpl
Enable CMCI promotion                    : Off Ctrl
Enable MCA to BINIT promotion            : Off Ctrl
Enable MCA promotion                     : NotImpl
Enable BERR promotion                    : NotImpl
cobra:~ #
--------------------------------------------------------------

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 11:06:35 -08:00
John Keller
c69577711a [IA64] SN: Correctly update smp_affinty mask
On Altix systems, the /proc/irq/nn/smp_affinity mask is not being setup
at device iniitalization, or updated after an interrupt redirection.
This patch resolves those issues.

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 10:50:09 -08:00
Matthew Wilcox
d61b49c1aa [IA64] sparse cleanups
0/NULL confusion and some missing UL on constants.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 10:48:19 -08:00
Zou Nan hai
a79561134f [IA64] IA64 Kexec/kdump
Changes and updates.

1. Remove fake rendz path and related code according to discuss with Khalid Aziz.
2. fc.i offset fix in relocate_kernel.S.
3. iospic shutdown code eoi and mask race fix from Fujitsu.
4. Warm boot hook in machine_kexec to SN SAL code from Jack Steiner.
5. Send slave to SAL slave loop patch from Jay Lan.
6. Kdump on non-recoverable MCA event patch from Jay Lan
7. Use CTL_UNNUMBERED in kdump_on_init sysctl.

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-07 09:51:35 -08:00
Linus Torvalds
4522d58275 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits)
  [PATCH] x86-64: Export smp_call_function_single
  [PATCH] i386: Clean up smp_tune_scheduling()
  [PATCH] unwinder: move .eh_frame to RODATA
  [PATCH] unwinder: fully support linker generated .eh_frame_hdr section
  [PATCH] x86-64: don't use set_irq_regs()
  [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
  [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM
  [PATCH] i386: replace kmalloc+memset with kzalloc
  [PATCH] x86-64: remove remaining pc98 code
  [PATCH] x86-64: remove unused variable
  [PATCH] x86-64: Fix constraints in atomic_add_return()
  [PATCH] x86-64: fix asm constraints in i386 atomic_add_return
  [PATCH] x86-64: Correct documentation for bzImage protocol v2.05
  [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
  [PATCH] x86-64: Fix numaq build error
  [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header
  [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder
  [PATCH] x86-64: Clarify error message in GART code
  [PATCH] x86-64: Fix interrupt race in idle callback (3rd try)
  [PATCH] x86-64: Remove unwind stack pointer alignment forcing again
  ...

Fixed conflict in include/linux/uaccess.h manually

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:59:11 -08:00
Ingo Molnar
0231606785 [PATCH] hotplug CPU: clean up hotcpu_notifier() use
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.

the compiler can skip truly unused functions just fine:

    text    data     bss     dec     hex filename
 1624412  728710 3674856 6027978  5bfaca vmlinux.before
 1624412  728710 3674856 6027978  5bfaca vmlinux.after

[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Masami Hiramatsu
b4c6c34a53 [PATCH] kprobes: enable booster on the preemptible kernel
When we are unregistering a kprobe-booster, we can't release its
instruction buffer immediately on the preemptive kernel, because some
processes might be preempted on the buffer.  The freeze_processes() and
thaw_processes() functions can clean most of processes up from the buffer.
There are still some non-frozen threads who have the PF_NOFREEZE flag.  If
those threads are sleeping (not preempted) at the known place outside the
buffer, we can ensure safety of freeing.

However, the processing of this check routine takes a long time.  So, this
patch introduces the garbage collection mechanism of insn_slot.  It also
introduces the "dirty" flag to free_insn_slot because of efficiency.

The "clean" instruction slots (dirty flag is cleared) are released
immediately.  But the "dirty" slots which are used by boosted kprobes, are
marked as garbages.  collect_garbage_slots() will be invoked to release
"dirty" slots if there are more than INSNS_PER_PAGE garbage slots or if
there are no unused slots.

Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "bibo,mao" <bibo.mao@intel.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Yumiko Sugita <yumiko.sugita.yf@hitachi.com>
Cc: Satoshi Oshima <soshima@redhat.com>
Cc: Hideo Aoki <haoki@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Magnus Damm
386d9a7edd [PATCH] elf: Always define elf_addr_t in linux/elf.h
Define elf_addr_t in linux/elf.h.  The size of the type is determined using
ELF_CLASS.  This allows us to remove the defines that today are spread all
over .c and .h files.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Daniel Jacobowitz <drow@false.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Christoph Lameter
e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter
e94b176609 [PATCH] slab: remove SLAB_KERNEL
SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Chen, Kenneth W
39dde65c99 [PATCH] shared page table for hugetlb page
Following up with the work on shared page table done by Dave McCracken.  This
set of patch target shared page table for hugetlb memory only.

The shared page table is particular useful in the situation of large number of
independent processes sharing large shared memory segments.  In the normal
page case, the amount of memory saved from process' page table is quite
significant.  For hugetlb, the saving on page table memory is not the primary
objective (as hugetlb itself already cuts down page table overhead
significantly), instead, the purpose of using shared page table on hugetlb is
to allow faster TLB refill and smaller cache pollution upon TLB miss.

With PT sharing, pte entries are shared among hundreds of processes, the cache
consumption used by all the page table is smaller and in return, application
gets much higher cache hit ratio.  One other effect is that cache hit ratio
with hardware page walker hitting on pte in cache will be higher and this
helps to reduce tlb miss latency.  These two effects contribute to higher
application performance.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Dave McCracken <dmccr@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:21 -08:00
Siddha, Suresh B
72486f1f8f [PATCH] i386: change the 'no_control' field to 'hotpluggable' in the struct cpu
Change the 'no_control' field in the cpu struct to a more positive
and better term 'hotpluggable'. And change(/cleanup) the logic accordingly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Peter Chubb
c7f570a5ec [IA64] Fix pci.c kernel compilation breakage.
The recent change to convert the is_enabled flag in the PCI device to an
atomic count broke the IA64 compilation.

As pcibios_disable_device is only ever called if the reference count
is zero, convert the if to a BUG_ON.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-06 14:13:38 -08:00
David Howells
6d5aefb8ea WorkQueue: Fix up arch-specific work items where possible
Fix up arch-specific work items where possible to use the new work_struct and
delayed_work structs.

Three places that enqueue bits of their stack and then return have been marked
with #error as this is not permitted.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 19:36:26 +00:00
Al Viro
322529961e [NET]: IA64 checksum annotations and cleanups.
* sanitize prototypes, annotate
* ntohs -> shift in checksum calculations
* kill access_ok() in csum_partial_copy_from_user
* collapse do_csum_partial_copy_from_user

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:05 -08:00
Linus Torvalds
72a73a69f6 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (28 commits)
  PCI: make arch/i386/pci/common.c:pci_bf_sort static
  PCI: ibmphp_pci.c: fix NULL dereference
  pciehp: remove unnecessary pci_disable_msi
  pciehp: remove unnecessary free_irq
  PCI: rpaphp: change device tree examination
  PCI: Change memory allocation for acpiphp slots
  i2c-i801: SMBus patch for Intel ICH9
  PCI: irq: irq and pci_ids patch for Intel ICH9
  PCI: pci_{enable,disable}_device() nestable ports
  PCI: switch pci_{enable,disable}_device() to be nestable
  PCI: arch/i386/kernel/pci-dma.c: ioremap balanced with iounmap
  pci/i386: style cleanups
  PCI: Block on access to temporarily unavailable pci device
  pci: fix __pci_register_driver error handling
  pci: clear osc support flags if no _OSC method
  acpiphp: fix missing acpiphp_glue_exit()
  acpiphp: fix use of list_for_each macro
  Altix: Initial ACPI support - ROM shadowing.
  Altix: SN ACPI hotplug support.
  Altix: Add initial ACPI IO support
  ...
2006-12-01 16:41:27 -08:00
John Keller
a2302c68d9 Altix: Initial ACPI support - ROM shadowing.
Support a shadowed ROM when running with an ACPI capable PROM.

Define a new dev.resource flag IORESOURCE_ROM_BIOS_COPY to
describe the case of a BIOS shadowed ROM, which can then
be used to avoid pci_map_rom() making an unneeded call to
pci_enable_rom().


Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:36:58 -08:00
John Keller
8ea6091f50 Altix: Add initial ACPI IO support
First phase in introducing ACPI support to SN.
In this phase, when running with an ACPI capable PROM,
the DSDT will define the root busses and all SN nodes
(SGIHUB, SGITIO). An ACPI bus driver will be registered
for the node devices, with the acpi_pci_root_driver being
used for the root busses. An ACPI vendor descriptor is
now used to pass platform specific information for both
nodes and busses, eliminating the need for the current
SAL calls. Also, with ACPI support, SN fixup code is no longer
needed to initiate the PCI bus scans, as the acpi_pci_root_driver
does that.

However, to maintain backward compatibility with non-ACPI capable
PROMs, none of the current 'fixup' code can been deleted, though
much restructuring has been done. For example, the bulk of the code
in io_common.c is relocated code that is now common regardless
of what PROM is running, while io_acpi_init.c and io_init.c contain
routines specific to an ACPI or non ACPI capable PROM respectively.

A new pci bus fixup platform vector has been created to provide
a hook for invoking platform specific bus fixup from pcibios_fixup_bus().

The size of io_space[] has been increased to support systems with
large IO configurations.


Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:36:57 -08:00
Matthew Wilcox
3efe2d84c8 PCI: Use pci_generic_prep_mwi on ia64
The pci_generic_prep_mwi() code does everything that pcibios_prep_mwi()
does on ia64.  All we need to do is be sure that pci_cache_line_size
is set appropriately, and we can delete pcibios_prep_mwi().

Using SMP_CACHE_BYTES as the default was wrong on uniprocessor machines
as it is only 8 bytes.  The default in the generic code of L1_CACHE_BYTES
is at least as good.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-01 14:36:56 -08:00
Matt LaPlante
0779bf2d2e Fix misc .c/.h comment typos
Fix various .c/.h typos in comments (no code changes).

Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-30 05:24:39 +01:00
Luck, Tony
cea196bb2e [IA64] a fix towards allmodconfig build
The HP_SIMSCSI driver can't be built as a module (unhealthy dependencies on
things that shouldn't really be exported).

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-11-16 11:25:12 -08:00
Ingo Molnar
5fbb004aba [IA64] use generic_handle_irq()
Use generic_handle_irq() to handle mixed-type irq handling.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-11-16 09:38:35 -08:00
Ingo Molnar
06344db316 [IA64] typename -> name conversion
convert irq chip typename -> name.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-11-16 09:38:02 -08:00
Andrew Morton
351a58390a [IA64] irqs: use name' not typename'
`typename' is going away and is usually uninitialised anwyay.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-11-16 09:37:45 -08:00
Robin Holt
cbf093e8c7 [IA64] bte_unaligned_copy() transfers one extra cache line.
When called to do a transfer that has a start offset within the cache
line which is uneven between source and destination and a length which
terminates the source of the copy exactly on a cache line, one extra
line gets copied into a temporary buffer.  This is normally not an issue
since the buffer is a kernel buffer and only the requested information
gets copied into the user buffer.

The problem arises when the source ends at the very last physical page
of memory.  That last cache line does not exist and results in the SHUB
chip raising an MCA.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-11-15 10:12:15 -08:00
Hugh Dickins
68589bc353 [PATCH] hugetlb: prepare_hugepage_range check offset too
(David:)

If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
because the given file offset is not hugepage aligned - then do_mmap_pgoff
will go to the unmap_and_free_vma backout path.

But at this stage the vma hasn't been marked as hugepage, and the backout path
will call unmap_region() on it.  That will eventually call down to the
non-hugepage version of unmap_page_range().  On ppc64, at least, that will
cause serious problems if there are any existing hugepage pagetable entries in
the vicinity - for example if there are any other hugepage mappings under the
same PUD.  unmap_page_range() will trigger a bad_pud() on the hugepage pud
entries.  I suspect this will also cause bad problems on ia64, though I don't
have a machine to test it on.

(Hugh:)

prepare_hugepage_range() should check file offset alignment when it checks
virtual address and length, to stop MAP_FIXED with a bad huge offset from
unmapping before it fails further down.  PowerPC should apply the same
prepare_hugepage_range alignment checks as ia64 and all the others do.

Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
is the check for too small a mapping); but even so, move up setting of
VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
when unwinding from error will go the non-huge way, which may cause bad
behaviour on architectures (powerpc and ia64) which segregate their huge
mappings into a separate region of the address space.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-14 09:09:27 -08:00
Jes Sorensen
1a4b0fc503 [PATCH] mspec driver build fix
Fix MSPEC driver to build for non SN2 enabled configs as the driver should
work in cached and uncached modes (no fetchop) on these systems.  In
addition make MSPEC select IA64_UNCACHED_ALLOCATOR, which is required for
it and move it to arch/ia64/Kconfig to avoid warnings on non ia64
architectures running allmodconfig.  Once the Kconfig code is fixed, we can
move it back.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Cc: Fernando Luis Vzquez Cao <fernando@oss.ntt.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-13 07:40:42 -08:00
KAMEZAWA Hiroyuki
6c33eb3997 [PATCH] ia64: select ACPI_NUMA if ACPI
When ACPI && NUMA, pxm_to_node is used and it exists in drivers/acpi/numa.c

Tony said:

  The patch makes sense ...  if you pick both of "ACPI" and "NUMA", then you
  need (and should automatically be given) ACPI_NUMA too.

  The only open question is whether there is a better way of getting there.
  Perhaps with less configuration options in the first place?  We are heading
  towards a future where so many systems will be NUMA that there would seem to
  be little benefit in keeping ACPI_NUMA separate from ACPI ...  but perhaps
  we aren't quite there yet.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Cc: Len Brown <lenb@kernel.org>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:24 -08:00
Linus Torvalds
d5b9b787b5 Merge branch 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Correct definition of handle_IPI
  [IA64] move SAL_CACHE_FLUSH check later in boot
  [IA64] MCA recovery: Montecito support
  [IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
  [IA64] don't double >> PAGE_SHIFT pointer for /dev/kmem access
2006-10-31 17:03:50 -08:00
Keith Owens
024e4f2c51 [IA64] Correct definition of handle_IPI
The declaration of handle_IPI in arch/ia64/kernel/smp.c was changed but
not the definition of this function.  Remove struct pt_regs from
handle_IPI().

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-10-31 14:38:15 -08:00
Troy Heber
fa1d19e5d9 [IA64] move SAL_CACHE_FLUSH check later in boot
The check to see if the firmware drops interrupts during a
SAL_CACHE_FLUSH is done to early in the boot. SAL_CACHE_FLUSH expects
to be able to make PAL calls in virtual mode, on some cell based
machines a fault occurs causing a MCA. This patch moves the check
after mmu_context_init so the TLB and VHPT are properly setup.

Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-10-31 14:32:10 -08:00
Russ Anderson
264b0f9930 [IA64] MCA recovery: Montecito support
The information in MCA records is filled in slightly differently on
Montecito than on Madison/McKinley.  Usually, the cache check and bus
check target identifiers have the same address.   On Montecito the
cache check and bus check target identifiers can be different if 
a corrected error (ie SBE or unconsumed poison data) was encountered and
then an uncorrected error (ie DBE) was consumed.  In that case, the 
cache check target identifier is the physical address of the DBE (that
caused the MCA to surface) while the bus check target identifier is the 
physical address of the SBE.  This patch correctly finds the target
identifier that triggered the MCA.

If there are multiple valid cache target identifiers in the same
error record then use the one with the lowest cache level.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-10-31 14:30:34 -08:00
Kenji Kaneshige
5ee7737379 [IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-10-31 14:17:27 -08:00
Linus Torvalds
fe31eb6797 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
  PCI: Remove quirk_via_abnormal_poweroff
  PCI: reset pci device state to unknown state for resume
  PCI: x86-64: mmconfig missing printk levels
  PCI: fix pci_fixup_video as it blows up on sparc64
  acpiphp: fix latch status
2006-10-27 15:35:28 -07:00
Andrew Morton
61ce1efe6e [PATCH] vmlinux.lds: consolidate initcall sections
Add a vmlinux.lds.h helper macro for defining the eight-level initcall table,
teach all the architectures to use it.

This is a prerequisite for a patch which performs initcall synchronisation for
multithreaded-probing.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
[ Added AVR32 as well ]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-27 15:34:51 -07:00
Eiichiro Oiwa
6b5c76b8e2 PCI: fix pci_fixup_video as it blows up on sparc64
This reverts much of the original pci_fixup_video change and makes it
work for all arches that need it.

fixed, and tested on x86, x86_64 and IA64 dig.

Signed-off-by: Eiichiro Oiwa <eiichiro.oiwa.nm@hitachi.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-27 11:20:33 -07:00
Jack Steiner
9b3377f992 [IA64] Count resched interrupts
Count the number of "resched" interrupts that each cpu receives.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-10-17 15:03:08 -07:00