Commit Graph

911 Commits

Author SHA1 Message Date
David S. Miller
83292e0a9c [SPARC64]: Fix MD property lifetime bugs.
Property values cannot be referenced outside of
mdesc_grab()/mdesc_release() pairs.  The only major
offender was the VIO bus layer, easily fixed.

Add some commentary to mdesc.h describing these rules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:33 -07:00
David S. Miller
43fdf27470 [SPARC64]: Abstract out mdesc accesses for better MD update handling.
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.

The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.

The MD is now reference counted, so accesses have to now take the
form:

	handle = mdesc_grab();

	... operations on MD ...

	mdesc_release(handle);

The only remaining issue are cases where code holds on to references
to MD property values.  mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime.  Those will be fixed up in a subsequent
changeset.

A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting.  It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:28 -07:00
David S. Miller
133f09a169 [SPARC64]: Use more mearningful names for IRQ registry.
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless.  Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:24 -07:00
David S. Miller
e450992d13 [SPARC64]: Initial domain-services driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:20 -07:00
David S. Miller
13077d8028 [SPARC64]: Export powerd facilities for external entities.
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:16 -07:00
David S. Miller
2c4f4ecb7a [SPARC64]: Add domain-services nodes to VIO device tree.
They sit under the root of the MD tree unlike the rest of
the LDC channel based virtual devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:13 -07:00
David S. Miller
cb48123584 [SPARC64]: Assorted LDC bug cures.
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
   it and LDC_MODE_STREAM were mis-numbered.

2) read_stream() should try to read as much as possible into
   the per-LDC stream buffer area, so do not trim the read_nonraw()
   length by the caller's size parameter.

3) Send data ACKs when necessary in read_nonraw().

4) In read_nonraw() when we get a pure ACK, advance the RX head
   unconditionally past it.

5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
   This helps debugging stream mode LDC channel problems.

6) Decrease verbosity of rx_data_wait() so that it is more useful.
   A debugging message each loop iteration is too much.

7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
   not lp->tx_head.

8) Set the seqid field properly in send_data_nack().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:09 -07:00
David S. Miller
5a606b72a4 [SPARC64]: Do not ACK an INO if it is disabled or inprogress.
This is also a partial workaround for a bug in the LDOM firmware which
double-transmits RX inos during high load.  Without this, such an
event causes the kernel to loop forever in the interrupt call chain
ACK'ing but never actually running the IRQ handler (and thus clearing
the interrupt condition in the device).

There is still a bad potential effect when double INOs occur,
not covered by this changeset.  Namely, if the INO is already on
the per-cpu INO vector list, we still blindly re-insert it and
thus we can end up losing interrupts already linked in after
it.

We could deal with that by traversing the list before insertion,
but that's too expensive for this edge case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:05 -07:00
David S. Miller
e53e97ce3c [SPARC64]: Add LDOM virtual channel driver and VIO device layer.
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework.  This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.

Built on top of this is a VIO device protocol which has it's
own handshaking and message types.  At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:03:18 -07:00
Matthew Wilcox
36e235901f PCI: Only build PCI syscalls on architectures that want them
The PCI syscalls are built on every architecture except X86, but only
a few have ever hooked them up.  Use a new Kconfig symbol to save a
couple of kB on the architectures that have never used the syscalls.
Tested on x86 and ia64 only.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:13 -07:00
Auke Kok
b8a3a5214d PCI: read revision ID by default
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:09 -07:00
Ingo Molnar
0437e109e1 sched: zap the migration init / cache-hot balancing code
the SMP load-balancer uses the boot-time migration-cost estimation
code to attempt to improve the quality of balancing. The reason for
this code is that the discrete priority queues do not preserve
the order of scheduling accurately, so the load-balancer skips
tasks that were running on a CPU 'recently'.

this code is fundamental fragile: the boot-time migration cost detector
doesnt really work on systems that had large L3 caches, it caused boot
delays on large systems and the whole cache-hot concept made the
balancing code pretty undeterministic as well.

(and hey, i wrote most of it, so i can say it out loud that it sucks ;-)

under CFS the same purpose of cache affinity can be achieved without
any special cache-hot special-case: tasks are sorted in the 'timeline'
tree and the SMP balancer picks tasks from the left side of the
tree, thus the most cache-cold task is balanced automatically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-09 18:51:57 +02:00
David S. Miller
a357b8f42e [SPARC64]: Need to set state to IDLE during sun4v IRQ enable.
This fixes hypervisor console interrupts on LDOM guests.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:31 -07:00
David S. Miller
1245088400 [SPARC64]: Fix VIRQ enabling.
We were doing the wrong call to turn them on, and also
when enabling we need to forcefully set the state to IDLE.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:09 -07:00
David S. Miller
fc395f8d58 [SPARC64]: Fix args to sun4v_ldc_revoke().
First argument is LDC channel ID, then mapping cookie,
then the MTE revoke cookie.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:27 -07:00
David S. Miller
56f5c0bd50 [SPARC64]: Fix IO/MEM space sizing for PCI.
In pci_determine_mem_io_space(), do not hard code the region sizes.
Instead, use the values given to us in the ranges property.

Thanks goes to Mikael Petterson for the original Xorg failure
bug repoert, and strace dumps from Mikael and Dmitry Artamonow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:19 -07:00
David S. Miller
4a907dec98 [SPARC64]: Wire up cookie based sun4v interrupt registry.
This will be used for logical domain channel interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:04 -07:00
David S. Miller
8c2786cfa6 [SPARC64]: Handle PCI bridges without 'ranges' property.
This fixes the IDE controller not showing up on Netra-T1
systems.

Just like Simba bridges, some PCI bridges can lack the
'ranges' OBP property.  So we handle this similarly to
the existing Simba code:

1) In of_device register address resolving, we push the
   translation to the parent.

2) In PCI device scanning, we interrogate the PCI config
   space registers of the PCI bus device in order to resolve
   the resources, just like the generic Linux PCI probing
   code does.

With much help and testing from Fabio, who also reported
the initial problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
2007-06-07 21:59:44 -07:00
Robert P. J. Day
ea1ff19ce0 [SPARC64]: Include <linux/rwsem.h> instead of <asm/rwsem.h>.
To be consistent with other architectures, include the generic version
of rwsem.h.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 20:24:50 -07:00
David S. Miller
ec4d18f219 [SPARC64]: Fix SBUS IRQ regression caused by PCI-E driver.
We used to access the 64-bit IRQ IMAP and ICLR registers of bus
controllers 4-bytes in and as a 32-bit register word, since only the
low 32-bits were relevant.  This seemed like a good idea at the time.

But the PCI-E controller requires full 8-byte 64-bit access to
these registers, so we switched over to accessing them fully.

SBUS was not adjusted properly, which broke interrupts completely.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:51 -07:00
David S. Miller
321566c250 [SPARC64]: Fix 2 bugs in PCI Sabre bus scanning.
If we are on hummingbird, bus runs at 66MHZ.

pbm->pci_bus should be setup with the result of pci_scan_one_pbm()
or else we deref NULL pointers in the error interrupt handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:46 -07:00
David S. Miller
a2f9f6bbb3 [SPARC64]: Fix {mc,smt}_capable().
It's not just sun4v hypervisor platforms that should return true
for this, sun4u with UltraSPARC-IV should return true too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:05 -07:00
David S. Miller
5cd342df96 [SPARC64]: Make core and sibling groups equal on UltraSPARC-IV.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:02 -07:00
David S. Miller
f78eae2e6f [SPARC64]: Proper multi-core scheduling support.
The scheduling domain hierarchy is:

   all cpus -->
      cpus that share an instruction cache -->
          cpus that share an integer execution unit

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:00 -07:00
David Miller
d887ab3a9b [SPARC64]: Provide mmu statistics via sysfs.
If the system supports hypervisor based statistics, allow them to
be fetched, enabled, and disabled via sysfs.

Enable and disable via the boolean:

/sys/devices/systems/cpu/cpuN/mmustat_enable

Statistic values are provided under:

/sys/devices/systems/cpu/cpuN/mmu_status/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:57 -07:00
David Miller
48b6735640 [SPARC64]: Fix service channel hypervisor function names.
sed 's/scv/svc/'

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:54 -07:00
David S. Miller
d1f253e60a [SPARC64]: Export basic cpu properties via sysfs.
Cache sizes, udelay_val, and clock_tick.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:51 -07:00
David S. Miller
eff3414b72 [SPARC64]: Move topology init code into new file, sysfs.c
Also, use per-cpu data for struct cpu.  Calling kmalloc for
each cpu in topology_init() is just plain clumsy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:50 -07:00
Linus Torvalds
54ca412336 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
  sparc64: fix alignment bug in linker definition script
2007-05-31 12:33:16 -07:00
David S. Miller
dbbe3cb8cf [SPARC64]: Add missing NCS and SVC hypervisor interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:52:48 -07:00
Sam Ravnborg
4096b46f01 sparc64: fix alignment bug in linker definition script
The RO_DATA section were hardcoded to a specific
alignment in include/asm-generic/vmlinux.h.
But for sparc64 this did not match the PAGE_SIZE.

Introduce a new section definition named:
RO_DATA that takes actual alignment as parameter.
RODATA are provided for backward compatibility.

On top of this avoid hardcoding alignment for
sparc64 in reset of the script
Fix is build-tested on sparc64 + x86_64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-29 21:29:00 +02:00
David S. Miller
7db35f31cb [SPARC64]: Fill holes in hypervisor APIs and fix KTSB registry.
Several interfaces were missing and others misnumbered or
improperly documented.

Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor.  This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:52:15 -07:00
David S. Miller
2d9e2763c2 [SPARC64]: Fix two bugs wrt. kernel 4MB TSB.
1) The TSB lookup was not using the correct hash mask.

2) It was not aligned on a boundary equal to it's size,
   which is required by the sun4v Hypervisor.

wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:51:38 -07:00
David S. Miller
679292993c [SPARC64]: Fix _PAGE_EXEC_4U check in sun4u I-TLB miss handler.
It was using an immediate _PAGE_EXEC_4U value in an 'and'
instruction to perform the test.  This doesn't work because
the immediate field is signed 13-bit, this the mask being
tested against the PTE was 0x1000 sign-extended to 32-bits
instead of just plain 0x1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:15 -07:00
Horst H. von Brand
7189859f28 [SPARC64]: arch/sparc64/time.c doesn't compile on Ultra 1 (no PCI)
This is bug 8540 on bugzilla.kernel.org

arch/sparc64/time.c contains references to assorted bq4802 stuff if
CONFIG_PCI is not set, and compile fails. I #ifdef'ed out everything
that looks PCI-ish in that file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:02 -07:00
David S. Miller
22adb358e8 [SPARC64]: Eliminate NR_CPUS limitations.
Cheetah systems can have cpuids as large as 1023, although physical
systems don't have that many cpus.

Only three limitations existed in the kernel preventing arbitrary
NR_CPUS values:

1) dcache dirty cpu state stored in page->flags on
   D-cache aliasing platforms.  With some build time
   calculations and some build-time BUG checks on
   page->flags layout, this one was easily solved.

2) The cheetah XCALL delivery code could only handle
   a cpumask with up to 32 cpus set.  Some simple looping
   logic clears that up too.

3) thread_info->cpu was a u8, easily changed to a u16.

There are a few spots in the kernel that still put NR_CPUS
sized arrays on the kernel stack, but that's not a sparc64
specific problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:49 -07:00
David S. Miller
5cbc307373 [SPARC64]: Use machine description and OBP properly for cpu probing.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:41 -07:00
David S. Miller
e01c0d6d8c [SPARC64]: Negotiate hypervisor API for PCI services.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:34 -07:00
David S. Miller
22d6a1cba3 [SPARC64]: Report proper system soft state to the hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:29 -07:00
David S. Miller
36b48973b8 [SPARC64]: Fix typo in sun4v_hvapi_register error handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:21 -07:00
David S. Miller
5840fc66bb [SPARC64]: PCI device scan is way too verbose by default.
These messages were very useful when bringing up the
OBP based PCI device scan code, but it's just a lot
of noise every bootup now especially on big machines.

The messages can be re-enabled via 'ofpci_debug=1' on
the kernel command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:18 -07:00
David S. Miller
59db8102bd [SPARC64]: Don't be picky about virtual-dma values on sun4v.
Handle arbitrary base and length values as long as they
are multiples of IO_PAGE_SIZE.

Bug found by Arun Kumar Rao.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:15 -07:00
Sam Ravnborg
ca967258b6 all-archs: consolidate .data section definition in asm-generic
With this consolidation we can now modify the .data
section definition in one spot for all archs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
Sam Ravnborg
7664709b44 all-archs: consolidate .text section definition in asm-generic
Move definition of .text section to asm-generic.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
David S. Miller
03983ab858 [SPARC64]: Fix sched_clock() et al.
SPARC64_NSEC_PER_CYC_SHIFT was set too high.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17 22:55:26 -07:00
David S. Miller
c7754d465b [SPARC64]: Add hypervisor API negotiation and fix console bugs.
Hypervisor interfaces need to be negotiated in order to use
some API calls reliably.  So add a small set of interfaces
to request API versions and query current settings.

This allows us to fix some bugs in the hypervisor console:

1) If we can negotiate API group CORE of at least major 1
   minor 1 we can use con_read and con_write which can improve
   console performance quite a bit.

2) When we do a console write request, we should hold the
   spinlock around the whole request, not a byte at a time.
   What would happen is that it's easy for output from
   different cpus to get mixed with each other.

3) Use consistent udelay() based polling, udelay(1) each
   loop with a limit of 1000 polls to handle stuck hypervisor
   console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-15 20:23:02 -07:00
David S. Miller
be35cf01a9 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 04:19:01 -07:00
David S. Miller
17f34f0ec9 [SPARC64]: Add missing cpus_empty() check in hypervisor xcall handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:01:52 -07:00
David S. Miller
49d23cfcec [SPARC64]: Be more resiliant with PCI I/O space regs.
If we miss on the ranges, just toss the translation up to the parent
instead of failing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-13 22:01:18 -07:00
David S. Miller
8354c5b726 [SPARC]: Wire up signalfd/timerfd/eventfd syscalls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 22:06:51 -07:00
David S. Miller
d037e0532e [SPARC64]: Add support for bq4802 TOD chip, as found on ultra45.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:27 -07:00
David S. Miller
95d71e663e [SPARC64]: Correct FIRE_IOMMU_FLUSHINV register offset.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:26 -07:00
David S. Miller
d77311f942 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:24 -07:00
David S. Miller
f16537bac7 [SPARC64]: pci_resource_adjust() cannot be __init.
Noticed by Meelis Roos.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:22 -07:00
Simon Arlott
e5dd42e4fb [SPARC64]: Spelling fixes.
Spelling fixes in arch/sparc64/.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:21 -07:00
David S. Miller
e9429eacd7 [SPARC64]: Kill LARGE_ALLOCS and update defconfig.
Let's use SLUB, since it works now, in order to get it
tested a bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:19 -07:00
Amy Griffis
e54dc2431d [PATCH] audit signal recipients
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Amy Griffis
7f13da40e3 [PATCH] add SIGNAL syscall class (v3)
Add a syscall class for sending signals.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Linus Torvalds
0ab598099c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
  [SPARC64]: Bump PROMINTR_MAX to 32.
  [SPARC64]: Fix recursion in PROM tree building.
  [SERIAL] sunzilog: Interrupt enable before ISR handler installed
  [SPARC64] PCI: Consolidate PCI access code into pci_common.c
2007-05-10 13:32:05 -07:00
David S. Miller
26e6385f14 [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
Otherwise MSI explodes because pci_msi_init_pci_dev() does not
get invoked.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 02:16:27 -07:00
David S. Miller
aa5242e78f [SPARC64]: Fix recursion in PROM tree building.
Use iteration for scanning of PROM node siblings.

Based upon a patch by Greg Onufer, who found this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 00:53:29 -07:00
Fernando Luis Vazquez Cao
2f4dfe206a Remove hardcoding of hard_smp_processor_id on UP systems
With the advent of kdump, the assumption that the boot CPU when booting an UP
kernel is always the CPU with a particular hardware ID (often 0) (usually
referred to as BSP on some architectures) is not valid anymore.  The reason
being that the dump capture kernel boots on the crashed CPU (the CPU that
invoked crash_kexec), which may be or may not be that particular CPU.

Move definition of hard_smp_processor_id for the UP case to
architecture-specific code ("asm/smp.h") where it belongs, so that each
architecture can provide its own implementation.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:48 -07:00
David S. Miller
ca3dd88e41 [SPARC64] PCI: Consolidate PCI access code into pci_common.c
All the sun4u controllers do the same thing to compute the physical
I/O address to poke, and we can move the sun4v code into this common
location too.

This one needs a bit of testing, in particular the Sabre code had some
funny stuff that would break up u16 and/or u32 accesses into pieces
and I didn't think that was needed any more.  If it is we need to find
out why and add back code to do it again.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-09 02:35:27 -07:00
David S. Miller
127cda1e8c [SPARC64]: Optimize fault kprobe handling just like powerpc.
And eliminate DIE_GPF while we're at it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 18:25:14 -07:00
David S. Miller
6c1142602c [SPARC]: Wire up utimensat syscall.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:50:14 -07:00
David S. Miller
af80318eb7 [SPARC64]: Fix request_irq() ignored result warnings in PCI controller code.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:23:31 -07:00
David S. Miller
c57c2ffb15 [SPARC64]: Kill asm-sparc64/pbm.h
Everything it contains can be hidden in pci_impl.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:43:08 -07:00
David S. Miller
28113a9941 [SPARC64]: Removal of trivial pci_controller_info uses.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:44 -07:00
David S. Miller
6c108f1299 [SPARC64]: Move index info pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:40 -07:00
David S. Miller
e9870c4c0a [SPARC64]: Move {setup,teardown}_msi_irq into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:36 -07:00
David S. Miller
f1cd8de2c9 [SPARC64]: Move pci_ops into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:32 -07:00
David S. Miller
96a496fd49 [SPARC64] SBUS: Error interrupt registry cleanups.
Do not use IRQF_SHARED, these interrupt numbers should all
be unique.

Also use name strings without spaces in them just like
PCI controller drivers do, for consistency.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:28 -07:00
David S. Miller
34768bc832 [SPARC64] PCI: Use root list of pbm's instead of pci_controller_info's
The idea is to move more and more things into the pbm,
with the eventual goal of eliminating the pci_controller_info
entirely as there really isn't any need for it.

This stage of the transformations requires some reworking of
the PCI error interrupt handling.

It might be tricky to get rid of the pci_controller_info parenting for
a few reasons:

1) When we get an uncorrectable or correctable error we want
   to interrogate the IOMMU and streaming cache of both
   PBMs for error status.  These errors come from the UPA
   front-end which is shared between the two PBM PCI bus
   segments.

   Historically speaking this is why I choose the datastructure
   hierarchy of pci_controller_info-->pci_pbm_info

2) The probing does a portid/devhandle match to look for the
   'other' pbm, but this is entirely an artifact and can be
   eliminated trivially.

What we could do to solve #1 is to have a "buddy" pointer from one pbm
to another.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:24 -07:00
David S. Miller
cfa0652c4e [SPARC64] PCI: Use common routine to fetch PBM properties.
Namely bus-range and ino-bitmap.

This allows us also to eliminate pci_controller_info's
pci_{first,last}_busno fields as only the pbm ones are
used now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:12 -07:00
Ulrich Drepper
1c710c896e utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it

a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
   of the BSD lutimes(3) functions

For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.

Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.

Also, the completely missing futimensat() functionality is added.  We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).

Test application (the syscall number will need per-arch editing):

#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>

#define __NR_utimensat 280

#define UTIME_NOW       ((1l << 30) - 1l)
#define UTIME_OMIT      ((1l << 30) - 2l)

int
main(void)
{
  int status = 0;

  int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
  if (fd == -1)
    error (1, errno, "failed to create test file \"ttt\"");

  struct stat64 st1;
  if (fstat64 (fd, &st1) != 0)
    error (1, errno, "fstat failed");

  struct timespec t[2];
  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  struct stat64 st2;
  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0] = st1.st_atim;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_OMIT;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("atim not set");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim changed from zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_OMIT;
  t[1] = st1.st_mtim;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("mtim changed from original time");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
      || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
    {
      puts ("mtim not set");
      status = 1;
    }
  if (status != 0)
    goto out;

  sleep (2);

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_NOW;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_NOW;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  struct timeval tv;
  gettimeofday(&tv,NULL);

  if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
      || st2.st_atim.tv_sec > tv.tv_sec)
    {
      puts ("atim not set to NOW");
      status = 1;
    }
  if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
      || st2.st_mtim.tv_sec > tv.tv_sec)
    {
      puts ("mtim not set to NOW");
      status = 1;
    }

  if (symlink ("ttt", "tttsym") != 0)
    error (1, errno, "cannot create symlink");

  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
    error (1, errno, "utimensat failed");

  if (lstat64 ("tttsym", &st2) != 0)
    error (1, errno, "lstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("symlink atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("symlink mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 1;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 1;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to one");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to one");
      status = 1;
    }

  if (status == 0)
     puts ("all OK");

 out:
  close (fd);
  unlink ("ttt");
  unlink ("tttsym");

  return status;
}

[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:18 -07:00
Randy Dunlap
e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Christoph Hellwig
1eeb66a1bb move die notifier handling to common code
This patch moves the die notifier handling to common code.  Previous
various architectures had exactly the same code for it.  Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)

arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at.  avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Christoph Hellwig
ab1b6f03a1 simplify the stacktrace code
Simplify the stacktrace code:

 - remove the unused task argument to save_stack_trace, it's always
   current
 - remove the all_contexts flag, it's alwasy 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
Linus Torvalds
ef93127e4c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SERIAL] sunsu: Fix section mismatch warnings.
  [SPARC64]: pgtable_cache_init() should be __init.
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
  [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
  [MM]: sparse_init() should be __init.
  [SPARC64]: Update defconfig.
  [VIDEO]: Add Sun XVR-2500 framebuffer driver.
  [VIDEO]: Add Sun XVR-500 framebuffer driver.
  [SPARC64]: SUN4U PCI-E controller support.
  [SPARC]: Fix comment typo in smp4m_blackbox_current().
  [SCSI] SUNESP: sun_esp.c needs linux/delay.h

Fix up conflict in arch/sparc64/mm/init.c manually due to removal of
pgtable_cache_init() through the -mm patches (even though that patch was
also by David ;)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:22:48 -07:00
Benjamin Herrenschmidt
ac35ee484d get_unmapped_area handles MAP_FIXED on sparc64
Handle MAP_FIXED in hugetlb_get_unmapped_area on sparc64 by just using
prepare_hugepage_range()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: William Irwin <bill.irwin@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:56 -07:00
Christoph Lameter
f0f3980b21 slab allocators: remove multiple alignment specifications
It is not necessary to tell the slab allocators to align to a cacheline
if an explicit alignment was already specified. It is rather confusing
to specify multiple alignments.

Make sure that the call sites only use one form of alignment.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:55 -07:00
Christoph Lameter
5af6083990 slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN
This patch was recently posted to lkml and acked by Pekka.

The flag SLAB_MUST_HWCACHE_ALIGN is

1. Never checked by SLAB at all.

2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB

3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB.

The only remaining use is in sparc64 and ppc64 and their use there
reflects some earlier role that the slab flag once may have had. If
its specified then SLAB_HWCACHE_ALIGN is also specified.

The flag is confusing, inconsistent and has no purpose.

Remove it.

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:55 -07:00
David Miller
3a2cba993b Quicklist support for sparc64
I ported this to sparc64 as per the patch below, tested on UP SunBlade1500 and
24 cpu Niagara T1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:54 -07:00
David S. Miller
e7f11aeed0 [SPARC64]: pgtable_cache_init() should be __init.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:02:46 -07:00
David S. Miller
c35a376d60 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
The IRQ translation init routines should all be __init.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:02:24 -07:00
David S. Miller
a6009dda97 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
apb_calc_first_last(), apb_fake_ranges(), pci_of_scan_bus(),
of_scan_pci_bridge(), pci_of_scan_bus(), and pci_scan_one_pbm()
should all be __devinit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:01:38 -07:00
David S. Miller
23abc9ec6a [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
probe_other_fhcs() and central_probe() should be __init

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:00:37 -07:00
David S. Miller
7db00552d9 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-06 22:47:14 -07:00
David S. Miller
861fe90656 [SPARC64]: SUN4U PCI-E controller support.
Some minor refactoring in the generic code was necessary for
this:

1) This controller requires 8-byte access to the interrupt map
   and clear register.  They are 64-bits on all the other
   SBUS and PCI controllers anyways, so this was easy to cure.

2) The IMAP register has a different layout and some bits that we
   need to preserve, so use a read/modify/write when making
   changes to the IMAP register in generic code.

3) Flushing the entire IOMMU TLB is best done with a single write
   to a register on this PCI controller, add a iommu->iommu_flushinv
   for this.

Still lacks MSI support, that will come later.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-06 22:44:06 -07:00
Linus Torvalds
ea62ccd00f Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
  [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
  [PATCH] i386: type may be unused
  [PATCH] i386: Some additional chipset register values validation.
  [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
  [PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
  [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
  [PATCH] i386: white space fixes in i387.h
  [PATCH] i386: Drop noisy e820 debugging printks
  [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
  [PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
  [PATCH] x86-64: Share identical video.S between i386 and x86-64
  [PATCH] x86-64: Remove CONFIG_REORDER
  [PATCH] x86-64: Print type and size correctly for unknown compat ioctls
  [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
  [PATCH] i386: Little cleanups in smpboot.c
  [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
  [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
  [PATCH] i386: Add X86_FEATURE_RDTSCP
  [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
  [PATCH] i386: Implement alternative_io for i386
  ...

Fix up trivial conflict in include/linux/highmem.h manually.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-05 14:55:20 -07:00
Linus Torvalds
7e20ef030d Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (49 commits)
  [SCTP]: Set assoc_id correctly during INIT collision.
  [SCTP]: Re-order SCTP initializations to avoid race with sctp_rcv()
  [SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
  [SCTP]: Verify all destination ports in sctp_connectx.
  [XFRM] SPD info TLV aggregation
  [XFRM] SAD info TLV aggregationx
  [AF_RXRPC]: Sort out MTU handling.
  [AF_IUCV/IUCV] : Add missing section annotations
  [AF_IUCV]: Implementation of a skb backlog queue
  [NETLINK]: Remove bogus BUG_ON
  [IPV6]: Some cleanups in include/net/ipv6.h
  [TCP]: zero out rx_opt in tcp_disconnect()
  [BNX2]: Fix TSO problem with small MSS.
  [NET]: Rework dev_base via list_head (v3)
  [TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
  [BNX2]: Update version and reldate.
  [BNX2]: Print bus information for PCIE devices.
  [BNX2]: Add 1-shot MSI handler for 5709.
  [BNX2]: Restructure PHY event handling.
  [BNX2]: Add indirect spinlock.
  ...
2007-05-04 19:36:58 -07:00
Pavel Emelianov
7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Michael Ellerman
7fe3730de7 MSI: arch must connect the irq and the msi_desc
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.

set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.

The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.

Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Dan Williams
f282b97021 msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2)
Allows architectures to advertise that they support MSI rather than listing
each architecture as a PCI_MSI dependency.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Jeremy Fitzhardinge
b6e3590f81 [PATCH] x86: Allow percpu variables to be page-aligned
Let's allow page-alignment in general for per-cpu data (wanted by Xen, and
Ingo suggested KVM as well).

Because larger alignments can use more room, we increase the max per-cpu
memory to 64k rather than 32k: it's getting a little tight.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:12 +02:00
David S. Miller
16ce82d846 [SPARC64]: Convert PCI over to generic struct iommu/strbuf.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 21:08:21 -07:00
David S. Miller
3e4d26508a [SPARC64]: Convert SBUS over to generic iommu/strbuf structs.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:44 -07:00
David S. Miller
9b3627f389 [SPARC64]: Consolidate {sbus,pci}_iommu_arena.
Move to asm-sparc64/iommu.h and rename to plain "iommu_arena".

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:42 -07:00
Stephen Rothwell
3dfe10ee7c [SPARC64]: constify some paramaters of OF routines
This starts bringing the PowerPC and Sparc64 implemetations back closer
together.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:40 -07:00
David S. Miller
a165b4205e [SPARC64]: Fix PCI rework to adhere to of_get_property() const return.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:37 -07:00
David S. Miller
f1cfdb55f1 [SPARC64]: Document and fix calculation of pages_avail.
It should be set to the total number of pages that the
system will really have available after things like
initmem, the bootmem map, and initrd are freed up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:36 -07:00
David S. Miller
0f3e25049e [SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c
It needs to be ready before we invoke pci_determine_mem_io_space().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:35 -07:00
David S. Miller
3996465392 [SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:34 -07:00
David S. Miller
b93f262023 [SPARC64]: Add proper header file extern for cmdline_memory_size.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:33 -07:00
David S. Miller
9753f0d650 [SPARC64]: Kill sparc_ultra_dump_{i,d}tlb()
While useful in odd circumstances to debug something, they are
normally totally unused and anyone can fetch this code out of the
history if they really need it.

And in any event, the person who needs this kind of code is usually me
:-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:32 -07:00
David S. Miller
85f1e1f660 [SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:31 -07:00
David S. Miller
5be4a96367 [SPARC64]: Give move verbose show_mem() output just like i386.
We now report everything i386 does except for highmem which
doesn't apply.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:30 -07:00
David S. Miller
28256ca2e0 [SPARC64]: Mark show_mem() printk's with KERN_INFO.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:28 -07:00
David S. Miller
a94aa25306 [SPARC64]: Kill kvaddr_to_phys() and friends.
Just inline it into flush_icache_range() which is the only
user.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:27 -07:00
David S. Miller
4be5c34dc4 [SPARC64]: Privatize sun4u_get_pte() and fix name.
__get_phys is only called from init.c as is prom_virt_to_phys(),
__get_iospace() is not called at all, and sun4u_get_pte() is largely
misnamed.

Privatize the implementation and helper functions of
sun4u_get_phys() to mm/init.c, and rename to
kvaddr_to_paddr().

The only used of this thing is flush_icache_range(), and thus
things can be considerably further simplified.  For example,
we should only see module or PAGE_OFFSET kernel addresses here,
so we don't need the OBP firmware range handling at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:26 -07:00
David S. Miller
a0963bdfb9 [SPARC64]: Kill _start[]/_end[] declarations in mm/init.c
We already get those from asm/sections.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:25 -07:00
David S. Miller
0015d3d68c [SPARC64]: Simplify read_obp_memory().
Kick out empty entries as soon as we spot them, and use memmove()
instead of a silly loop to make the operation more clear.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:23 -07:00
David S. Miller
d78d0891d3 [SPARC64]: Use SPARSEMEM_STATIC
Decrease the SECTION_SIZE_BITS --> MAX_PHYSADDR_BITS
range a little bit.

The cost of going to SPARSEMEM_STATIC becomes 8K of BSS space, and in
return we save a pointer dereferences on every page struct lookup.
Even better we hit the main kernel image for the base address which is
in a hugepage locked TLB entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:22 -07:00
David S. Miller
28f57e774d [SPARC64]: Force dummy host controller onto bus zero.
This helps deal with the invisible bridge that sits between
the host controller and the top-most visisble PCI devices
on hypervisor systems.

For example, on T1000 the bus-range property says 2 --> 4
and so there is a PCI express bridge at bus 2, devfn 0, etc.

So if we don't force the dummy host controller to bus zero,
we'll try to create two devices with the same domain/bus/devfn
triplet.

Also, add some more log diagnostics to make debugging stuff like this
easyer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:20 -07:00
David S. Miller
97b3cf050b [SPARC64]: Add dummy host controller to root of all PCI domains.
We fake up a dummy one in all cases because that is the simplest
thing to do and it happens to be necessary for hypervisor systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:19 -07:00
David S. Miller
c6e87566ea [SPARC64]: Const'ify pci_iommu_ops.
Based upon a similar patch for x86_64 written by
Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:18 -07:00
David S. Miller
0bba2dd823 [SPARC64]: Kill pbm->pci_first_slot.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:17 -07:00
David S. Miller
3875c5c02d [SPARC64]: Kill pci_controller->pbms_same_domain
We don't do the "Simba APB is a PBM" bogosity for Sabre
controllers any longer, so this pbms_same_domain thing
is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:16 -07:00
David S. Miller
8d3aee9375 [SPARC64]: Kill pci_controller->base_address_update().
Implemented but never actually used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:15 -07:00
David S. Miller
0bae5f81b6 [SPARC64]: Kill pci_controller->resource_adjust()
All the implementations can be identical and generic, so
no need for controller specific methods.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:14 -07:00
David S. Miller
3487a1f9e7 [SPARC64]: Kill PBM ranges software state.
It is only used in one spot and we can just fetch the
OF property right there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:13 -07:00
David S. Miller
229177c7f3 [SPARC64]: Kill PBM intmap software state.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:12 -07:00
David S. Miller
9fd8b64761 [SPARC64]: Consolidate PCI mem/io resource determination.
It can be done for every PCI configuration using OF properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:11 -07:00
David S. Miller
01f94c4a6c [SPARC64]: Fix sabre pci controllers with new probing scheme.
The SIMBA APB bridge is strange, it is a PCI bridge but it lacks
some standard OF properties, in particular it lacks a 'ranges'
property.

What you have to do is read the IO and MEM range registers in
the APB bridge to determine the ranges handled by each bridge.
So fill in the bus resources by doing that.

Since we now handle this quirk in the generic PCI and OF device
probing layers, we can flat out eliminate all of that code from
the sabre pci controller driver.

In fact we can thus eliminate completely another quirk of the sabre
driver.  It tried to make the two APB bridges look like PBMs but that
makes zero sense now (and it's questionable whether it ever made sense).
So now just use pbm_A and probe the whole PCI hierarchy using that as
the root.

This simplification allows many future cleanups to occur.

Also, I've found yet another quirk that needs to be worked around
while testing this.  You can't use the 'class-code' OF firmware
property, especially for IDE controllers.  We have to read the value
out of PCI config space or else we'll see the value the device was
showing before it was programmed into native mode.

I'm starting to think it might be wise to just read all of the values
out of PCI config space instead of using the OF properties. :-/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:10 -07:00
David S. Miller
a378fd0ee8 [SPARC64]: Fix obppath pci device sysfs creation.
Need to traverse recursively down child busses else we only
get the file created under devices at the top-level.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:09 -07:00
David S. Miller
bc606f3c91 [SPARC64]: Minor cleanups to schizo pci controller driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:08 -07:00
David S. Miller
1e8a8cc52d [SPARC64]: Internalize pci_memspace_mask.
The only user was bus_dvma_to_mem() which is no longer used
by any driver, so kill that, and the export of pci_memspace_mask.

The only user now is the PCI mmap support code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:07 -07:00
David S. Miller
a2fb23af1c [SPARC64]: Probe PCI bus using OF device tree.
Almost entirely taken from the 64-bit PowerPC PCI code.

This allowed to eliminate a ton of cruft from the sparc64
PCI layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:06 -07:00
David S. Miller
deb66c4521 [SPARC64] isa: Convert to use pci_device_to_OF_node().
Also, do not try to compute resources by hand, instead use
the pre-computed ones in the of_device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:05 -07:00
David S. Miller
1327e9b62f [SPARC64] ebus: Convert to use pci_device_to_OF_node().
Also, we don't need to store or use the PBM so kill that
from the linux_ebus.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:04 -07:00
David S. Miller
a8b8814bdf [SPARC]: Use strcasecmp for OFW property name comparisons.
This allows us to simplify sharing code with powerpc which
has properties that have various forms of capitalization
when on the sparc64 side the property is all lower-case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:41 -07:00
Stephen Rothwell
64b94701c0 [SPARC/64]: constify of_get_property return
Finally, we actually change the functions themselves.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:35 -07:00
Stephen Rothwell
6a23acf390 [SPARC64]: constify of_get_property return: arch/sparc64
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:24 -07:00
Tony Breeds
644923d4a5 [SPARC64]: Small cleanups time.c
- Removes days_in_mo[], as it's almost identical to month_days[]
- Use the leapyear() macro
- Line length wrapping.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:20 -07:00
David S. Miller
d62c6f093a [SPARC64]: Fix sparc64_next_event() error return.
It should return an error code not a boolean.

Based upon an hpet timer fix by Thomas Gleixner.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:18 -07:00
David S. Miller
112f48716d [SPARC64]: Add clocksource/clockevents support.
I'd like to thank John Stul and others for helping
me along the way.

A lot of cleanups fell out of this.  For example, the get_compare()
tick_op was totally unused, so was deleted.  And the most often used
tick_op members were grouped together for cache-friendlyness.

The sparc64 TSC is given to the kernel as a one-shot timer.

tick_ops->init_timer() simply turns off the privileged bit in
the tick register (when possible), and disables the interrupt
by setting bit 63 in the compare register.  The ->disable_irq()
op also sets this bit.

tick_ops->add_compare() is changed to:

1) Add the given delta to "tick" not to "compare"
2) Return a boolean which, if true, means that the tick
   value read after writing the compare value was found
   to have incremented past the initial tick value.  This
   mirrors logic used in the HPET driver's ->next_event()
   method.

Each tick_ops implementation also now provides a name string.
And we feed this into the clocksource and clockevents layers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:15 -07:00
David S. Miller
038cb01ea6 [SPARC64]: Add tick_nohz_{stop,restart}_sched_tick() calls to cpu_idle().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:13 -07:00
David S. Miller
777a447529 [SPARC64]: Unify timer interrupt handler.
Things were scattered all over the place, split between
SMP and non-SMP.

Unify it all so that dyntick support is easier to add.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:11 -07:00
David S. Miller
a58c9f3c1e [SPARC64]: Synchronize RTC clock via timer just like x86.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:54:08 -07:00
Tom "spot" Callaway
24fc6f00b6 [SPARC64]: Fix inline directive in pci_iommu.c
While building a test kernel for the new esp driver (against
git-current), I hit this bug. Trivial fix, put the inline declaration
in the right place. :)

Signed-off-by: Tom "spot" Callaway <tcallawa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-13 13:35:35 -07:00
David S. Miller
5c7aa6ffae [SPARC64]: Fix arg passing to compat_sys_ipc().
Do not sign extend args using the sys32_ipc stub, that is
buggy and unnecessary.

Based upon an excellent report by Mikael Pettersson.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-13 13:27:08 -07:00
Robert Reif
f6b45da129 [SPARC]: Fix section mismatch warnings in pci.c and pcic.c
Fix section mismatch in arch/sparc/kernel/pcic.c and 
arch/sparc64/kernel/pci.c.

Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-12 13:47:37 -07:00
Roland McGrath
1d51c69fb6 [SPARC]: avoid CHILD_MAX and OPEN_MAX constants
I don't figure anyone really cares about SunOS syscall emulation, and I
certainly don't.  But I'm getting rid of uses of the OPEN_MAX and CHILD_MAX
compile-time constant, and these are almost the only ones.  OPEN_MAX is a
bogus constant with no meaning about anything.  The RLIMIT_NOFILE resource
limit is what sysconf (_SC_OPEN_MAX) actually wants to return.

The CHILD_MAX cases weren't actually using anything I want to get rid of,
but I noticed that they are there and are wrong too.  The CHILD_MAX value
is not really unlimited as a -1 return from sysconf indicates.  The
RLIMIT_NPROC resource limit is what sysconf (_SC_CHILD_MAX) wants to return.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-12 13:13:42 -07:00
David S. Miller
2f3a2efd85 [SPARC64]: Fix SBUS IOMMU allocation code.
There are several IOMMU allocator bugs.  Instead of trying to fix this
overly complicated code, just mirror the PCI IOMMU arena allocator
which is very stable and well stress tested.

I tried to make the code as identical as possible so we can switch
sun4u PCI and SBUS over to a common piece of IOMMU code.  All that
will be need are two callbacks, one to do a full IOMMU flush and one
to do a streaming buffer flush.

This patch gets rid of a lot of hangs and mysterious crashes on SBUS
sparc64 systems, at least for me.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-11 23:56:10 -07:00
David S. Miller
24d559cac4 [SPARC64]: store-init needs trailing membar.
The manual says that it is required and we actually have crash reports
where loads see stale data due to not having membars here.

In one case the networking does:

	memset(skb, 0, offsetof(struct sk_buff, truesize));

and then some code later checks skb->nohdr for zero, but it's still
the value that was there before the memset().

Note that arch/sparc64/lib/xor.S already got this right.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-19 13:27:33 -07:00
David S. Miller
db2f9f6d91 [SPARC64]: Use Kconfig.preempt
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-17 15:23:22 -07:00
David S. Miller
d1acb4210a [SPARC64]: Get DEBUG_PAGEALLOC working again.
We have to make sure to use base-pagesize TLB entries even during the
early transition period where we need TLB miss handling but don't have
the kernel page tables setup yet for the linear region.

Also, it is necessary therefore to not use the 4MB TSB for these
translations, and instead use the normal kernel TSB.  This allows us
to also get rid of the 4MB tsb for debug builds which shrinks the
kernel a little bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-16 17:20:28 -07:00
David S. Miller
bb8236f2b9 [SPARC64]: Add missing HPAGE_MASK masks on address parameters.
These pte loops all assume the passed in address is HPAGE
aligned, make sure that is actually true.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-12 22:55:39 -07:00
David S. Miller
50d266a3a1 [SPARC]: Hook up missing syscalls.
sys_mbind
sys_get_mempolicy
sys_set_mempolicy
sys_kexec_load
sys_move_pages
sys_getcpu
sys_epoll_pwait

This work is largely a result of David Woodhouse's most
excellent missing syscalls patch.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-12 19:58:18 -07:00
Mathieu Desnoyers
c0a79b229a [SPARC64]: Fix atomicity of TIF update in flush_thread()
Fix atomicity of TIF update in flush_thread() for sparc64

Fixes correctly the race by using *_ti_thread_flag.

Race :

parent process executing :
sys_ptrace()
 (lock_kernel())
 (ptrace_get_task_struct(pid))
 arch_ptrace()
   ptrace_detach()
     ptrace_disable(child);
       clear_singlestep(child);
         clear_tsk_thread_flag(child, TIF_SINGLESTEP);
         (which clears the TIF_SINGLESTEP flag atomically from a different
          process)
 (put_task_struct(child))
 (unlock_kernel())

And at the same time, in the child process :
sys_execve()
 do_execve()
   search_binary_handler()
     load_elf_binary()
       flush_old_exec()
         flush_thread()
           doing a non-atomic thread flag update

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-10 00:19:49 -08:00