* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
clocksource: make clocksource watchdog cycle through online CPUs
Documentation: move timer related documentation to a single place
clockevents: optimise tick_nohz_stop_sched_tick() a bit
locking: remove unused double_spin_lock()
hrtimers: simplify lockdep handling
timers: simplify lockdep handling
posix-timers: fix shadowed variables
timer_list: add annotations to workqueue.c
hrtimer: use nanosleep specific restart_block fields
hrtimer: add nanosleep specific restart_block member
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove DEBUG_SEMAPHORE from Kconfig
Improve semaphore documentation
Simplify semaphore implementation
Add down_timeout and change ACPI to use it
Introduce down_killable()
Generic semaphore implementation
Add semaphore.h to kernel_lock.c
Fix quota.h includes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (104 commits)
IB/iser: Don't change itt endianness
IB/mlx4: Update module version and release date
IPoIB: Handle case when P_Key is deleted and re-added at same index
IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
IB/mlx4: Fix incorrect comment
IB/mlx4: Fix race when detaching a QP from a multicast group
IB/ehca: Support all ibv_devinfo values in query_device() and query_port()
RDMA/nes: Free IRQ before killing tasklet
IB/mthca: Update module version and release date
IB/mlx4: Update QP state if query QP succeeds
IB/mthca: Update QP state if query QP succeeds
RDMA/amso1100: Add check for NULL reply_msg in c2_intr()
IB/mlx4: Add support for resizing CQs
IB/mlx4: Add support for modifying CQ moderation parameters
IPoIB: Support modifying IPoIB CQ event moderation
IB/core: Add support for modify CQ
IPoIB: Add basic ethtool support
mlx4_core: Increase max number of QPs to 128K
RDMA/amso1100: Add support for "send with invalidate" work requests
IB/core: Add support for "send with invalidate" work requests
...
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (36 commits)
[S390] Remove code duplication from monreader / dcssblk.
[S390] kernel: show last breaking-event-address on oops
[S390] lowcore: Change type of lowcores softirq_pending to __u32.
[S390] zcrypt: Comments and kernel-doc cleanup
[S390] uaccess: Always access the correct address space.
[S390] Fix a lot of sparse warnings.
[S390] Convert s390 to GENERIC_CLOCKEVENTS.
[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
[S390] Convert monitor calls to function calls.
[S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters
[S390] replace remaining __FUNCTION__ occurrences
[S390] remove redundant display of free swap space in show_mem()
[S390] qdio: remove outdated developerworks link.
[S390] Add debug_register_mode() function to debug feature API
[S390] crypto: use more descriptive function names for init/exit routines.
[S390] switch sched_clock to store-clock-extended.
[S390] zcrypt: add support for large random numbers
[S390] hw_random: allow rng_dev_read() to return hardware errors.
[S390] Vertical cpu management.
[S390] cpu topology support for s390.
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: No need for per node slab counters if !SLUB_DEBUG
slub: Move map/flag clearing to __free_slab
slub: Fixes to per cpu stat output in sysfs
slub: Deal with config variable dependencies
slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic
slub: Initialize per-cpu stats
In order to not trip the clocksource watchdog, kgdb must touch the
clocksource watchdog on the return to normal system run state.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes the hang regression with kgdb when the NMI interrupt
comes in while the master core is returning from an exception.
Adjust the NMI logic such that KGDB will not stop NMI exceptions from
occurring by in general returning NOTIFY_DONE. It is not possible to
distinguish the debug NMI sync vs the normal NMI apic interrupt so
kgdb needs to catch the unknown NMI if it the debugger was previously
active on one of the cpus.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
simplified and streamlined kgdb support on x86, both 32-bit and 64-bit,
based on patch from:
Subject: kgdb: core-lite
From: Jason Wessel <jason.wessel@windriver.com>
[ and countless other authors - see the patch for details. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
polled console handling support, to access a console in an irq-less
way while in debug or irq context.
absolutely zero impact as long as CONFIG_CONSOLE_POLL is disabled.
(which is the default)
[ jan.kiszka@siemens.com: lots of cleanups ]
[ mingo@elte.hu: redesign, splitups, cleanups. ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
kgdb core code. Handles the protocol and the arch details.
[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
add probe_kernel_read() and probe_kernel_write().
Uninlined and restricted to kernel range memory only, as suggested
by Linus.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Move wakeup code to .c, so that video mode setting code can be shared
between boot and wakeup. Remove nasty assembly code in 64-bit case by
re-using trampoline code. Stack setup was fixed to clear high 16bits
of %esp, maybe that fixes some machines.
.c code sharing and morse code was done H. Peter Anvin, Sam Ravnborg
reviewed kbuild related stuff, and it seems okay to him. Rafael did
some cleanups.
[rjw:
* Made the patch stop breaking compilation on x86-32
* Added arch/x86/kernel/acpi/sleep.h
* Got rid of compiler warnings in arch/x86/kernel/acpi/sleep.c
* Fixed 32-bit compilation on x86-64 systems
* Added include/asm-x86/trampoline.h and fixed the non-SMP
compilation on 64-bit x86
* Removed arch/x86/kernel/acpi/sleep_32.c which was not used
* Fixed some breakage caused by the integration of smpboot.c done
under us in the meantime]
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cleanup references to the early cpu maps for the non-SMP configuration
and remove some functions called for SMP configurations only.
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
UV supports really big systems. So big, in fact, that the APICID register
does not contain enough bits to contain an APICID that is unique across all
cpus.
The UV BIOS supports 3 APICID modes:
- legacy mode. This mode uses the old APIC mode where
APICID is in bits [31:24] of the APICID register.
- x2apic mode. This mode is whitebox-compatible. APICIDs
are unique across all cpus. Standard x2apic APIC operations
(Intel-defined) can be used for IPIs. The node identifier
fits within the Intel-defined portion of the APICID register.
- x2apic-uv mode. In this mode, the APICIDs on each node have
unique IDs, but IDs on different node are not unique. For example,
if each mode has 32 cpus, the APICIDs on each node might be
0 - 31. Every node has the same set of IDs.
The UV hub is used to route IPIs/interrupts to the correct node.
Traditional APIC operations WILL NOT WORK.
In x2apic-uv mode, the ACPI tables all contain a full unique ID (note:
exact bit layout still changing but the following is close):
nnnnnnnnnnlc0cch
n = unique node number
l = socket number on board
c = core
h = hyperthread
Only the "lc0cch" bits are written to the APICID register. The remaining bits are
supplied by having the get_apic_id() function "OR" the extra bits into the value
read from the APICID register. (Hmmm.. why not keep the ENTIRE APICID register
in per-cpu data....)
The x2apic-uv mode is recognized by the MADT table containing:
oem_id = "SGI"
oem_table_id = "UV-X"
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add UV macros for converting between cpu numbers, blade numbers
and node numbers. Note that these are used ONLY within x86_64 UV
modules, and are not for general kernel use.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Definitions of UV MMRs.
Note: this file is auto-generated by hardware design tools.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Increase the number of bits in an apicid from 8 to 32.
By default, MP_processor_info() gets the APICID from the
mpc_config_processor structure. However, this structure limits
the size of APICID to 8 bits. This patch allows the caller of
MP_processor_info() to optionally pass a larger APICID that will
be used instead of the one in the mpc_config_processor struct.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add functions that can be used to determine if an x86_64
system is a SGI "UV" system. UV systems come in 3 types and
are identified by the OEM ID in the MADT.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce a function to read the local APIC_ID.
This change is in preparation for additional changes to
the APICID functions that will come in a later patch.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch renames VM_MASK to X86_VM_MASK (which
in turn defined as alias to X86_EFLAGS_VM) to better
distinguish from virtual memory flags. We can't just
use X86_EFLAGS_VM instead because it is also used
for conditional compilation
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A 1G section size makes memory hotplug too coarse in a virtual
environment. Retuce it by a factor of 2 to 512M. I would have liked
to make it smaller, but it runs out of reserved flags in the page flags.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge what's left from smp_32.h and smp_64.h into smp.h
By now, they're basically extern definitions.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
we merge everything that is inside CONFIG_SMP
to smp.h. They differ a little bit, so we use
CONFIG_X86_32_SMP and CONFIG_X86_64_SMP as markers.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This implementation in x86_64 is clean and consistent, but we
sacrifice it for the sake of being equal to i386 (since the other
way around would be harder).
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Although those constants are always defined in x86_64,
and will have the effect of just including the headers
in the very way we did before, I'm doing this in a separate
patch to be conservative and avoid surprises.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The code is now the same between i386 and x86_64. We already
know what happens when it reaches this point: They go away
from the arch-specific headers, and suddenly appears in the common
header.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We provide a bogus macro for x86_64 in case CONFIG_X86_LOCAL_APIC
is not set. It will always be set for x86_64, so the effect
is just to make the code equal to i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
APIC_DEFINITION is not defined in x86_64, so in practice, we keep
our old code here. But as a nice side effect, the code is now
equal to smp_32.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new cacheflush.h API's didn't have any comments describing
how they're to be used yet and the conventions around these functions.
This patch adds comments to this effect; in order for that to be
a logical series, some prototypes had to move around.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using a naked parameterless macro could lead to other tokens being
unexpectedly replaced.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When compilers became generally better at optimizing code than humans, the
register keyword became mostly useless. For the floppy driver it certainly
is since it's so slow compared to the rest of the system that optimizing
access to a single variable or two isn't going to make any real difference
So let's just leave it to the compiler - it'll do a better job anyway.
This patch does away with a few register keywords in the x86 floppy driver.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On AMD SMM protected memory is part of the address map, but handled
internally like an MTRR. That leads to large pages getting split
internally which has some performance implications. Check for the
AMD TSEG MSR and split the large page mapping on that area
explicitely if it is part of the direct mapping.
There is also SMM ASEG, but it is in the first 1MB and already covered by
the earlier split first page patch.
Idea for this came from an earlier patch by Andreas Herrmann
On a RevF dual Socket Opteron system kernbench shows a clear
improvement from this:
(together with the earlier patches in this series, especially the
split first 2MB patch)
[lower is better]
no split stddev split stddev delta
Elapsed Time 87.146 (0.727516) 84.296 (1.09098) -3.2%
User Time 274.537 (4.05226) 273.692 (3.34344) -0.3%
System Time 34.907 (0.42492) 34.508 (0.26832) -1.1%
Percent CPU 322.5 (38.3007) 326.5 (44.5128) +1.2%
=> About 3.2% improvement in elapsed time for kernbench.
With GB pages on AMD Fam1h the impact of splitting is much higher of course,
since it would split two full GB pages (together with the first
1MB split patch) instead of two 2MB pages. I could not benchmark
a clear difference in kernbench on gbpages, so I kept it disabled
for that case
That was only limited benchmarking of course, so if someone
was interested in running more tests for the gbpages case
that could be revisited (contributions welcome)
I didn't bother implementing this for 32bit because it is very
unlikely the 32bit lowmem mapping overlaps into the TSEG near 4GB
and the 2MB low split is already handled for both.
[ mingo@elte.hu: do it on gbpages kernels too, there's no clear reason
why it shouldnt help there. ]
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
RDMSR for 64bit values with exception handling.
Makes it easier to deal with 64bit valued MSRs. The old 64bit code
base had that too as checking_rdmsrl(), but it got dropped somehow.
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>