Commit Graph

2388 Commits

Author SHA1 Message Date
Lasse Collin
303148045a x86: support XZ-compressed kernel
This integrates the XZ decompression code to the x86 pre-boot code.

mkpiggy.c is updated to reserve about 32 KiB more buffer safety margin for
kernel decompression.  It is done unconditionally for all decompressors to
keep the code simpler.

The XZ decompressor needs around 30 KiB of heap, so the heap size is
increased to 32 KiB on both x86-32 and x86-64.

Documentation/x86/boot.txt is updated to list the XZ magic number.

With the x86 BCJ filter in XZ, XZ-compressed x86 kernel tends to be a few
percent smaller than the equivalent LZMA-compressed kernel.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alain Knaff <alain@knaff.lu>
Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-13 08:03:25 -08:00
Andres Salomon
7637c9259f drivers/staging/olpc_dcon: convert to new cs5535 gpio API
Drop the old geode_gpio crud, as well as the raw outl() calls; instead,
use the Linux GPIO API where possible, and the cs5535_gpio API in other
places.

Note that we don't actually clean up the driver properly yet (once loaded,
it always remains loaded).  That'll come later..

This patch is necessary for building the driver.

Signed-off-by: Andres Salomon <dilinger@queued.net>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-13 08:03:13 -08:00
Linus Torvalds
16ee8db6a9 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix Moorestown VRTC fixmap placement
  x86/gpio: Implement x86 gpio_to_irq convert function
  x86, UV: Fix APICID shift for Westmere processors
  x86: Use PCI method for enabling AMD extended config space before MSR method
  x86: tsc: Prevent delayed init if initial tsc calibration failed
  x86, lapic-timer: Increase the max_delta to 31 bits
  x86: Fix sparse non-ANSI function warnings in smpboot.c
  x86, numa: Fix CONFIG_DEBUG_PER_CPU_MAPS without NUMA emulation
  x86, AMD, PCI: Add AMD northbridge PCI device id for CPU families 12h and 14h
  x86, numa: Fix cpu to node mapping for sparse node ids
  x86, numa: Fake node-to-cpumask for NUMA emulation
  x86, numa: Fake apicid and pxm mappings for NUMA emulation
  x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU
  x86, numa: Reduce minimum fake node size to 32M

Fix up trivial conflict in arch/x86/kernel/apic/x2apic_uv_x.c
2011-01-11 11:11:46 -08:00
Linus Torvalds
42776163e1 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
  perf session: Fix infinite loop in __perf_session__process_events
  perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)
  perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail
  perf tools: Emit clearer message for sys_perf_event_open ENOENT return
  perf stat: better error message for unsupported events
  perf sched: Fix allocation result check
  perf, x86: P4 PMU - Fix unflagged overflows handling
  dynamic debug: Fix build issue with older gcc
  tracing: Fix TRACE_EVENT power tracepoint creation
  tracing: Fix preempt count leak
  tracepoint: Add __rcu annotation
  tracing: remove duplicate null-pointer check in skb tracepoint
  tracing/trivial: Add missing comma in TRACE_EVENT comment
  tracing: Include module.h in define_trace.h
  x86: Save rbp in pt_regs on irq entry
  x86, dumpstack: Fix unused variable warning
  x86, NMI: Clean-up default_do_nmi()
  x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU
  x86, NMI: Remove DIE_NMI_IPI
  x86, NMI: Add priorities to handlers
  ...
2011-01-11 11:02:13 -08:00
Arjan van de Ven
fa36e956c5 x86: Fix Moorestown VRTC fixmap placement
The x86 fixmaps need to be all together... unfortunately the
VRTC one was misplaced.

This patch makes sure the MRST VRTC fixmap is put prior to the
__end_of_permanent_fixed_addresses marker.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
LKML-Reference: <20110111105544.24448.27607.stgit@bob.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11 12:46:16 +01:00
Alek Du
718c45bd1a x86/gpio: Implement x86 gpio_to_irq convert function
We need this for x86 MID platforms where GPIO interrupts are
used. No special magic is needed so the default 1:1 behaviour
will do nicely.

Signed-off-by: Alek Du <alek.du@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
LKML-Reference: <20110111105439.24448.69863.stgit@bob.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11 12:46:15 +01:00
Jan Beulich
24d9b70b8c x86: Use PCI method for enabling AMD extended config space before MSR method
While both methods should work equivalently well for the native
case, the Xen Dom0 case can't reliably work with the MSR one,
since there's no guarantee that the virtual CPUs it has
available fully cover all necessary physical ones.

As per the suggestion of Robert Richter the patch only adds the
PCI method, but leaves the MSR one as a fallback to cover new
systems the PCI IDs of which may not have got added to the code
base yet.

The only change in v2 is the breaking out of the new CPI
initialization method into a separate function, as requested by
Ingo.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Robert Richter <robert.richter@amd.com>
Cc: Andreas Herrmann3 <Andreas.Herrmann3@amd.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
LKML-Reference: <4D2B3FD7020000780002B67D@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11 12:43:41 +01:00
Linus Torvalds
8c8ae4e8cd Merge branch 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: HVM X2APIC support
  apic: Move hypervisor detection of x2apic to hypervisor.h
2011-01-10 08:48:29 -08:00
Ingo Molnar
9adcc4a127 Merge branch 'x86/numa' into x86/urgent
Merge reason: Topic is ready for upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-10 09:36:07 +01:00
Ingo Molnar
4385428a47 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent 2011-01-09 10:42:21 +01:00
Cyrill Gorcunov
047a3772fe perf, x86: P4 PMU - Fix unflagged overflows handling
Don found that P4 PMU reads CCCR register instead of counter
itself (in attempt to catch unflagged event) this makes P4
NMI handler to consume all NMIs it observes. So the other
NMI users such as kgdb simply have no chance to get NMI
on their hands.

Side note: at moment there is no way to run nmi-watchdog
together with perf tool. This is because both 'perf top' and
nmi-watchdog use same event. So while nmi-watchdog reserves
one event/counter for own needs there is no room for perf tool
left (there is a way to disable nmi-watchdog on boot of course).

Ming has tested this patch with the following results

 | 1. watchdog disabled
 |
 | kgdb tests on boot OK
 | perf works OK
 |
 | 2. watchdog enabled, without patch perf-x86-p4-nmi-4
 |
 | kgdb tests on boot hang
 |
 | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb
 | tests on boot
 |
 | "perf top" partialy works
 |   cpu-cycles            no
 |   instructions          yes
 |   cache-references      no
 |   cache-misses          no
 |   branch-instructions   no
 |   branch-misses         yes
 |   bus-cycles            no
 |
 | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied
 |
 | kgdb tests on boot OK
 | perf does not work, NMI "Dazed and confused" messages show up
 |

Which means we still have problems with p4 box due to 'unknown'
nmi happens but at least it should fix kgdb test cases.

Reported-by: Jason Wessel <jason.wessel@windriver.com>
Reported-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Lin Ming <ming.m.lin@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4D275E7E.3040903@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-09 10:40:52 +01:00
Linus Torvalds
72eb6a7914 Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
  gameport: use this_cpu_read instead of lookup
  x86: udelay: Use this_cpu_read to avoid address calculation
  x86: Use this_cpu_inc_return for nmi counter
  x86: Replace uses of current_cpu_data with this_cpu ops
  x86: Use this_cpu_ops to optimize code
  vmstat: User per cpu atomics to avoid interrupt disable / enable
  irq_work: Use per cpu atomics instead of regular atomics
  cpuops: Use cmpxchg for xchg to avoid lock semantics
  x86: this_cpu_cmpxchg and this_cpu_xchg operations
  percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
  percpu,x86: relocate this_cpu_add_return() and friends
  connector: Use this_cpu operations
  xen: Use this_cpu_inc_return
  taskstats: Use this_cpu_ops
  random: Use this_cpu_inc_return
  fs: Use this_cpu_inc_return in buffer.c
  highmem: Use this_cpu_xx_return() operations
  vmstat: Use this_cpu_inc_return for vm statistics
  x86: Support for this_cpu_add, sub, dec, inc_return
  percpu: Generic support for this_cpu_add, sub, dec, inc_return
  ...

Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
2011-01-07 17:02:58 -08:00
Sheng Yang
d9b8ca8474 xen: HVM X2APIC support
This patch is similiar to Gleb Natapov's patch for KVM, which enable the
hypervisor to emulate x2apic feature for the guest. By this way, the emulation
of lapic would be simpler with x2apic interface(MSR), and faster.

[v2: Re-organized 'xen_hvm_need_lapic' per Ian Campbell suggestion]

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-07 10:03:50 -05:00
Sheng Yang
2904ed8dd5 apic: Move hypervisor detection of x2apic to hypervisor.h
Then we can reuse it for Xen later.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Avi Kivity <avi@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-07 10:03:49 -05:00
Don Zickus
c410b83077 x86, NMI: Remove DIE_NMI_IPI
With priorities in place and no one really understanding the difference between
DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI.

This also simplifies default_do_nmi() a little bit.  Instead of calling the
die_notifier in both the if and else part, just pull it out and call it before
the if-statement.  This has the side benefit of avoiding a call to the ioport
to see if there is an external NMI sitting around until after the (more frequent)
internal NMIs are dealt with.

Patch-Inspired-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:53 +01:00
Don Zickus
166d751479 x86, NMI: Add priorities to handlers
In order to consolidate the NMI die_chain events, we need to setup the priorities
for the die notifiers.

I started by defining a bunch of common priorities that can be used by the
notifier blocks.  Then I modified the notifier blocks to use the newly created
priorities.

Now that the priorities are straightened out, it should be easier to remove the
event DIE_NMI_IPI.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:52 +01:00
Huang Ying
1c7b74d46f x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERR
Replace the NMI related magic numbers with symbol constants.

Memory parity error is only valid for IBM PC-AT, newer machine use
bit 7 (0x80) of 0x61 port for PCI SERR. While memory error is usually
reported via MCE. So corresponding function name and kernel log string
is changed.

But on some machines, PCI SERR line is still used to report memory
errors. This is used by EDAC, so corresponding EDAC call is reserved.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:51 +01:00
Ingo Molnar
1c2a48cf65 Merge branch 'linus' into x86/apic-cleanups
Conflicts:
	arch/x86/include/asm/io_apic.h

Merge reason: Resolve the conflict, update to a more recent -rc base

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 14:14:15 +01:00
Linus Torvalds
47935a731b Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', 'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-64, asm: Use fxsaveq/fxrestorq in more places

* 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hwmon: Add core threshold notification to therm_throt.c

* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, paravirt: Use native_halt on a halt, not native_safe_halt

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  locking, lockdep: Convert sprintf_symbol to %pS

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: Better struct irqaction layout
2011-01-06 11:11:50 -08:00
Linus Torvalds
77a0dd54ba Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV, BAU: Extend for more than 16 cpus per socket
  x86, UV: Fix the effect of extra bits in the hub nodeid register
  x86, UV: Add common uv_early_read_mmr() function for reading MMRs
2011-01-06 11:09:57 -08:00
Linus Torvalds
4f00b901d4 Merge branch 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  module: Move RO/NX module protection to after ftrace module update
  x86: Resume trampoline must be executable
  x86: Add RO/NX protection for loadable kernel modules
  x86: Add NX protection for kernel data
  x86: Fix improper large page preservation
2011-01-06 11:07:33 -08:00
Linus Torvalds
b4c6e2ea5e Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, earlyprintk: Move mrst early console to platform/ and fix a typo
  x86, apbt: Setup affinity for apb timers acting as per-cpu timer
  ce4100: Add errata fixes for UART on CE4100
  x86: platform: Move iris to x86/platform where it belongs
  x86, mrst: Check platform_device_register() return code
  x86/platform: Add Eurobraille/Iris power off support
  x86, mrst: Add explanation for using 1960 as the year offset for vrtc
  x86, mrst: Fix dependencies of "select INTEL_SCU_IPC"
  x86, mrst: The shutdown for MRST requires the SCU IPC mechanism
  x86: Ce4100: Add reboot_fixup() for CE4100
  ce4100: Add PCI register emulation for CE4100
  x86: Add CE4100 platform support
  x86: mrst: Set vRTC's IRQ to level trigger type
  x86: mrst: Add audio driver bindings
  rtc: Add drivers/rtc/rtc-mrst.c
  x86: mrst: Add vrtc driver which serves as a wall clock device
  x86: mrst: Add Moorestown specific reboot/shutdown support
  x86: mrst: Parse SFI timer table for all timer configs
  x86/mrst: Add SFI platform device parsing code
2011-01-06 11:06:31 -08:00
Linus Torvalds
6f46b120a9 Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, microcode, AMD: Cleanup code a bit
  x86, microcode, AMD: Replace vmalloc+memset with vzalloc
2011-01-06 11:06:09 -08:00
Linus Torvalds
017892c341 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion
  x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation
  x86, acpi: Add MAX_LOCAL_APIC for 32bit
  x86: io_apic: Split setup_ioapic_ids_from_mpc()
  x86: io_apic: Fix CONFIG_X86_IO_APIC=n breakage
  x86: apic: Move probe_nr_irqs_gsi() into ioapic_init_mappings()
  x86: Allow platforms to force enable apic
2011-01-06 10:51:36 -08:00
Linus Torvalds
42cbd8efb0 Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cacheinfo: Cleanup L3 cache index disable support
  x86, amd-nb: Cleanup AMD northbridge caching code
  x86, amd-nb: Complete the rename of AMD NB and related code
2011-01-06 10:50:28 -08:00
Yinghai Lu
cb2ded37fd x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion
Found one x2apic pre-enabled system, x2apic_mode suddenly get
corrupted after register some cpus, when compiled
CONFIG_NR_CPUS=255 instead of 512.

It turns out that generic_processor_info() ==> phyid_set(apicid,
phys_cpu_present_map) causes the problem.

phys_cpu_present_map is sized by MAX_APICS bits, and pre-enabled
system some cpus have an apic id > 255.

The variable after phys_cpu_present_map may get corrupted
silently:

 ffffffff828e8420 B phys_cpu_present_map
 ffffffff828e8440 B apic_verbosity
 ffffffff828e8444 B local_apic_timer_c2_ok
 ffffffff828e8448 B disable_apic
 ffffffff828e844c B x2apic_mode
 ffffffff828e8450 B x2apic_disabled
 ffffffff828e8454 B num_processors
 ...

Actually phys_cpu_present_map is referenced via apic id, instead
index. We should use MAX_LOCAL_APIC instead MAX_APICS.

For 64-bit it will be 32768 in all cases. BSS will increase by 4k bytes
on 64-bit:

	text		data		bss		dec		filename
	21696943	4193748		12787712	38678403	vmlinux.before
	21696943	4193748		12791808	38682499	vmlinux.after

No change on 32bit.

Finally we can remove MAX_APCIS that was rather confusing.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4D23BD9C.3070102@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-05 14:09:23 +01:00
Ingo Molnar
bc030d6cb9 Merge commit 'v2.6.37-rc8' into x86/apic
Conflicts:
	arch/x86/include/asm/io_apic.h

Merge reason: move to a fresh -rc, resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-04 09:43:42 +01:00
Cliff Wickman
cfa60917f0 x86, UV, BAU: Extend for more than 16 cpus per socket
Fix a hard-coded limit of a maximum of 16 cpu's per socket.

The UV Broadcast Assist Unit code initializes by scanning the
cpu topology of the system and assigning a master cpu for each
socket and UV hub. That scan had an assumption of a limit of 16
cpus per socket. With Westmere we are going over that limit.
The UV hub hardware will allow up to 32.

If the scan finds the system has gone over that limit it returns
an error and we print a warning and fall back to doing TLB
shootdowns without the BAU.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org> # .37.x
LKML-Reference: <E1PZol7-0000mM-77@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-03 20:35:03 +01:00
R, Durgadoss
9e76a97efd x86, hwmon: Add core threshold notification to therm_throt.c
This patch adds code to therm_throt.c to notify core thermal threshold
events. These thresholds are supported by the IA32_THERM_INTERRUPT register.
The status/log for the same is monitored using the IA32_THERM_STATUS register.
The necessary #defines are in msr-index.h. A call back is added to mce.h, to
further notify the thermal stack, about the threshold events.

Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-01-03 08:30:30 -08:00
Tejun Heo
7b543a5334 x86: Replace uses of current_cpu_data with this_cpu ops
Replace all uses of current_cpu_data with this_cpu operations on the
per cpu structure cpu_info.  The scala accesses are replaced with the
matching this_cpu ops which results in smaller and more efficient
code.

In the long run, it might be a good idea to remove cpu_data() macro
too and use per_cpu macro directly.

tj: updated description

Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-30 12:22:03 +01:00
Tejun Heo
0a3aee0da4 x86: Use this_cpu_ops to optimize code
Go through x86 code and replace __get_cpu_var and get_cpu_var
instances that refer to a scalar and are not used for address
determinations.

Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-30 12:20:28 +01:00
Ingo Molnar
56f4c40034 Merge branch 'core' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/core 2010-12-30 11:26:45 +01:00
H. Peter Anvin
d50e8fc7e3 Merge branch 'x86/apic-cleanups' into x86/numa 2010-12-29 11:36:26 -08:00
Cliff Wickman
c8217b8305 x86, paravirt: Use native_halt on a halt, not native_safe_halt
halt() should use native_halt()
safe_halt() uses native_safe_halt()

If CONFIG_PARAVIRT=y, halt() is defined in arch/x86/include/asm/paravirt.h as

static inline void halt(void)
{
        PVOP_VCALL0(pv_irq_ops.safe_halt);
}

Otherwise (no CONFIG_PARAVIRT) halt() in arch/x86/include/asm/irqflags.h is

static inline void halt(void)
{
        native_halt();
}

So it looks to me like the CONFIG_PARAVIRT case of using native_safe_halt()
for a halt() is an oversight.
Am I missing something?

It probably hasn't shown up as a problem because the local apic is disabled
on a shutdown or restart.  But if we disable interrupts and call halt()
we shouldn't expect that the halt() will re-enable interrupts.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1PSBcz-0001g1-FM@eag09.americas.sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-27 14:02:11 -08:00
David Rientjes
a387e95a49 x86, numa: Fix cpu to node mapping for sparse node ids
NUMA boot code assumes that physical node ids start at 0, but the DIMMs
that the apic id represents may not be reachable.  If this is the case,
node 0 is never online and cpus never end up getting appropriately
assigned to a node.  This causes the cpumask of all online nodes to be
empty and machines crash with kernel code assuming online nodes have
valid cpus.

The fix is to appropriately map all the address ranges for physical nodes
and ensure the cpu to node mapping function checks all possible nodes (up
to MAX_NUMNODES) instead of simply checking nodes 0-N, where N is the
number of physical nodes, for valid address ranges.

This requires no longer "compressing" the address ranges of nodes in the
physical node map from 0-N, but rather leave indices in physnodes[] to
represent the actual node id of the physical node.  Accordingly, the
topology exported by both amd_get_nodes() and acpi_get_nodes() no longer
must return the number of nodes to iterate through; all such iterations
will now be to MAX_NUMNODES.

This change also passes the end address of system RAM (which may be
different from normal operation if mem= is specified on the command line)
before the physnodes[] array is populated.  ACPI parsed nodes are
truncated to fit within the address range that respect the mem=
boundaries and even some physical nodes may become unreachable in such
cases.

When NUMA emulation does succeed, any apicid to node mapping that exists
for unreachable nodes are given default values so that proximity domains
can still be assigned.  This is important for node_distance() to
function as desired.

Signed-off-by: David Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1012221702090.3701@chino.kir.corp.google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 15:27:16 -08:00
David Rientjes
f51bf3073a x86, numa: Fake apicid and pxm mappings for NUMA emulation
This patch adds the equivalent of acpi_fake_nodes() for AMD Northbridge
platforms.  The goal is to fake the apicid-to-node mappings for NUMA
emulation so the physical topology of the machine is correctly maintained
within the kernel.

This change also fakes proximity domains for both ACPI and k8 code so the
physical distance between emulated nodes is maintained via
node_distance().  This exports the correct distances via
/sys/devices/system/node/.../distance based on the underlying topology.

A new helper function, fake_physnodes(), is introduced to correctly
invoke the correct NUMA code to fake these two mappings based on the
system type.  If there is no underlying NUMA configuration, all cpus are
mapped to node 0 for local distance.

Since acpi_fake_nodes() is no longer called with CONFIG_ACPI_NUMA, it's
prototype can be removed from the header file for such a configuration.

Signed-off-by: David Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1012221701360.3701@chino.kir.corp.google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 15:27:14 -08:00
David Rientjes
4e76f4e67a x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU
Both acpi_get_nodes() and amd_get_nodes() are only necessary when
CONFIG_NUMA_EMU is enabled, so avoid compiling them when the option is
disabled.

Signed-off-by: David Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1012221701210.3701@chino.kir.corp.google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 15:27:12 -08:00
David Rientjes
34dc9e7496 x86, numa: Reduce minimum fake node size to 32M
This patch changes the minimum fake node size from 64MB to 32MB so it is
possible to test NUMA code at a greater scale on smaller machines
(64 nodes on a 2G machine, 1024 nodes on 32G machine with
CONFIG_NODES_SHIFT=10).

Signed-off-by: David Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1012221700590.3701@chino.kir.corp.google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 15:27:10 -08:00
Yinghai Lu
56d91f132c x86, acpi: Add MAX_LOCAL_APIC for 32bit
We should use MAX_LOCAL_APIC for max apic ids and MAX_APICS as number
of local apics.

Also apic_version[] array should use MAX_LOCAL_APICs.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D0AD464.2020408@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-12-23 13:15:53 -08:00
Ingo Molnar
26e20a108c Merge commit 'v2.6.37-rc7' into x86/security 2010-12-23 09:48:41 +01:00
Don Zickus
4a7863cc2e x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR
The x86 arch has shifted its use of the nmi_watchdog from a
local implementation to the global one provide by
kernel/watchdog.c.  This shift has caused a whole bunch of
compile problems under different config options.  I attempt to
simplify things with the patch below.

In order to simplify things, I had to come to terms with the
meaning of two terms ARCH_HAS_NMI_WATCHDOG and
CONFIG_HARDLOCKUP_DETECTOR.  Basically they mean the same thing,
the former on a local level and the latter on a global level.

With the old x86 nmi watchdog gone, there is no need to rely on
defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
make sense any more.  x86 will now use the global
implementation.

The changes below do a few things.  First it changes the few
places that relied on ARCH_HAS_NMI_WATCHDOG to use
CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
anyway, so nothing unusual here).  Those pieces of code were
relying more on local apic functionality the nmi watchdog
functionality, so the change should make sense.

Second, I removed the x86 implementation of
touch_nmi_watchdog().  It isn't need now, instead x86 will rely
on kernel/watchdog.c's implementation.

Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
x86.  And tweaked the include/linux/nmi.h file to tell users to
look for an externally defined touch_nmi_watchdog in the case of
ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
changes removes some of the ugliness in that file.

Finally, I added a Kconfig dependency for
CONFIG_HARDLOCKUP_DETECTOR that said you can't have
ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR.  You can
only have one nmi_watchdog.

Tested with
ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
(various broken configs)

Hopefully, after this patch I won't get any more compile broken
emails. :-)

v3:
  changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
  prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-22 22:15:32 +01:00
Ingo Molnar
6c529a266b Merge commit 'v2.6.37-rc7' into perf/core
Merge reason: Pick up the latest -rc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-22 11:53:23 +01:00
Robert Richter
da169f5df2 oprofile, x86: Add support for 6 counters (AMD family 15h)
This patch adds support for up to 6 hardware counters for AMD family
15h cpus. There is a new MSR range for hardware counters beginning at
MSRC001_0200 Performance Event Select (PERF_CTL0).

Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-12-19 11:43:08 +01:00
Linus Torvalds
46bdfe6a50 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86: avoid high BIOS area when allocating address space
  x86: avoid E820 regions when allocating address space
  x86: avoid low BIOS area when allocating address space
  resources: add arch hook for preventing allocation in reserved areas
  Revert "resources: support allocating space within a region from the top down"
  Revert "PCI: allocate bus resources from the top down"
  Revert "x86/PCI: allocate space from the end of a region, not the beginning"
  Revert "x86: allocate space within a region top-down"
  Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode"
  PCI: Update MCP55 quirk to not affect non HyperTransport variants
2010-12-18 10:13:24 -08:00
Christoph Lameter
8270137a0d cpuops: Use cmpxchg for xchg to avoid lock semantics
Use cmpxchg instead of xchg to realize this_cpu_xchg.

xchg will cause LOCK overhead since LOCK is always implied but cmpxchg
will not.

Baselines:

xchg()		= 18 cycles (no segment prefix, LOCK semantics)
__this_cpu_xchg = 1 cycle

(simulated using this_cpu_read/write, two prefixes. Looks like the
cpu can use loop optimization to get rid of most of the overhead)

Cycles before:

this_cpu_xchg	 = 37 cycles (segment prefix and LOCK (implied by xchg))

After:

this_cpu_xchg	= 11 cycle (using cmpxchg without lock semantics)

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Christoph Lameter
7296e08aba x86: this_cpu_cmpxchg and this_cpu_xchg operations
Provide support as far as the hardware capabilities of the x86 cpus
allow.

Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for
fast cpuops implementations.

V1->V2:
	- Take out the definition for this_cpu_cmpxchg_8 and move it into
	  a separate patch.

tj: - Reordered ops to better follow this_cpu_* organization.
    - Renamed macro temp variables similar to their existing
      neighbours.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Bjorn Helgaas
a2c606d53a x86: avoid high BIOS area when allocating address space
This prevents allocation of the last 2MB before 4GB.

The experiment described here shows Windows 7 ignoring the last 1MB:
https://bugzilla.kernel.org/show_bug.cgi?id=23542#c27

This patch ignores the top 2MB instead of just 1MB because H. Peter Anvin
says "There will be ROM at the top of the 32-bit address space; it's a fact
of the architecture, and on at least older systems it was common to have a
shadow 1 MiB below."

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-12-17 10:01:30 -08:00
Tejun Heo
403047754c percpu,x86: relocate this_cpu_add_return() and friends
- include/linux/percpu.h: this_cpu_add_return() and friends were
  located next to __this_cpu_add_return().  However, the overall
  organization is to first group by preemption safeness.  Relocate
  this_cpu_add_return() and friends to preemption-safe area.

- arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after
  other more basic operations.  Relocate [__]this_cpu_add_return_8()
  so that they're first grouped by preemption safeness.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
2010-12-17 16:13:22 +01:00
Christoph Lameter
8f1d97c79e x86: Support for this_cpu_add, sub, dec, inc_return
Supply an implementation for x86 in order to generate more efficient code.

V2->V3:
	- Cleanup
	- Remove strange type checking from percpu_add_return_op.

tj: - Dropped unused typedef from percpu_add_return_op().
    - Renamed ret__ to paro_ret__ in percpu_add_return_op().
    - Minor indentation adjustments.

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-17 15:15:28 +01:00
Ingo Molnar
006b20fe4c Merge branch 'perf/urgent' into perf/core
Merge reason: We want to apply a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:22:27 +01:00