Commit Graph

4359 Commits

Author SHA1 Message Date
Ingo Molnar
a5ebc0b1a7 Merge commit 'v2.6.29' into timers/core 2009-03-26 15:45:22 +01:00
Ravikiran G Thirumalai
70511134f6 Revert "x86: don't compile vsmp_64 for 32bit"
Partial revert of commit 129d8bc828
titled 'x86: don't compile vsmp_64 for 32bit'

Commit reverted to compile vsmp_64.c if CONFIG_X86_64 is defined,
since is_vsmp_box() needs to indicate that TSCs are not synchronized, and
hence, not a valid time source, even when CONFIG_X86_VSMP is not defined.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: shai@scalex86.org
LKML-Reference: <20090324061429.GH7278@localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-25 21:34:28 +01:00
Masami Hiramatsu
fee039a1d0 x86: kretprobe-booster interrupt emulation code fix
Fix interrupt emulation code in kretprobe-booster according to
pt_regs update (es/ds change and gs adding).

This issue has been reported on systemtap-bugzilla:

  http://sources.redhat.com/bugzilla/show_bug.cgi?id=9965

  | On a -tip kernel on x86_32, kretprobe_example (from samples) triggers the
  | following backtrace when its retprobing a class of functions that cause a
  | copy_from/to_user().
  |
  | BUG: sleeping function called from invalid context at mm/memory.c:3196
  | in_atomic(): 0, irqs_disabled(): 1, pid: 2286, name: cat

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
LKML-Reference: <49C7995C.2010601@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-25 18:53:29 +01:00
Rusty Russell
e06b1b56f9 x86: Correct behaviour of irq affinity
Impact: get correct smp_affinity as user requested

The effect of setting desc->affinity (ie. from userspace via sysfs) has
varied over time.  In 2.6.27, the 32-bit code anded the value with
cpu_online_map, and both 32 and 64-bit did that anding whenever a cpu
was unplugged.

2.6.29 consolidated this into one routine (and fixed hotplug) but
introduced another variation: anding the affinity with cfg->domain.

We should just set it to what the user said - if possible.

(cpu_mask_to_apicid_and already takes cpu_online_mask into account)

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <49C94DDF.2010703@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-25 18:48:29 +01:00
Yinghai Lu
f56e503412 x86: use default_cpu_mask_to_apicid for 64bit
Impact: cleanup

Use online_mask directly on 64bit too.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49C94DAE.9070300@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 22:28:38 +01:00
Yinghai Lu
fa74c90733 x86: fix set_extra_move_desc calling
Impact: fix bug with irq-descriptor moving when logical flat

Rusty observed:

> The effect of setting desc->affinity (ie. from userspace via sysfs) has varied
> over time.  In 2.6.27, the 32-bit code anded the value with cpu_online_map,
> and both 32 and 64-bit did that anding whenever a cpu was unplugged.
>
> 2.6.29 consolidated this into one routine (and fixed hotplug) but introduced
> another variation: anding the affinity with cfg->domain.  Is this right, or
> should we just set it to what the user said?  Or as now, indicate that we're
> restricting it.

Eric pointed out that desc->affinity should be what the user requested,
if it is at all possible to honor the user space request.

This bug got introduced by commit 22f65d31b "x86: Update io_apic.c to use
new cpumask API".

Fix it by moving the masking to before the descriptor moving ...

Reported-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <49C94134.4000408@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 22:12:10 +01:00
Ingo Molnar
5c8cd82ed7 Merge branch 'x86/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/linux-2.6-tiptop into x86/cleanups 2009-03-24 15:20:51 +01:00
Ingo Molnar
29219683c4 Merge branches 'x86/apic', 'x86/cleanups', 'x86/mm', 'x86/pat', 'x86/setup' and 'x86/signal'; commit 'v2.6.29' into x86/core 2009-03-24 15:19:45 +01:00
Steven Rostedt
5d1a03dc54 function-graph: moved the timestamp from arch to generic code
This patch move the timestamp from happening in the arch specific
code into the general code. This allows for better control by the tracer
to time manipulation.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 09:31:34 -04:00
Sheng Yang
dbb9fd8630 iommu: Add domain_has_cap iommu_ops
This iommu_op can tell if domain have a specific capability, like snooping
control for Intel IOMMU, which can be used by other components of kernel to
adjust the behaviour.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-03-24 09:42:51 +00:00
Benjamin Herrenschmidt
9e41d9597e Merge commit 'origin/master' into next 2009-03-24 13:38:30 +11:00
Jaswinder Singh Rajput
ba639039d6 x86: e820 fix various signedness issues in setup.c and e820.c
Impact: cleanup

This fixed various signedness issues in setup.c and e820.c:
arch/x86/kernel/setup.c:455:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:455:53:    expected int *pnr_map
arch/x86/kernel/setup.c:455:53:    got unsigned int extern [toplevel] *<noident>
arch/x86/kernel/setup.c:639:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:639:53:    expected int *pnr_map
arch/x86/kernel/setup.c:639:53:    got unsigned int extern [toplevel] *<noident>
arch/x86/kernel/setup.c:820:54: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:820:54:    expected int *pnr_map
arch/x86/kernel/setup.c:820:54:    got unsigned int extern [toplevel] *<noident>

arch/x86/kernel/e820.c:670:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/e820.c:670:53:    expected int *pnr_map
arch/x86/kernel/e820.c:670:53:    got unsigned int [toplevel] *<noident>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 15:02:05 +05:30
Jaswinder Singh Rajput
a1e38ca5ce x86: apic/io_apic.c define msi_ir_chip and ir_ioapic_chip all the time
move out msi_ir_chip and ir_ioapic_chip from CONFIG_INTR_REMAP shadow

Fix:
 arch/x86/kernel/apic/io_apic.c:1431: warning: ‘msi_ir_chip’ defined but not used

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 02:11:25 +05:30
Jaswinder Singh Rajput
474e56b82c x86: irq.c keep CONFIG_X86_LOCAL_APIC interrupts together
Impact: cleanup

keep CONFIG_X86_LOCAL_APIC interrupts together to avoid extra ifdef

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 02:08:34 +05:30
Jaswinder Singh Rajput
ce2d8bfd44 x86: irq.c use same path for show_interrupts
Impact: cleanup

SMP and !SMP will use same path for show_interrupts

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 02:08:00 +05:30
Jaswinder Singh Rajput
f2362e6f1b x86: cpu/cpu.h cleanup
Impact: cleanup

 - Fix various style issues

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-23 02:06:51 +05:30
Dmitri Vorobiev
1cc185211a x86: Fix a couple of sparse warnings in arch/x86/kernel/apic/io_apic.c
Impact: cleanup

This patch fixes the following sparse warnings:

 arch/x86/kernel/apic/io_apic.c:3602:17: warning: symbol 'hpet_msi_type'
 was not declared. Should it be static?

 arch/x86/kernel/apic/io_apic.c:3467:30: warning: Using plain integer as
 NULL pointer

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
LKML-Reference: <1237741871-5827-2-git-send-email-dmitri.vorobiev@movial.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-22 18:15:14 +01:00
Ingo Molnar
648340dff0 Merge branch 'x86/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/linux-2.6-tip into x86/cleanups 2009-03-21 17:37:35 +01:00
Jaswinder Singh Rajput
1894e36754 x86: pci-nommu.c cleanup
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 17:01:25 +05:30
Jaswinder Singh Rajput
d53a444460 x86: io_delay.c cleanup
Impact: cleanup

 - fix header file issues

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 16:57:04 +05:30
Jaswinder Singh Rajput
8383d821e7 x86: rtc.c cleanup
Impact: cleanup

 - fix various style problems
  - fix header file issues

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 16:56:37 +05:30
Jaswinder Singh Rajput
c8344bc218 x86: i8253 cleanup
Impact: cleanup

 - fix various style problems
  - fix header file issues

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 16:56:10 +05:30
Jaswinder Singh Rajput
390cd85c8a x86: kdebugfs.c cleanup
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 16:55:45 +05:30
Jaswinder Singh Rajput
271eb5c588 x86: topology.c cleanup
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 16:55:24 +05:30
Jaswinder Singh Rajput
0b3ba0c3cc x86: mpparse.c introduce check_physptr helper function
To reduce the size of the oversized function __get_smp_config()

There should be no impact to functionality.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 14:15:43 +05:30
Jaswinder Singh Rajput
5a5737eac2 x86: mpparse.c introduce smp_dump_mptable helper function
smp_read_mpc() and replace_intsrc_all() can use same smp_dump_mptable()

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-03-21 14:15:11 +05:30
Bartlomiej Zolnierkiewicz
04c93ce499 x86: fix IO APIC resource allocation error message
Impact: fix incorrect error message

- IO APIC resource allocation error message contains one too many "be".

- Print the error message iff there are IO APICs in the system.

I've seen this error message for some time on my x86-32 laptop...

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Alan Bartlett <ajb.stxsl@googlemail.com>
LKML-Reference: <200903202100.30789.bzolnier@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-20 21:02:55 +01:00
Hiroshi Shimamoto
14fc9fbc70 x86: signal: check signal stack overflow properly
Impact: cleanup

Check alternate signal stack overflow with proper stack pointer.
The stack pointer of the next signal frame is different if that
task has i387 state.

On x86_64, redzone would be included.

No need to check SA_ONSTACK if we're already using alternate signal stack.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <49C2874D.3080002@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-20 19:01:31 +01:00
Matthew Wilcox
1c8d7b0a56 PCI MSI: Add support for multiple MSI
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block.  Ensure that the architecture back ends don't
have to know about multiple MSI.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:14 -07:00
Bjorn Helgaas
13bf757669 x86: use dev_printk in quirk message
This patch changes a VIA PCI quirk to use dev_info() rather than printk().

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgek.org>
2009-03-20 10:48:10 -07:00
Ingo Molnar
7f00a2495b Merge branches 'x86/cleanups', 'x86/mm', 'x86/setup' and 'linus' into x86/core 2009-03-20 10:34:22 +01:00
Jeremy Fitzhardinge
71ff49d71b x86: with the last user gone, remove set_pte_present
Impact: cleanup

set_pte_present() is no longer used, directly or indirectly,
so remove it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
LKML-Reference: <1237406613-2929-2-git-send-email-jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-19 14:04:19 +01:00
Ingo Molnar
c58603e81b x86: mpparse: clean up code by introducing a few helper functions, fix
Impact: fix boot crash

This fixes commit a683027856.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1237403503.22438.21.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-19 08:52:13 +01:00
Lai Jiangshan
e9d9df4473 ftrace: protect running nmi (V3)
When I review the sensitive code ftrace_nmi_enter(), I found
the atomic variable nmi_running does protect NMI VS do_ftrace_mod_code(),
but it can not protects NMI(entered nmi) VS NMI(ftrace_nmi_enter()).

cpu#1                   | cpu#2                 | cpu#3
ftrace_nmi_enter()      | do_ftrace_mod_code()  |
  not modify            |                       |
------------------------|-----------------------|--
executing               | set mod_code_write = 1|
executing             --|-----------------------|--------------------
executing               |                       | ftrace_nmi_enter()
executing               |                       |    do modify
------------------------|-----------------------|-----------------
ftrace_nmi_exit()       |                       |

cpu#3 may be being modified the code which is still being executed on cpu#1,
it will have undefined results and possibly take a GPF, this patch
prevents it occurred.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <49C0B411.30003@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-18 20:36:59 -04:00
Jaswinder Singh Rajput
a683027856 x86: mpparse: clean up code by introducing a few helper functions
Impact: cleanup

Refactor the MP-table parsing code via the introduction of the
following helper functions:

  skip_entry()
  smp_reserve_bootmem()
  check_irq_src()
  check_slot()

To simplify the code flow and to reduce the size of the
following oversized functions: smp_read_mpc(), smp_scan_config().

There should be no impact to functionality.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 17:15:05 +01:00
Rusty Russell
89bd55d185 x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
Impact: cleanup, avoid cpumask games

The APM code wants to run on CPU 0: we create an "on_cpu0" wrapper
which uses work_on_cpu() if we're not already on cpu 0.

This introduces a new failure mode: -ENOMEM, so we add an explicit
err arg and handle Linux-style errnos in apm_err().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <200903111631.29787.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:51:45 +01:00
Ingo Molnar
4bae196735 x86: microcode: cleanup
Impact: cleanup

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Peter Oruba <peter.oruba@amd.com>
LKML-Reference: <200903111632.37279.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:51:17 +01:00
Rusty Russell
af5c820a31 x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
Impact: don't play with current's cpumask

Straightforward indirection through work_on_cpu().  One change is
that the error code from microcode_update_cpu() is now actually
plumbed back to microcode_init_cpu(), so now we printk if it fails
on cpu hotplug.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Peter Oruba <peter.oruba@amd.com>
LKML-Reference: <200903111632.37279.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:50:47 +01:00
Jaswinder Singh Rajput
cde5edbda8 x86: kprobes.c fix compilation warning
arch/x86/kernel/kprobes.c:196: warning: passing argument 1 of ‘search_exception_tables’ makes integer from pointer without a cast

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference:<49BED952.2050809@redhat.com>
LKML-Reference: <1237378065.13488.2.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:21:01 +01:00
Ingo Molnar
705bb9dc72 Merge branches 'x86/cleanups', 'x86/cpu', 'x86/debug', 'x86/mce2', 'x86/mm', 'x86/mtrr', 'x86/setup', 'x86/setup-memory', 'x86/urgent', 'x86/uv', 'x86/x2apic' and 'linus' into x86/core
Conflicts:
	arch/parisc/kernel/irq.c
2009-03-18 13:19:49 +01:00
Jaswinder Singh Rajput
4e16c88875 x86: cpu/mttr/cleanup.c fix compilation warning
arch/x86/kernel/cpu/mtrr/cleanup.c:197: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1237378015.13488.1.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 13:14:31 +01:00
Ingo Molnar
95f3c4ebff Merge branch 'dma-api/debug' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu 2009-03-18 10:37:48 +01:00
Ingo Molnar
04dfcfcb54 Merge branch 'linus' into core/iommu 2009-03-18 10:37:43 +01:00
Ingo Molnar
37ba317c9e Merge branches 'sched/cleanups' and 'linus' into sched/core 2009-03-18 09:57:02 +01:00
Rusty Russell
2c74d66624 x86, uv: fix cpumask iterator in uv_bau_init()
Impact: fix boot crash on UV systems

Commit 76ba0ecda0 "cpumask: use
cpumask_var_t in uv_flush_tlb_others" used cur_cpu as an iterator;
it was supposed to be zero for the code below it.

Reported-by: Cliff Wickman <cpw@sgi.com>
Original-From: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: steiner@sgi.com
Cc: <stable@kernel.org>
LKML-Reference: <200903180822.31196.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 09:47:54 +01:00
Rusty Russell
30e1e6d1af cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
Impact: Fix cpu offline when CONFIG_MAXSMP=y

Changeset bc9b83dd1f "cpumask: convert
c1e_mask in arch/x86/kernel/process.c to cpumask_var_t" contained a
bug: c1e_mask is manipulated even if C1E isn't detected (and hence
not allocated).

This is simply fixed by checking for NULL (which gcc optimizes out
anyway of CONFIG_CPUMASK_OFFSTACK=n, since it knows ce1_mask can never
be NULL).

In addition, fix a leak where select_idle_routine re-allocates
(and re-clears) c1e_mask on every cpu init.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Mike Travis <travis@sgi.com>
LKML-Reference: <200903171450.34549.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 09:40:35 +01:00
Suresh Siddha
ce4e240c27 x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths
Impact: optimize APIC IPI related barriers

Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.

x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.

Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 09:36:14 +01:00
Andrew Morton
a6b6a14e0c x86: use smp_call_function_single() in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Attempting to rid us of the problematic work_on_cpu().  Just use
smp_call_function_single() here.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <20090318042217.EF3F1DDF39@ozlabs.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-18 07:03:12 +01:00
Ingo Molnar
03418c7efa Merge branches 'tracing/ftrace' and 'linus' into tracing/core 2009-03-18 06:58:45 +01:00
Suresh Siddha
68a8ca593f x86: fix broken irq migration logic while cleaning up multiple vectors
Impact: fix spurious IRQs

During irq migration, we send a low priority interrupt to the previous
irq destination. This happens in non interrupt-remapping case after interrupt
starts arriving at new destination and in interrupt-remapping case after
modifying and flushing the interrupt-remapping table entry caches.

This low priority irq cleanup handler can cleanup multiple vectors, as
multiple irq's can be migrated at almost the same time. While
there will be multiple invocations of irq cleanup handler (one cleanup
IPI for each irq migration), first invocation of the cleanup handler
can potentially cleanup more than one vector (as the first invocation can
see the requests for more than vector cleanup). When we cleanup multiple
vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
other vectors that are to be cleanedup can still be pending in the local
cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).

When we are ready to unhook a vector corresponding to an irq, check if that
vector is registered in the local cpu's IRR. If so skip that cleanup and
do a self IPI with the cleanup vector, so that we give a chance to
service the pending vector interrupt and then cleanup that vector
allocation once we execute the lowest priority handler.

This fixes spurious interrupts seen when migrating multiple vectors
at the same time.

[ This is apparently possible even on conventional xapic, although to
  the best of our knowledge it has never been seen.  The stable
  maintainers may wish to consider this one for -stable. ]

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: stable@kernel.org
2009-03-17 16:49:30 -07:00
Suresh Siddha
05c3dc2c4b x86, ioapic: Fix non atomic allocation with interrupts disabled
Impact: fix possible race

save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
called with interrupts disabled. Fix this by splitting this into two different
function. Allocation part save_IO_APIC_setup() now happens before
disabling interrupts.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:45:29 -07:00
Suresh Siddha
29b61be65a x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code
Impact: cleanup

Clean up #ifdefs and replace them with helper functions.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:45:07 -07:00
Suresh Siddha
0280f7c416 x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping
Impact: simplification

In the current code, for level triggered migration, we need to modify the
io-apic RTE with the update vector information, along with modifying interrupt
remapping table entry(IRTE) with vector and destination. This is to ensure that
remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.

With this patch, for level triggered, we eliminate the io-apic RTE modification
(with the updated vector information), by using a virtual vector (io-apic pin
number).  Real vector that is used for interrupting cpu will be coming from
the interrupt-remapping table entry. Trigger mode in the IRTE will always be
edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
So a level triggered interrupt will appear as an edge to the local apic
cpu but still as level to the IO-APIC.

With this change, level irq migration can be done by simply modifying
the interrupt-remapping table entry with out changing the io-apic RTE.
And as the interrupt appears as edge at the cpu, in addition to do the
local apic EOI, we need to do IO-APIC directed EOI to clear the remote
IRR bit in  the IO-APIC RTE.

This simplies the irq migration in the presence of interrupt-remapping.

Idea-by: Rajesh Sankaran <rajesh.sankaran@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:44:27 -07:00
Suresh Siddha
cf6567fe40 x86, x2apic: fix clear_local_APIC() in the presence of x2apic
Impact: cleanup, paranoia

We were not clearing the local APIC in clear_local_APIC() in the
presence of x2apic. Fix it.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:43:51 -07:00
Suresh Siddha
7c6d9f9785 x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping
Impact: make kexec work with x2apic

disable_IO_APIC() gets called during crashdump aswell, which configures the
IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel.

In the presence of interrupt-remapping, we need to change the
interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire
B mode.

To keep things simple during the crash, use virtual wire A mode
(for which we don't need to touch io-apic and interrupt-remapping tables).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:42:28 -07:00
Suresh Siddha
9d783ba042 x86, x2apic: enable fault handling for intr-remapping
Impact: interface augmentation (not yet used)

Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 15:38:59 -07:00
H. Peter Anvin
0a699af8e6 x86-32: move _end to a dummy section
Impact: build fix with CONFIG_RELOCATABLE

Move _end into a dummy section, so that relocs.c will know it is a
relocatable symbol.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
2009-03-17 14:16:02 -07:00
Jeremy Fitzhardinge
704439ddf9 x86/brk: put the brk reservations in their own section
Impact: disambiguate real .bss variables from .brk storage

Add a .brk section after the .bss section.  This has no effect
on the final vmlinux, but it more clearly distinguishes the space
taken by actual .bss symbols, and the variable space reserved
by .brk users.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-03-17 12:58:15 -07:00
H. Peter Anvin
60ac982139 x86-32: tighten the bound on additional memory to map
Impact: Tighten bound to avoid masking errors

The definition of MAPPING_BEYOND_END was excessive; this has a nasty
tendency to mask bugs.  We have learned over time that this kind of
bug hiding can cause some very strange errors.  Therefore, tighten the
bound to only need to map the actual kernel area.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
2009-03-17 11:52:10 -07:00
Jeremy Fitzhardinge
b8a22a6273 x86-32: remove ALLOCATOR_SLOP from head_32.S
Impact: cleanup

ALLOCATOR_SLOP is a vestigial remain from when we used the
bootmem allocator to allocate the kernel's linear memory mapping.
Now we directly reserve pages from the e820 mapping, and no
longer require secondary structures to keep track of allocated
pages.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 11:46:01 -07:00
Jeremy Fitzhardinge
c090f532db x86-32: make sure we map enough to fit linear map pagetables
Impact: crash fix

head_32.S needs to map the kernel itself, and enough space so
that mm/init.c can allocate space from the e820 allocator
for the linear map of low memory.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-03-17 11:42:05 -07:00
Masami Hiramatsu
30390880de prevent boosting kprobes on exception address
Don't boost at the addresses which are listed on exception tables,
because major page fault will occur on those addresses.  In that case,
kprobes can not ensure that when instruction buffer can be freed since
some processes will sleep on the buffer.

kprobes-ia64 already has same check.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-17 09:11:48 -07:00
Linus Torvalds
9e8912e04e Fast TSC calibration: calculate proper frequency error bounds
In order for ntpd to correctly synchronize the clocks, the frequency of
the system clock must not be off by more than 500 ppm (or, put another
way, 1:2000), or ntpd will end up giving up on trying to synchronize
properly, and ends up reseting the clock in jumps instead.

The fast TSC PIT calibration sometimes failed this test - it was
assuming that the PIT reads always took about one microsecond each (2us
for the two reads to get a 16-bit timer), and that calibrating TSC to
the PIT over 15ms should thus be sufficient to get much closer than
500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
error value).

However, that assumption does not always hold: apparently some hardware
is either very much slower at reading the PIT registers, or there was
other noise causing at least one machine to get 700+ ppm errors.

So instead of using a fixed 15ms timing loop, this changes the fast PIT
calibration to read the TSC delta over the individual PIT timer reads,
and use the result to calculate the error bars on the PIT read timing
properly.  We then successfully calibrate the TSC only if the maximum
error bars fall below 500ppm.

In the process, we also relax the timing to allow up to 25ms for the
calibration, although it can happen much faster depending on hardware.

Reported-and-tested-by: Jesper Krogh <jesper@krogh.cc>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-17 08:13:17 -07:00
Linus Torvalds
a6a80e1d8c Fix potential fast PIT TSC calibration startup glitch
During bootup, when we reprogram the PIT (programmable interval timer)
to start counting down from 0xffff in order to use it for the fast TSC
calibration, we should also make sure to delay a bit afterwards to allow
the PIT hardware to actually start counting with the new value.

That will happens at the next CLK pulse (1.193182 MHz), so the easiest
way to do that is to just wait at least one microsecond after
programming the new PIT counter value.  We do that by just reading the
counter value back once - which will take about 2us on PC hardware.

Reported-and-tested-by: john stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-17 07:58:26 -07:00
Joerg Roedel
86f3195293 dma-debug/x86: register pci bus for dma-debug leak detection
Impact: detect dma memory leaks for pci devices

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-17 12:56:49 +01:00
Joerg Roedel
2118d0c548 dma-debug: x86 architecture bindings
Impact: make use of DMA-API debugging code in x86

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-03-17 12:56:46 +01:00
Yinghai Lu
f0348c438c x86: MTRR workaround for system with stange var MTRRs
Impact: don't trim e820 according to wrong mtrr

Ozan reports that his server emits strange warning.
it turns out the BIOS sets the MTRRs incorrectly.

Ignore those strange ranges, and don't trim e820,
just emit one warning about BIOS

Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BEE1E7.7020706@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-17 10:47:47 +01:00
Thomas Gleixner
250981e6e1 x86: reduce preemption off section in exit thread
Impact: latency improvement

No need to keep preemption disabled over the kfree call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-03-16 15:32:28 +01:00
Hidetoshi Seto
514ec49a5f x86, mce: remove incorrect __cpuinit for intel_init_cmci()
Impact: Bug fix on UP

Referring commit cc3ca22063,
Peter removed __cpuinit annotations for mce_cpu_features()
and its successor functions, which caused troubles on UP
configurations.

However the intel_init_cmci() was introduced after that and
it also has __cpuinit annotation even though it is called from
mce_cpu_features(). Remove the annotation from that function
too.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-16 09:15:32 +01:00
Yinghai Lu
c61cf4cfe7 x86: print out more info in e820_update_range()
Impact: help debug e820 bugs

Try to print out more info, to catch wrong call parameters.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BCB557.3030000@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-15 10:01:59 +01:00
Yinghai Lu
6d7942dc2a x86: fix 64k corruption-check
Impact: fix boot crash

Need to exit early if the addr is far above 64k.

The crash got exposed by:

  78a8b35: x86: make e820_update_range() handle small range update

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>
LKML-Reference: <49BC2279.2030101@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-15 07:03:15 +01:00
Yinghai Lu
2bd2753ff4 x86: put initial_pg_tables into .bss
Impact: makes vmlinux section information more useful

Don't use ram after _end blindly for pagetables. aka init pages is before _end
put those pg table into .bss

[Adapted to use brk segment - Jeremy]

v2: keep initial page table up to 512M only.
v4: put initial page tables just before _end

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 17:23:47 -07:00
Jeremy Fitzhardinge
796216a57f x86: allow extend_brk users to reserve brk space
Impact: new interface; remove hard-coded limit

Add RESERVE_BRK(name, size) macro to reserve space in the brk
area.  This should be a conservative (ie, larger) estimate of
how much space might possibly be required from the brk area.
Any unused space will be freed, so there's no real downside
on making the reservation too large (within limits).

The name should be unique within a given file, and somewhat
descriptive.

The C definition of RESERVE_BRK() ends up being more complex than
one would expect to work around a cluster of gcc infelicities:

  The first attempt was to simply try putting __section(.brk_reservation)
  on a variable.  This doesn't work because it ends up making it a
  @progbits section, which gets actual space allocated in the vmlinux
  executable.

  The second attempt was to emit the space into a section using asm,
  but gcc doesn't allow arguments to be passed to file-level asm()
  statements, making it hard to pass in the size.

  The final attempt is to wrap the asm() in a function to allow
  it to have arguments, and put the function itself into the
  .discard section, which vmlinux*.lds drops entirely from the
  emitted vmlinux.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 17:23:47 -07:00
Yinghai Lu
7543c1de84 x86-32: compute initial mapping size more accurately
Impact: simplification

We only need to map the kernel in head_32.S, not the whole of
lowmem.  We use 512MB as a reasonable (but arbitrary) limit on
the maximum size of the kernel image.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 17:23:47 -07:00
Jeremy Fitzhardinge
6de6cb442e x86: use brk allocation for DMI
Impact: use new interface instead of previous ad hoc implementation

Use extend_brk() to allocate memory for DMI rather than having an
ad-hoc allocator.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 17:23:47 -07:00
Jeremy Fitzhardinge
ccf3fe02e3 x86-32: use brk segment for allocating initial kernel pagetable
Impact: use new interface instead of previous ad hoc implementation

Rather than having special purpose init_pg_table_start/end variables
to delimit the kernel pagetable built by head_32.S, just use the brk
mechanism to extend the bss for the new pagetable.

This patch removes init_pg_table_start/end and pg0, defines __brk_base
(which is page-aligned and immediately follows _end), initializes
the brk region to start there, and uses it for the 32-bit pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 17:23:47 -07:00
H. Peter Anvin
5368a2be34 x86: move brk initialization out of #ifdef CONFIG_BLK_DEV_INITRD
Impact: build fix

The brk initialization functions were incorrectly located inside
an #ifdef CONFIG_VLK_DEV_INITRD block, causing the obvious build failure in
minimal configurations.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-03-14 17:23:41 -07:00
Jeremy Fitzhardinge
93dbda7cbc x86: add brk allocation for very, very early allocations
Impact: new interface

Add a brk()-like allocator which effectively extends the bss in order
to allow very early code to do dynamic allocations.  This is better than
using statically allocated arrays for data in subsystems which may never
get used.

The space for brk allocations is in the bss ELF segment, so that the
space is mapped properly by the code which maps the kernel, and so
that bootloaders keep the space free rather than putting a ramdisk or
something into it.

The bss itself, delimited by __bss_stop, ends before the brk area
(__brk_base to __brk_limit).  The kernel text, data and bss is reserved
up to __bss_stop.

Any brk-allocated data is reserved separately just before the kernel
pagetable is built, as that code allocates from unreserved spaces
in the e820 map, potentially allocating from any unused brk memory.
Ultimately any unused memory in the brk area is used in the general
kernel memory pool.

Initially the brk space is set to 1MB, which is probably much larger
than any user needs (the largest current user is i386 head_32.S's code
to build the pagetables to map the kernel, which can get fairly large
with a big kernel image and no PSE support).  So long as the system
has sufficient memory for the bootloader to reserve the kernel+1MB brk,
there are no bad effects resulting from an over-large brk.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 15:37:14 -07:00
Jeremy Fitzhardinge
b9719a4d9c x86: make section delimiter symbols part of their section
Impact: cleanup

Move the symbols delimiting a section part of the section
(section relative) rather than absolute.  This avoids any
unexpected gaps between the section-start symbol and the first
data in the section, which could be caused by implicit
alignment of the section data.  It also makes the general
form of vmlinux_64.lds.S consistent with vmlinux_32.lds.S.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-03-14 15:37:14 -07:00
Jaswinder Singh Rajput
f4c3c4cdb1 x86: cpu_debug add support for various AMD CPUs
Impact: Added AMD CPUs support

Added flags for various AMD CPUs.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 18:07:58 +01:00
Sebastian Andrzej Siewior
48f4c485c2 x86/centaur: merge 32 & 64 bit version
there should be no difference, except:

 * the 64bit variant now also initializes the padlock unit.
 * ->c_early_init() is executed again from ->c_init()
 * the 64bit fixups made into 32bit path.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: herbert@gondor.apana.org.au
LKML-Reference: <1237029843-28076-2-git-send-email-sebastian@breakpoint.cc>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 16:27:29 +01:00
Ingo Molnar
0ca0f16fd1 Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug', 'x86/kconfig', 'x86/mm', 'x86/ptrace', 'x86/setup' and 'x86/urgent'; commit 'v2.6.29-rc8' into x86/core 2009-03-14 16:25:40 +01:00
Yinghai Lu
d4c90e37a2 x86: print the continous part of fixed mtrrs together
Impact: print out fewer lines

 1. print continuous range with same type together
 2. change _INFO to _DEBUG

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BACB61.8000302@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 12:27:06 +01:00
Yinghai Lu
63516ef6d6 x86: fix get_mtrr() warning about smp_processor_id() with CONFIG_PREEMPT=y
Impact: fix debug warning

Jaswinder noticed that there is a warning about smp_processor_id()
in get_mtrr().

Fix it by wrapping the printout into a get/put_cpu() pair.

Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <49BAB7FF.4030107@kernel.org>
[ changed to get/put_cpu(), cleaned up surrounding code a it. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 12:27:06 +01:00
Yinghai Lu
78a8b35bc7 x86: make e820_update_range() handle small range update
Impact: enhance e820 code to handle more cases

Try to handle new range which could be covered by one entry.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: jbeulich@novell.com
LKML-Reference: <49B9F0C1.10402@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 12:20:07 +01:00
Ingo Molnar
0f3fa48a7e x86: cpu/common.c more cleanups
Complete/fix the cleanups of cpu/common.c:

 - fix ugly warning due to asm/topology.h -> linux/topology.h change
 - standardize the style across the file
 - simplify/refactor the code flow where possible

Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 10:37:34 +01:00
Ingo Molnar
c550033ced Merge branch 'core/percpu' into x86/core 2009-03-14 09:50:10 +01:00
Ingo Molnar
62395efdb0 Merge branch 'x86/asm' into tracing/syscalls
We need the wider TIF work-mask checks in entry_32.S.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 09:44:08 +01:00
Jaswinder Singh Rajput
88200bc28d x86: entry_32.S fix compile warnings - fix work mask bit width
Fix:

 arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1
 arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff
 arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1
 arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff
 arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091

TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the
first 16 bits of the work mask - bit 27 falls outside of that.

Update the entry_32.S code to check the full 32-bit mask.

[ %cx => %ecx fix from Cyrill Gorcunov <gorcunov@gmail.com> ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "H. Peter Anvin" <hpa@kernel.org>
LKML-Reference: <1237012693.18733.3.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 09:42:51 +01:00
Jaswinder Singh Rajput
9766cdbcb2 x86: cpu/common.c cleanups
- fix various style problems
 - declare varibles before they get used
 - introduced clear_all_debug_regs
 - fix header files issues

LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-14 08:59:50 +01:00
Ingo Molnar
ccd50dfd92 tracing/syscalls: support for syscalls tracing on x86, fix
Impact: build fix

 kernel/built-in.o: In function `ftrace_syscall_exit':
 (.text+0x76667): undefined reference to `syscall_nr_to_meta'

ftrace.o is built:

obj-$(CONFIG_DYNAMIC_FTRACE)    += ftrace.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o

But now a CONFIG_FTRACE_SYSCALLS dependency is needed too.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <1236401580-5758-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 17:02:17 +01:00
Frederic Weisbecker
f58ba10067 tracing/syscalls: support for syscalls tracing on x86
Extend x86 architecture syscall tracing support with syscall
metadata table details.

(The upcoming core syscall tracing modifications rely on this.)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236955332-10133-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 16:57:42 +01:00
Américo Wang
5a8ac9d28d x86: ptrace, bts: fix an unreachable statement
Commit c2724775ce put a statement
after return, which makes that statement unreachable.

Move that statement before return.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090313075622.GB8933@hack>
Cc: <stable@kernel.org> # .29 only
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 10:27:57 +01:00
Andreas Herrmann
3ff42da504 x86: mtrr: don't modify RdDram/WrDram bits of fixed MTRRs
Impact: bug fix + BIOS workaround

BIOS is expected to clear the SYSCFG[MtrrFixDramModEn] on AMD CPUs
after fixed MTRRs are configured.

Some BIOSes do not clear SYSCFG[MtrrFixDramModEn] on BP (and on APs).

This can lead to obfuscation in Linux when this bit is not cleared on
BP but cleared on APs. A consequence of this is that the saved
fixed-MTRR state (from BP) differs from the fixed-MTRRs of APs --
because RdDram/WrDram bits are read as zero when
SYSCFG[MtrrFixDramModEn] is cleared -- and Linux tries to sync
fixed-MTRR state from BP to AP. This implies that Linux sets
SYSCFG[MtrrFixDramEn] and activates those bits.

More important is that (some) systems change these bits in SMM when
ACPI is enabled. Hence it is racy if Linux modifies RdMem/WrMem bits,
too.

(1) The patch modifies an old fix from Bernhard Kaindl to get
    suspend/resume working on some Acer Laptops. Bernhard's patch
    tried to sync RdMem/WrMem bits of fixed MTRR registers and that
    helped on those old Laptops. (Don't ask me why -- can't test it
    myself). But this old problem was not the motivation for the
    patch. (See http://lkml.org/lkml/2007/4/3/110)

(2) The more important effect is to fix issues on some more current systems.

    On those systems Linux panics or just freezes, see

    http://bugzilla.kernel.org/show_bug.cgi?id=11541
    (and also duplicates of this bug:
    http://bugzilla.kernel.org/show_bug.cgi?id=11737
    http://bugzilla.kernel.org/show_bug.cgi?id=11714)

    The affected systems boot only using acpi=ht, acpi=off or
    when the kernel is built with CONFIG_MTRR=n.

    The acpi options prevent full enablement of ACPI.  Obviously when
    ACPI is enabled the BIOS/SMM modfies RdMem/WrMem bits.  When
    CONFIG_MTRR=y Linux also accesses and modifies those bits when it
    needs to sync fixed-MTRRs across cores (Bernhard's fix, see (1)).
    How do you synchronize that? You can't. As a consequence Linux
    shouldn't touch those bits at all (Rationale are AMD's BKDGs which
    recommend to clear the bit that makes RdMem/WrMem accessible).
    This is the purpose of this patch. And (so far) this suffices to
    fix (1) and (2).

I suggest not to touch RdDram/WrDram bits of fixed-MTRRs and
SYSCFG[MtrrFixDramEn] and to clear SYSCFG[MtrrFixDramModEn] as
suggested by AMD K8, and AMD family 10h/11h BKDGs.
BIOS is expected to do this anyway. This should avoid that
Linux and SMM tread on each other's toes ...

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: trenn@suse.de
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090312163937.GH20716@alberich.amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 10:19:27 +01:00
Frederic Weisbecker
1b3fa2ce64 tracing/x86: basic implementation of syscall tracing for x86
Provide the x86 trace callbacks to trace syscalls.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <1236401580-5758-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 06:25:44 +01:00
Yinghai Lu
773e673de2 x86: fix e820_update_range()
Impact: fix left range size on head

| commit 5c0e6f035d
|    x86: fix code paths used by update_mptable
|    Impact: fix crashes under Xen due to unrobust e820 code

fixes one e820 bug, but introduces another bug.

Need to update size for left range at first in case it is header.

also add __e820_add_region take more parameter.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: jbeulich@novell.com
LKML-Reference: <49B9E286.502@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 05:38:29 +01:00
Rusty Russell
73e907de7d cpumask: remove x86 cpumask_t uses.
Impact: cleanup

We are removing cpumask_t in favour of struct cpumask: mainly as a
marker of what code is now CONFIG_CPUMASK_OFFSTACK-safe.

The only non-trivial change here is vector_allocation_domain():
explicitly clear the mask and set the first word, rather than using
assignment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:57 +10:30
Rusty Russell
76ba0ecda0 cpumask: use cpumask_var_t in uv_flush_tlb_others.
Impact: remove cpumask_t, reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:57 +10:30
Rusty Russell
5c6cb5e2b1 cpumask: remove cpumask_t assignment from vector_allocation_domain()
Impact: cleanup

It's not legal to do assignments into cpumask_var_t; they will soon be of
variable length.

So explicitly clear the mask and set the first word, rather than using
assignment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:56 +10:30
Rusty Russell
70ba2b6a70 cpumask: clean up summit's send_IPI functions
Impact: cleanup, remove cpumask from stack

summit_send_IPI_allbutself might as well call
default_send_IPI_mask_allbutself_logical().  Also change cpumask_t to
struct cpumask and &cpu_online_map to cpu_online_mask while here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-13 14:49:55 +10:30