Commit Graph

3938 Commits

Author SHA1 Message Date
Linus Torvalds
f9b3bcfbc4 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm changes from Ingo Molnar:
 "Misc smaller changes all over the map"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/iommu/dmar: Remove warning for HPET scope type
  x86/mm/gart: Drop unnecessary check
  x86/mm/hotplug: Put kernel_physical_mapping_remove() declaration in CONFIG_MEMORY_HOTREMOVE
  x86/mm/fixmap: Remove unused FIX_CYCLONE_TIMER
  x86/mm/numa: Simplify some bit mangling
  x86/mm: Re-enable DEBUG_TLBFLUSH for X86_32
  x86/mm/cpa: Cleanup split_large_page() and its callee
  x86: Drop always empty .text..page_aligned section
2013-04-30 08:40:35 -07:00
Linus Torvalds
01c7cd0ef5 Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perparatory x86 kasrl changes from Ingo Molnar:
 "This contains changes from the ongoing KASLR work, by Kees Cook.

  The main changes are the use of a read-only IDT on x86 (which
  decouples the userspace visible virtual IDT address from the physical
  address), and a rework of ELF relocation support, in preparation of
  random, boot-time kernel image relocation."

* 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, relocs: Refactor the relocs tool to merge 32- and 64-bit ELF
  x86, relocs: Build separate 32/64-bit tools
  x86, relocs: Add 64-bit ELF support to relocs tool
  x86, relocs: Consolidate processing logic
  x86, relocs: Generalize ELF structure names
  x86: Use a read-only IDT alias on all CPUs
2013-04-30 08:37:24 -07:00
Linus Torvalds
df8edfa9af Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid changes from Ingo Molnar:
 "The biggest change is x86 CPU bug handling refactoring and cleanups,
  by Borislav Petkov"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, CPU, AMD: Drop useless label
  x86, AMD: Correct {rd,wr}msr_amd_safe warnings
  x86: Fold-in trivial check_config function
  x86, cpu: Convert AMD Erratum 400
  x86, cpu: Convert AMD Erratum 383
  x86, cpu: Convert Cyrix coma bug detection
  x86, cpu: Convert FDIV bug detection
  x86, cpu: Convert F00F bug detection
  x86, cpu: Expand cpufeature facility to include cpu bugs
2013-04-30 08:34:38 -07:00
Linus Torvalds
874f6d1be7 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc smaller cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/lib: Fix spelling, put space between a numeral and its units
  x86/lib: Fix spelling in the comments
  x86, quirks: Shut-up a long-standing gcc warning
  x86, msr: Unify variable names
  x86-64, docs, mm: Add vsyscall range to virtual address space layout
  x86: Drop KERNEL_IMAGE_START
  x86_64: Use __BOOT_DS instead_of __KERNEL_DS for safety
2013-04-30 08:34:07 -07:00
Linus Torvalds
ab86e974f0 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core timer updates from Ingo Molnar:
 "The main changes in this cycle's merge are:

   - Implement shadow timekeeper to shorten in kernel reader side
     blocking, by Thomas Gleixner.

   - Posix timers enhancements by Pavel Emelyanov:

   - allocate timer ID per process, so that exact timer ID allocations
     can be re-created be checkpoint/restore code.

   - debuggability and tooling (/proc/PID/timers, etc.) improvements.

   - suspend/resume enhancements by Feng Tang: on certain new Intel Atom
     processors (Penwell and Cloverview), there is a feature that the
     TSC won't stop in S3 state, so the TSC value won't be reset to 0
     after resume.  This can be taken advantage of by the generic via
     the CLOCK_SOURCE_SUSPEND_NONSTOP flag: instead of using the RTC to
     recover/approximate sleep time, the main (and precise) clocksource
     can be used.

   - Fix /proc/timer_list for 4096 CPUs by Nathan Zimmer: on so many
     CPUs the file goes beyond 4MB of size and thus the current
     simplistic seqfile approach fails.  Convert /proc/timer_list to a
     proper seq_file with its own iterator.

   - Cleanups and refactorings of the core timekeeping code by John
     Stultz.

   - International Atomic Clock time is managed by the NTP code
     internally currently but not exposed externally.  Separate the TAI
     code out and add CLOCK_TAI support and TAI support to the hrtimer
     and posix-timer code, by John Stultz.

   - Add deep idle support enhacement to the broadcast clockevents core
     timer code, by Daniel Lezcano: add an opt-in CLOCK_EVT_FEAT_DYNIRQ
     clockevents feature (which will be utilized by future clockevents
     driver updates), which allows the use of IRQ affinities to avoid
     spurious wakeups of idle CPUs - the right CPU with an expiring
     timer will be woken.

   - Add new ARM bcm281xx clocksource driver, by Christian Daudt

   - ... various other fixes and cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  clockevents: Set dummy handler on CPU_DEAD shutdown
  timekeeping: Update tk->cycle_last in resume
  posix-timers: Remove unused variable
  clockevents: Switch into oneshot mode even if broadcast registered late
  timer_list: Convert timer list to be a proper seq_file
  timer_list: Split timer_list_show_tickdevices
  posix-timers: Show sigevent info in proc file
  posix-timers: Introduce /proc/PID/timers file
  posix timers: Allocate timer id per process (v2)
  timekeeping: Make sure to notify hrtimers when TAI offset changes
  hrtimer: Fix ktime_add_ns() overflow on 32bit architectures
  hrtimer: Add expiry time overflow check in hrtimer_interrupt
  timekeeping: Shorten seq_count region
  timekeeping: Implement a shadow timekeeper
  timekeeping: Delay update of clock->cycle_last
  timekeeping: Store cycle_last value in timekeeper struct as well
  ntp: Remove ntp_lock, using the timekeeping locks to protect ntp state
  timekeeping: Simplify tai updating from do_adjtimex
  timekeeping: Hold timekeepering locks in do_adjtimex and hardpps
  timekeeping: Move ADJ_SETOFFSET to top level do_adjtimex()
  ...
2013-04-30 08:15:40 -07:00
Linus Torvalds
8700c95adb Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP/hotplug changes from Ingo Molnar:
 "This is a pretty large, multi-arch series unifying and generalizing
  the various disjunct pieces of idle routines that architectures have
  historically copied from each other and have grown in random, wildly
  inconsistent and sometimes buggy directions:

   101 files changed, 455 insertions(+), 1328 deletions(-)

  this went through a number of review and test iterations before it was
  committed, it was tested on various architectures, was exposed to
  linux-next for quite some time - nevertheless it might cause problems
  on architectures that don't read the mailing lists and don't regularly
  test linux-next.

  This cat herding excercise was motivated by the -rt kernel, and was
  brought to you by Thomas "the Whip" Gleixner."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  idle: Remove GENERIC_IDLE_LOOP config switch
  um: Use generic idle loop
  ia64: Make sure interrupts enabled when we "safe_halt()"
  sparc: Use generic idle loop
  idle: Remove unused ARCH_HAS_DEFAULT_IDLE
  bfin: Fix typo in arch_cpu_idle()
  xtensa: Use generic idle loop
  x86: Use generic idle loop
  unicore: Use generic idle loop
  tile: Use generic idle loop
  tile: Enter idle with preemption disabled
  sh: Use generic idle loop
  score: Use generic idle loop
  s390: Use generic idle loop
  powerpc: Use generic idle loop
  parisc: Use generic idle loop
  openrisc: Use generic idle loop
  mn10300: Use generic idle loop
  mips: Use generic idle loop
  microblaze: Use generic idle loop
  ...
2013-04-30 07:50:17 -07:00
Linus Torvalds
16fa94b532 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "The main changes in this development cycle were:

   - full dynticks preparatory work by Frederic Weisbecker

   - factor out the cpu time accounting code better, by Li Zefan

   - multi-CPU load balancer cleanups and improvements by Joonsoo Kim

   - various smaller fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  sched: Fix init NOHZ_IDLE flag
  sched: Prevent to re-select dst-cpu in load_balance()
  sched: Rename load_balance_tmpmask to load_balance_mask
  sched: Move up affinity check to mitigate useless redoing overhead
  sched: Don't consider other cpus in our group in case of NEWLY_IDLE
  sched: Explicitly cpu_idle_type checking in rebalance_domains()
  sched: Change position of resched_cpu() in load_balance()
  sched: Fix wrong rq's runnable_avg update with rt tasks
  sched: Document task_struct::personality field
  sched/cpuacct/UML: Fix header file dependency bug on the UML build
  cgroup: Kill subsys.active flag
  sched/cpuacct: No need to check subsys active state
  sched/cpuacct: Initialize cpuacct subsystem earlier
  sched/cpuacct: Initialize root cpuacct earlier
  sched/cpuacct: Allocate per_cpu cpuusage for root cpuacct statically
  sched/cpuacct: Clean up cpuacct.h
  sched/cpuacct: Remove redundant NULL checks in cpuacct_acount_field()
  sched/cpuacct: Remove redundant NULL checks in cpuacct_charge()
  sched/cpuacct: Add cpuacct_acount_field()
  sched/cpuacct: Add cpuacct_init()
  ...
2013-04-30 07:43:28 -07:00
Linus Torvalds
e0972916e8 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Features:

   - Add "uretprobes" - an optimization to uprobes, like kretprobes are
     an optimization to kprobes.  "perf probe -x file sym%return" now
     works like kretprobes.  By Oleg Nesterov.

   - Introduce per core aggregation in 'perf stat', from Stephane
     Eranian.

   - Add memory profiling via PEBS, from Stephane Eranian.

   - Event group view for 'annotate' in --stdio, --tui and --gtk, from
     Namhyung Kim.

   - Add support for AMD NB and L2I "uncore" counters, by Jacob Shin.

   - Add Ivy Bridge-EP uncore support, by Zheng Yan

   - IBM zEnterprise EC12 oprofile support patchlet from Robert Richter.

   - Add perf test entries for checking breakpoint overflow signal
     handler issues, from Jiri Olsa.

   - Add perf test entry for for checking number of EXIT events, from
     Namhyung Kim.

   - Add perf test entries for checking --cpu in record and stat, from
     Jiri Olsa.

   - Introduce perf stat --repeat forever, from Frederik Deweerdt.

   - Add --no-demangle to report/top, from Namhyung Kim.

   - PowerPC fixes plus a couple of cleanups/optimizations in uprobes
     and trace_uprobes, by Oleg Nesterov.

  Various fixes and refactorings:

   - Fix dependency of the python binding wrt libtraceevent, from
     Naohiro Aota.

   - Simplify some perf_evlist methods and to allow 'stat' to share code
     with 'record' and 'trace', by Arnaldo Carvalho de Melo.

   - Remove dead code in related to libtraceevent integration, from
     Namhyung Kim.

   - Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf
     sched lat' back working, by Arnaldo Carvalho de Melo

   - We don't use Newt anymore, just plain libslang, by Arnaldo Carvalho
     de Melo.

   - Kill a bunch of die() calls, from Namhyung Kim.

   - Fix build on non-glibc systems due to libio.h absence, from Cody P
     Schafer.

   - Remove some perf_session and tracing dead code, from David Ahern.

   - Honor parallel jobs, fix from Borislav Petkov

   - Introduce tools/lib/lk library, initially just removing duplication
     among tools/perf and tools/vm.  from Borislav Petkov

  ... and many more I missed to list, see the shortlog and git log for
  more details."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (136 commits)
  perf/x86/intel/P4: Robistify P4 PMU types
  perf/x86/amd: Fix AMD NB and L2I "uncore" support
  perf/x86/amd: Remove old-style NB counter support from perf_event_amd.c
  perf/x86: Check all MSRs before passing hw check
  perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
  perf/x86/intel: Add Ivy Bridge-EP uncore support
  perf/x86/intel: Fix SNB-EP CBO and PCU uncore PMU filter management
  perf/x86: Avoid kfree() in CPU_{STARTING,DYING}
  uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty
  uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit()
  uprobes/tracing: Change create_trace_uprobe() to support uretprobes
  uprobes/tracing: Make seq_printf() code uretprobe-friendly
  uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
  uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
  uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
  uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
  uprobes/tracing: Generalize struct uprobe_trace_entry_head
  uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
  uprobes/tracing: Kill the pointless seq_print_ip_sym() call
  uprobes/tracing: Kill the pointless task_pt_regs() calls
  ...
2013-04-30 07:41:01 -07:00
Gerald Schaefer
106c992a5e mm/hugetlb: add more arch-defined huge_pte functions
Commit abf09bed3c ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs.  the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.

This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390.  This change
will be a no-op for those architectures.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:33 -07:00
Ingo Molnar
5ac2b5c272 perf/x86/intel/P4: Robistify P4 PMU types
Linus found, while extending integer type extension checks in the
sparse static code checker, various fragile patterns of mixed
signed/unsigned  64-bit/32-bit integer use in perf_events_p4.c.

The relevant hardware register ABI is 64 bit wide on 32-bit
kernels as  well, so clean it all up a bit, remove unnecessary
casts, and make sure we  use 64-bit unsigned integers in these
places.

[ Unfortunately this patch was not tested on real P4 hardware,
  those are pretty rare already. If this patch causes any
  problems on P4 hardware then please holler ... ]

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130424072630.GB1780@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-26 09:31:41 +02:00
Thomas Gleixner
6402c7dc2a Merge branch 'linus' into timers/core
Reason: Get upstream fixes before adding conflicting code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-24 20:33:54 +02:00
Jacob Shin
c43ca5091a perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
Add support for AMD Family 15h [and above] northbridge
performance counters. MSRs 0xc0010240 ~ 0xc0010247 are shared
across all cores that share a common northbridge.

Add support for AMD Family 16h L2 performance counters. MSRs
0xc0010230 ~ 0xc0010237 are shared across all cores that share a
common L2 cache.

We do not enable counter overflow interrupts. Sampling mode and
per-thread events are not supported.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130419213428.GA8229@jshin-Toonie
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-21 11:01:24 +02:00
Ingo Molnar
73e21ce28d Merge branch 'perf/urgent' into perf/core
Conflicts:
	arch/x86/kernel/cpu/perf_event_intel.c

Merge in the latest fixes before applying new patches, resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-21 10:57:33 +02:00
Linus Torvalds
db93f8b420 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "Three groups of fixes:

   1. Make sure we don't execute the early microcode patching if family
      < 6, since it would touch MSRs which don't exist on those
      families, causing crashes.

   2. The Xen partial emulation of HyperV can be dealt with more
      gracefully than just disabling the driver.

   3. More EFI variable space magic.  In particular, variables hidden
      from runtime code need to be taken into account too."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Verify the family before dispatching microcode patching
  x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
  x86,efi: Implement efi_no_storage_paranoia parameter
  efi: Export efi_query_variable_store() for efivars.ko
  x86/Kconfig: Make EFI select UCS2_STRING
  efi: Distinguish between "remaining space" and actually used space
  efi: Pass boot services variable info to runtime code
  Move utf16 functions to kernel core and rename
  x86,efi: Check max_size only if it is non-zero.
  x86, efivars: firmware bug workarounds should be in platform code
2013-04-20 18:38:48 -07:00
H. Peter Anvin
c0a9f451e4 Merge remote-tracking branch 'efi/urgent' into x86/urgent
Matt Fleming (1):
      x86, efivars: firmware bug workarounds should be in platform
      code

Matthew Garrett (3):
      Move utf16 functions to kernel core and rename
      efi: Pass boot services variable info to runtime code
      efi: Distinguish between "remaining space" and actually used
      space

Richard Weinberger (2):
      x86,efi: Check max_size only if it is non-zero.
      x86,efi: Implement efi_no_storage_paranoia parameter

Sergey Vlasov (2):
      x86/Kconfig: Make EFI select UCS2_STRING
      efi: Export efi_query_variable_store() for efivars.ko

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-19 17:09:03 -07:00
Ingo Molnar
b5210b2a34 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes updates from Oleg Nesterov:

 - "uretprobes" - an optimization to uprobes, like kretprobes are an optimization
   to kprobes. "perf probe -x file sym%return" now works like kretprobes.

 - PowerPC fixes plus a couple of cleanups/optimizations in uprobes and trace_uprobes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-16 11:04:10 +02:00
Matthew Garrett
cc5a080c5d efi: Pass boot services variable info to runtime code
EFI variables can be flagged as being accessible only within boot services.
This makes it awkward for us to figure out how much space they use at
runtime. In theory we could figure this out by simply comparing the results
from QueryVariableInfo() to the space used by all of our variables, but
that fails if the platform doesn't garbage collect on every boot. Thankfully,
calling QueryVariableInfo() while still inside boot services gives a more
reliable answer. This patch passes that information from the EFI boot stub
up to the efi platform code.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-04-15 21:31:09 +01:00
Linus Torvalds
6c4c4d4bda Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set
  x86/mm/cpa/selftest: Fix false positive in CPA self test
  x86/mm/cpa: Convert noop to functional fix
  x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal
  x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
2013-04-14 11:13:24 -07:00
Anton Arapov
791eca1010 uretprobes/x86: Hijack return address
Hijack the return address and replace it with a trampoline address.

Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-13 15:31:55 +02:00
Dave Hansen
1de14c3c5c x86-32: Fix possible incomplete TLB invalidate with PAE pagetables
This patch attempts to fix:

	https://bugzilla.kernel.org/show_bug.cgi?id=56461

The symptom is a crash and messages like this:

	chrome: Corrupted page table at address 34a03000
	*pdpt = 0000000000000000 *pde = 0000000000000000
	Bad pagetable: 000f [#1] PREEMPT SMP

Ingo guesses this got introduced by commit 611ae8e3f5 ("x86/tlb:
enable tlb flush range support for x86") since that code started to free
unused pagetables.

On x86-32 PAE kernels, that new code has the potential to free an entire
PMD page and will clear one of the four page-directory-pointer-table
(aka pgd_t entries).

The hardware aggressively "caches" these top-level entries and invlpg
does not actually affect the CPU's copy.  If we clear one we *HAVE* to
do a full TLB flush, otherwise we might continue using a freed pmd page.
(note, we do this properly on the population side in pud_populate()).

This patch tracks whenever we clear one of these entries in the 'struct
mmu_gather', and ensures that we follow up with a full tlb flush.

BTW, I disassembled and checked that:

	if (tlb->fullmm == 0)
and
	if (!tlb->fullmm && !tlb->need_flush_all)

generate essentially the same code, so there should be zero impact there
to the !PAE case.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Artem S Tashkinov <t.artem@mailcity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-12 16:56:47 -07:00
Paul Bolle
a7e6567585 x86/mm/fixmap: Remove unused FIX_CYCLONE_TIMER
The last users of FIX_CYCLONE_TIMER were removed in v2.6.18. We
can remove this unneeded constant.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Link: http://lkml.kernel.org/r/1365698982.1427.3.camel@x61.thuisdomein
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-12 07:21:18 +02:00
Kees Cook
4eefbe792b x86: Use a read-only IDT alias on all CPUs
Make a copy of the IDT (as seen via the "sidt" instruction) read-only.
This primarily removes the IDT from being a target for arbitrary memory
write attacks, and has the added benefit of also not leaking the kernel
base offset, if it has been relocated.

We already did this on vendor == Intel and family == 5 because of the
F0 0F bug -- regardless of if a particular CPU had the F0 0F bug or
not.  Since the workaround was so cheap, there simply was no reason to
be very specific.  This patch extends the readonly alias to all CPUs,
but does not activate the #PF to #UD conversion code needed to deliver
the proper exception in the F0 0F case except on Intel family 5
processors.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130410192422.GA17344@www.outflux.net
Cc: Eric Northup <digitaleric@google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-11 13:53:19 -07:00
Boris Ostrovsky
511ba86e1d x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal
Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.

Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.

[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
  updates" may cause a minor performance regression on
  bare metal.  This patch resolves that performance regression.  It is
  somewhat unclear to me if this is a good -stable candidate. ]

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com
Tested-by: Josh Boyer <jwboyer@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> SEE NOTE ABOVE
2013-04-10 11:25:10 -07:00
Borislav Petkov
5952886bfe x86/mm/cpa: Cleanup split_large_page() and its callee
So basically we're generating the pte_t * from a struct page and
we're handing it down to the __split_large_page() internal version
which then goes and gets back struct page * from it because it
needs it.

Change the caller to hand down struct page * directly and the
callee can compute the pte_t itself.

Net save is one virt_to_page() call and simpler code. While at
it, make __split_large_page() static.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1363886217-24703-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-10 14:39:08 +02:00
Thomas Gleixner
ee761f629d arch: Consolidate tsk_is_polling()
Move it to a common place. Preparatory patch for implementing
set/clear for the idle need_resched poll implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130321215233.446034505@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-08 17:39:22 +02:00
Ingo Molnar
529801898b Merge branch 'for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/core
Pull IBM zEnterprise EC12 support patchlet from Robert Richter.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-08 11:43:30 +02:00
Borislav Petkov
1423bed239 x86, msr: Unify variable names
Make sure all MSR-accessing primitives which split MSR values in
two 32-bit parts have their variables called 'low' and 'high' for
consistence with the rest of the code and for ease of staring.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1362428180-8865-5-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-02 16:03:32 -07:00
Borislav Petkov
8e3c2a8cf6 x86: Drop KERNEL_IMAGE_START
We have KERNEL_IMAGE_START and __START_KERNEL_map which both contain the
start of the kernel text mapping's virtual address. Remove the prior one
which has been replicated a lot less times around the tree.

No functionality change.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1362428180-8865-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-02 16:03:29 -07:00
Paul Moore
8b4b9f27e5 x86: remove the x32 syscall bitmask from syscall_get_nr()
Commit fca460f95e simplified the x32
implementation by creating a syscall bitmask, equal to 0x40000000, that
could be applied to x32 syscalls such that the masked syscall number
would be the same as a x86_64 syscall.  While that patch was a nice
way to simplify the code, it went a bit too far by adding the mask to
syscall_get_nr(); returning the masked syscall numbers can cause
confusion with callers that expect syscall numbers matching the x32
ABI, e.g. unmasked syscall numbers.

This patch fixes this by simply removing the mask from syscall_get_nr()
while preserving the other changes from the original commit.  While
there are several syscall_get_nr() callers in the kernel, most simply
check that the syscall number is greater than zero, in this case this
patch will have no effect.  Of those remaining callers, they appear
to be few, seccomp and ftrace, and from my testing of seccomp without
this patch the original commit definitely breaks things; the seccomp
filter does not correctly filter the syscalls due to the difference in
syscall numbers in the BPF filter and the value from syscall_get_nr().
Applying this patch restores the seccomp BPF filter functionality on
x32.

I've tested this patch with the seccomp BPF filters as well as ftrace
and everything looks reasonable to me; needless to say general usage
seemed fine as well.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost
Cc: <stable@vger.kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-02 14:38:09 -07:00
Borislav Petkov
7d7dc116e5 x86, cpu: Convert AMD Erratum 400
Convert AMD erratum 400 to the bug infrastructure. Then, retract all
exports for modules since they're not needed now and make the AMD
erratum checking machinery local to amd.c. Use forward declarations to
avoid shuffling too much code around needlessly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-7-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:55 -07:00
Borislav Petkov
e6ee94d58d x86, cpu: Convert AMD Erratum 383
Convert the AMD erratum 383 testing code to the bug infrastructure. This
allows keeping the AMD-specific erratum testing machinery private to
amd.c and not export symbols to modules needlessly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-6-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:54 -07:00
Borislav Petkov
c5b41a6750 x86, cpu: Convert Cyrix coma bug detection
... to the new facility.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-5-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:54 -07:00
Borislav Petkov
93a829e8e2 x86, cpu: Convert FDIV bug detection
... to the new facility. Add a reference to the wikipedia article
explaining the FDIV test we're doing here.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-4-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:53 -07:00
Borislav Petkov
e2604b49e8 x86, cpu: Convert F00F bug detection
... to using the new facility and drop the cpuinfo_x86 member.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:52 -07:00
Borislav Petkov
65fc985b37 x86, cpu: Expand cpufeature facility to include cpu bugs
We add another 32-bit vector at the end of the ->x86_capability
bitvector which collects bugs present in CPUs. After all, a CPU bug is a
kind of a capability, albeit a strange one.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1363788448-31325-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-04-02 10:12:52 -07:00
Stephane Eranian
f20093eef5 perf/x86: Add memory profiling via PEBS Load Latency
This patch adds support for memory profiling using the
PEBS Load Latency facility.

Load accesses are sampled by HW and the instruction
address, data address, load latency, data source, tlb,
locked information can be saved in the sampling buffer
if using the PERF_SAMPLE_COST (for latency),
PERF_SAMPLE_ADDR, PERF_SAMPLE_DATA_SRC types.

To enable PEBS Load Latency, users have to use the
model specific event:

 - on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD
 - on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD

To make things easier, this patch also exports a generic
alias via sysfs: mem-loads. It export the right event
encoding based on the host CPU and can be used directly
by the perf tool.

Loosely based on Intel's Lin Ming patch posted on LKML
in July 2011.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-9-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:16:31 -03:00
Linus Torvalds
dfca53fb16 ACPI and power management fixes for 3.9-rc5
- Fix for a recent cpufreq regression related to acpi-cpufreq and
   suspend/resume from Viresh Kumar.
 
 - cpufreq stats reference counting fix from Viresh Kumar.
 
 - intel_pstate driver fixes from Dirk Brandewie and
   Konrad Rzeszutek Wilk.
 
 - New ACPI suspend blacklist entry for Sony Vaio VGN-FW21M from
   Fabio Valentini.
 
 - ACPI Platform Error Interface (APEI) fix from Chen Gong.
 
 - PCI root bridge hotplug locking fix from Yinghai Lu.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRVETOAAoJEKhOf7ml8uNs30kP/3GsKWacHsaIPdhIiHQC3f91
 HMLabrW7NE7ldrOoXzj1lTHsIc1TQHm722vyI+aF061HErfkF8Jkdi5rkIai8VMq
 IJXe4CtwuuCi0SeKQsV9ymiQanTrgsP/AlGV5x/KM/As8dvAVW/1+Ln/gXAnH0IJ
 /Onqf3eA4NBw/1Hjg7AGHGeCmOlDHvcetHF7eX4MaiYZHEwuy/a7jswH4aNOjwgx
 GZtbrnwUO6OtDKv6ie//1EbP753VrkHDtK3jzIy2lUA5YyLmr0XOTvy4uQh2n/r7
 tVTqsVoNZNA4En0YUspfsWwBruUic3ra9qVTrJqn7Fzymyr+TgyCQQzSUGrOGy2a
 wY0vwMAwm1dMwAsZWPhnui6aqvu0bbg0u7sxCZQs8WapdtjxPdD7iIhRk2YU4wOZ
 omtejW0thUIwEmHWgBPo9rFvfZmxy9hb044UfhkLI9xBmuTVrDb/HqeVPA767ZoO
 k7IVg1DG4Ye6xboCIILfluoUAsc3DvkHpCIvWVujK3pF5j/M9ptt3d8eXDFIzmWD
 J6tm9ARkQoUPRAs6751cG1N0nP++ZlErYseU/h6eXoC0rkeC/WbGyxIumii4xJhg
 Gs6GGeM8OgQ/7Fat68kA2Z7jriY+MTteLbq1Sl3PBlfdURaceOXkTIVrxXo33Itq
 jQiEKa1CbJDi6OBKog8K
 =0bjZ
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael J Wysocki:

 - Fix for a recent cpufreq regression related to acpi-cpufreq and
   suspend/resume from Viresh Kumar.

 - cpufreq stats reference counting fix from Viresh Kumar.

 - intel_pstate driver fixes from Dirk Brandewie and Konrad Rzeszutek
   Wilk.

 - New ACPI suspend blacklist entry for Sony Vaio VGN-FW21M from Fabio
   Valentini.

 - ACPI Platform Error Interface (APEI) fix from Chen Gong.

 - PCI root bridge hotplug locking fix from Yinghai Lu.

* tag 'pm+acpi-3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI / ACPI: hold acpi_scan_lock during root bus hotplug
  ACPI / APEI: fix error status check condition for CPER
  ACPI / PM: fix suspend and resume on Sony Vaio VGN-FW21M
  cpufreq: acpi-cpufreq: Don't set policy->related_cpus from .init()
  cpufreq: stats: do cpufreq_cpu_put() corresponding to cpufreq_cpu_get()
  intel-pstate: Use #defines instead of hard-coded values.
  cpufreq / intel_pstate: Fix calculation of current frequency
  cpufreq / intel_pstate: Add function to check that all MSRs are valid
2013-03-28 13:47:31 -07:00
Linus Torvalds
33b65f1e9c Bug-fixes:
- Regression fixes for C-and-P states not being parsed properly.
  - Fix possible security issue with guests triggering DoS via non-assigned MSI-Xs.
  - Fix regression (introduced in v3.7) with raising an event (v2).
  - Fix hastily introduced band-aid during c0 for the CR3 blowup.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEcBAABAgAGBQJRUxlVAAoJEFjIrFwIi8fJiUsH/2a3A8EVqS7OYDNgT0ZFb1VI
 rMLNiA50sRJNDsq0NbGl1Y+Lubus1czc0c7HXFQ557OakN6WqcmPPjCKp4JT6NnV
 Jz/IZ0iimdoHiPru1Qe4ah3fSgzUtht2LB48Z/a0Is4k3LsRP2W3/niVC3ypnyuJ
 52HjjuxeFAfXIkNeqsrO2a6cUXZeXzUyR4g9GNxDozi4jHpoPQ4j9okZbo218xH+
 /pRnFeMD7t7dFkgNeyeGXUiJn2AkNPHi3Hx+RH5nN9KXQ1eem9R4p7Qpez1dUEWF
 YEc/bs7MyOYezzTVHPYk77Yt8baOHJt7UbHjM6jfi1aGYYINTRr3m5mORd3rCmc=
 =61IX
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
 "This is mostly just the last stragglers of the regression bugs that
  this merge window had.  There are also two bug-fixes: one that adds an
  extra layer of security, and a regression fix for a change that was
  added in v3.7 (the v1 was faulty, the v2 works).

   - Regression fixes for C-and-P states not being parsed properly.
   - Fix possible security issue with guests triggering DoS via
     non-assigned MSI-Xs.
   - Fix regression (introduced in v3.7) with raising an event (v2).
   - Fix hastily introduced band-aid during c0 for the CR3 blowup."

* tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/events: avoid race with raising an event in unmask_evtchn()
  xen/mmu: Move the setting of pvops.write_cr3 to later phase in bootup.
  xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called.
  xen-pciback: notify hypervisor about devices intended to be assigned to guests
  xen/acpi-processor: Don't dereference struct acpi_processor on all CPUs.
2013-03-27 12:56:25 -07:00
Konrad Rzeszutek Wilk
05e99c8cf9 intel-pstate: Use #defines instead of hard-coded values.
They are defined in coreboot (MSR_PLATFORM) and the other
one is already defined in msr-index.h.

Let's use those.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-25 15:13:07 +01:00
Jan Beulich
909b3fdb0d xen-pciback: notify hypervisor about devices intended to be assigned to guests
For MSI-X capable devices the hypervisor wants to write protect the
MSI-X table and PBA, yet it can't assume that resources have been
assigned to their final values at device enumeration time. Thus have
pciback do that notification, as having the device controlled by it is
a prerequisite to assigning the device to guests anyway.

This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI:
add mechanism to fully protect MSI-X table from PV guest accesses") on
the master branch of git://xenbits.xen.org/xen.git.

CC: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-03-22 10:20:55 -04:00
Linus Torvalds
cd82346934 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A fair chunk of the linecount comes from a fix for a tracing bug that
  corrupts latency tracing buffers when the overwrite mode is changed on
  the fly - the rest is mostly assorted fewliner fixlets."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event
  kprobes/x86: Check Interrupt Flag modifier when registering probe
  kprobes: Make hash_64() as always inlined
  perf: Generate EXIT event only once per task context
  perf: Reset hwc->last_period on sw clock events
  tracing: Prevent buffer overwrite disabled for latency tracers
  tracing: Keep overwrite in sync between regular and snapshot buffers
  tracing: Protect tracer flags with trace_types_lock
  perf tools: Fix LIBNUMA build with glibc 2.12 and older.
  tracing: Fix free of probe entry by calling call_rcu_sched()
  perf/POWER7: Create a sysfs format entry for Power7 events
  perf probe: Fix segfault
  libtraceevent: Remove hard coded include to /usr/local/include in Makefile
  perf record: Fix -C option
  perf tools: check if -DFORTIFY_SOURCE=2 is allowed
  perf report: Fix build with NO_NEWT=1
  perf annotate: Fix build with NO_NEWT=1
  tracing: Fix race in snapshot swapping
2013-03-21 08:29:11 -07:00
Linus Torvalds
ea4a0ce111 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
  KVM: x86: fix deadlock in clock-in-progress request handling
  KVM: allow host header to be included even for !CONFIG_KVM
2013-03-19 18:24:12 -07:00
Andy Honig
0b79459b48 KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
There is a potential use after free issue with the handling of
MSR_KVM_SYSTEM_TIME.  If the guest specifies a GPA in a movable or removable
memory such as frame buffers then KVM might continue to write to that
address even after it's removed via KVM_SET_USER_MEMORY_REGION.  KVM pins
the page in memory so it's unlikely to cause an issue, but if the user
space component re-purposes the memory previously used for the guest, then
the guest will be able to corrupt that memory.

Tested: Tested against kvmclock unit test

Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-19 14:17:35 -03:00
Masami Hiramatsu
9a556ab998 kprobes/x86: Check Interrupt Flag modifier when registering probe
Currently kprobes check whether the copied instruction modifies
IF (interrupt flag) on each probe hit. This results not only in
introducing overhead but also involving
inat_get_opcode_attribute into the kprobes hot path, and it can
cause an infinite recursive call (and kernel panic in the end).

Actually, since the copied instruction itself can never be modified
on the buffer, it is needless to analyze the instruction on every
probe hit.

To fix this issue, we check it only once when registering probe
and store the result on ainsn->if_modifier.

Reported-by: Timo Juhani Lindfors <timo.lindfors@iki.fi>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20130314115242.19690.33573.stgit@mhiramat-M0-7522
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 10:21:23 +01:00
Feng Tang
c54fdbb282 x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3
On some new Intel Atom processors (Penwell and Cloverview), there is
a feature that the TSC won't stop in S3 state, say the TSC value
won't be reset to 0 after resume. This feature makes TSC a more reliable
clocksource and could benefit the timekeeping code during system
suspend/resume cycle, so add a flag for it.

Signed-off-by: Feng Tang <feng.tang@intel.com>
[jstultz: Fix checkpatch warning]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-03-15 16:50:26 -07:00
Frederic Weisbecker
56dd9470d7 context_tracking: Move exception handling to generic code
Exceptions handling on context tracking should share common
treatment: on entry we exit user mode if the exception triggered
in that context. Then on exception exit we return to that previous
context.

Generalize this to avoid duplication across archs.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-07 17:09:25 +01:00
Peter Jones
3c4aff6b9a x86, doc: Be explicit about what the x86 struct boot_params requires
If the sentinel triggers, we do not want the boot loader authors to
just poke it and make the error go away, we want them to actually fix
the problem.

This should help avoid making the incorrect change in non-compliant
bootloaders.

[ hpa: dropped the Documentation/x86/boot.txt hunk pending
  clarifications ]

Signed-off-by: Peter Jones <pjones@redhat.com>
Link: http://lkml.kernel.org/r/1362592823-28967-1-git-send-email-pjones@redhat.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-06 20:34:43 -08:00
Josh Boyer
2e604c0f19 x86: Don't clear efi_info even if the sentinel hits
When boot_params->sentinel is set, all we really know is that some
undefined set of fields in struct boot_params contain garbage.  In the
particular case of efi_info, however, there is a private magic for
that substructure, so it is generally safe to leave it even if the
bootloader is broken.

kexec (for which we did the initial analysis) did not initialize this
field, but of course all the EFI bootloaders do, and most EFI
bootloaders are broken in this respect (and should be fixed.)

Reported-by: Robin Holt <holt@sgi.com>
Link: http://lkml.kernel.org/r/CA%2B5PVA51-FT14p4CRYKbicykugVb=PiaEycdQ57CK2km_OQuRQ@mail.gmail.com
Tested-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-06 20:23:30 -08:00
Linus Torvalds
14cc0b55b7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal/compat fixes from Al Viro:
 "Fixes for several regressions introduced in the last signal.git pile,
  along with fixing bugs in truncate and ftruncate compat (on just about
  anything biarch at least one of those two had been done wrong)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  compat: restore timerfd settime and gettime compat syscalls
  [regression] braino in "sparc: convert to ksignal"
  fix compat truncate/ftruncate
  switch lseek to COMPAT_SYSCALL_DEFINE
  lseek() and truncate() on sparc really need sign extension
2013-03-02 08:34:06 -08:00
Linus Torvalds
e3c4877de8 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/EFI changes from Peter Anvin:

 - Improve the initrd handling in the EFI boot stub by allowing forward
   slashes in the pathname - from Chun-Yi Lee.

 - Cleanup code duplication in the EFI mixed kernel/firmware code - from
   Satoru Takeuchi.

 - efivarfs bug fixes for more strict filename validation, with lots of
   input from Al Viro.

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, efi: remove duplicate code in setup_arch() by using, efi_is_native()
  efivarfs: guid part of filenames are case-insensitive
  efivarfs: Validate filenames much more aggressively
  efivarfs: Use sizeof() instead of magic number
  x86, efi: Allow slash in file path of initrd
2013-02-27 16:17:42 -08:00