Commit Graph

10017 Commits

Author SHA1 Message Date
Linus Torvalds
b8c0aa46b3 This pull request has a lot of work done. The main thing is the changes
to the ftrace function callback infrastructure. It's introducing a
 way to allow different functions to call directly different trampolines
 instead of all calling the same "mcount" one.
 
 The only user of this for now is the function graph tracer, which always
 had a different trampoline, but the function tracer trampoline was called
 and did basically nothing, and then the function graph tracer trampoline
 was called. The difference now, is that the function graph tracer
 trampoline can be called directly if a function is only being traced by
 the function graph trampoline. If function tracing is also happening on
 the same function, the old way is still done.
 
 The accounting for this takes up more memory when function graph tracing
 is activated, as it needs to keep track of which functions it uses.
 I have a new way that wont take as much memory, but it's not ready yet
 for this merge window, and will have to wait for the next one.
 
 Another big change was the removal of the ftrace_start/stop() calls that
 were used by the suspend/resume code that stopped function tracing when
 entering into suspend and resume paths. The stop of ftrace was done
 because there was some function that would crash the system if one called
 smp_processor_id()! The stop/start was a big hammer to solve the issue
 at the time, which was when ftrace was first introduced into Linux.
 Now ftrace has better infrastructure to debug such issues, and I found
 the problem function and labeled it with "notrace" and function tracing
 can now safely be activated all the way down into the guts of suspend
 and resume.
 
 Other changes include clean ups of uprobe code.
 Clean up of the trace_seq() code.
 And other various small fixes and clean ups to ftrace and tracing.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJT35zXAAoJEKQekfcNnQGuOz0H/38zqM0nLFhrgvz3EPk2UOjn
 xqpX8qyb2V7TJZL+IqeXU2a5cQZl5ba0D4WtBGpxbTae3CJYiuQ87iKUNFoH0om5
 FDpn80igb368k8V3qRdRsziKVCCf0XBd/NkHJXc0ZkfXGyzB2Ga4bBxALxp2gj9y
 bnO+vKo6+tWYKG4hyQb4P3LRXUrK8/LWEsPr39cH2QH1Rdj69Lx9CgrCdUVJmwcb
 Bj8hEiLXL/RYCFNn79A3wNTUvW0rG/AOIf4SLqXtasSRZ0ToaU0ZyDnrNv+0Ol47
 rX8tSk+LfXchL9hpIvjCf1vlAYq3pO02favteR/jip3lx/dTjEDE4RJ9qtJzZ4Q=
 =fwQY
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This pull request has a lot of work done.  The main thing is the
  changes to the ftrace function callback infrastructure.  It's
  introducing a way to allow different functions to call directly
  different trampolines instead of all calling the same "mcount" one.

  The only user of this for now is the function graph tracer, which
  always had a different trampoline, but the function tracer trampoline
  was called and did basically nothing, and then the function graph
  tracer trampoline was called.  The difference now, is that the
  function graph tracer trampoline can be called directly if a function
  is only being traced by the function graph trampoline.  If function
  tracing is also happening on the same function, the old way is still
  done.

  The accounting for this takes up more memory when function graph
  tracing is activated, as it needs to keep track of which functions it
  uses.  I have a new way that wont take as much memory, but it's not
  ready yet for this merge window, and will have to wait for the next
  one.

  Another big change was the removal of the ftrace_start/stop() calls
  that were used by the suspend/resume code that stopped function
  tracing when entering into suspend and resume paths.  The stop of
  ftrace was done because there was some function that would crash the
  system if one called smp_processor_id()! The stop/start was a big
  hammer to solve the issue at the time, which was when ftrace was first
  introduced into Linux.  Now ftrace has better infrastructure to debug
  such issues, and I found the problem function and labeled it with
  "notrace" and function tracing can now safely be activated all the way
  down into the guts of suspend and resume

  Other changes include clean ups of uprobe code, clean up of the
  trace_seq() code, and other various small fixes and clean ups to
  ftrace and tracing"

* tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits)
  ftrace: Add warning if tramp hash does not match nr_trampolines
  ftrace: Fix trampoline hash update check on rec->flags
  ring-buffer: Use rb_page_size() instead of open coded head_page size
  ftrace: Rename ftrace_ops field from trampolines to nr_trampolines
  tracing: Convert local function_graph functions to static
  ftrace: Do not copy old hash when resetting
  tracing: let user specify tracing_thresh after selecting function_graph
  ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on()
  tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST
  s390/ftrace: remove check of obsolete variable function_trace_stop
  arm64, ftrace: Remove check of obsolete variable function_trace_stop
  Blackfin: ftrace: Remove check of obsolete variable function_trace_stop
  metag: ftrace: Remove check of obsolete variable function_trace_stop
  microblaze: ftrace: Remove check of obsolete variable function_trace_stop
  MIPS: ftrace: Remove check of obsolete variable function_trace_stop
  parisc: ftrace: Remove check of obsolete variable function_trace_stop
  sh: ftrace: Remove check of obsolete variable function_trace_stop
  sparc64,ftrace: Remove check of obsolete variable function_trace_stop
  tile: ftrace: Remove check of obsolete variable function_trace_stop
  ftrace: x86: Remove check of obsolete variable function_trace_stop
  ...
2014-08-04 11:50:00 -07:00
Linus Torvalds
f74ad8df4e PCI changes for the v3.17 merge window:
Resource management
     - Support BAR sizes up to 128GB (Yinghai Lu)
     - Keep original resource if we fail to expand it (Guo Chao)
     - Return conventional error values from pci_revert_fw_address() (Bjorn Helgaas)
     - Tidy resource assignment messages (Bjorn Helgaas)
     - Don't exclude low BIOS area for non-PCI cards (Christoph Schulz)
 
   PCI device hotplug
     - Prevent NULL dereference during pciehp probe (Andreas Noever)
     - Make pciehp pcie_wait_cmd() self-contained (Bjorn Helgaas)
     - Wait for pciehp hotplug command completion lazily (Bjorn Helgaas)
     - Compute pciehp timeout from hotplug command start time (Bjorn Helgaas)
     - Remove pciehp assumptions about which commands cause completion events (Bjorn Helgaas)
     - Clear pciehp Data Link Layer State Changed during init (Myron Stowe)
     - Remove pciehp struct controller.no_cmd_complete (Rajat Jain)
     - Remove cpqphp unnecessary null test (Fabian Frederick)
     - Remove "invalid IRQ" warning for hot-added PCIe ports (Jiang Liu)
 
   IOMMU
     - Add DMA alias quirk for Intel 82801 bridge (Alex Williamson)
 
   MSI
     - Add internal msix_clear_and_set_ctrl() (Yijing Wang)
     - Remove unused msi_enabled_mask() (Yijing Wang)
     - Cache Multiple Message Capable in struct msi_desc (Yijing Wang)
     - Add msi_setup_entry() to clean up initialization (Yijing Wang)
     - Remove unused msi_remove_pci_irq_vectors() (Yijing Wang)
     - Retrieve first MSI IRQ from msi_desc rather than pci_dev (Yijing Wang)
     - Remove unused list access in __pci_restore_msix_state() (Yijing Wang)
     - Use irq_get_msi_desc() to simplify code (Yijing Wang)
 
   Generic host bridge driver
     - Fix GPL v2 license string typo (Bjorn Helgaas)
 
   Marvell MVEBU
     - Fix GPL v2 license string typo (Thierry Reding)
 
   NVIDIA Tegra
     - Use correct initial HW settings (Phil Edworthy)
     - Remove rcar_pcie_setup_window() resource argument (Phil Edworthy)
     - Fix GPL v2 license string typo (Thierry Reding)
 
   Renesas R-Car
     - Remove redundant config accessor register checks (Sergei Shtylyov)
     - Fix GPL v2 license string typo (Bjorn Helgaas)
 
   Virtualization
     - Factor secondary bus reset logic (Gavin Shan)
     - Remove duplicate powerpc reset logic (Gavin Shan)
 
   Miscellaneous
     - Rework default VGA detection for EFI (Bruno Prémont)
     - Fix sysfs "acpi_index" and "label" errors for NIC renaming (Simone Gotti)
     - Configure ASPM at pci_enable_device()-time (Vidya Sagar)
     - Add include/linux/pci_ids.h include guard (Rasmus Villemoes)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTzppBAAoJEFmIoMA60/r8mtQP/jgVWCSU+0ulHjoxVSRLu4Lc
 UGKQFjS03oWWflHdvW6wZFqN82Ynva9fYCLMtiKdPg7cgTosSRT3I4DjAIm80ZI/
 kZvHxSmi6DBYmchZBsWzj60zxNiYZeEgd7CevzcJRHuwbKNMr2y12s6hjJbyl5lF
 ygaXWpDveKjsEDjyk9vKjUGwul/NJKynar253Yh178XaoypdGuiEIw3D1lQFMZZp
 ADcRijIi+CD2BENtDr6fbldbj+yQ93yyUSloEnaKtWZD+Ao5IsHngN0IyRu+l1Wl
 LFob0AsopeYVFKdw22Gn1KAq9Jj01acsSBRXjgrauU+tLY512Vkbp1lFYl85B/38
 /Z0VNHncmIh29rq9Tl2xQwEeI3Ja27FfnMjC70dLM5YjWf8vsYnDEQZHyxAAe15D
 p3H3YuuDjmvHkoSrHY/68DLfDl9ubw3/BFUlCMqijL7444ZWLXathrnCV8ZJimmr
 PlF/m7GtXYF4wIw19m9KQqNBUPJJEsVHExKzICOY4v5/nMlvx4ZkBDR3tPNEH1sk
 3AYKjLDw21Nle7yKcAlxDI/TYWZqxuph23UpevzlQd16tutq2i2FqpauiqI3DFm4
 VfYVbOVQwfeUJt11VOCgxvE7RsTxCk5QefB+YKVAdVK6vMZHeZxsetYvrCDptnea
 cId/NfiEFnmr+u3mAyPM
 =U5Ip
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "I'll be on vacation until Aug 11, and I suspect the merge window will
  open before then, so I'm sending this to you early.  There are more
  things I'd like to get into v3.17, so I hope to send another pull
  request soon after I return.

  The most notable pieces here are:

   - Support BARs up to 128GB (up from 8GB)
   - Fix SR-IOV resource assignment when we fail to expand a resource
   - Rework pciehp to handle a common hardware erratum
   - Cleanup MSI
   - Fix NIC renaming issue
   - Fix VGA default device issue on EFI systems
   - Fix ASPM configuration (previously we didn't enable it as expected)

  Alex Williamson has graciously agreed to take care of any major issues
  with this if you take it before I return.

  Details:

  Resource management
    - Support BAR sizes up to 128GB (Yinghai Lu)
    - Keep original resource if we fail to expand it (Guo Chao)
    - Return conventional error values from pci_revert_fw_address() (Bjorn Helgaas)
    - Tidy resource assignment messages (Bjorn Helgaas)
    - Don't exclude low BIOS area for non-PCI cards (Christoph Schulz)

  PCI device hotplug
    - Prevent NULL dereference during pciehp probe (Andreas Noever)
    - Make pciehp pcie_wait_cmd() self-contained (Bjorn Helgaas)
    - Wait for pciehp hotplug command completion lazily (Bjorn Helgaas)
    - Compute pciehp timeout from hotplug command start time (Bjorn Helgaas)
    - Remove pciehp assumptions about which commands cause completion events (Bjorn Helgaas)
    - Clear pciehp Data Link Layer State Changed during init (Myron Stowe)
    - Remove pciehp struct controller.no_cmd_complete (Rajat Jain)
    - Remove cpqphp unnecessary null test (Fabian Frederick)
    - Remove "invalid IRQ" warning for hot-added PCIe ports (Jiang Liu)

  IOMMU
    - Add DMA alias quirk for Intel 82801 bridge (Alex Williamson)

  MSI
    - Add internal msix_clear_and_set_ctrl() (Yijing Wang)
    - Remove unused msi_enabled_mask() (Yijing Wang)
    - Cache Multiple Message Capable in struct msi_desc (Yijing Wang)
    - Add msi_setup_entry() to clean up initialization (Yijing Wang)
    - Remove unused msi_remove_pci_irq_vectors() (Yijing Wang)
    - Retrieve first MSI IRQ from msi_desc rather than pci_dev (Yijing Wang)
    - Remove unused list access in __pci_restore_msix_state() (Yijing Wang)
    - Use irq_get_msi_desc() to simplify code (Yijing Wang)

  Generic host bridge driver
    - Fix GPL v2 license string typo (Bjorn Helgaas)

  Marvell MVEBU
    - Fix GPL v2 license string typo (Thierry Reding)

  NVIDIA Tegra
    - Use correct initial HW settings (Phil Edworthy)
    - Remove rcar_pcie_setup_window() resource argument (Phil Edworthy)
    - Fix GPL v2 license string typo (Thierry Reding)

  Renesas R-Car
    - Remove redundant config accessor register checks (Sergei Shtylyov)
    - Fix GPL v2 license string typo (Bjorn Helgaas)

  Virtualization
    - Factor secondary bus reset logic (Gavin Shan)
    - Remove duplicate powerpc reset logic (Gavin Shan)

  Miscellaneous
    - Rework default VGA detection for EFI (Bruno Prémont)
    - Fix sysfs "acpi_index" and "label" errors for NIC renaming (Simone Gotti)
    - Configure ASPM at pci_enable_device()-time (Vidya Sagar)
    - Add include/linux/pci_ids.h include guard (Rasmus Villemoes)"

* tag 'pci-v3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (38 commits)
  PCI/MSI: Use irq_get_msi_desc() to simplify code
  PCI/MSI: Remove unused list access in __pci_restore_msix_state()
  PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev
  PCI/MSI: Remove unused function msi_remove_pci_irq_vectors()
  PCI/MSI: Add msi_setup_entry() to clean up MSI initialization
  PCI: Configure ASPM when enabling device
  x86: don't exclude low BIOS area when allocating address space for non-PCI cards
  PCI: generic: Fix GPL v2 license string typo
  PCI: rcar: Fix GPL v2 license string typo
  PCI: tegra: Fix GPL v2 license string typo
  PCI: mvebu: Fix GPL v2 license string typo
  PCI: Add include guard to include/linux/pci_ids.h
  x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()
  PCI: Tidy resource assignment messages
  PCI: Return conventional error values from pci_revert_fw_address()
  PCI: Cleanup control flow
  PCI: Support BAR sizes up to 128GB
  PCI: cpqphp: Remove unnecessary null test before debugfs_remove()
  PCI: pciehp: Clear Data Link Layer State Changed during init
  PCI: Add bridge DMA alias quirk for Intel 82801 bridge
  ...
2014-08-04 09:29:37 -07:00
Andy Lutomirski
7209a75d20 x86_64/entry/xen: Do not invoke espfix64 on Xen
This moves the espfix64 logic into native_iret.  To make this work,
it gets rid of the native patch for INTERRUPT_RETURN:
INTERRUPT_RETURN on native kernels is now 'jmp native_iret'.

This changes the 16-bit SS behavior on Xen from OOPSing to leaking
some bits of the Xen hypervisor's RSP (I think).

[ hpa: this is a nonzero cost on native, but probably not enough to
  measure. Xen needs to fix this in their own code, probably doing
  something equivalent to espfix64. ]

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/7b8f1d8ef6597cb16ae004a43c56980a7de3cf94.1406129132.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
2014-07-28 15:25:40 -07:00
Linus Torvalds
9dae0a3fc4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A bunch of fixes for perf and kprobes:
   - revert a commit that caused a perf group regression
   - silence dmesg spam
   - fix kprobe probing errors on ia64 and ppc64
   - filter kprobe faults from userspace
   - lockdep fix for perf exit path
   - prevent perf #GP in KVM guest
   - correct perf event and filters"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Fix "Failed to find blacklist" probing errors on ia64 and ppc64
  kprobes/x86: Don't try to resolve kprobe faults from userspace
  perf/x86/intel: Avoid spamming kernel log for BTS buffer failure
  perf/x86/intel: Protect LBR and extra_regs against KVM lying
  perf: Fix lockdep warning on process exit
  perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings
  perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge
  perf: Revert ("perf: Always destroy groups on exit")
2014-07-27 09:57:16 -07:00
H. Peter Anvin
bf72f5dee0 Promote one fix for 3.16
This fix was necessary after
 
 9c15a24b03 ("x86/mce: Improve mcheck_init_device() error handling")
 
 went in. What this patch did was, among others, check the return value
 of misc_register and exit early if it encountered an error. Original
 code sloppily didn't do that.
 
 However,
 
         cef12ee52b ("xen/mce: Add mcelog support for Xen platform")
 
 made it so that xen's init routine xen_late_init_mcelog runs first. This
 was needed for the xen mcelog device which is supposed to be independent
 from the baremetal one.
 
 Initially it was reported that misc_register() fails often on xen and
 that's why it needed fixing. However, it is *supposed* to fail by
 design, when running in dom0 so that the xen mcelog device file gets
 registered first.
 
 And *then* you need the notifier *not* unregistered on the error path so
 that the timer does get deleted properly in the CPU hotplug notifier.
 
 Btw, this fix is needed also on baremetal in the unlikely event that
 misc_register(&mce_chrdev_device) fails there too.
 
 I was unsure whether to rush it in now and decided to delay it to 3.17.
 However, xen people wanted it promoted as it breaks xen when doing cpu
 hotplug there. So, after a bit of simmering in tip/master for initial
 smoke testing, let's move it to 3.16. It fixes a semi-regression which
 got introduced in 3.16 so no need for stable tagging.
 
 tip/x86/ras contains that exact same commit but we can't remove it
 there as it is not the last one. It won't cause any merge issues, as I
 confirmed locally but I should state here the special situation of this
 one fix explicitly anyway.
 
 Thanks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTzUMOAAoJEBLB8Bhh3lVKDC0P/3azgACl7qXkpUoFIUW4RSGJ
 3psmzjYd8RU2Trh7+8NFTUccLYIe6+4wyE8R4fgx1iV3kuUPOuAIRn0sRk4/TgCC
 IwA+j9v28jilFQvXuv+v105mIdiTmKJotcL3+k4I3HUR8RehnHK+BrlxR8e/zQzU
 W/Vk9cgz8wzPxMmmpvUE8OHmWYEk6h/rkCz6TF0bAJbgSq9iN37mmpKfXwwvaIRi
 CFP/UXIduipEuYUd+8fJqUg14roDFvA5hXPz7abM0gSr1NjR/k724f2G4iDnez0V
 pm8I+PmyEaZYI6udUrbyZ0X9wTcgHLe6GtwyEIBaOOqkMZbjC4anEe7VUJFIKr5/
 dNN538DGXxNfrf0u5lkHF/SCiE/T3eoP8imaMeZXwE04HHjWOfvqJs7miUPfGjKE
 VYvbY2fVbjbYCFjgqc2XkJgPpwGpOF2WCaXva2hyIRKrXsgt725wvoBCnNMX+AD7
 l71AeexVCN3TGmP7Twi7W8d23QPR0+M7x5SNkFaUTepJ2f+s3Fpeopq9mfbAtM6i
 p6fRDVncOBalNW3G5hED3YYcjfjQgpiKMQ7zm84K71NkSCBOsjiatFrPD+Y0lBwM
 LNtOViTXFvXW5xMBDDjQq3+InIbzt7MlQMVP4nP6rkEjOLfR60+ivr9sPUuqTb//
 hqI5bCPgWhxEBrzA7iIj
 =z/P/
 -----END PGP SIGNATURE-----

x86: Merge tag 'ras_urgent' into x86/urgent

Promote one fix for 3.16

This fix was necessary after

9c15a24b03 ("x86/mce: Improve mcheck_init_device() error handling")

went in. What this patch did was, among others, check the return value
of misc_register and exit early if it encountered an error. Original
code sloppily didn't do that.

However,

        cef12ee52b ("xen/mce: Add mcelog support for Xen platform")

made it so that xen's init routine xen_late_init_mcelog runs first. This
was needed for the xen mcelog device which is supposed to be independent
from the baremetal one.

Initially it was reported that misc_register() fails often on xen and
that's why it needed fixing. However, it is *supposed* to fail by
design, when running in dom0 so that the xen mcelog device file gets
registered first.

And *then* you need the notifier *not* unregistered on the error path so
that the timer does get deleted properly in the CPU hotplug notifier.

Btw, this fix is needed also on baremetal in the unlikely event that
misc_register(&mce_chrdev_device) fails there too.

I was unsure whether to rush it in now and decided to delay it to 3.17.
However, xen people wanted it promoted as it breaks xen when doing cpu
hotplug there. So, after a bit of simmering in tip/master for initial
smoke testing, let's move it to 3.16. It fixes a semi-regression which
got introduced in 3.16 so no need for stable tagging.

tip/x86/ras contains that exact same commit but we can't remove it
there as it is not the last one. It won't cause any merge issues, as I
confirmed locally but I should state here the special situation of this
one fix explicitly anyway.

Thanks.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-07-24 16:32:31 -07:00
Peter Zijlstra
2a2261553d x86, cpu: Fix cache topology for early P4-SMT
P4 systems with cpuid level < 4 can have SMT, but the cache topology
description available (cpuid2) does not include SMP information.

Now we know that SMT shares all cache levels, and therefore we can
mark all available cache levels as shared.

We do this by setting cpu_llc_id to ->phys_proc_id, since that's
the same for each SMT thread. We can do this unconditional since if
there's no SMT its still true, the one CPU shares cache with only
itself.

This fixes a problem where such CPUs report an incorrect LLC CPU mask.

This in turn fixes a crash in the scheduler where the topology was
build wrong, it assumes the LLC mask to include at least the SMT CPUs.

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lan
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-07-23 08:16:17 -07:00
Sven Wegener
8142b21550 x86_32, entry: Store badsys error code in %eax
Commit 554086d ("x86_32, entry: Do syscall exit work on badsys
(CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
code, resulting in syscall() not returning proper errors for undefined
syscalls on CPUs supporting the sysenter feature.

The following code:

> int result = syscall(666);
> printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));

results in:

> result=666 errno=0 error=Success

Obviously, the syscall return value is the called syscall number, but it
should have been an ENOSYS error. When run under ptrace it behaves
correctly, which makes it hard to debug in the wild:

> result=-1 errno=38 error=Function not implemented

The %eax register is the return value register. For debugging via ptrace
the syscall entry code stores the complete register context on the
stack. The badsys handlers only store the ENOSYS error code in the
ptrace register set and do not set %eax like a regular syscall handler
would. The old resume_userspace call chain contains code that clobbers
%eax and it restores %eax from the ptrace registers afterwards. The same
goes for the ptrace-enabled call chain. When ptrace is not used, the
syscall return value is the passed-in syscall number from the untouched
%eax register.

Use %eax as the return value register in syscall_badsys and
sysenter_badsys, like a real syscall handler does, and have the caller
push the value onto the stack for ptrace access.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.net
Reviewed-and-tested-by: Andy Lutomirski <luto@amacapital.net>
Cc: <stable@vger.kernel.org> # If 554086d is backported
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-07-22 02:34:05 -07:00
Borislav Petkov
51cbe7e7c4 x86, MCE: Robustify mcheck_init_device
BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.

So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.

Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-07-21 18:14:32 +02:00
Linus Torvalds
1b9f0efd61 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "A couple of key fixes and a few less critical ones.  The main ones
  are:

   - add a .bss section to the PE/COFF headers when building with EFI
     stub

   - invoke the correct paravirt magic when building the espfix page
     tables

  Unfortunately both of these areas also have at least one additional
  fix each still in thie pipeline, but which are not yet ready to push"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Remove unused variable "polling"
  x86/espfix/xen: Fix allocation of pages for paravirt page tables
  x86/efi: Include a .bss section within the PE/COFF headers
  efi: fdt: Do not report an error during boot if UEFI is not available
  efi/arm64: efistub: remove local copy of linux_banner
2014-07-18 20:46:55 -10:00
Steven Rostedt (Red Hat)
fdc841b58c ftrace: x86: Remove check of obsolete variable function_trace_stop
Nothing sets function_trace_stop to disable function tracing anymore.
Remove the check for it in the arch code.

Link: http://lkml.kernel.org/r/53C54D32.6000000@zytor.com

Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18 13:57:02 -04:00
Steven Rostedt (Red Hat)
84b2bc7fa0 ftrace/x86: Add call to ftrace_graph_is_dead() in function graph code
ftrace_stop() is going away as it disables parts of function tracing
that affects users that should not be affected. But ftrace_graph_stop()
is built on ftrace_stop(). Here's another example of killing all of
function tracing because something went wrong with function graph
tracing.

Instead of disabling all users of function tracing on function graph
error, disable only function graph tracing. To do this, the arch code
must call ftrace_graph_is_dead() before it implements function graph.

Link: http://lkml.kernel.org/r/53C54D18.3020602@zytor.com

Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-17 09:45:08 -04:00
Linus Torvalds
bcf44bfe5e Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "A cpufreq lockup fix and a compiler warning fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix compiler warnings
  x86, tsc: Fix cpufreq lockup
2014-07-16 10:11:02 -10:00
Linus Torvalds
d14aef3872 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Tooling fixes and an Intel PMU driver fixlet"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Do not allow optimized switch for non-cloned events
  perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
  perf symbols: Get kernel start address by symbol name
  perf tools: Fix segfault in cumulative.callchain report
2014-07-16 10:10:27 -10:00
Christoph Schulz
cbace46a97 x86: don't exclude low BIOS area when allocating address space for non-PCI cards
Commit 30919b0bf3 ("x86: avoid low BIOS area when allocating address
space") moved the test for resource allocations that fall within the first
1MB of address space from the PCI-specific path to a generic path, such
that all resource allocations will avoid this area.  However, this breaks
ISA cards which need to allocate a memory region within the first 1MB.  An
example is the i82365 PCMCIA controller and derivatives like the Ricoh
RF5C296/396 which map part of the PCMCIA socket memory address space into
the first 1MB of system memory address space.  They do not work anymore as
no usable memory region exists due to this change:

  Intel ISA PCIC probe: Ricoh RF5C296/396 ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
  host opts [0]: none
  host opts [1]: none
  ISA irqs (scanned) = 3,4,5,9,10 status change on irq 10
  pcmcia_socket pcmcia_socket1: pccard: PCMCIA card inserted into slot 1
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0cc000-0x0effff: excluding 0xe0000-0xeffff
  pcmcia_socket pcmcia_socket1: cs: unable to map card memory!

If filtering out the first 1MB is reverted, everything works as expected.

Tested-by: Robert Resch <fli4l@robert.reschpara.de>
Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v2.6.37+
2014-07-16 12:29:36 -06:00
Andy Lutomirski
0cdd192cf4 kprobes/x86: Don't try to resolve kprobe faults from userspace
This commit:

    commit 6f6343f53d
    Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
    Date:   Thu Apr 17 17:17:33 2014 +0900

        kprobes/x86: Call exception handlers directly from do_int3/do_debug

appears to have inadvertently dropped a check that the int3 came
from kernel mode.  Trying to dereference addr when addr is
user-controlled is completely bogus.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/c4e339882c121aa76254f2adde3fcbdf502faec2.1405099506.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 14:16:32 +02:00
David Rientjes
4485154138 perf/x86/intel: Avoid spamming kernel log for BTS buffer failure
It's unnecessary to excessively spam the kernel log anytime the BTS buffer
cannot be allocated, so make this allocation __GFP_NOWARN.

The user probably will want to at least find some artifact that the
allocation has failed in the past, probably due to fragmentation because
of its large size, when it's not allocated at bootstrap.  Thus, add a
WARN_ONCE() so something is left behind for them to understand why perf
commnads that require PEBS is not working properly.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 13:31:30 +02:00
Kan Liang
338b522ca4 perf/x86/intel: Protect LBR and extra_regs against KVM lying
With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.

When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.

For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 13:18:43 +02:00
Stephane Eranian
7711fe4fc2 perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings
This patch fixes the SNB-EP and IVT Cbox filter mapping
table. The table controls which filters are supported by
which events. There were several mistakes in those tables
causing some filters to be ignored, such as NID on
TOR_INSERTS.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: zheng.z.yan@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140630144624.GA2604@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 13:18:41 +02:00
Vince Weaver
1996388e9f perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge
This was discussed back in February:

	https://lkml.org/lkml/2014/2/18/956

But I never saw a patch come out of it.

On IvyBridge we share the SandyBridge cache event tables, but the
dTLB-load-miss event is not compatible.  Patch it up after
the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 13:18:40 +02:00
Paul Bolle
d3f44fbabe x86: Remove unused variable "polling"
Compile tested. "polling" is unused since commit f80c5b39b8
("sched/idle, x86: Switch from TS_POLLING to TIF_POLLING_NRFLAG").

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Link: http://lkml.kernel.org/r/1404138749.2978.6.camel@x41
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 12:58:47 +02:00
Boris Ostrovsky
8762e50928 x86/espfix/xen: Fix allocation of pages for paravirt page tables
init_espfix_ap() is currently off by one level when informing hypervisor
that allocated pages will be used for ministacks' page tables.

The most immediate effect of this on a PV guest is that if
'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
will refuse to use it for a page table (which it shouldn't be anyway). This will
result in warnings by both Xen and Linux.

More importantly, a subsequent write to that page (again, by a PV guest) is
likely to result in fatal page fault.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.com
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
2014-07-14 13:47:32 -07:00
HATAYAMA Daisuke
b292d7a104 perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.

For example, we use external NMI to make system panic to get crash
dump, but in this case, the external NMI is falsely handled do to the
issue.

This commit deals with the issue simply by ignoring CondChgd bit.

Here is explanation in detail.

On x86 NMI watchdog uses performance monitoring feature to
periodically signal NMI each time performance counter gets overflowed.

intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
owner of a given NMI by looking at overflow status bits in
MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
handles the given NMI as its own NMI.

The problem is that the intel_pmu_handle_irq() doesn't distinguish
CondChgd bit from other bits. Unlike the other status bits, CondChgd
bit doesn't represent overflow status for performance counters. Thus,
CondChgd bit cannot be thought of as a mark indicating a given NMI is
NMI watchdog's.

As a result, if CondChgd bit is set, any NMI is falsely handled by the
NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
action is never performed until CondChgd bit is cleared.

I noticed this behavior on systems with Ivy Bridge processors: Intel
Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
in the beginning at boot. Then the CondChgd bit is immediately cleared
by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
0.

On the other hand, on older processors such as Nehalem, Xeon E7540,
CondChgd bit is not set in the beginning at boot.

I'm not sure about exact behavior of CondChgd bit, in particular when
this bit is set. Although I read Intel System Programmer's Manual to
figure out that, the descriptions I found are:

  In 18.9.1:

  "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
   indicate changes to the state of performancmonitoring hardware"

  In Table 35-2 IA-32 Architectural MSRs

  63 CondChg: status bits of this register has changed.

These are different from the bahviour I see on the actual system as I
explained above.

At least, I think ignoring CondChgd bit should be enough for NMI
watchdog perspective.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-02 08:35:55 +02:00
Peter Zijlstra
3896c329df x86, tsc: Fix cpufreq lockup
Mauro reported that his AMD X2 using the powernow-k8 cpufreq driver
locked up when doing cpu hotplug.

Because we called set_cyc2ns_scale() from the time_cpufreq_notifier()
unconditionally, it gets called multiple times for each freq change,
instead of only the once, when the tsc_khz value actually changes.

Because it gets called more than once, we run out of cyc2ns data slots
and stall, waiting for a free one, but because we're half way offline,
there's no consumers to free slots.

By placing the call inside the condition that actually changes tsc_khz
we avoid superfluous calls and avoid the problem.

Reported-by: Mauro <registosites@hotmail.com>
Tested-by: Mauro <registosites@hotmail.com>
Fixes: 20d1c86a57 ("sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs")
Cc: <stable@vger.kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Bin Gao <bin.gao@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-02 08:33:47 +02:00
Steven Rostedt (Red Hat)
79922b8009 ftrace: Optimize function graph to be called directly
Function graph tracing is a bit different than the function tracers, as
it is processed after either the ftrace_caller or ftrace_regs_caller
and we only have one place to modify the jump to ftrace_graph_caller,
the jump needs to happen after the restore of registeres.

The function graph tracer is dependent on the function tracer, where
even if the function graph tracing is going on by itself, the save and
restore of registers is still done for function tracing regardless of
if function tracing is happening, before it calls the function graph
code.

If there's no function tracing happening, it is possible to just call
the function graph tracer directly, and avoid the wasted effort to save
and restore regs for function tracing.

This requires adding new flags to the dyn_ftrace records:

  FTRACE_FL_TRAMP
  FTRACE_FL_TRAMP_EN

The first is set if the count for the record is one, and the ftrace_ops
associated to that record has its own trampoline. That way the mcount code
can call that trampoline directly.

In the future, trampolines can be added to arbitrary ftrace_ops, where you
can have two or more ftrace_ops registered to ftrace (like kprobes and perf)
and if they are not tracing the same functions, then instead of doing a
loop to check all registered ftrace_ops against their hashes, just call the
ftrace_ops trampoline directly, which would call the registered ftrace_ops
function directly.

Without this patch perf showed:

  0.05%  hackbench  [kernel.kallsyms]  [k] ftrace_caller
  0.05%  hackbench  [kernel.kallsyms]  [k] arch_local_irq_save
  0.05%  hackbench  [kernel.kallsyms]  [k] native_sched_clock
  0.04%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] preempt_trace
  0.04%  hackbench  [kernel.kallsyms]  [k] prepare_ftrace_return
  0.04%  hackbench  [kernel.kallsyms]  [k] __this_cpu_preempt_check
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller

See that the ftrace_caller took up more time than the ftrace_graph_caller
did.

With this patch:

  0.05%  hackbench  [kernel.kallsyms]  [k] __buffer_unlock_commit
  0.04%  hackbench  [kernel.kallsyms]  [k] call_filter_check_discard
  0.04%  hackbench  [kernel.kallsyms]  [k] ftrace_graph_caller
  0.04%  hackbench  [kernel.kallsyms]  [k] sched_clock

The ftrace_caller is no where to be found and ftrace_graph_caller still
takes up the same percentage.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-01 07:13:31 -04:00
Linus Torvalds
d1fc98ba96 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "A pile of fixes related to the VDSO, EFI and 32-bit badsys handling.

  It turns out that removing the section headers from the VDSO breaks
  gdb, so this puts back most of them.  A very simple typo broke
  rt_sigreturn on some versions of glibc, with obviously disastrous
  results.  The rest is pretty much fixes for the corresponding fallout.

  The EFI fixes fixes an arithmetic overflow on 32-bit systems and
  quiets some build warnings.

  Finally, when invoking an invalid system call number on x86-32, we
  bypass a bunch of handling, which can make the audit code oops"

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi-pstore: Fix an overflow on 32-bit builds
  x86/vdso: Error out in vdso2c if DT_RELA is present
  x86/vdso: Move DISABLE_BRANCH_PROFILING into the vdso makefile
  x86_32, signal: Fix vdso rt_sigreturn
  x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508)
  x86/vdso: Create .build-id links for unstripped vdso files
  x86/vdso: Remove some redundant in-memory section headers
  x86/vdso: Improve the fake section headers
  x86/vdso2c: Use better macros for ELF bitness
  x86/vdso: Discard the __bug_table section
  efi: Fix compiler warnings (unused, const, type)
2014-06-27 18:43:03 -07:00
Aaron Tomlin
f3aca3d095 nmi: provide the option to issue an NMI back trace to every cpu but current
Sometimes it is preferred not to use the trigger_all_cpu_backtrace()
routine when one wants to avoid capturing a back trace for current.  For
instance if one was previously captured recently.

This patch provides a new routine namely
trigger_allbutself_cpu_backtrace() which offers the flexibility to issue
an NMI to every cpu but current and capture a back trace accordingly.

Patch x86 and sparc to support new routine.

[dzickus@redhat.com: add stub in #else clause]
[dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion]
[sfr@canb.auug.org.au: undo C99ism]
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-23 16:47:44 -07:00
Andy Lutomirski
6ba19a670c x86_32, signal: Fix vdso rt_sigreturn
This commit:

    commit 6f121e548f
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Mon May 5 12:19:34 2014 -0700

        x86, vdso: Reimplement vdso.so preparation in build-time C

Contained this obvious typo:

-               restorer = VDSO32_SYMBOL(current->mm->context.vdso, rt_sigreturn);
+               restorer = current->mm->context.vdso +
+                       selected_vdso32->sym___kernel_sigreturn;

Note the missing 'rt_' in the new code.  Fix it.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1eb40ad923acde2e18357ef2832867432e70ac42.1403361010.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-06-23 15:54:42 -07:00
Andy Lutomirski
554086d85e x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508)
The bad syscall nr paths are their own incomprehensible route
through the entry control flow.  Rearrange them to work just like
syscalls that return -ENOSYS.

This fixes an OOPS in the audit code when fast-path auditing is
enabled and sysenter gets a bad syscall nr (CVE-2014-4508).

This has probably been broken since Linux 2.6.27:
af0575bba0 i386 syscall audit fast-path

Cc: stable@vger.kernel.org
Cc: Roland McGrath <roland@redhat.com>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e09c499eade6fc321266dd6b54da7beb28d6991c.1403558229.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-06-23 14:59:26 -07:00
Masami Hiramatsu
4cdf77a828 x86/kprobes: Fix build errors and blacklist context_track_user
This essentially reverts commit:

  ecd50f714c ("kprobes, x86: Call exception_enter after kprobes handled")

since it causes build errors with CONFIG_CONTEXT_TRACKING and
that has been made from misunderstandings;
context_track_user_*() don't involve much in interrupt context,
it just returns if in_interrupt() is true.

Instead of changing the do_debug/int3(), this just adds
context_track_user_*() to kprobes blacklist, since those are
still can be called right before kprobes handles int3 and debug
exceptions, and probing those will cause an infinite loop.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20140614064711.7865.45957.stgit@kbuild-fedora.novalocal
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-14 09:07:44 +02:00
Linus Torvalds
71998d1be4 Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 irq fixes from Ingo Molnar:
 "Two changes: a cpu-hotplug/irq race fix, plus a HyperV related fix"

* 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Fix fixup_irqs() error handling
  x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately
2014-06-12 20:03:47 -07:00
Linus Torvalds
3737a12761 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
 "A second round of perf updates:

   - wide reaching kprobes sanitization and robustization, with the hope
     of fixing all 'probe this function crashes the kernel' bugs, by
     Masami Hiramatsu.

   - uprobes updates from Oleg Nesterov: tmpfs support, corner case
     fixes and robustization work.

   - perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
     et al:
        * Add support to accumulate hist periods (Namhyung Kim)
        * various fixes, refactorings and enhancements"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
  perf: Differentiate exec() and non-exec() comm events
  perf: Fix perf_event_comm() vs. exec() assumption
  uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
  perf/documentation: Add description for conditional branch filter
  perf/x86: Add conditional branch filtering support
  perf/tool: Add conditional branch filter 'cond' to perf record
  perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
  uprobes: Teach copy_insn() to support tmpfs
  uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
  perf/x86: Use common PMU interrupt disabled code
  perf/ARM: Use common PMU interrupt disabled code
  perf: Disable sampled events if no PMU interrupt
  perf: Fix use after free in perf_remove_from_context()
  perf tools: Fix 'make help' message error
  perf record: Fix poll return value propagation
  perf tools: Move elide bool into perf_hpp_fmt struct
  perf tools: Remove elide setup for SORT_MODE__MEMORY mode
  perf tools: Fix "==" into "=" in ui_browser__warning assignment
  perf tools: Allow overriding sysfs and proc finding with env var
  perf tools: Consider header files outside perf directory in tags target
  ...
2014-06-12 19:18:49 -07:00
Linus Torvalds
682b7c1c8e Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main drm merge window pull request, changes all over the
  place, mostly normal levels of churn.

  Highlights:

  Core drm:
     More cleanups, fix race on connector/encoder naming, docs updates,
     object locking rework in prep for atomic modeset

  i915:
     mipi DSI support, valleyview power fixes, cursor size fixes,
     execlist refactoring, vblank improvements, userptr support, OOM
     handling improvements

  radeon:
     GPUVM tuning and large page size support, gart fixes, deep color
     HDMI support, HDMI audio cleanups

  nouveau:
     - displayport rework should fix lots of issues
     - initial gk20a support
     - gk110b support
     - gk208 fixes

  exynos:
     probe order fixes, HDMI changes, IPP consolidation

  msm:
     debugfs updates, misc fixes

  ast:
     ast2400 support, sync with UMS driver

  tegra:
     cleanups, hdmi + hw cursor for Tegra 124.

  panel:
     fixes existing panels add some new ones.

  ipuv3:
     moved from staging to drivers/gpu"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (761 commits)
  drm/nouveau/disp/dp: fix tmds passthrough on dp connector
  drm/nouveau/dp: probe dpcd to determine connectedness
  drm/nv50-: trigger update after all connectors disabled
  drm/nv50-: prepare for attaching a SOR to multiple heads
  drm/gf119-/disp: fix debug output on update failure
  drm/nouveau/disp/dp: make use of postcursor when its available
  drm/g94-/disp/dp: take max pullup value across all lanes
  drm/nouveau/bios/dp: parse lane postcursor data
  drm/nouveau/dp: fix support for dpms
  drm/nouveau: register a drm_dp_aux channel for each dp connector
  drm/g94-/disp: add method to power-off dp lanes
  drm/nouveau/disp/dp: maintain link in response to hpd signal
  drm/g94-/disp: bash and wait for something after changing lane power regs
  drm/nouveau/disp/dp: split link config/power into two steps
  drm/nv50/disp: train PIOR-attached DP from second supervisor
  drm/nouveau/disp/dp: make use of existing output data for link training
  drm/gf119/disp: start removing direct vbios parsing from supervisor
  drm/nv50/disp: start removing direct vbios parsing from supervisor
  drm/nouveau/disp/dp: maintain receiver caps in response to hpd signal
  drm/nouveau/disp/dp: create subclass for dp outputs
  ...
2014-06-12 11:32:30 -07:00
Linus Torvalds
214b931320 Lots of tweaks, small fixes, optimizations, and some helper functions
to help out the rest of the kernel to ease their use of trace events.
 
 The big change for this release is the allowing of other tracers,
 such as the latency tracers, to be used in the trace instances and allow
 for function or function graph tracing to be in the top level
 simultaneously.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTlbUqAAoJEKQekfcNnQGuP+8H+wTBG06beHsqe6XcaeXcKNkt
 Mimm0O04oQdw89CBWeJvXyOwRTtiN4M/4hxHXBTDtChxM9oUyWw1o0IpSMMuQ16O
 w9r3DfC8e1air+ufEuYWM0QNtyzHi8EfDSNia55ON5jvtkCZTXOEKZD+n8M9w28p
 I7PVgr0PDztsCpethCpg0M8beK9zuQPWMzsHAQCsKI06Xl5z33kPIJR15Exh+Kr1
 uVVTZW7JFVAPuSnteLSIx9pN6OjsVGzOZCljg+O+9/v/02u5nkMiS2nURxae86kg
 RTSiRYT6Hvl/MCBhdss/w5kgSk6BYiZ0hXbLtwetvre+vQrOR5CnDw2DxZ7e+gU=
 =oudH
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Lots of tweaks, small fixes, optimizations, and some helper functions
  to help out the rest of the kernel to ease their use of trace events.

  The big change for this release is the allowing of other tracers, such
  as the latency tracers, to be used in the trace instances and allow
  for function or function graph tracing to be in the top level
  simultaneously"

* tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
  tracing: Fix memory leak on instance deletion
  tracing: Fix leak of ring buffer data when new instances creation fails
  tracing/kprobes: Avoid self tests if tracing is disabled on boot up
  tracing: Return error if ftrace_trace_arrays list is empty
  tracing: Only calculate stats of tracepoint benchmarks for 2^32 times
  tracing: Convert stddev into u64 in tracepoint benchmark
  tracing: Introduce saved_cmdlines_size file
  tracing: Add __get_dynamic_array_len() macro for trace events
  tracing: Remove unused variable in trace_benchmark
  tracing: Eliminate double free on failure of allocation on boot up
  ftrace/x86: Call text_ip_addr() instead of the duplicated code
  tracing: Print max callstack on stacktrace bug
  tracing: Move locking of trace_cmdline_lock into start/stop seq calls
  tracing: Try again for saved cmdline if failed due to locking
  tracing: Have saved_cmdlines use the seq_read infrastructure
  tracing: Add tracepoint benchmark tracepoint
  tracing: Print nasty banner when trace_printk() is in use
  tracing: Add funcgraph_tail option to print function name after closing braces
  tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
  tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks
  ...
2014-06-09 16:39:15 -07:00
Linus Torvalds
3f17ea6dea Merge branch 'next' (accumulated 3.16 merge window patches) into master
Now that 3.15 is released, this merges the 'next' branch into 'master',
bringing us to the normal situation where my 'master' branch is the
merge window.

* accumulated work in next: (6809 commits)
  ufs: sb mutex merge + mutex_destroy
  powerpc: update comments for generic idle conversion
  cris: update comments for generic idle conversion
  idle: remove cpu_idle() forward declarations
  nbd: zero from and len fields in NBD_CMD_DISCONNECT.
  mm: convert some level-less printks to pr_*
  MAINTAINERS: adi-buildroot-devel is moderated
  MAINTAINERS: add linux-api for review of API/ABI changes
  mm/kmemleak-test.c: use pr_fmt for logging
  fs/dlm/debug_fs.c: replace seq_printf by seq_puts
  fs/dlm/lockspace.c: convert simple_str to kstr
  fs/dlm/config.c: convert simple_str to kstr
  mm: mark remap_file_pages() syscall as deprecated
  mm: memcontrol: remove unnecessary memcg argument from soft limit functions
  mm: memcontrol: clean up memcg zoneinfo lookup
  mm/memblock.c: call kmemleak directly from memblock_(alloc|free)
  mm/mempool.c: update the kmemleak stack trace for mempool allocations
  lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations
  mm: introduce kmemleak_update_trace()
  mm/kmemleak.c: use %u to print ->checksum
  ...
2014-06-08 11:31:16 -07:00
Linus Torvalds
bb077d6006 Revert "x86/smpboot: Initialize secondary CPU only if master CPU will wait for it"
This reverts commit 3e1a878b7c.

It came in very late, and already has one reported failure: Sitsofe
reports that the current tree fails to boot on his EeePC, and bisected
it down to this.  Rather than waste time trying to figure out what's
wrong, just revert it.

Reported-by: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-08 10:09:49 -07:00
Ingo Molnar
ec00010972 Merge branch 'perf/urgent' into perf/core, to resolve conflict and to prepare for new patches
Conflicts:
	arch/x86/kernel/traps.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:55:06 +02:00
Linus Torvalds
a0abcf2e8f Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 cdso updates from Peter Anvin:
 "Vdso cleanups and improvements largely from Andy Lutomirski.  This
  makes the vdso a lot less ''special''"

* 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso, build: Make LE access macros clearer, host-safe
  x86/vdso, build: Fix cross-compilation from big-endian architectures
  x86/vdso, build: When vdso2c fails, unlink the output
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls
  x86, mm: Improve _install_special_mapping and fix x86 vdso naming
  mm, fs: Add vm_ops->name as an alternative to arch_vma_name
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments
  x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO
  x86, vdso: Move the 32-bit vdso special pages after the text
  x86, vdso: Reimplement vdso.so preparation in build-time C
  x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c
  x86, vdso: Clean up 32-bit vs 64-bit vdso params
  x86, mm: Ensure correct alignment of the fixmap
2014-06-05 08:05:29 -07:00
Linus Torvalds
2071b3e34f Merge branch 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86-64 espfix changes from Peter Anvin:
 "This is the espfix64 code, which fixes the IRET information leak as
  well as the associated functionality problem.  With this code applied,
  16-bit stack segments finally work as intended even on a 64-bit
  kernel.

  Consequently, this patchset also removes the runtime option that we
  added as an interim measure.

  To help the people working on Linux kernels for very small systems,
  this patchset also makes these compile-time configurable features"

* 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
  x86, espfix: Make it possible to disable 16-bit support
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Fix broken header guard
  x86, espfix: Move espfix definitions into a separate header file
  x86-32, espfix: Remove filter for espfix32 due to race
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
2014-06-05 07:46:15 -07:00
Ingo Molnar
8c6e549a44 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes tmpfs support patches from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:35:30 +02:00
Igor Mammedov
3e1a878b7c x86/smpboot: Initialize secondary CPU only if master CPU will wait for it
Hang is observed on virtual machines during CPU hotplug,
especially in big guests with many CPUs. (It reproducible
more often if host is over-committed).

It happens because master CPU gives up waiting on
secondary CPU and allows it to run wild. As result
AP causes locking or crashing system. For example
as described here:

   https://lkml.org/lkml/2014/3/6/257

If master CPU have sent STARTUP IPI successfully,
and AP signalled to master CPU that it's ready
to start initialization, make master CPU wait
indefinitely till AP is onlined.
To ensure that AP won't ever run wild, make it
wait at early startup till master CPU confirms its
intention to wait for AP. If AP doesn't respond in 10
seconds, the master CPU will timeout and cancel
AP onlining.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-4-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:08 +02:00
Igor Mammedov
feef1e8ecb x86/smpboot: Log error on secondary CPU wakeup failure at ERR level
If system is running without debug level logging,
it will not log error if do_boot_cpu() failed to
wakeup AP. It may lead to silent AP bringup
failures at boot time.
Change message level to KERN_ERR to make error
visible to user as it's done on other architectures.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-3-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:07 +02:00
Igor Mammedov
89f898c1e1 x86: Fix list/memory corruption on CPU hotplug
currently if AP wake up is failed, master CPU marks AP as not
present in do_boot_cpu() by calling set_cpu_present(cpu, false).
That leads to following list corruption on the next physical CPU
hotplug:

[  418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0()
[  418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0).
[  418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee
[  418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387
[  418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  418.166433]  0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021
[  418.176460]  ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8
[  418.177453]  ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea
[  418.178445] Call Trace:
[  418.185811]  [<ffffffff8159b22d>] dump_stack+0x49/0x5c
[  418.186440]  [<ffffffff8106942c>] warn_slowpath_common+0x8c/0xc0
[  418.187192]  [<ffffffff81069516>] warn_slowpath_fmt+0x46/0x50
[  418.191231]  [<ffffffff8136ef51>] ? acpi_ns_get_node+0xb7/0xc7
[  418.193889]  [<ffffffff812f796e>] __list_add+0xbe/0xd0
[  418.196649]  [<ffffffff812e2aa9>] kobject_add_internal+0x79/0x200
[  418.208610]  [<ffffffff812e2e18>] kobject_add_varg+0x38/0x60
[  418.213831]  [<ffffffff812e2ef4>] kobject_add+0x44/0x70
[  418.229961]  [<ffffffff813e2c60>] device_add+0xd0/0x550
[  418.234991]  [<ffffffff813f0e95>] ? pm_runtime_init+0xe5/0xf0
[  418.250226]  [<ffffffff813e32be>] device_register+0x1e/0x30
[  418.255296]  [<ffffffff813e82a3>] register_cpu+0xe3/0x130
[  418.266539]  [<ffffffff81592be5>] arch_register_cpu+0x65/0x150
[  418.285845]  [<ffffffff81355c0d>] acpi_processor_hotadd_init+0x5a/0x9b
...
Which is caused by the fact that generic_processor_info() allocates
logical CPU id by calling:

 cpu = cpumask_next_zero(-1, cpu_present_mask);

which returns id of previously failed to wake up CPU, since its
bit is cleared by do_boot_cpu() and as result register_cpu()
tries to register another CPU with the same id as already
present but failed to be onlined CPU.

Taking in account that AP will not do anything if master CPU
failed to wake it up, there is no reason to mark that AP as not
present and break next cpu hotplug attempts. As a side effect of
not marking AP as not present, user would be allowed to online
it again later.

Also fix memory corruption in acpi_unmap_lsapic()

if during CPU hotplug master CPU failed to wake up AP
it set percpu x86_cpu_to_apicid to BAD_APICID=0xFFFF for AP.

However following attempt to unplug that CPU will lead to
out of bound write access to __apicid_to_node[] which is
32768 items long on x86_64 kernel.

So with above fix of cpu_present_mask make sure that a present
CPU has a valid APIC ID by not setting x86_cpu_to_apicid
to BAD_APICID in do_boot_cpu() on failure and allow
acpi_processor_remove()->acpi_unmap_lsapic() cleanly remove CPU.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-2-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:07 +02:00
Oleg Nesterov
5cdb76d6f0 uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
Purely cosmetic, no changes in .o,

1. As Jim pointed out arch_uprobe->def looks ambiguous, rename it to
   ->defparam.

2. Add the comment into default_post_xol_op() to explain "regs->sp +=".

3. Remove the stale part of the comment in arch_uprobe_analyze_insn().

Suggested-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-06-05 16:21:57 +02:00
Anshuman Khandual
37548914fb perf/x86: Add conditional branch filtering support
This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mpe@ellerman.id.au
Cc: benh@kernel.crashing.org
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400743210-32289-3-git-send-email-khandual@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:30:23 +02:00
Vince Weaver
c184c980de perf/x86: Use common PMU interrupt disabled code
Make the x86 perf code use the new common PMU interrupt disabled code.

Typically most x86 machines have working PMU interrupts, although
some older p6-class machines had this problem.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161715560.11099@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:30:03 +02:00
Dave Airlie
8d4ad9d4bb Merge commit '9e9a928eed8796a0a1aaed7e0b676db86ba84594' into drm-next
Merge drm-fixes into drm-next.

Both i915 and radeon need this done for later patches.

Conflicts:
	drivers/gpu/drm/drm_crtc_helper.c
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem.c
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
	drivers/gpu/drm/i915/i915_gem_gtt.c
2014-06-05 20:28:59 +10:00
Ingo Molnar
10b0256496 Merge branch 'perf/kprobes' into perf/core
Conflicts:
	arch/x86/kernel/traps.c

The kprobes enhancements are fully cooked, ship them upstream.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:26:50 +02:00
Ingo Molnar
c56d34064b Merge branch 'perf/uprobes' into perf/core
These bits from Oleg are fully cooked, ship them to Linus.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:26:27 +02:00
Linus Torvalds
00170fdd08 Merge branch 'akpm' (patchbomb from Andrew) into next
Merge misc updates from Andrew Morton:

 - a few fixes for 3.16.  Cc'ed to stable so they'll get there somehow.

 - various misc fixes and cleanups

 - most of the ocfs2 queue.  Review is slow...

 - most of MM.  The MM queue is pretty huge this time, but not much in
   the way of feature work.

 - some tweaks under kernel/

 - printk maintenance work

 - updates to lib/

 - checkpatch updates

 - tweaks to init/

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits)
  fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
  fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul
  init/main.c: remove an ifdef
  kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND
  init/main.c: add initcall_blacklist kernel parameter
  init/main.c: don't use pr_debug()
  fs/binfmt_flat.c: make old_reloc() static
  fs/binfmt_elf.c: fix bool assignements
  fs/efs: convert printk(KERN_DEBUG to pr_debug
  fs/efs: add pr_fmt / use __func__
  fs/efs: convert printk to pr_foo()
  scripts/checkpatch.pl: device_initcall is not the only __initcall substitute
  checkpatch: check stable email address
  checkpatch: warn on unnecessary void function return statements
  checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar);
  checkpatch: add warning for kmalloc/kzalloc with multiply
  checkpatch: warn on #defines ending in semicolon
  checkpatch: make --strict a default for files in drivers/net and net/
  checkpatch: always warn on missing blank line after variable declaration block
  checkpatch: fix wildcard DT compatible string checking
  ...
2014-06-04 16:55:13 -07:00
Borislav Petkov
a8fe19ebfb kernel/printk: use symbolic defines for console loglevels
... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:17 -07:00