Commit Graph

80375 Commits

Author SHA1 Message Date
Michael Neuling
29ce3c5073 powerpc: Add isync to copy_and_flush
In __after_prom_start we copy the kernel down to zero in two calls to
copy_and_flush.  After the first call (copy from 0 to copy_to_here:)
we jump to the newly copied code soon after.

Unfortunately there's no isync between the copy of this code and the
jump to it.  Hence it's possible that stale instructions could still be
in the icache or pipeline before we branch to it.

We've seen this on real machines and it's results in no console output
after:
  calling quiesce...
  returning from prom_init

The below adds an isync to ensure that the copy and flushing has
completed before any branching to the new instructions occurs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26 16:08:17 +10:00
Michael Neuling
2171364d1a powerpc: Add HWCAP2 aux entry
We are currently out of free bits in AT_HWCAP. With POWER8, we have
several hardware features that we need to advertise.

Tested on POWER and x86.

Signed-off-by: Michael Neuling <michael@neuling.org>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26 16:08:16 +10:00
Benjamin Herrenschmidt
6263fb3bd7 powerpc/powernv: Fix missing Kconfig dependency for MSIs
We need PPC_MSI_BITMAP support

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24 15:15:33 +10:00
Benjamin Herrenschmidt
234d15def9 Merge remote-tracking branch 'origin/master' into next
Merge upstream to get the audit fixes
2013-04-24 14:43:36 +10:00
Michael Ellerman
6747e83235 powerpc/spufs: Initialise inode->i_ino in spufs_new_inode()
In commit 85fe402 (fs: do not assign default i_ino in new_inode), the
initialisation of i_ino was removed from new_inode() and pushed down
into the callers. However spufs_new_inode() was not updated.

This exhibits as no files appearing in /spu, because all our dirents
have a zero inode, which readdir() seems to dislike.

Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24 14:22:32 +10:00
Vasant Hegde
fd2d37c854 powerpc/rtas_flash: New return code to indicate FW entitlement expiry
Add new return code to rtas_flash to indicate firmware entitlement
expiry. Strictly we don't need this update. But to keep it in sync
with PAPR, this was added.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24 14:22:31 +10:00
Vasant Hegde
f51005d7a4 powerpc/rtas_flash: Update return token comments
Add proper comment to ibm,validate-flash-image RTAS call
update result tokens.

Note: Only comment section is modified, no code change.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24 14:22:31 +10:00
Michael Ellerman
51e0eaf98d powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()
None of the cell platforms support CPU hotplug, so we should iterate
only over online nodes when setting PMU interrupts.

This also fixes a warning during boot when NODES_SHIFT is large enough:

WARNING: at /scratch/michael/src/kmk/linus/kernel/irq/irqdomain.c:766
...
NIP [c0000000000db278] .irq_linear_revmap+0x30/0x58
LR [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8
Call Trace:
[c0000003fc9c3af0] [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8 (unreliable)
[c0000003fc9c3b80] [c000000000655c1c] .__machine_initcall_cell_cbe_init_pm_irq+0x84/0x158
[c0000003fc9c3c20] [c00000000000afb4] .do_one_initcall+0x5c/0x1e0
[c0000003fc9c3cd0] [c000000000644580] .kernel_init_freeable+0x238/0x328
[c0000003fc9c3db0] [c00000000000b784] .kernel_init+0x1c/0x120
[c0000003fc9c3e30] [c000000000009fb8] .ret_from_kernel_thread+0x64/0xac

This is caused by us overflowing our linear revmap because we're
requesting too many interrupts.

Reported-by: Dennis Schridde <devurandom@gmx.net>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-24 14:22:30 +10:00
Benjamin Herrenschmidt
973e2abd15 Merge remote-tracking branch 'mpe/master' into next
Merge the tentative powerpc-next branch maintained by Michael
while I was on vacation
2013-04-24 14:11:20 +10:00
Chen Gang
5676005acf powerpc/pseries/lparcfg: Fix possible overflow are more than 1026
need set '\0' for 'local_buffer'.

SPLPAR_MAXLENGTH is 1026, RTAS_DATA_BUF_SIZE is 4096. so the contents of
rtas_data_buf may truncated in memcpy.

if contents are really truncated.
  the splpar_strlen is more than 1026. the next while loop checking will
  not find the end of buffer. that will cause memory access violation.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23 16:39:35 +10:00
Oleg Nesterov
28d170abad ptrace/powerpc: Don't flush_ptrace_hw_breakpoint() on fork()
arch_dup_task_struct() does flush_ptrace_hw_breakpoint(src), this
destroys the parent's breakpoints for no reason. We should clear
child->thread.ptrace_bps[] copied by dup_task_struct() instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23 16:05:06 +10:00
Adhemerval Zanella
fcb41a2030 powerpc: Add VDSO version of time
On 04/18/2013 07:38 PM, Anton Blanchard wrote:
> Since you are only reading one long you shouldn't need to check the
> update count and loop, you will always see a consistent value. The
> system call version of time() just does an unprotected load for example.

Fixed.

> With the above change and with Michael's comments covered (decent
> changelog entry and Signed-off-by):
>
> Acked-by: Anton Blanchard <anton@samba.org>

Thanks for the review, below the updated patch:

From: Adhemerval Zanella <azanella@linux.vnet.ibm.com>

This patch implement the time syscall as vDSO. The performance speedups
are:

Baseline PPC32: 380 nsec
Baseline PPC64: 350 nsec
vdso PPC32:      20 nsec
vsdo PPC64:      20 nsec

Tested on 64 bit build with both 32 bit and 64 bit userland.

Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Adhemerval Zanella <azanella@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-23 16:05:05 +10:00
Linus Torvalds
3125929454 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix offcore_rsp valid mask for SNB/IVB
  perf: Treat attr.config as u64 in perf_swevent_init()
2013-04-21 10:25:42 -07:00
Linus Torvalds
830ac8524f Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull kdump fixes from Peter Anvin:
 "The kexec/kdump people have found several problems with the support
  for loading over 4 GiB that was introduced in this merge cycle.  This
  is partly due to a number of design problems inherent in the way the
  various pieces of kdump fit together (it is pretty horrifically manual
  in many places.)

  After a *lot* of iterations this is the patchset that was agreed upon,
  but of course it is now very late in the cycle.  However, because it
  changes both the syntax and semantics of the crashkernel option, it
  would be desirable to avoid a stable release with the broken
  interfaces."

I'm not happy with the timing, since originally the plan was to release
the final 3.9 tomorrow.  But apparently I'm doing an -rc8 instead...

* 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kexec: use Crash kernel for Crash kernel low
  x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
  x86, kdump: Retore crashkernel= to allocate under 896M
  x86, kdump: Set crashkernel_low automatically
2013-04-20 18:40:36 -07:00
Linus Torvalds
db93f8b420 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "Three groups of fixes:

   1. Make sure we don't execute the early microcode patching if family
      < 6, since it would touch MSRs which don't exist on those
      families, causing crashes.

   2. The Xen partial emulation of HyperV can be dealt with more
      gracefully than just disabling the driver.

   3. More EFI variable space magic.  In particular, variables hidden
      from runtime code need to be taken into account too."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Verify the family before dispatching microcode patching
  x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
  x86,efi: Implement efi_no_storage_paranoia parameter
  efi: Export efi_query_variable_store() for efivars.ko
  x86/Kconfig: Make EFI select UCS2_STRING
  efi: Distinguish between "remaining space" and actually used space
  efi: Pass boot services variable info to runtime code
  Move utf16 functions to kernel core and rename
  x86,efi: Check max_size only if it is non-zero.
  x86, efivars: firmware bug workarounds should be in platform code
2013-04-20 18:38:48 -07:00
Linus Torvalds
8c3a13c84b Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
 "A set of fixes from various people - Will Deacon gets a prize for
  removing code this time around.  The biggest fix in this lot is
  sorting out the ARM740T mess.  The rest are relatively small fixes."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7699/1: sched_clock: Add more notrace to prevent recursion
  ARM: 7698/1: perf: fix group validation when using enable_on_exec
  ARM: 7697/1: hw_breakpoint: do not use __cpuinitdata for dbg_cpu_pm_nb
  ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon
  ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch()
  ARM: 7692/1: iop3xx: move IOP3XX_PERIPHERAL_VIRT_BASE
  ARM: modules: don't export cpu_set_pte_ext when !MMU
  ARM: mm: remove broken condition check for v4 flushing
  ARM: mm: fix numerous hideous errors in proc-arm740.S
  ARM: cache: remove ARMv3 support code
  ARM: tlbflush: remove ARMv3 support
2013-04-20 18:38:06 -07:00
Linus Torvalds
851b3f3238 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:

 1) Fix race in sparc64 TLB shootdowns, we have to synchronize with the
    sibling cpus completing if we are passing them a reference via
    pointer to a data structure.

 2) Fix cleaning of bitmaps in sparc32, from Akinobu Mita.

 3) Fix various sparc header mistakes, some of which resulted in
    userland build breakage.  From Sam Ravnborg.

 4) Kill ghost declarations and defines missed when several bits of code
    got deleted recently.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix race in TLB batch processing.
  sparc: use asm-generic version of types.h
  bbc_i2c: fix section mismatch warning
  sparc: use generic headers
  sparc:cleanup unused code in smp_32.h
  sparc/iommu: fix typo s/265KB/256KB/
  sparc/srmmu: clear trailing edge of bitmap properly
  sparc:remove unused declaration smp_boot_cpus()
2013-04-20 18:23:08 -07:00
H. Peter Anvin
c0a9f451e4 Merge remote-tracking branch 'efi/urgent' into x86/urgent
Matt Fleming (1):
      x86, efivars: firmware bug workarounds should be in platform
      code

Matthew Garrett (3):
      Move utf16 functions to kernel core and rename
      efi: Pass boot services variable info to runtime code
      efi: Distinguish between "remaining space" and actually used
      space

Richard Weinberger (2):
      x86,efi: Check max_size only if it is non-zero.
      x86,efi: Implement efi_no_storage_paranoia parameter

Sergey Vlasov (2):
      x86/Kconfig: Make EFI select UCS2_STRING
      efi: Export efi_query_variable_store() for efivars.ko

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-19 17:09:03 -07:00
H. Peter Anvin
74c3e3fcf3 x86, microcode: Verify the family before dispatching microcode patching
For each CPU vendor that implements CPU microcode patching, there will
be a minimum family for which this is implemented.  Verify this
minimum level of support.

This can be done in the dispatch function or early in the application
functions.  Doing the latter turned out to be somewhat awkward because
of the ineviable split between the BSP and the AP paths, and rather
than pushing deep into the application functions, do this in
the dispatch function.

Reported-by: "Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
2013-04-19 16:36:03 -07:00
David S. Miller
f36391d279 sparc64: Fix race in TLB batch processing.
As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.

So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.

Fix this by using generic infrastructure to synchonize on the
completion of the cross call.

This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().

We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls.  If we're not in such a
region, we flush TLBs synchronously.

1) Get rid of xcall_flush_tlb_pending and per-cpu type
   implementations.

2) Do TLB batch cross calls instead via:

	smp_call_function_many()
		tlb_pending_func()
			__flush_tlb_pending()

3) Batch only in lazy mmu sequences:

	a) Add 'active' member to struct tlb_batch
	b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
	c) Set 'active' in arch_enter_lazy_mmu_mode()
	d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
	e) Check 'active' in tlb_batch_add_one() and do a synchronous
           flush if it's clear.

4) Add infrastructure for synchronous TLB page flushes.

	a) Implement __flush_tlb_page and per-cpu variants, patch
	   as needed.
	b) Likewise for xcall_flush_tlb_page.
	c) Implement smp_flush_tlb_page() to invoke the cross-call.
	d) Wire up global_flush_tlb_page() to the right routine based
           upon CONFIG_SMP

5) It turns out that singleton batches are very common, 2 out of every
   3 batch flushes have only a single entry in them.

   The batch flush waiting is very expensive, both because of the poll
   on sibling cpu completeion, as well as because passing the tlb batch
   pointer to the sibling cpus invokes a shared memory dereference.

   Therefore, in flush_tlb_pending(), if there is only one entry in
   the batch perform a completely asynchronous global_flush_tlb_page()
   instead.

Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2013-04-19 17:26:26 -04:00
Stephen Boyd
cea15092f0 ARM: 7699/1: sched_clock: Add more notrace to prevent recursion
cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns()
is called by cyc_to_sched_clock(). I suspect that some compilers
inline both of these functions into sched_clock() and so we've
been getting away without having a notrace marking. It seems that
my compiler isn't inlining cyc_to_sched_clock() though, so I'm
hitting a recursion bug when I enable the function graph tracer,
causing my system to crash. Marking these functions notrace fixes
it. Technically cyc_to_ns() doesn't need the notrace because it's
already marked inline, but let's just add it so that if we ever
remove inline from that function it doesn't blow up.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-19 22:23:55 +01:00
Linus Torvalds
f068f5e158 ARM: arm-soc fixes for 3.9
Only one remaining fix for arm-soc platforms at this time, a small
 bugfix for cpu hotplug on highbank platforms that has become much
 easier to hit as of late. Details in the patch description, but it's
 small and well-contained and definitely impacts users of the platform,
 so 3.9 seems appropriate.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRcYGyAAoJEIwa5zzehBx3sYYP/iiongz+96eVHVxBMGMWHg0f
 BcTtHw2ef0h4vCDMNUXmPEmVu74hN5VMCutUd2qeKvckyC7gEIrbVnLFwej05RiA
 AFwIJ0A2pAjdvsAkRfTNXPpFs7bvZ+8r6cUJd4JGCP7IC8Zi4sj4dET1ZNEsaccD
 vMQ+Y8O3dvVu1lEFfGT87huppQgCr/jzc6O9oc3eHDZ242v8tVLS31PpuZe8Qt25
 8QvKKsY/UG82/aiz+ijlcddDJz132byNzrOvmY0DkcZ5ZMxbTnGUNFUqFszFkBYu
 FrtAyJyQEyUYwY7r6geogfgtj17mQzbq1/4azcApmDQHadBhVbXdDuTW0EPO63QC
 sQzf9JgWR66H5hOYeDp2ka1RbBh00k6byvh7T5adzgoJDtbHtJJ8OxW16OR/eoCQ
 umCZ2rQxAQCpw11qRnA8LDwnmujr5qXFMuj5NqepUaDCLbWAq2VWeNTicu0LMgCN
 RJZ7ifk94o+uCQETd0D2ZrZZtqvrbLtaLAgfa54PiMYecV0rRF44FUVDCIbUBRKq
 5ouN76JfvbOQutLj41nA/QBr0ATCAw4mPoxaeaIwie1G7lXvZvMRRkjKLxDhe/qs
 +3gZivBhQQuKEe3CYooLx3xoC/bs1VoMsyiRXa8YgdKsZbY0bIsQAUZp8dYjAgoW
 hYf3zRgRd69+k0uqWA1f
 =j3Hq
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Only one remaining fix for arm-soc platforms at this time, a small
  bugfix for cpu hotplug on highbank platforms that has become much
  easier to hit as of late.

  Details in the patch description, but it's small and well-contained
  and definitely impacts users of the platform, so 3.9 seems
  appropriate."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: highbank: fix cache flush ordering for cpu hotplug
2013-04-19 11:38:36 -07:00
Rob Herring
73053d973d ARM: highbank: fix cache flush ordering for cpu hotplug
The L1 data cache flush needs to be after highbank_set_cpu_jump call which
pollutes the cache with the l2x0_lock. This causes other cores to deadlock
waiting for the l2x0_lock. Moving the flush of the entire data cache after
highbank_set_cpu_jump fixes the problem. Use flush_cache_louis instead of
flush_cache_all are that is sufficient to flush only the L1 data cache.
flush_cache_louis did not exist when highbank_cpu_die was originally
written.

With PL310 errata 769419 enabled, a wmb is inserted into idle which takes
the l2x0_lock. This makes the problem much more easily hit and causes
reset to hang.

Reported-by: Paolo Pisati <p.pisati@gmail.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2013-04-18 09:37:46 -07:00
K. Y. Srinivasan
7eff7ded02 x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
Install the Hyper-V specific interrupt handler only when needed. This would
permit us to get rid of the Xen check. Note that when the vmbus drivers invokes
the call to register its handler, we are sure to be running on Hyper-V.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1366299886-6399-1-git-send-email-kys@microsoft.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-18 08:59:20 -07:00
Li Zhong
016af59f0f powerpc: Try to insert the hptes repeatedly in kernel_map_linear_page()
This patch fixes the following oops, which could be trigged by build the kernel
with many concurrent threads, under CONFIG_DEBUG_PAGEALLOC.

hpte_insert() might return -1, indicating that the bucket (primary here)
is full. We are not necessarily reporting a BUG in this case. Instead, we could
try repeatedly (try secondary, remove and try again) until we find a slot.

[  543.075675] ------------[ cut here ]------------
[  543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239!
[  543.075714] Oops: Exception in kernel mode, sig: 5 [#1]
[  543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[  543.075741] Modules linked in: binfmt_misc ehea
[  543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594
[  543.075771] REGS: c0000000a90832c0 TRAP: 0700   Not tainted  (3.8.0-next-20130222)
[  543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 22224482  XER: 00000000
[  543.075816] SOFTE: 0
[  543.075823] CFAR: c00000000004c200
[  543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1
GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff
GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594
GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854
GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001
GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000
GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001
GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950
GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230
[  543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8
[  543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8
[  543.076025] Call Trace:
[  543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable)
[  543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc
[  543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c
[  543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4
[  543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8
[  543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0
[  543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4
[  543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30
[  543.076155] Instruction dump:
[  543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008
[  543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000
[  543.076224] ---[ end trace bd5807e8d6ae186b ]---

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 16:00:00 +10:00
Li Zhong
b170bd3de6 powerpc: Split the code trying to insert hpte repeatedly as an helper function
Move the logic trying to insert hpte in __hash_page_huge() to an helper
function, so it could also be used by others.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:59 +10:00
Li Zhong
2c3c0693d9 powerpc: Move the setting of rflags out of loop in __hash_page_huge
It seems that new_pte and rflags don't get changed in the repeating loop, so
move their assignment out of the loop.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:59 +10:00
Brian King
c2e1d84523 powerpc: Set default VGA device
Add a PCI quirk for VGA devices on Power to set the default VGA device.
Ensures a default VGA is always set if a graphics adapter is present,
even if firmware did not initialize it. If more than one graphics
adapter is present, ensure the one initialized by firmware is set
as the default VGA device. This ensures that X autoconfiguration
will work.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:58 +10:00
Yuanquan Chen
37f02195be powerpc/pci: fix PCI-e devices rescan issue on powerpc platform
Powerpc initializes the DMA and IRQ information in pci_scan_child_bus()->
pcibios_fixup_bus()->pcibios_setup_bus_devices(). But for the devices
which are hotpluged, bus->is added has been set for the first scan of the
PCI-e bus, so the initialization code won't be called. Then the hotpluged
devices' driver will fail to load.

For example :
The PCI-e device 0001:03:00.0 is the Intel PCI-e e1000e network card, remove
it from the system:

    # echo 1 > /sys/bus/pci/devices/0001\:03\:00.0/remove
    # e1000e 0001:03:00.0 eth0: removed PHC

Rescan it from it's bus:

    # echo 1 > /sys/bus/pci/devices/0001\:02\:00.0/rescan
    ...
    e1000e 0001:03:00.0: Disabling ASPM L0s L1
    e1000e 0001:03:00.0: No usable DMA configuration, aborting
    e1000e: probe of 0001:03:00.0 failed with error -5

So we move the DMA & IRQ initialization code from pcibios_setup_devices() and
construct a new function pcibios_enable_device. We call this function in
pcibios_enable_device, which will be called by PCI-e rescan code. At the
meanwhile, we avoid the the impact on cardbus. I also validate this patch with
silicon's PCIe-sata which encounters the IRQ issue.

Signed-off-by: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:57 +10:00
Stephen Rothwell
55671f3cc2 powerpc: fix annotation of fake_numa_create_new_node()
This function has always been marked as __cpuinit, but is only called
from functions marked as __init and references an __initdata variable.
So change its annotation to __init.

Fixes this build warning:

WARNING: arch/powerpc/mm/built-in.o(.cpuinit.text+0x86): Section mismatch in reference from the function .fake_numa_create_new_node() to the variable .init.data:cmdline
The function __cpuinit .fake_numa_create_new_node() references
a variable __initdata cmdline.
If cmdline is only used by .fake_numa_create_new_node then
annotate cmdline with a matching annotation.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:56 +10:00
Vaidyanathan Srinivasan
7122beeee7 powerpc: fix numa distance for form0 device tree
The following commit breaks numa distance setup for old powerpc
systems that use form0 encoding in device tree.

commit 41eab6f88f
powerpc/numa: Use form 1 affinity to setup node distance

Device tree node /rtas/ibm,associativity-reference-points would
index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or
form1 encoding detected by ibm,architecture-vec-5 property.

All modern systems use form1 and current kernel code is correct.
However, on older systems with form0 encoding, the numa distance
will get hard coded as LOCAL_DISTANCE for all nodes.  This causes
task scheduling anomaly since scheduler will skip building numa
level domain (topmost domain with all cpus) if all numa distances
are same.  (value of 'level' in sched_init_numa() will remain 0)

Prior to the above commit:
((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)

Restoring compatible behavior with this patch for old powerpc systems
with device tree where numa distance are encoded as form0.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:56 +10:00
Michael Neuling
517b731477 powerpc/ptrace: Add DAWR debug feature info for userspace
This adds new debug feature information so that the DAWR can be
identified by userspace tools like GDB.

Unfortunately the DAWR doesn't sit nicely into the current description
that ptrace provides to userspace via struct ppc_debug_info.  It doesn't
allow for specifying that only some ranges are possible or even the end
alignment constraints (DAWR only allows 512 byte wide ranges which can't
cross a 512 byte boundary).

After talking to Edjunior Machado (GDB ppc developer), it was decided
this was the best approach.  Just mark it as debug feature DAWR and
tools like GDB can internally decide the constraints.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:55 +10:00
Ian Munsie
a6a058e52a powerpc: Add accounting for Doorbell interrupts
This patch adds a new line to /proc/interrupts to account for the
doorbell interrupts that each hardware thread has received. The total
interrupt count in /proc/stat will now also include doorbells.

 # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 16:        551       1267        281        175      XICS Level     IPI
LOC:       2037       1503       1688       1625   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
CNT:          0          0          0          0   Performance monitoring interrupts
MCE:          0          0          0          0   Machine check exceptions
DBL:         42        550         20         91   Doorbell interrupts

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 15:59:55 +10:00
Nishanth Aravamudan
61435690a9 powerpc/pseries: close DDW race between functions of adapter
Given a PCI device with multiple functions in a DDW capable slot, the
following situation can be encountered: When the first function sets a
64-bit DMA mask, enable_ddw() will be called and we can fail to properly
configure DDW (the most common reason being the new DMA window's size is
not large enough to map all of an LPAR's memory). With the recent
changes to DDW, we remove the base window in order to determine if the
new window is of sufficient size to cover an LPAR's memory. We correctly
replace the base window if we find that not to be the case. However,
once we go through and re-configured 32-bit DMA via the IOMMU, the next
function of the adapter will go through the same process. And since DDW
is a characteristic of the slot itself, we are most likely going to fail
again. But to determine we are going to fail the second slot, we again
remove the base window -- but that is now in-use by the first
function/driver, which might be issuing I/O already.

To close this window, keep a list of all the failed struct device_nodes
that have failed to configure DDW. If the current device_node is in that
list, just fail out immediately and fall back to 32-bit DMA without
doing any DDW manipulation.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:04:00 +10:00
Gavin Shan
fb1b55d654 powerpc/powernv: Use MSI bitmap to manage IRQs
As Michael Ellerman mentioned, arch/powerpc/sysdev/msi_bitmap.c
already implemented bitmap to manage (alloc/free) MSI interrupts.
The patch intends to use that mechanism to manage MSI interrupts
for PowerNV platform.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:59 +10:00
Michael Neuling
2a3563b023 powerpc: Setup in HFSCR for POWER8
Setup the HFSCR (Hypervisor Facility Status and Control Register) for POWER8
when running HV=1.  The HFSCR is the same as the FSCR except it's for
hypervisors.  It controls the available of various facilities in OS and
userspace levels.  It also indicates the cause of a hypervisor facility
unavailable interrupt (although we are not using this here).

This patch sets the facilities Linux knows about incase the firmware doesn't.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:59 +10:00
Michael Neuling
04b418c97f powerpc: Add HFSCR SPR definitions
Add SPR number and bit definitions for the HFSCR (Hypervisor Facility Status
and Control Register).

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:58 +10:00
Alexey Kardashevskiy
ee4a391661 powerpc: fixing ptrace_get_reg to return an error
Currently ptrace_get_reg returns error as a value
what make impossible to tell whether it is a correct value or error code.

The patch adds a parameter which points to the real return data and
returns an error code.

As get_user_msr() never fails and it is used in multiple places so it has not
been changed by this patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:57 +10:00
Paul Mackerras
3cc33d50f5 powerpc: Fix build errors with UP configs in HV-style KVM
This fixes these errors when building UP with CONFIG_KVM_BOOK3S_64_HV=y:

arch/powerpc/kvm/book3s_hv.c:1855:2: error: implicit declaration of function 'inhibit_secondary_onlining' [-Werror=implicit-function-declaration]
arch/powerpc/kvm/book3s_hv.c:1862:2: error: implicit declaration of function 'uninhibit_secondary_onlining' [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors

and this error (with CONFIG_KVM_BOOK3S_64=m, or a vmlinux link error
with CONFIG_KVM_BOOK3S_64=y):

ERROR: "smp_send_reschedule" [arch/powerpc/kvm/kvm.ko] undefined!
make[2]: *** [__modpost] Error 1

The fix for the link error is suboptimal; ideally we want a self_ipi()
function from irq.c, connected at least to the MPIC code, to initiate
an IPI to this cpu.  The fix here at least lets the code build, and it
will work, just with interrupts being delayed sometimes.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:57 +10:00
Zhang Yanfei
c843be8a54 powerpc: remove cast for kmalloc/kzalloc return value
remove cast for kmalloc/kzalloc return value.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:56 +10:00
Alex Grad
c0b52c143e powerpc/kgdb: Removed kmalloc returned value cast
Signed-off-by: Alex Grad <alex.grad@gmail.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:56 +10:00
Paul Bolle
46c74751c2 powerpc: drop even more unused Kconfig symbols
When I submitted commit 6805ab6daa
("powerpc: drop unused Kconfig symbols") I apparently failed to notice
that my patch also made PREP_RESIDUAL and PPC_A2_DD2 unused. Drop these
now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:55 +10:00
Paul Bolle
0eaf2255c3 powerpc: remove CONFIG_MPC10X_OPENPIC
The last users of Kconfig symbol MPC10X_OPENPIC were removed in v2.6.27.
Its Kconfig entry can be removed now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:55 +10:00
Paul Bolle
89d269ff2e powerpc: remove unused CONFIG_405EP
All users of Kconfig symbol 405EP were removed in release v2.6.27.
Remove this symbol (and a useless select of it) too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:54 +10:00
Paul Bolle
aac3d0c875 powerpc: Fix typo "CONFIG_ICSWX_PID"
Untested. As this typo was introduced in v3.3, with commit
9d67028090 ("powerpc: Split ICSWX ACOP and
PID processing"), which actually added PPC_ICSWX_PID, this surely needs
testing.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:54 +10:00
Paul Bolle
45b02f8d94 memblock: kill "config MAX_ACTIVE_REGIONS"
The Kconfig symbol MAX_ACTIVE_REGIONS is unused. Commit
0ee332c145 ("memblock: Kill
early_node_map[]") removed the only place were it was actually used. But
it did not remove its Kconfig entries (for powerpc and sh).

Remove those two entries (and the entry for metag, that popped up in
v3.9-rc1).

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:53 +10:00
Paul Bolle
933ee7119f powerpc: remove PReP platform
PPC_PREP is marked as BROKEN since v2.6.15. Remove all PReP specific
code now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:53 +10:00
Paul Bolle
9850baed30 powerpc: remove dead CONFIG_HVC_SCOM code
Commit c1fb6816fb ("powerpc: Add
relocation on exception vector handlers") added two lines of code that
depend on the macro CONFIG_HVC_SCOM. That macro doesn't exist. Perhaps
it was intended to use CONFIG_PPC_SCOM here. But since
"maintence_interrupt" is a typo and there's nothing in arch/powerpc that
looks like maintenance_interrupt it seems best to just delete these
lines.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:52 +10:00
Paul Bolle
7d5480fe00 powerpc/40x: remove unused "config 405GPR"
The last user of Kconfig symbol 405GPR got removed in release v3.2.
Remove this symbol too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:52 +10:00
Paul Bolle
d6301775db powerpc: remove outdated default on PCI_PERMEDIA
The Kconfig symbol PCI_PERMEDIA got removed in v2.6.24, through commit
e6b6e3ffb9 ("[POWERPC] Remove APUS support
from arch/ppc"). Remove its last occurrence.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2013-04-18 13:03:51 +10:00