Because 64-bit powerpc uses lazy (soft) interrupt disabling, it is
possible for a performance monitor exception to come in when the
kernel thinks interrupts are disabled (i.e. when they are
soft-disabled but hard-enabled). In such a situation the performance
monitor exception handler might have some processing to do (such as
process wakeups) which can't be done in what is effectively an NMI
handler.
This provides a way to defer that work until interrupts get enabled,
either in raw_local_irq_restore() or by returning from an interrupt
handler to code that had interrupts enabled. We have a per-processor
flag that indicates that there is work pending to do when interrupts
subsequently get re-enabled. This flag is checked in the interrupt
return path and in raw_local_irq_restore(), and if it is set,
perf_counter_do_pending() is called to do the pending work.
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
powerpc/44x: Support 16K/64K base page sizes on 44x
powerpc: Force memory size to be a multiple of PAGE_SIZE
powerpc/32: Wire up the trampoline code for kdump
powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
powerpc/32: Setup OF properties for kdump
powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
powerpc: Prepare xmon_save_regs for use with kdump
powerpc: Remove default kexec/crash_kernel ops assignments
powerpc: Make default kexec/crash_kernel ops implicit
powerpc: Setup OF properties for ppc32 kexec
powerpc/pseries: Fix cpu hotplug
powerpc: Fix KVM build on ppc440
powerpc/cell: add QPACE as a separate Cell platform
powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
powerpc/mpc5200: fix error paths in PSC UART probe function
powerpc/mpc5200: add rts/cts handling in PSC UART driver
powerpc/mpc5200: Make PSC UART driver update serial errors counters
powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
...
Fix trivial conflict in drivers/char/Makefile as per Paul's directions
This adds support for 16k and 64k page sizes on PowerPC 44x processors.
The PGDIR table is much smaller than a page when using 16k or 64k
pages (512 and 32 bytes respectively) so we allocate the PGDIR with
kzalloc() instead of __get_free_pages().
One PTE table covers rather a large memory area when using 16k or 64k
pages (32MB or 512MB respectively), so we can easily put FIXMAP and
PKMAP in the area covered by one PTE table.
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.
What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Wire up the trampoline code for ppc32 to relay exceptions from the
vectors at address 0 to vectors at address 32MB, and modify Kconfig
to enable Kdump support for all classic powerpcs.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add the ability for a classic ppc kernel to be loaded at an address
of 32MB. This done by fixing a few places that assume we are loaded
at address 0, and by changing several uses of KERNELBASE to use
PAGE_OFFSET, instead.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This replaces the dummy crash_setup_regs function with full-fledged
crash_setup_regs implementation. On PPC32 we simply use the new
ppc_save_regs function to dump the registers.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Today the arch/powerpc/xmon/setjmp.S file contains only the
xmon_save_regs function. We want to use it for kdump purposes, so
let's move the file into arch/powerpc/kernel/ and give the function a
more generic name (ppc_save_regs).
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add RTS/CTS-support for the PSC of the MPC5200B. Tested with a Phytec
MPC5200B-IO.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds documentation to the mpc5200 interrupt controller
driver and cleans up some minor coding conventions. It also moves the
contents of mpc52xx_pic.h into the driver proper (except for a small
common bit that is moved to the common mpc52xx.h) because the
information encoded there is not required by any other part of kernel
code. Finally for code readability sake, the L2_OFFSET shift value
is removed because the code using it resolves to a noop.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Add const qualifier to device_node argument for
dcr_resource_{start,len} as of_get_property also const-qualifies this
argument.
Signed-off-by: Grant Erickson <gerickson@nuovations.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
in the hash code based on some CPU feature bit. We also manipulate
_PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.
This changes the logic so that instead, the PTE now contains
_PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
that need it. The hash code clears it if the feature bit is not set.
It also adds some clean accessors to setup various valid combinations
of access flags and change various bits of code to use them instead.
This should help having the PTE actually containing the bit
combinations that we really want.
I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
set it explicitely from the TLB miss. I will ultimately remove it
completely as it appears that it might not be needed after all
but in the meantime, having it in the TLB miss makes things a
lot easier.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, the various forms of low level TLB invalidations are all
implemented in misc_32.S for 32-bit processors, in a fairly scary
mess of #ifdef's and with interesting duplication such as a whole
bunch of code for FSL _tlbie and _tlbia which are no longer used.
This moves things around such that _tlbie is now defined in
hash_low_32.S and is only used by the 32-bit hash code, and all
nohash CPUs use the various _tlbil_* forms that are now moved to
a new file, tlb_nohash_low.S.
I moved all the definitions for that stuff out of
include/asm/tlbflush.h as they are really internal mm stuff, into
mm/mmu_decl.h
The code should have no functional changes. I kept some variants
inline for trivial forms on things like 40x and 8xx.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This commit moves the whole no-hash TLB handling out of line into a
new tlb_nohash.c file, and implements some basic SMP support using
IPIs and/or broadcast tlbivax instructions.
Note that I'm using local invalidations for D->I cache coherency.
At worst, if another processor is trying to execute the same and
has the old entry in its TLB, it will just take a fault and re-do
the TLB flush locally (it won't re-do the cache flush in any case).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We're soon running out of CPU features and I need to add some new
ones for various MMU related bits, so this patch separates the MMU
features from the CPU features. I moved over the 32-bit MMU related
ones, added base features for MMU type families, but didn't move
over any 64-bit only feature yet.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reworks the context management code used by 4xx,8xx and
freescale BookE. It adds support for SMP by implementing a
concept of stale context map to lazily flush the TLB on
processors where a context may have been invalidated. This
also contains the ground work for generalizing such lazy TLB
flushing by just picking up a new PID and marking the old one
stale. This will be implemented later.
This is a first implementation that uses a global spinlock.
Ideally, we should try to get at least the fast path (context ID
already assigned) lockless or limited to a per context lock,
but for now this will do.
I tried to keep the UP case reasonably simple to avoid adding
too much overhead to 8xx which does a lot of context stealing
since it effectively has only 16 PIDs available.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This splits the mmu_context handling between 32-bit hash based
processors, 64-bit hash based processors and everybody else. This is
preliminary work for adding SMP support for BookE processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds supports to the "extended" DCR addressing via the indirect
mfdcrx/mtdcrx instructions supported by some 4xx cores (440H6 and
later).
I enabled the feature for now only on AMCC 460 chips.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We have more than one piece of code that looks up cache nodes manually
using the "l2-cache" property. Add a common helper routine which does
this and handles ePAPR's "next-level-cache" property as well as
powermac.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a local_flush_tlb_mm() call as a pre-requisite for some
SMP work for BookE processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Instead of not defining it at all, this defines the macro as
being empty, thus avoiding ifdef's in call sites when CONFIG_BUG
is not set.
Also removes an extra whitespace in the existing definition.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The block layer dropped the virtual merge feature
(b8b3e16cfe). BIO_VMERGE_BOUNDARY
definition is meaningless now (For POWER, BIO_VMERGE_BOUNDARY has been
meaningless for a long time since POWER disables the virtual merge
feature).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently there are a number of platforms that open code access to
the ppc_pci_flags global variable. However, that variable is not
present if CONFIG_PCI is not set, which can lead to a build break.
This introduces a number of accessor functions that are defined
to be empty in the case of CONFIG_PCI being disabled. The
various platform files in the kernel are updated to use these.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since "Factor out cpu joining/unjoining the GIQ"
(b4963255ad) the WARN_ON in
xics_set_cpu_giq() is being triggered during boot on JS20 because the
GIQ indicator is not available on that platform. While the warning is
harmless and the system runs normally, it's nicer to check for the
existence of the indicator before trying to manipulate it.
Implement rtas_indicator_present(), which searches the
/rtas/rtas-indicators property for the given indicator token, and use
this function in xics_set_cpu_giq().
Also use a WARN statement in xics_set_cpu_giq to get better
information on failure.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The `have_of' variable is a relic from the arch/ppc time, it isn't
useful nowadays.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change #define stubs of dma_sync ops to be empty static inlines
to avoid build warning.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
commit 059e4938f8 ("powerpc/ps3: Add a sub-match
id to ps3_system_bus") forgot to update the module alias support:
- Add the sub-match ids to the module aliases, so udev can distinguish
between different types of sub-devices.
- Rename PS3_MODULE_ALIAS_GRAPHICS to PS3_MODULE_ALIAS_GPU_FB, as ps3fb
binds to the "FB" sub-device.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix a minor comment typo in pgtable-ppc64.h.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We were missing the CPU_FTR_NOEXECUTE bit in our cputable for all
these processors. The result is that update_mmu_cache() would flush
the cache for all pages mapped to userspace which is totally
unnecessary on those processors since we already handle flushing
on execute in the page fault path.
This should provide a nice speed up ;-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The initial TLB mapping for the kernel boot didn't set the memory coherent
attribute, MAS2[M], in SMP mode.
If this code supported booting a secondary processor, which it doesn't yet,
but if it did, then when a secondary processor boots, it would probably signal
the primary processor by setting a variable called something like
__secondary_hold_acknowledge. However, due to the lack of the M bit, the
primary processor would not snoop the transaction (even if a transaction were
broadcast). If primary CPU's L1 D-cache had a copy, it would not be flushed
and the CPU would never see the ack. Which would have resulted in the primary
CPU spinning for a long time, perhaps a full second before it gives up, while
it would have waited for the ack from the secondary CPU that it wouldn't have
been able to see because of the stale cache.
The value of MAS2 for the boot page TLB1 entry is a compile time constant,
so there is no need to calculate it in powerpc assembly language.
Also, from the MPC8572 manual section 6.12.5.3, "Bits that represent
offsets within a page are ignored and should be cleared." Existing code
didn't clear them, this code does.
The same when the page of KERNELBASE is found; we don't need to use asm to
mask the lower 12 bits off.
In the code that computes the address to rfi from, don't hard code the
offset to 24 bytes, but have the assembler figure that out for us.
Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch add the handlers of SPE/EFP exceptions.
The code is used to emulate float point arithmetic,
when MSR(SPE) is enabled and receive EFP data interrupt or EFP round interrupt.
This patch has no conflict with or dependence on FP math-emu.
The code has been tested by TestFloat.
Now the code doesn't support SPE/EFP instructions emulation
(it won't be called when receive program interrupt),
but it could be easily added.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Move to using the same macro definition for _FP_CHOOSENAN as s390,
sh, sparc32/64. The original author didn't understand this and
matched what sparc64 was doing and they have updated to this definition.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
PowerPC float point division emulation is derived from gcc.
I reported this problem on gcc maillist and got this reply:
http://gcc.gnu.org/ml/gcc/2008-03/msg00543.html
Since UDIV_NEEDS_NORMALIZATION is not used by kernel, we should use
_FP_DIV_MEAT_1_udiv_norm to make sure the single float point
is normalized before udiv_qrnnd.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The name of the device_node field differ across the platforms, so we
have to implement inlined accessors. This is needed to avoid ugly
#ifdef in the generic code.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to swap these out once we start using swiotlb, so add
them to dma_ops. Create CONFIG_PPC_NEED_DMA_SYNC_OPS Kconfig
option; this is currently enabled automatically if we're
CONFIG_NOT_COHERENT_CACHE. In the future, this will also
be enabled for builds that need swiotlb. If PPC_NEED_DMA_SYNC_OPS
is not defined, the dma_sync_*_for_* ops compile to nothing.
Otherwise, they access the dma_ops pointers for the sync ops.
This patch also changes dma_sync_single_range_* to actually
sync the range - previously it was using a generous
dma_sync_single. dma_sync_single_* is now implemented
as a dma_sync_single_range with an offset of 0.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Refactor the RCU based pte free code that was used on ppc64 to be used
on all powerpc.
Additionally refactor pte_free() & pte_free_kernel() into common code
between ppc32 & ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The tlb invalidates in kmap_atomic/kunmap_atomic can be called from
IRQ context, however they are only local invalidates (on the processor
that the kmap was called on). In the future we want to use IPIs to
do tlb invalidates this causes issue since flush_tlb_page() is considered
a broadcast invalidate.
Add local_flush_tlb_page() as a non-broadcast invalidate and use it in
kmap_atomic() since we don't have enough information in the
flush_tlb_page() call to determine its local.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32-bit hash code didn't need it so far so we don't update
mm->cpu_vm_mask on context switch. This however will break when we
merge the RCU based page table freeing patch and other upcoming 32-bit
embedded SMP work, so this adds the update.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: MMU: avoid creation of unreachable pages in the shadow
KVM: ppc: stop leaking host memory on VM exit
KVM: MMU: fix sync of ptes addressed at owner pagetable
KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
KVM: ia64: Fix incorrect kbuild CFLAGS override
KVM: VMX: Fix interrupt loss during race with NMI
KVM: s390: Fix problem state handling in guest sigp handler
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).
Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
called only from __init, calls __init. Incidentally, it ought to be static
in file.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the VM exits, we must call put_page() for every page referenced in the
shadow TLB.
Without this patch, we usually leak 30-50 host pages (120 - 200 KiB with 4 KiB
pages). The maximum number of pages leaked is the size of our shadow TLB, 64
pages.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Impact: add ability to trace modules on 32 bit PowerPC
This patch performs the necessary trampoline calls to handle
modules with dynamic ftrace on 32 bit PowerPC.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: Allow 64 bit PowerPC to trace modules with dynamic ftrace
This adds code to handle the PPC64 module trampolines, and allows for
PPC64 to use dynamic ftrace.
Thanks to Paul Mackerras for these updates:
- fix the mod and rec->arch.mod NULL checks.
- fix to is_bl_op compare.
Thanks to Milton Miller for:
- finding the nasty race with using two nops, and recommending
instead that I use a branch 8 forward.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: update to PowerPC ftrace arch API
This patch converts PowerPC to use the new dynamic ftrace arch API.
Thanks to Paul Mackennas for pointing out the mistakes of my original
test_24bit_addr function.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>