We ended up with an ugly conflict between fixes and next in ftrace.h
involving multiple nested ifdefs, and the automatic resolution is
wrong. So merge fixes into next so we can fix it up.
Fix the below crash on Book3E 64. pgtable_page_dtor expects struct
page *arg.
Also call the destructor on non book3s platforms correctly. This frees
up the split PTL locks correctly if we had allocated them before.
Call Trace:
.kmem_cache_free+0x9c/0x44c (unreliable)
.ptlock_free+0x1c/0x30
.tlb_remove_table+0xdc/0x224
.free_pgd_range+0x298/0x500
.shift_arg_pages+0x10c/0x1e0
.setup_arg_pages+0x200/0x25c
.load_elf_binary+0x450/0x16c8
.search_binary_handler.part.11+0x9c/0x248
.do_execveat_common.isra.13+0x868/0xc18
.run_init_process+0x34/0x4c
.try_to_run_init_process+0x1c/0x68
.kernel_init+0xdc/0x130
.ret_from_kernel_thread+0x58/0x7c
Fixes: 702346768 ("powerpc/mm/nohash: Remove pte fragment dependency from nohash")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The toc field in the mod_arch_specific struct isn't actually used
anywhere, so remove it.
Also the ftrace-specific fields are now common between 32-bit and
64-bit, so simplify the struct definition a bit by moving them out of
the __powerpc64__ #ifdef.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add one missing prototype for function rh_dump_blk. Fix warning treated as
error in W=1:
arch/powerpc/lib/rheap.c:740:6: error: no previous prototype for ‘rh_dump_blk’ [-Werror=missing-prototypes]
Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The function prototypes were declared within a `#ifdef CONFIG_PPC_LITE5200`
block which would prevent them from being visible when compiling
`mpc52xx_pm.c`. Move the prototypes outside of the `#ifdef` block to fix
the following warnings treated as errors with W=1:
arch/powerpc/platforms/52xx/mpc52xx_pm.c:58:5: error: no previous prototype for ‘mpc52xx_pm_prepare’ [-Werror=missing-prototypes]
arch/powerpc/platforms/52xx/mpc52xx_pm.c:113:5: error: no previous prototype for ‘mpc52xx_pm_enter’ [-Werror=missing-prototypes]
arch/powerpc/platforms/52xx/mpc52xx_pm.c:181:6: error: no previous prototype for ‘mpc52xx_pm_finish’ [-Werror=missing-prototypes]
Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The pmac_pfunc_base_install prototype was declared in powermac/smp.c since
function was used there, move it to pmac_pfunc.h header to be visible in
pfunc_base.c. Fix a warning treated as error with W=1:
arch/powerpc/platforms/powermac/pfunc_base.c:330:12: error: no previous prototype for ‘pmac_pfunc_base_install’ [-Werror=missing-prototypes]
Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Trivial fix to remove the following sparse warnings:
arch/powerpc/kernel/module_32.c:112:74: warning: Using plain integer as NULL pointer
arch/powerpc/kernel/module_32.c:117:74: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:1155:28: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:1230:20: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:1385:36: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:1752:23: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:2084:19: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:2110:32: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:2167:19: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:2183:19: warning: Using plain integer as NULL pointer
drivers/macintosh/via-pmu.c:277:20: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/setup.c:155:67: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/setup.c:247:27: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/setup.c:249:27: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/setup.c:252:37: warning: Using plain integer as NULL pointer
arch/powerpc/mm/tlb_hash32.c:127:21: warning: Using plain integer as NULL pointer
arch/powerpc/mm/tlb_hash32.c:148:21: warning: Using plain integer as NULL pointer
arch/powerpc/mm/tlb_hash32.c:44:21: warning: Using plain integer as NULL pointer
arch/powerpc/mm/tlb_hash32.c:57:21: warning: Using plain integer as NULL pointer
arch/powerpc/mm/tlb_hash32.c:87:21: warning: Using plain integer as NULL pointer
arch/powerpc/kernel/btext.c:160:31: warning: Using plain integer as NULL pointer
arch/powerpc/kernel/btext.c:167:22: warning: Using plain integer as NULL pointer
arch/powerpc/kernel/btext.c:274:21: warning: Using plain integer as NULL pointer
arch/powerpc/kernel/btext.c:285:31: warning: Using plain integer as NULL pointer
arch/powerpc/include/asm/hugetlb.h:204:16: warning: Using plain integer as NULL pointer
arch/powerpc/mm/ppc_mmu_32.c:170:21: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/pci.c:1227:23: warning: Using plain integer as NULL pointer
arch/powerpc/platforms/powermac/pci.c:65:24: warning: Using plain integer as NULL pointer
Also use `--fix` command line option from `script/checkpatch --strict` to
remove the following:
CHECK: Comparison to NULL could be written "!dispDeviceBase"
#72: FILE: arch/powerpc/kernel/btext.c:160:
+ if (dispDeviceBase == NULL)
CHECK: Comparison to NULL could be written "!vbase"
#80: FILE: arch/powerpc/kernel/btext.c:167:
+ if (vbase == NULL)
CHECK: Comparison to NULL could be written "!base"
#89: FILE: arch/powerpc/kernel/btext.c:274:
+ if (base == NULL)
CHECK: Comparison to NULL could be written "!dispDeviceBase"
#98: FILE: arch/powerpc/kernel/btext.c:285:
+ if (dispDeviceBase == NULL)
CHECK: Comparison to NULL could be written "strstr"
#117: FILE: arch/powerpc/kernel/module_32.c:117:
+ if (strstr(secstrings + sechdrs[i].sh_name, ".debug") != NULL)
CHECK: Comparison to NULL could be written "!Hash"
#130: FILE: arch/powerpc/mm/ppc_mmu_32.c:170:
+ if (Hash == NULL)
CHECK: Comparison to NULL could be written "Hash"
#143: FILE: arch/powerpc/mm/tlb_hash32.c:44:
+ if (Hash != NULL) {
CHECK: Comparison to NULL could be written "!Hash"
#152: FILE: arch/powerpc/mm/tlb_hash32.c:57:
+ if (Hash == NULL) {
CHECK: Comparison to NULL could be written "!Hash"
#161: FILE: arch/powerpc/mm/tlb_hash32.c:87:
+ if (Hash == NULL) {
CHECK: Comparison to NULL could be written "!Hash"
#170: FILE: arch/powerpc/mm/tlb_hash32.c:127:
+ if (Hash == NULL) {
CHECK: Comparison to NULL could be written "!Hash"
#179: FILE: arch/powerpc/mm/tlb_hash32.c:148:
+ if (Hash == NULL) {
ERROR: space required after that ';' (ctx:VxV)
#192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
+ for (; node != NULL;node = node->sibling) {
CHECK: Comparison to NULL could be written "node"
#192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
+ for (; node != NULL;node = node->sibling) {
CHECK: Comparison to NULL could be written "!region"
#201: FILE: arch/powerpc/platforms/powermac/pci.c:1227:
+ if (region == NULL)
CHECK: Comparison to NULL could be written "of_get_property"
#214: FILE: arch/powerpc/platforms/powermac/setup.c:155:
+ if (of_get_property(np, "cache-unified", NULL) != NULL && dc) {
CHECK: Comparison to NULL could be written "!np"
#223: FILE: arch/powerpc/platforms/powermac/setup.c:247:
+ if (np == NULL)
CHECK: Comparison to NULL could be written "np"
#226: FILE: arch/powerpc/platforms/powermac/setup.c:249:
+ if (np != NULL) {
CHECK: Comparison to NULL could be written "l2cr"
#230: FILE: arch/powerpc/platforms/powermac/setup.c:252:
+ if (l2cr != NULL) {
CHECK: Comparison to NULL could be written "via"
#243: FILE: drivers/macintosh/via-pmu.c:277:
+ if (via != NULL)
CHECK: Comparison to NULL could be written "current_req"
#252: FILE: drivers/macintosh/via-pmu.c:1155:
+ if (current_req != NULL) {
CHECK: Comparison to NULL could be written "!req"
#261: FILE: drivers/macintosh/via-pmu.c:1230:
+ if (req == NULL || pmu_state != idle
CHECK: Comparison to NULL could be written "!req"
#270: FILE: drivers/macintosh/via-pmu.c:1385:
+ if (req == NULL) {
CHECK: Comparison to NULL could be written "!pp"
#288: FILE: drivers/macintosh/via-pmu.c:2084:
+ if (pp == NULL)
CHECK: Comparison to NULL could be written "!pp"
#297: FILE: drivers/macintosh/via-pmu.c:2110:
+ if (count < 1 || pp == NULL)
CHECK: Comparison to NULL could be written "!pp"
#306: FILE: drivers/macintosh/via-pmu.c:2167:
+ if (pp == NULL)
CHECK: Comparison to NULL could be written "pp"
#315: FILE: drivers/macintosh/via-pmu.c:2183:
+ if (pp != NULL) {
Link: https://github.com/linuxppc/linux/issues/37
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some functions prototypes were missing for the non-altivec code. Add the
missing prototypes in a new header file, fix warnings treated as errors
with W=1:
arch/powerpc/lib/xor_vmx_glue.c:18:6: error: no previous prototype for ‘xor_altivec_2’ [-Werror=missing-prototypes]
arch/powerpc/lib/xor_vmx_glue.c:29:6: error: no previous prototype for ‘xor_altivec_3’ [-Werror=missing-prototypes]
arch/powerpc/lib/xor_vmx_glue.c:40:6: error: no previous prototype for ‘xor_altivec_4’ [-Werror=missing-prototypes]
arch/powerpc/lib/xor_vmx_glue.c:52:6: error: no previous prototype for ‘xor_altivec_5’ [-Werror=missing-prototypes]
The prototypes were already present in <asm/xor.h> but this header file is
meant to be included after <include/linux/raid/xor.h>. Trying to re-use
<asm/xor.h> directly would lead to warnings such as:
arch/powerpc/include/asm/xor.h:39:15: error: variable ‘xor_block_altivec’ has initializer but incomplete type
Trying to re-use <asm/xor.h> after <include/linux/raid/xor.h> in
xor_vmx_glue.c would in turn trigger the following warnings:
include/asm-generic/xor.h:688:34: error: ‘xor_block_32regs’ defined but not used [-Werror=unused-variable]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This allows the compiler to verify the format strings vs the types of
the arguments.
Update the other prototype declarations in asm/xmon.h.
Silence warnings (triggered at W=1) by adding relevant __printf
attribute. Move #define at bottom of the file to prevent conflict with
gcc attribute.
Solves the original warning:
arch/powerpc/xmon/nonstdio.c:178:2: error: function might be
possible candidate for ‘gnu_printf’ format attribute
In turn this uncovered many formatting errors in xmon.c, all fixed in
this patch.
Signed-off-by: Mathieu Malaterre <malat@debian.org>
[mpe: Always use px not p, fixup the 44x specific code, tweak change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use symbolic names defined in asm/ppc-opcode.h
instead of hardcoded values.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch exports tm_enable()/tm_disable/tm_abort() APIs, which
will be used for PR KVM transactional memory logic.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patches add some macros for CR0/TEXASR bits so that PR KVM TM
logic (tbegin./treclaim./tabort.) can make use of them later.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch adds support to read 64-bit sensor values. This method is
used to read energy sensors and counters which are of type u64.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add byte-swapping versions of __raw_writeq() and __raw_rm_writeq().
This allows us to avoid sparse warnings caused by passing __be64 to
__raw_writeq(), which takes unsigned long:
arch/powerpc/platforms/powernv/pci-ioda.c:1981:38:
warning: incorrect type in argument 1 (different base types)
expected unsigned long [unsigned] v
got restricted __be64 [usertype] <noident>
It's also generally preferable to use a byte-swapping accessor rather
than doing it by hand in the code, which is more bug prone.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
arch/powerpc/Makefile activates -mmultiple on BE PPC32 configs
in order to use multiple word instructions in functions entry/exit.
The patch does the same for the asm parts, for consistency.
On processors like the 8xx on which insn fetching is pretty slow,
this speeds up registers save/restore.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: PPC32 is BE only, so drop the endian checks]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commit 6ad966d730.
That commit was pointless, because csum_add() sums two 32 bits
values, so the sum is 0x1fffffffe at the maximum.
And then when adding upper part (1) and lower part (0xfffffffe),
the result is 0xffffffff which doesn't carry.
Any lower value will not carry either.
And behind the fact that this commit is useless, it also kills the
whole purpose of having an arch specific inline csum_add()
because the resulting code gets even worse than what is obtained
with the generic implementation of csum_add()
0000000000000240 <.csum_add>:
240: 38 00 ff ff li r0,-1
244: 7c 84 1a 14 add r4,r4,r3
248: 78 00 00 20 clrldi r0,r0,32
24c: 78 89 00 22 rldicl r9,r4,32,32
250: 7c 80 00 38 and r0,r4,r0
254: 7c 09 02 14 add r0,r9,r0
258: 78 09 00 22 rldicl r9,r0,32,32
25c: 7c 00 4a 14 add r0,r0,r9
260: 78 03 00 20 clrldi r3,r0,32
264: 4e 80 00 20 blr
In comparison, the generic implementation of csum_add() gives:
0000000000000290 <.csum_add>:
290: 7c 63 22 14 add r3,r3,r4
294: 7f 83 20 40 cmplw cr7,r3,r4
298: 7c 10 10 26 mfocrf r0,1
29c: 54 00 ef fe rlwinm r0,r0,29,31,31
2a0: 7c 60 1a 14 add r3,r0,r3
2a4: 78 63 00 20 clrldi r3,r3,32
2a8: 4e 80 00 20 blr
And the reverted implementation for PPC64 gives:
0000000000000240 <.csum_add>:
240: 7c 84 1a 14 add r4,r4,r3
244: 78 80 00 22 rldicl r0,r4,32,32
248: 7c 80 22 14 add r4,r0,r4
24c: 78 83 00 20 clrldi r3,r4,32
250: 4e 80 00 20 blr
Fixes: 6ad966d730 ("powerpc/64: Fix checksum folding in csum_add()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
PMD_PAGE_SIZE() is nowhere used and _PMD_SIZE is only
used by PMD_PAGE_SIZE().
This patch removes them.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Implement a local TLB flush for invalidating an LPID with variants for
process or partition scope. And a global TLB flush for invalidating
a partition scoped page of an LPID.
These will be used by KVM in subsequent patches.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Instead of encoding shift in the table address, use an enumerated index value.
This allow us to do different things in the callback for pte and pmd.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
4K config use one full page at level 4 of the pagetable. Add support for single
fragment allocation in pagetable fragment code and and use that for 4K config.
This makes both 4k and 64k use the same code path. Later we will switch pmd to
use the page table fragment code. This is done only for 64bit platforms which
is using page table fragment support.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Now that we have removed 64K page size support, the RCU page table free can
be much simpler for nohash. Make a copy of the the rcu callback to pgalloc.h
header similar to nohash 32. We could possibly merge 32 and 64 bit there. But
that is for a later patch
We also move the book3s specific handler to pgtable_book3s64.c. This will be
updated in a later patch to handle split pmd ptlock.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We have in Kconfig
config PPC_64K_PAGES
bool "64k page size"
depends on !PPC_FSL_BOOK3E && (44x || PPC_BOOK3S_64 || PPC_BOOK3E_64)
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
Only supported BOOK3E 64 bit platforms is FSL_BOOK3E. Remove the dead 64k page
support code from 64bit nohash.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
it had always been pointless - compat_sys_select() sign-extends
the first argument just fine on its own.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[mpe: Use COMPAT_SPU_NEW() to keep systbl_chk.sh happy]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[mpe: Fix sys_debug_setcontext() prototype to return long]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The hcall_exit() tracepoint has retval defined as unsigned long. That
leads to humours results like:
bash-3686 [009] d..2 854.134094: hcall_entry: opcode=24
bash-3686 [009] d..2 854.134095: hcall_exit: opcode=24 retval=18446744073709551609
It's normal for some hcalls to return negative values, displaying them
as unsigned isn't very helpful. So change it to signed.
bash-3711 [001] d..2 471.691008: hcall_entry: opcode=24
bash-3711 [001] d..2 471.691008: hcall_exit: opcode=24 retval=-7
Which can be more easily compared to H_NOT_FOUND in hvcall.h
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
The build is failing with CONFIG_NUMA=n and some compiler versions:
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
hotplug-cpu.c:(.text+0x12c): undefined reference to `timed_topology_update'
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_cpu_remove':
hotplug-cpu.c:(.text+0x400): undefined reference to `timed_topology_update'
Fix it by moving the empty version of timed_topology_update() into the
existing #ifdef block, which has the right guard of SPLPAR && NUMA.
Fixes: cee5405da4 ("powerpc/hotplug: Improve responsiveness of hotplug change")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some syscall entry functions on powerpc are prefixed with
ppc_/ppc32_/ppc64_ rather than the usual sys_/__se_sys prefix. fork(),
clone(), swapcontext() are some examples of syscalls with such entry
points. We need to match against these names when initializing ftrace
syscall tracing.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On powerpc64 ABIv1, we are enabling syscall tracing for only ~20
syscalls. This is due to commit e145242ea0 ("syscalls/core,
syscalls/x86: Clean up syscall stub naming convention") which has
changed the syscall entry wrapper prefix from "SyS" to "__se_sys".
Update the logic for ABIv1 to not just skip the initial dot, but also
the "__se_sys" prefix.
Fixes: commit e145242ea0 ("syscalls/core, syscalls/x86: Clean up syscall stub naming convention")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 4e26bc4a4e ("powerpc/64: Rename soft_enabled to
irq_soft_mask") we renamed paca->soft_enabled. But then in commit
8e0b634b13 ("powerpc/64s: Do not allocate lppaca if we are not
virtualized") we added it back. Oops. This happened because the two
patches were in flight at the same time and rebased vs each other
multiple times, and we missed it in review.
Fixes: 8e0b634b13 ("powerpc/64s: Do not allocate lppaca if we are not virtualized")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
By using IS_ENABLED() we can simplify __set_pte_at() by removing
redundant *ptep = pte.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When nohash and book3s header were split, some hash related stuff
remained in the nohash header. This patch removes them.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Duplicate pte_young() to avoid circular header dependency]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
FADump capture kernel boots in restricted memory environment preserving
the context of previous kernel to save vmcore. Supporting hugepages in
such environment makes things unnecessarily complicated, as hugepages
need memory set aside for them. This means most of the capture kernel's
memory is used in supporting hugepages. In most cases, this results in
out-of-memory issues while booting FADump capture kernel. But hugepages
are not of much use in capture kernel whose only job is to save vmcore.
So, disabling hugepages support, when fadump is active, is a reliable
solution for the out of memory issues. Introducing a flag variable to
disable HugeTLB support when fadump is active.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We've had dynamic ftrace support for over 9 years since Steve first
wrote it, all the distros use dynamic, and static is basically
untested these days, so drop support for static ftrace.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
With -mprofile-kernel, we always save the full register state in
ftrace_caller(). While this works, this is inefficient if we're not
interested in the register state, such as when we're using the function
tracer.
Rename the existing ftrace_caller() as ftrace_regs_caller() and provide
a simpler implementation for ftrace_caller() that is used when registers
are not required to be saved.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add some helpers to enable/disable ftrace through paca->ftrace_enabled.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Re-arrange the last #ifdef section in preparation for a subsequent
change.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We have some C code that we call into from real mode where we cannot
take any exceptions. Though the C functions themselves are mostly safe,
if these functions are traced, there is a possibility that we may take
an exception. For instance, in certain conditions, the ftrace code uses
WARN(), which uses a 'trap' to do its job.
For such scenarios, introduce a new field in paca 'ftrace_enabled',
which is checked on ftrace entry before continuing. This field can then
be set to zero to disable/pause ftrace, and set to a non-zero value to
resume ftrace.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is a single npu context per set of callback parameters. Callers
should be prevented from overwriting existing callback values so
instead return an error if different parameters are passed.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mark Hairgrove <mhairgrove@nvidia.com>
Tested-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Fix crashes when loading modules built with a different CONFIG_RELOCATABLE
value by adding CONFIG_RELOCATABLE to vermagic.
- Fix busy loops in the OPAL NVRAM driver if we get certain error conditions
from firmware.
- Remove tlbie trace points from KVM code that's called in real mode, because
it causes crashes.
- Fix checkstops caused by invalid tlbiel on Power9 Radix.
- Ensure the set of CPU features we "know" are always enabled is actually the
minimal set when we build with support for firmware supplied CPU features.
Thanks to:
Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJa0oWBExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAV
EA//UQB7n33HlroMBL3841VswUrIgY36gSi9+QZPjxHiTYGVStI+u0FHq5hm8OMm
1FkronD0sfbN7t8LJ9FS6vCoAlW15vMBj95pWJXAL7NuQneO7cM5JvRbowVKbNbq
GESn4EkUj9bAXl7aTzX2yA7jruzMJIwE/2bgl1J6YkpByHx5x/fgzKTuoCRggQqY
Xd656ZZyWR/uIuWhAjCPFdrPnekVvo7/Gy5Yo5kcR6LbX4MO0JFNOHpiT3wPGGv/
LJedJtdpH1biMk7SrqLfWD7mpMsVJeMelpkftEbe8zUXf17wBd1rl4JkybLTDwZD
HznZ0nbnh0j5Q/KRHwOh0sLDSisJF6pQHH/Bnc4Xqrsn4OERwEk40/Ym/O0kDsjH
n2F3X5a5QLWoBWRsdV3BMyEzc0bgnb++tXeBWOV4jJBEOhoqxwimCU+3fmLiy95A
urJYf8Sljwve9I5kJyXKpruopgeoJ1aumRwopE8YCG9lfBgzvU/90u6+uKqAdkOv
kpZxS5FDT2lNIwbCP9WjEelAOGrtA6JQNWU0iPGwsVAvAxX/p1RwwbQeWVlWs3Ek
UdVF8eGezClpLPa/FQFdDyXfGE6WIf1vK9YnG7k3rUwwkSqoYxKgVQ3o1Vetp0Lg
fzPsDdDi2V2I3KDYNav8s6kohAIIS/a9XYk4BGvT/htRnUg=
=0e7c
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes when loading modules built with a different
CONFIG_RELOCATABLE value by adding CONFIG_RELOCATABLE to vermagic.
- Fix busy loops in the OPAL NVRAM driver if we get certain error
conditions from firmware.
- Remove tlbie trace points from KVM code that's called in real mode,
because it causes crashes.
- Fix checkstops caused by invalid tlbiel on Power9 Radix.
- Ensure the set of CPU features we "know" are always enabled is
actually the minimal set when we build with support for firmware
supplied CPU features.
Thanks to: Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.
* tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU features
powerpc/mm/radix: Fix checkstops caused by invalid tlbiel
KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
powerpc/8xx: Fix build with hugetlbfs enabled
powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
powerpc/fscr: Enable interrupts earlier before calling get_user()
powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic
As arch_kexec_kernel_image_{probe,load}(),
arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
are almost duplicated among architectures, they can be commonalized with
an architecture-defined kexec_file_ops array. So let's factor them out.
Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The cpu_has_feature() mechanism has an optimisation where at build
time we construct a mask of the CPU feature bits that will always be
true for the given .config, based on the platform/bitness/etc. that we
are building for.
That is incompatible with DT CPU features, where the set of CPU
features is dependent on feature flags that are given to us by
firmware.
The result is that some feature bits can not be *disabled* by DT CPU
features. Or more accurately, they can be disabled but they will still
appear in the ALWAYS mask, meaning cpu_has_feature() will always
return true for them.
In the past this hasn't really been a problem because on Book3S
64 (where we support DT CPU features), the set of ALWAYS bits has been
very small. That was because we always built for POWER4 and later,
meaning the set of common bits was small.
The only bit that could be cleared by DT CPU features that was also in
the ALWAYS mask was CPU_FTR_NODSISRALIGN, and that was only used in
the alignment handler to create a fake DSISR. That code was itself
deleted in 31bfdb036f ("powerpc: Use instruction emulation
infrastructure to handle alignment faults") (Sep 2017).
However the set of ALWAYS features changed with the recent commit
db5ae1c155 ("powerpc/64s: Refine feature sets for little endian
builds") which restricted the set of feature flags when building
little endian to Power7 or later. That caused the ALWAYS mask to
become much larger for little endian builds.
The result is that the following feature bits can currently not
be *disabled* by DT CPU features:
CPU_FTR_REAL_LE, CPU_FTR_MMCRA, CPU_FTR_CTRL, CPU_FTR_SMT,
CPU_FTR_PURR, CPU_FTR_SPURR, CPU_FTR_DSCR, CPU_FTR_PKEY,
CPU_FTR_VMX_COPY, CPU_FTR_CFAR, CPU_FTR_HAS_PPR.
To fix it we need to mask the set of ALWAYS features with the base set
of DT CPU features, ie. the features that are always enabled by DT CPU
features. That way there are no bits in the ALWAYS mask that are not
also always set by DT CPU features.
Fixes: db5ae1c155 ("powerpc/64s: Refine feature sets for little endian builds")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This is the start of an effort to tidy up and standardise all the
delays. Existing loops have a range of delay/sleep periods from 1ms
to 20ms, and some have no delay. They all loop forever except rtc,
which times out after 10 retries, and that uses 10ms delays. So use
10ms as our standard delay. The OPAL maintainer agrees 10ms is a
reasonable starting point.
The idea is to use the same recipe everywhere, once this is proven to
work then it will be documented as an OPAL API standard. Then both
firmware and OS can agree, and if a particular call needs something
else, then that can be documented with reasoning.
This is not the end-all of this effort, it's just a relatively easy
change that fixes some existing high latency delays. There should be
provision for standardising timeouts and/or interruptible loops where
possible, so non-fatal firmware errors don't cause hangs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- VHE optimizations
- EL2 address space randomization
- speculative execution mitigations ("variant 3a", aka execution past invalid
privilege register access)
- bugfixes and cleanups
PPC:
- improvements for the radix page fault handler for HV KVM on POWER9
s390:
- more kvm stat counters
- virtio gpu plumbing
- documentation
- facilities improvements
x86:
- support for VMware magic I/O port and pseudo-PMCs
- AMD pause loop exiting
- support for AMD core performance extensions
- support for synchronous register access
- expose nVMX capabilities to userspace
- support for Hyper-V signaling via eventfd
- use Enlightened VMCS when running on Hyper-V
- allow userspace to disable MWAIT/HLT/PAUSE vmexits
- usual roundup of optimizations and nested virtualization bugfixes
Generic:
- API selftest infrastructure (though the only tests are for x86 as of now)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJay19UAAoJEL/70l94x66DGKYIAIu9PTHAEwaX0et15fPW5y2x
rrtS355lSAmMrPJ1nePRQ+rProD/1B0Kizj3/9O+B9OTKKRsorRYNa4CSu9neO2k
N3rdE46M1wHAPwuJPcYvh3iBVXtgbMayk1EK5aVoSXaMXEHh+PWZextkl+F+G853
kC27yDy30jj9pStwnEFSBszO9ua/URdKNKBATNx8WUP6d9U/dlfm5xv3Dc3WtKt2
UMGmog2wh0i7ecXo7hRkMK4R7OYP3ZxAexq5aa9BOPuFp+ZdzC/MVpN+jsjq2J/M
Zq6RNyA2HFyQeP0E9QgFsYS2BNOPeLZnT5Jg1z4jyiD32lAZ/iC51zwm4oNKcDM=
=bPlD
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- VHE optimizations
- EL2 address space randomization
- speculative execution mitigations ("variant 3a", aka execution past
invalid privilege register access)
- bugfixes and cleanups
PPC:
- improvements for the radix page fault handler for HV KVM on POWER9
s390:
- more kvm stat counters
- virtio gpu plumbing
- documentation
- facilities improvements
x86:
- support for VMware magic I/O port and pseudo-PMCs
- AMD pause loop exiting
- support for AMD core performance extensions
- support for synchronous register access
- expose nVMX capabilities to userspace
- support for Hyper-V signaling via eventfd
- use Enlightened VMCS when running on Hyper-V
- allow userspace to disable MWAIT/HLT/PAUSE vmexits
- usual roundup of optimizations and nested virtualization bugfixes
Generic:
- API selftest infrastructure (though the only tests are for x86 as
of now)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits)
kvm: x86: fix a prototype warning
kvm: selftests: add sync_regs_test
kvm: selftests: add API testing infrastructure
kvm: x86: fix a compile warning
KVM: X86: Add Force Emulation Prefix for "emulate the next instruction"
KVM: X86: Introduce handle_ud()
KVM: vmx: unify adjacent #ifdefs
x86: kvm: hide the unused 'cpu' variable
KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig
Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
kvm: Add emulation for movups/movupd
KVM: VMX: raise internal error for exception during invalid protected mode state
KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending
KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending
KVM: x86: Fix misleading comments on handling pending exceptions
KVM: x86: Rename interrupt.pending to interrupt.injected
KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt
x86/kvm: use Enlightened VMCS when running on Hyper-V
x86/hyper-v: detect nested features
x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits
...
If you build the kernel with CONFIG_RELOCATABLE=n, then install the
modules, rebuild the kernel with CONFIG_RELOCATABLE=y and leave the
old modules installed, we crash something like:
Unable to handle kernel paging request for data at address 0xd000000018d66cef
Faulting instruction address: 0xc0000000021ddd08
Oops: Kernel access of bad area, sig: 11 [#1]
Modules linked in: x_tables autofs4
CPU: 2 PID: 1 Comm: systemd Not tainted 4.16.0-rc6-gcc_ubuntu_le-g99fec39 #1
...
NIP check_version.isra.22+0x118/0x170
Call Trace:
__ksymtab_xt_unregister_table+0x58/0xfffffffffffffcb8 [x_tables] (unreliable)
resolve_symbol+0xb4/0x150
load_module+0x10e8/0x29a0
SyS_finit_module+0x110/0x140
system_call+0x58/0x6c
This happens because since commit 71810db27c ("modversions: treat
symbol CRCs as 32 bit quantities"), a relocatable kernel encodes and
handles symbol CRCs differently from a non-relocatable kernel.
Although it's possible we could try and detect this situation and
handle it, it's much more robust to simply make the state of
CONFIG_RELOCATABLE part of the module vermagic.
Fixes: 71810db27c ("modversions: treat symbol CRCs as 32 bit quantities")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>