Commit Graph

810677 Commits

Author SHA1 Message Date
Christoph Hellwig
74ebe3e733 net: pasemi: set a 64-bit DMA mask on the DMA device
The pasemi driver never set a DMA mask, and given that the powerpc
DMA mapping routines never check it this worked ok so far.  But the
generic dma-direct code which I plan to switch on for powerpc checks
the DMA mask and fails unsupported mapping requests, so we need to
make sure the proper 64-bit mask is set.

Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-18 22:41:01 +11:00
Michael Ellerman
a58007621b powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT
rather than just checking that the value is non-zero, e.g.:

  static inline int pgd_present(pgd_t pgd)
  {
 -       return !pgd_none(pgd);
 +       return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
  }

Unfortunately this is broken on big endian, as the result of the
bitwise & is truncated to int, which is always zero because
_PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and
pud_present() are always false at compile time, and the compiler
elides the subsequent code.

Remarkably with that bug present we are still able to boot and run
with few noticeable effects. However under some work loads we are able
to trigger a warning in the ext4 code:

  WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0
  CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1
  ...
  NIP .ext4_set_page_dirty+0x70/0xb0
  LR  .set_page_dirty+0xa0/0x150
  Call Trace:
   .set_page_dirty+0xa0/0x150
   .unmap_page_range+0xbf0/0xe10
   .unmap_vmas+0x84/0x130
   .unmap_region+0xe8/0x190
   .__do_munmap+0x2f0/0x510
   .__vm_munmap+0x80/0x110
   .__se_sys_munmap+0x14/0x30
   system_call+0x5c/0x70

The fix is simple, we need to convert the result of the bitwise & to
an int before returning it.

Thanks to Erhard, Jan Kara and Aneesh for help with debugging.

Fixes: da7ad366b4 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
Cc: stable@vger.kernel.org # v4.20+
Reported-by: Erhard F. <erhard_f@mailbox.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-17 15:24:45 +11:00
Oliver O'Halloran
b174b4fb91 powerpc/powernv: Escalate reset when IODA reset fails
The IODA reset is used to flush out any OS controlled state from the PHB.
This reset can fail if a PHB fatal error has occurred in early boot,
probably due to a because of a bad device. We already do a fundemental
reset of the device in some cases, so this patch just adds a test to force
a full reset if firmware reports an error when performing the IODA reset.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-07 00:29:20 +11:00
Breno Leitao
ebb0e13ead powerpc/ptrace: Mitigate potential Spectre v1
'regno' is directly controlled by user space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

On PTRACE_SETREGS and PTRACE_GETREGS requests, user space passes the
register number that would be read or written. This register number is
called 'regno' which is part of the 'addr' syscall parameter.

This 'regno' value is checked against the maximum pt_regs structure size,
and then used to dereference it, which matches the initial part of a
Spectre v1 (and Spectre v1.1) attack. The dereferenced value, then,
is returned to userspace in the GETREGS case.

This patch sanitizes 'regno' before using it to dereference pt_reg.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-07 00:29:20 +11:00
Joel Stanley
98ecc6768e powerpc/32: Include .branch_lt in data section
When building a 32 bit powerpc kernel with Binutils 2.31.1 this warning
is emitted:

 powerpc-linux-gnu-ld: warning: orphan section `.branch_lt' from
 `arch/powerpc/kernel/head_44x.o' being placed in section `.branch_lt'

As of binutils commit 2d7ad24e8726 ("Support PLT16 relocs against local
symbols")[1], 32 bit targets can produce .branch_lt sections in their
output.

Include these symbols in the .data section as the ppc64 kernel does.

[1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=2d7ad24e8726ba4c45c9e67be08223a146a837ce
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Alan Modra <amodra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 15:47:16 +11:00
Sam Bobroff
195482c363 powerpc/eeh: Correct retries in eeh_pe_reset_full()
Currently, eeh_pe_reset_full() will only attempt to reset a PE more
than once if activating the reset state and deactivating it both
succeed, but later polling shows that it hasn't become active.

Change this so that it will try up to three times for any reason other
than an unrecoverable slot error and adjust the message generation so
that it's clear weather the reset has ultimately succeeded or failed.
This allows the reset to succeed in some situations where it would
currently fail.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:45 +11:00
Sam Bobroff
1ef52073fd powerpc/eeh: Improve recovery of passed-through devices
Currently, the EEH recovery process considers passed-through devices
as if they were not EEH-aware, which can cause them to be removed as
part of recovery.  Because device removal requires cooperation from
the guest, this may lead to the process stalling or deadlocking.
Also, if devices are removed on the host side, they will be removed
from their IOMMU group, making recovery in the guest impossible.

Therefore, alter the recovery process so that passed-through devices
are not removed but are instead left frozen (and marked isolated)
until the guest performs it's own recovery.  If firmware thaws a
passed-through PE because it's parent PE has been thawed (because it
was not passed through), re-freeze it.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:44 +11:00
Sam Bobroff
4d8e325d9d powerpc/eeh: Add include_passed to eeh_clear_pe_frozen_state()
Add a parameter to eeh_clear_pe_frozen_state() that allows
passed-through PEs to be excluded. Update callers to always pass true
so that there is no change in behaviour.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:44 +11:00
Sam Bobroff
9ed5ca66aa powerpc/eeh: Add include_passed to eeh_pe_state_clear()
Add a parameter to eeh_pe_state_clear() that allows passed-through PEs
to be excluded. Update callers to always pass true so that there is no
change in behaviour.

Also refactor to use direct traversal, to allow the removal of some
boilerplate.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:43 +11:00
Sam Bobroff
188fdea69f powerpc/eeh: remove sw_state from eeh_unfreeze_pe()
eeh_unfreeze_pe() performs two operations: unfreezing a PE (which may
cause firmware to unfreeze child PEs as well) and de-isolating the PE
and it's children.

To simplify this and support future work, separate out the
de-isolation and perform it at the call sites (when necessary).

There should be no change in behaviour.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:42 +11:00
Sam Bobroff
3376cb91ed powerpc/eeh: Cleanup eeh_pe_clear_frozen_state()
The 'clear_sw_state' parameter for eeh_pe_clear_frozen_state() is
redundant because it has no effect (except in the rare case of a
hardware error part way through unfreezing a tree of PEs, where it
would dangerously allow partial de-isolation before returning
failure).

It is passed down to __eeh_pe_clear_frozen_state(), and from there to
eeh_unfreeze_pe(), where it causes EEH_PE_ISOLATED to be removed
from the state of each PE during the traversal.  However, when the
traversal finishes, EEH_PE_ISOLATED is unconditionally removed by a
call to eeh_pe_state_clear() regardless of the parameter's value.

So remove the flag and pass false to eeh_unfreeze_pe() (to avoid the
rare case described above, as it was before the flag was introduced).
Also, perform the recursion directly in the function and eliminate a
bit of boilerplate.

There should be no change in functionality, except as mentioned above.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-05 11:55:41 +11:00
Christophe Leroy
26b523356f powerpc: Drop page_is_ram() and walk_system_ram_range()
Since commit c40dd2f766 ("powerpc: Add System RAM to /proc/iomem")
it is possible to use the generic walk_system_ram_range() and
the generic page_is_ram().

To enable the use of walk_system_ram_range() by the IBM EHEA ethernet
driver, we still need an export of the generic function.

As powerpc was the only user of CONFIG_ARCH_HAS_WALK_MEMORY, the
ifdef around the generic walk_system_ram_range() has become useless
and can be dropped.

Fixes: c40dd2f766 ("powerpc: Add System RAM to /proc/iomem")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Keep the EXPORT_SYMBOL_GPL in powerpc code]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-04 21:22:06 +11:00
Mathieu Malaterre
8e0f973575 Move static keyword at beginning of declaration
Move the static keyword around to remove the following warnings (W=1):

  arch/powerpc/platforms/ps3/os-area.c:212:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration]
  arch/powerpc/platforms/ps3/system-bus.c:45:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration]

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-03 20:44:19 +11:00
Mathieu Malaterre
e5c27ef7a5 powerpc: Remove trailing semicolon after curly brace
There is not point in having a trailing semicolon after a closing curly
brace. Remove it.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-03 20:01:03 +11:00
Christian Lamparter
423bfc69d7 powerpc: Enable kernel XZ compression option on 44x
Enable kernel XZ compression option on 44x.
Tested on a Western Digital - MyBook Live NAS.
It takes 22 seconds for the 800 MHz CPU to decompress
and boot a 2.63 MiB XZ-compressed kernel simpleImage.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-01 22:04:55 +11:00
Oliver O'Halloran
5a3840a470 powerpc/papr_scm: Use the correct bind address
When binding an SCM volume to a physical address the hypervisor has the
option to return early with a continue token with the expectation that
the guest will resume the bind operation until it completes. A quirk of
this interface is that the bind address will only be returned by the
first bind h-call and the subsequent calls will return
0xFFFF_FFFF_FFFF_FFFF for the bind address.

We currently do not save the address returned by the first h-call. As a
result we will use the junk address as the base of the bound region if
the hypervisor decides to split the bind across multiple h-calls. This
bug was found when testing with very large SCM volumes where the bind
process would take more time than they hypervisor's internal h-call time
limit would allow. This patch fixes the issue by saving the bind address
from the first call.

Cc: stable@vger.kernel.org
Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-01 10:13:51 +11:00
Aneesh Kumar K.V
579b9239c1 powerpc/radix: Fix kernel crash with mremap()
With support for split pmd lock, we use pmd page pmd_huge_pte pointer
to store the deposited page table. In those config when we move page
tables we need to make sure we move the deposited page table to the
correct pmd page. Otherwise this can result in crash when we withdraw
of deposited page table because we can find the pmd_huge_pte NULL.

eg:

  __split_huge_pmd+0x1070/0x1940
  __split_huge_pmd+0xe34/0x1940 (unreliable)
  vma_adjust_trans_huge+0x110/0x1c0
  __vma_adjust+0x2b4/0x9b0
  __split_vma+0x1b8/0x280
  __do_munmap+0x13c/0x550
  sys_mremap+0x220/0x7e0
  system_call+0x5c/0x70

Fixes: 675d995297 ("powerpc/book3s64: Enable split pmd ptlock.")
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 20:10:15 +11:00
Joe Lawrence
3de27dcf81 powerpc/livepatch: return -ERRNO values in save_stack_trace_tsk_reliable()
To match its x86 counterpart, save_stack_trace_tsk_reliable() should
return -EINVAL in cases that it is currently returning 1.  No caller is
currently differentiating non-zero error codes, but let's keep the
arch-specific implementations consistent.

Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 16:43:38 +11:00
Joe Lawrence
29a77bbb0c powerpc/livepatch: small cleanups in save_stack_trace_tsk_reliable()
Mostly cosmetic changes:

- Group common stack pointer code at the top
- Simplify the first frame logic
- Code stackframe iteration into for...loop construct
- Check for trace->nr_entries overflow before adding any into the array

Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 16:43:38 +11:00
Joe Lawrence
18be37603d powerpc/livepatch: relax reliable stack tracer checks for first-frame
The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.

The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.

Fixes: df78d3f614 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 16:43:38 +11:00
Nicolai Stange
a50d3250d7 powerpc/64s: Make reliable stacktrace dependency clearer
Make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on
PPC_BOOK3S_64 for documentation purposes. Before this patch, it
depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN
implies PPC_BOOK3S_64, there's no functional change here.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 16:43:29 +11:00
Nicolai Stange
eddd0b3323 powerpc/64s: Clear on-stack exception marker upon exception return
The ppc64 specific implementation of the reliable stacktracer,
save_stack_trace_tsk_reliable(), bails out and reports an "unreliable
trace" whenever it finds an exception frame on the stack. Stack frames
are classified as exception frames if the STACK_FRAME_REGS_MARKER
magic, as written by exception prologues, is found at a particular
location.

However, as observed by Joe Lawrence, it is possible in practice that
non-exception stack frames can alias with prior exception frames and
thus, that the reliable stacktracer can find a stale
STACK_FRAME_REGS_MARKER on the stack. It in turn falsely reports an
unreliable stacktrace and blocks any live patching transition to
finish. Said condition lasts until the stack frame is
overwritten/initialized by function call or other means.

In principle, we could mitigate this by making the exception frame
classification condition in save_stack_trace_tsk_reliable() stronger:
in addition to testing for STACK_FRAME_REGS_MARKER, we could also take
into account that for all exceptions executing on the kernel stack
  - their stack frames's backlink pointers always match what is saved
    in their pt_regs instance's ->gpr[1] slot and that
  - their exception frame size equals STACK_INT_FRAME_SIZE, a value
    uncommonly large for non-exception frames.

However, while these are currently true, relying on them would make
the reliable stacktrace implementation more sensitive towards future
changes in the exception entry code. Note that false negatives, i.e.
not detecting exception frames, would silently break the live patching
consistency model.

Furthermore, certain other places (diagnostic stacktraces, perf, xmon)
rely on STACK_FRAME_REGS_MARKER as well.

Make the exception exit code clear the on-stack
STACK_FRAME_REGS_MARKER for those exceptions running on the "normal"
kernel stack and returning to kernelspace: because the topmost frame
is ignored by the reliable stack tracer anyway, returns to userspace
don't need to take care of clearing the marker.

Furthermore, as I don't have the ability to test this on Book 3E or 32
bits, limit the change to Book 3S and 64 bits.

Fixes: df78d3f614 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 16:40:25 +11:00
Madhavan Srinivasan
ab4510e9ac powerpc/perf: Add mem access events to sysfs
Add mem-loads/mem-stores events to sysfs.
The event is formed based on raw event encoding.
Primary PMU event used here is PM_MRK_INST_CMPL
along with MMCRA[SM] modes and Thresholding bit

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 10:38:27 +11:00
Reza Arbab
865a9432d1 powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask
In htab_convert_pte_flags(), _PAGE_CACHE_CTL is used to check for the
_PAGE_SAO flag:

  else if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_SAO)
          rflags |= (HPTE_R_W | HPTE_R_I | HPTE_R_M);

But, it isn't defined to include that flag:

  #define _PAGE_CACHE_CTL (_PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT)

This happens to work, but only because of the flag values:

  #define _PAGE_SAO               0x00010 /* Strong access order */
  #define _PAGE_NON_IDEMPOTENT    0x00020 /* non idempotent memory */
  #define _PAGE_TOLERANT          0x00030 /* tolerant memory, cache inhibited */

To prevent any issues if these particulars ever change, add _PAGE_SAO to
the mask.

Suggested-by: Charles Johns <crjohns@us.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 00:36:06 +11:00
Sabyasachi Gupta
45a202a3fe powerpc/cell: Remove duplicate header
Remove linux/syscalls.h which is included more than once

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 00:00:23 +11:00
Sabyasachi Gupta
f069a062ec powerpc/powernv: Remove duplicate header
Remove linux/printk.h which is included more than once.

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-30 23:59:28 +11:00
Brajeswar Ghosh
75f8a37580 powerpc/kernel/time: Remove duplicate header
Remove linux/rtc.h which is included more than once

Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-30 23:42:31 +11:00
Vaibhav Jain
edeb304f65 cxl: Wrap iterations over afu slices inside 'afu_list_lock'
Within cxl module, iteration over array 'adapter->afu' may be racy
at few points as it might be simultaneously read during an EEH and its
contents being set to NULL while driver is being unloaded or unbound
from the adapter. This might result in a NULL pointer to 'struct afu'
being de-referenced during an EEH thereby causing a kernel oops.

This patch fixes this by making sure that all access to the array
'adapter->afu' is wrapped within the context of spin-lock
'adapter->afu_list_lock'.

Fixes: 9e8df8a219 ("cxl: EEH support")
Cc: stable@vger.kernel.org # v4.3+
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-30 23:36:53 +11:00
Christophe Leroy
9bf3d3c4e4 powerpc/traps: Fix the message printed when stack overflows
Today's message is useless:

  [   42.253267] Kernel stack overflow in process (ptrval), r1=c65500b0

This patch fixes it:

  [   66.905235] Kernel stack overflow in process sh[356], r1=c65560b0

Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Use task_pid_nr()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-30 23:31:44 +11:00
Nathan Fontenot
81b6132492 powerpc/pseries: Perform full re-add of CPU for topology update post-migration
On pseries systems, performing a partition migration can result in
altering the nodes a CPU is assigned to on the destination system. For
exampl, pre-migration on the source system CPUs are in node 1 and 3,
post-migration on the destination system CPUs are in nodes 2 and 3.

Handling the node change for a CPU can cause corruption in the slab
cache if we hit a timing where a CPUs node is changed while cache_reap()
is invoked. The corruption occurs because the slab cache code appears
to rely on the CPU and slab cache pages being on the same node.

The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
does not prevent us from hitting this scenario.

Changing the device tree property update notification handler that
recognizes an affinity change for a CPU to do a full DLPAR remove and
add of the CPU instead of dynamically changing its node resolves this
issue.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Tested-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-30 23:28:56 +11:00
Michael Ellerman
7bea7ac0ca powerpc/syscalls: Fix syscall tracing
Recently in commit fbf508da74 ("powerpc: split compat syscall table
out from native table") we changed the layout of the system call
table. Instead of having two entries for each syscall number, one for
the regular entry point and one for the compat entry point, we now
have separate tables for regular and compat entry points.

This inadvertently broke syscall tracing (CONFIG_FTRACE_SYSCALLS),
because our implementation of arch_syscall_addr() knew about the
layout of the table (it did nr * 2).

We can fix it just by dropping our version of arch_syscall_addr() and
using the generic version which does:

	return (unsigned long)sys_call_table[nr];

Fixes: fbf508da74 ("powerpc: split compat syscall table out from native table")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 21:32:25 +11:00
Jason A. Donenfeld
da727097a4 powerpc/pseries: Fix build break due to pnv_npu2_init()
Commit 3be2df00e2 ("powerpc/pseries/npu: Enable platform support")
added a call to pnv_npu2_init() in pseries code. This causes a build
break if we build with CONFIG_PPC_PSERIES && !CONFIG_PPC_POWERNV:

  powerpc64le-pc-linux-gnu-ld: arch/powerpc/platforms/pseries/pci.o: in function `pSeries_final_fixup':
  pci.c:(.init.text+0x1b0): undefined reference to `pnv_npu2_init'

This commit therefore wraps that line in an ifdef, so that pseries
builds without powernv.

Fixes: 3be2df00e2 ("powerpc/pseries/npu: Enable platform support")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Frob change log a bit to blame a different commit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 21:27:47 +11:00
Igor Stoppa
63da6caeb8 powerpc: remove unnecessary unlikely()
WARN_ON() already contains an unlikely(), so it's not necessary to
wrap it into another.

Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
Cc: Arseny Solokha <asolokha@kb.kras.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:38:05 +11:00
Mathieu Malaterre
9bd10b6498 powerpc: Allow CPU selection of G4/74xx variant
GCC supports -mcpu=G4

This patch gives the opportunity to select ALTIVEC for this variant.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:19:45 +11:00
Michael Ellerman
16842516ea powerpc/64s: Add MMU type to __die() output
On Power9 machines (64-bit Book3S), we can be running with either the
Hash table or Radix tree MMU enabled. So add some text to the __die()
output to tell us which is enabled, for the case where all you have is
the oops output and no other information.

Example output:

  kernel BUG at drivers/misc/lkdtm/bugs.c:63!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: kvm vmx_crypto binfmt_misc ip_tables x_tables

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:10 +11:00
Michael Ellerman
184051396b powerpc: Show PAGE_SIZE in __die() output
The page size the kernel is built with is useful info when debugging a
crash, so add it to the output in __die().

Result looks like eg:

  kernel BUG at drivers/misc/lkdtm/bugs.c:63!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: vmx_crypto kvm binfmt_misc ip_tables

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Michael Ellerman
782274434d powerpc: Stop using pr_cont() in __die()
Using pr_cont() risks having our output interleaved with other output
from other CPUs. Instead print everything in a single printk() call.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Breno Leitao
a65329aa7d selftests/powerpc: New TM signal self test
A new self test that forces MSR[TS] to be set without calling any TM
instruction. This test also tries to cause a page fault at a signal
handler, exactly between MSR[TS] set and tm_recheckpoint(), forcing
thread->texasr to be rewritten with TEXASR[FS] = 0, which will cause a BUG
when tm_recheckpoint() is called.

This test is not deterministic, since it is hard to guarantee that the page
access will cause a page fault. In order to force more page faults at
signal context, the signal handler and the ucontext are being mapped into a
MADV_DONTNEED memory chunks.

Tests have shown that the bug could be exposed with few interactions in a
buggy kernel. This test is configured to loop 5000x, having a good chance
to hit the kernel issue in just one run.  This self test takes less than
two seconds to run.

This test uses set/getcontext because the kernel will recheckpoint
zeroed structures, causing the test to segfault, which is undesired because
the test needs to rerun, so, there is a signal handler for SIGSEGV which
will restart the test.

v2: Uses the MADV_DONTNEED memory advice
v3: Fix memcpy and 32-bits compilation
v4: Does not define unused macros

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Jonathan Neuschäfer
8de7547e03 powerpc: wii.dts: Add GPIO keys
The Wii has POWER and EJECT buttons, which are connected through
normalization logic to the GPIO controller (the length of an assertion
of these signals is always the same, regardless of how long the user
pressed the buttons).

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Jonathan Neuschäfer
f4ddc19a71 powerpc: wii.dts: Add interrupt-related properties to GPIO node
The Hollywood GPIO controller is connected to the Hollywood PIC (&PIC1)
at IRQs 10 and 11; IRQ 10 for GPIO lines that are configured for access
by the PPC, 11 for GPIO lines that are configured for access by the
ARM926.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy
797eadd9c8 powerpc/powernv/npu: Remove obsolete comment about TCE_KILL_INVAL_ALL
TCE_KILL_INVAL_ALL has moved long ago but the comment was forgotted so
finish the move and remove the comment.

Fixes: 0bbcdb437d "powerpc/powernv/npu: TCE Kill helpers cleanup"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy
c35f78d7a4 powerpc/powernv: Remove never used pnv_power9_force_smt4
This removes never used symbol - pnv_power9_force_smt4.

Note that we might still want to add stubs for:
	void pnv_power9_force_smt4_catch(void);
	void pnv_power9_force_smt4_release(void);

Fixes: 7672691a08 "powerpc/powernv: Provide a way to force a core into SMT4 mode"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Alexey Kardashevskiy
cd6b8a631c powerpc/mm: Fix compile when CONFIG_PPC_RADIX_MMU is not defined
This adds some stubs for hash only configs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:17:09 +11:00
Joel Stanley
a652758ac1 powerpc: Use ALIGN instead of BLOCK
In the ld documentation under Builtin Functions:

  BLOCK(exp)

    This is a synonym for ALIGN, for compatibility with older linker scripts.

Clang's linker (lld) doesn't know about BLOCK so remove this use of
it.

Suggested-by: George Rimar <grimar@accesssoftek.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15 11:12:10 +11:00
Corentin Labbe
acef5e0165 powerpc/dts: Build virtex dtbs
I wanted to test the virtex440-ml507 qemu machine and found that the
dtb for it was not built.

All powerpc dtbs are only built when CONFIG_OF_ALL_DTBS is set which
depend on COMPILE_TEST.

This patch enables building of the virtex dtbs when
CONFIG_XILINX_VIRTEX440_GENERIC_BOARD is enabled.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
[mpe: Put both targets on a single line]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00
Christophe Leroy
8acb88682c powerpc/ipic: drop unused functions
ipic_set_highest_priority(), ipic_enable_mcp() and ipic_disable_mcp()
are unused. This patch drops them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00
Gustavo A. R. Silva
00def7130a powerpc/spufs: use struct_size() in kmalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    void *entry[];
};

instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00
Masahiro Yamada
fbe3ab014f powerpc: math-emu: remove unneeded header search paths
The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.

-Iinclude/math-emu seems unnecessary because all files include headers
in the form of #include <math-emu/...>.

I was able to build without these header search paths.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00
Masahiro Yamada
b00899b895 powerpc: remove redundant header search path additions
The same path -Iarch/$(ARCH) is passed to KBUILD_CPPFLAGS,
KBUILD_AFLAGS, and KBUILD_CFLAGS.

As you see in scripts/Makefile.lib, KBUILD_CPPFLAGS is passed
to c_flags and a_flags as well.

Passing it to KBUILD_CPPFLAGS is enough.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00
Masahiro Yamada
c142e9741e KVM: powerpc: remove -I. header search paths
The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.

Commit 46f43c6ee0 ("KVM: powerpc: convert marker probes to event
trace") first added these options, but they are completely useless.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-14 20:39:27 +11:00