linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-24 16:38:21 +07:00

Author	SHA1	Message	Date
Nicholas Piggin	a2e366832f	powerpc/64: mark emergency stacks valid to unwind Before: WARNING: CPU: 0 PID: 494 at arch/powerpc/kernel/irq.c:343 CPU: 0 PID: 494 Comm: a Tainted: G W NIP: c00000000001ed2c LR: c000000000d13190 CTR: c00000000003f910 REGS: c0000001fffd3870 TRAP: 0700 Tainted: G W MSR: 8000000000021003 <SF,ME,RI,LE> CR: 28000488 XER: 00000000 CFAR: c00000000001ec90 IRQMASK: 0 GPR00: c000000000aeb12c c0000001fffd3b00 c0000000012ba300 0000000000000000 GPR04: 0000000000000000 0000000000000000 000000010bd207c8 6b00696e74657272 GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000 GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 000000010bd207bc GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50 NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100 LR [c000000000d13190] _raw_spin_unlock_irqrestore+0x50/0xc0 Call Trace: Instruction dump: 60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000 60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000 After: WARNING: CPU: 0 PID: 499 at arch/powerpc/kernel/irq.c:343 CPU: 0 PID: 499 Comm: a Not tainted NIP: c00000000001ed2c LR: c000000000d13210 CTR: c00000000003f980 REGS: c0000001fffd3870 TRAP: 0700 Not tainted MSR: 8000000000021003 <SF,ME,RI,LE> CR: 28000488 XER: 00000000 CFAR: c00000000001ec90 IRQMASK: 0 GPR00: c000000000aeb1ac c0000001fffd3b00 c0000000012ba300 0000000000000000 GPR04: 0000000000000000 0000000000000000 00000001347607c8 6b00696e74657272 GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000 GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 00000001347607bc GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50 NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100 LR [c000000000d13210] _raw_spin_unlock_irqrestore+0x50/0xc0 Call Trace: [c0000001fffd3b20] [c000000000aeb1ac] of_find_property+0x6c/0x90 [c0000001fffd3b70] [c000000000aeb1f0] of_get_property+0x20/0x40 [c0000001fffd3b90] [c000000000042cdc] rtas_token+0x3c/0x70 [c0000001fffd3bb0] [c0000000000dc318] fwnmi_release_errinfo+0x28/0x70 [c0000001fffd3c10] [c0000000000dcd8c] pseries_machine_check_realmode+0x1dc/0x540 [c0000001fffd3cd0] [c00000000003fe04] machine_check_early+0x54/0x70 [c0000001fffd3d00] [c000000000008384] machine_check_early_common+0x134/0x1f0 --- interrupt: 200 at 0x1347607c8 LR = 0x7fffafbd8328 Instruction dump: 60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000 60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200325104144.158362-1-npiggin@gmail.com	2020-04-01 13:42:09 +11:00
Michael Ellerman	c7def7fbde	powerpc/64/tm: Don't let userspace set regs->trap via sigreturn In restore_tm_sigcontexts() we take the trap value directly from the user sigcontext with no checking: err \|= __get_user(regs->trap, &sc->gp_regs[PT_TRAP]); This means we can be in the kernel with an arbitrary regs->trap value. Although that's not immediately problematic, there is a risk we could trigger one of the uses of CHECK_FULL_REGS(): #define CHECK_FULL_REGS(regs) BUG_ON(regs->trap & 1) It can also cause us to unnecessarily save non-volatile GPRs again in save_nvgprs(), which shouldn't be problematic but is still wrong. It's also possible it could trick the syscall restart machinery, which relies on regs->trap not being == 0xc00 (see `9a81c16b52` ("powerpc: fix double syscall restarts")), though I haven't been able to make that happen. Finally it doesn't match the behaviour of the non-TM case, in restore_sigcontext() which zeroes regs->trap. So change restore_tm_sigcontexts() to zero regs->trap. This was discovered while testing Nick's upcoming rewrite of the syscall entry path. In that series the call to save_nvgprs() prior to signal handling (do_notify_resume()) is removed, which leaves the low-bit of regs->trap uncleared which can then trigger the FULL_REGS() WARNs in setup_tm_sigcontexts(). Fixes: `2b0a576d15` ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200401023836.3286664-1-mpe@ellerman.id.au	2020-04-01 13:42:00 +11:00
Aneesh Kumar K.V	233ba54618	powerpc/64: Avoid isync in flush_dcache_range() As per ISA an isync is only needed on instruction cache block invalidate. Remove the same from dcache invalidate. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320103242.229223-1-aneesh.kumar@linux.ibm.com	2020-03-27 17:37:07 +11:00
Fangrui Song	968339fad4	powerpc/boot: Delete unneeded .globl _zimage_start .globl sets the symbol binding to STB_GLOBAL while .weak sets the binding to STB_WEAK. GNU as let .weak override .globl since binutils-gdb 5ca547dc2399a0a5d9f20626d4bf5547c3ccfddd (1996). Clang integrated assembler let the last win but it may error in the future. Since it is a convention that only one binding directive is used, just delete .globl. Fixes: `ee9d21b3b3` ("powerpc/boot: Ensure _zimage_start is a weak symbol") Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200325164257.170229-1-maskray@google.com	2020-03-27 15:50:06 +11:00
Ganesh Goudar	efbc4303b2	powerpc/pseries: Handle UE event for memcpy_mcsafe memcpy_mcsafe has been implemented for power machines which is used by pmem infrastructure, so that an UE encountered during memcpy from pmem devices would not result in panic instead a right error code is returned. The implementation expects machine check handler to ignore the event and set nip to continue the execution from fixup code. Appropriate changes are already made to powernv machine check handler, make similar changes to pseries machine check handler to ignore the the event and set nip to continue execution at the fixup entry if we hit UE at an instruction with a fixup entry. while we are at it, have a common function which searches the exception table entry and updates nip with fixup address, and any future common changes can be made in this function that are valid for both architectures. powernv changes are made by commit `895e3dceeb` ("powerpc/mce: Handle UE event for memcpy_mcsafe") Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Santosh S <santosh@fossix.org> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200326184916.31172-1-ganeshgr@linux.ibm.com	2020-03-27 14:59:35 +11:00
Michael Ellerman	c72e8da062	powerpc/smp: Use IS_ENABLED() to avoid #ifdef We can avoid the #ifdef by using IS_ENABLED() in the existing condition check. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313112020.28235-2-mpe@ellerman.id.au	2020-03-27 01:15:09 +11:00
Michael Ellerman	4b4d181d63	powerpc/smp: Drop superfluous NULL check We don't need the NULL check of np, the result is the same because the OF helpers cope with NULL, of_node_to_nid(NULL) == NUMA_NO_NODE (-1). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313112020.28235-1-mpe@ellerman.id.au	2020-03-27 01:15:09 +11:00
Douglas Miller	8ec26c25c3	powerpc/xmon: Add ASCII dump to d1,d2,d4,d8 commands. The reason debuggers add an ASCII dump to other types of memory dumps is to give the user visual reference points in the case that ASCII strings are adjacent to other structures or element. For example, when examining the task_struct structure one can look for the comm[] string and use it to locate other important elements. ASCII strings do not have endianess, they exist in memory in the same order regardless of CPU endianess. ASCII strings are, by definition, human readable and so should be presented in a human readable format. For these reasons, the supplemental ASCII dump does not re-order the strings from memory to match the endianess of the corresponding 16, 32, or 64 bit words. That would make the ASCII dump much less useful. Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1488205694-13337-1-git-send-email-dougmill@linux.vnet.ibm.com	2020-03-27 00:49:44 +11:00
Cédric Le Goater	930914b7d5	powerpc/xive: Add a debugfs file to dump internal XIVE state As does XMON, the debugfs file /sys/kernel/debug/powerpc/xive exposes the XIVE internal state of the machine CPUs and interrupts. Available on the PowerNV and sPAPR platforms. Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Make the debugfs file 0400] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-5-clg@kaod.org	2020-03-27 00:20:38 +11:00
Cédric Le Goater	5191e0ba51	powerpc/xmon: Add source flags to output of XIVE interrupts Some firmwares or hypervisors can advertise different source characteristics. Track their value under XMON. What we are mostly interested in is the StoreEOI flag. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-4-clg@kaod.org	2020-03-27 00:19:05 +11:00
Cédric Le Goater	97ef275077	powerpc/xive: Fix xmon support on the PowerNV platform The PowerNV platform has multiple IRQ chips and the xmon command dumping the state of the XIVE interrupt should only operate on the XIVE IRQ chip. Fixes: `5896163f7f` ("powerpc/xmon: Improve output of XIVE interrupts") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-3-clg@kaod.org	2020-03-27 00:19:05 +11:00
Cédric Le Goater	b1a504a650	powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs When a CPU is brought up, an IPI number is allocated and recorded under the XIVE CPU structure. Invalid IPI numbers are tracked with interrupt number 0x0. On the PowerNV platform, the interrupt number space starts at 0x10 and this works fine. However, on the sPAPR platform, it is possible to allocate the interrupt number 0x0 and this raises an issue when CPU 0 is unplugged. The XIVE spapr driver tracks allocated interrupt numbers in a bitmask and it is not correctly updated when interrupt number 0x0 is freed. It stays allocated and it is then impossible to reallocate. Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms. Reported-by: David Gibson <david@gibson.dropbear.id.au> Fixes: `eac1e731b5` ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-2-clg@kaod.org	2020-03-27 00:19:04 +11:00
Nick Desaulniers	a7032637b5	powerpc: Prefer __section and __printf from compiler_attributes.h Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> [mpe: Drop changes to a/p/boot which doesn't use linux headers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190812215052.71840-10-ndesaulniers@google.com	2020-03-27 00:16:32 +11:00
Fabiano Rosas	7074695ac6	powerpc/prom_init: Remove leftover comment The if statement that this comment referred to was removed in commit `11fdb30934` ("powerpc/prom_init: Remove support for OPAL v2"). Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200324182912.1048906-1-farosas@linux.ibm.com	2020-03-25 21:15:01 +11:00
Christophe Leroy	21f8b2fa3c	powerpc/kprobes: Ignore traps that happened in real mode When a program check exception happens while MMU translation is disabled, following Oops happens in kprobe_handler() in the following code: } else if (*addr != BREAKPOINT_INSTRUCTION) { BUG: Unable to handle kernel data access on read at 0x0000e268 Faulting instruction address: 0xc000ec34 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=16K PREEMPT CMPC885 Modules linked in: CPU: 0 PID: 429 Comm: cat Not tainted 5.6.0-rc1-s3k-dev-00824-g84195dc6c58a #3267 NIP: c000ec34 LR: c000ecd8 CTR: c019cab8 REGS: ca4d3b58 TRAP: 0300 Not tainted (5.6.0-rc1-s3k-dev-00824-g84195dc6c58a) MSR: 00001032 <ME,IR,DR,RI> CR: 2a4d3c52 XER: 00000000 DAR: 0000e268 DSISR: c0000000 GPR00: c000b09c ca4d3c10 c66d0620 00000000 ca4d3c60 00000000 00009032 00000000 GPR08: 00020000 00000000 c087de44 c000afe0 c66d0ad0 100d3dd6 fffffff3 00000000 GPR16: 00000000 00000041 00000000 ca4d3d70 00000000 00000000 0000416d 00000000 GPR24: 00000004 c53b6128 00000000 0000e268 00000000 c07c0000 c07bb6fc ca4d3c60 NIP [c000ec34] kprobe_handler+0x128/0x290 LR [c000ecd8] kprobe_handler+0x1cc/0x290 Call Trace: [ca4d3c30] [c000b09c] program_check_exception+0xbc/0x6fc [ca4d3c50] [c000e43c] ret_from_except_full+0x0/0x4 --- interrupt: 700 at 0xe268 Instruction dump: 913e0008 81220000 38600001 3929ffff 91220000 80010024 bb410008 7c0803a6 38210020 4e800020 38600000 4e800020 <813b0000> 6d2a7fe0 2f8a0008 419e0154 ---[ end trace 5b9152d4cdadd06d ]--- kprobe is not prepared to handle events in real mode and functions running in real mode should have been blacklisted, so kprobe_handler() can safely bail out telling 'this trap is not mine' for any trap that happened while in real-mode. If the trap happened with MSR_IR or MSR_DR cleared, return 0 immediately. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Fixes: `6cc89bad60` ("powerpc/kprobes: Invoke handlers directly") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/424331e2006e7291a1bfe40e7f3fa58825f565e1.1582054578.git.christophe.leroy@c-s.fr	2020-03-25 12:09:51 +11:00
Nathan Chancellor	af6cf95c4d	powerpc/maple: Fix declaration made after definition When building ppc64 defconfig, Clang errors (trimmed for brevity): arch/powerpc/platforms/maple/setup.c:365:1: error: attribute declaration must precede definition [-Werror,-Wignored-attributes] machine_device_initcall(maple, maple_cpc925_edac_setup); ^ machine_device_initcall expands to __define_machine_initcall, which in turn has the macro machine_is used in it, which declares mach_##name with an __attribute__((weak)). define_machine actually defines mach_##name, which in this file happens before the declaration, hence the warning. To fix this, move define_machine after machine_device_initcall so that the declaration occurs before the definition, which matches how machine_device_initcall and define_machine work throughout arch/powerpc. While we're here, remove some spaces before tabs. Fixes: `8f101a051e` ("edac: cpc925 MC platform device setup") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Ilie Halip <ilie.halip@gmail.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200323222729.15365-1-natechancellor@gmail.com	2020-03-25 12:09:48 +11:00
Nicholas Piggin	adde8715cf	powerpc/pseries: Avoid harmless preempt warning Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320152436.1468651-1-npiggin@gmail.com	2020-03-25 12:09:39 +11:00
Oliver O'Halloran	e86350f70a	powerpc/eeh: Rework eeh_ops->probe() With the EEH early probe now being pseries specific there's no need for eeh_ops->probe() to take a pci_dn. Instead, we can make it take a pci_dev and use the probe function to map a pci_dev to an eeh_dev. This allows the platform to implement it's own method for finding (or creating) an eeh_dev for a given pci_dev which also removes a use of pci_dn in generic EEH code. This patch also renames eeh_device_add_late() to eeh_device_probe(). This better reflects what it does does and removes the last vestiges of the early/late EEH probe split. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-6-oohall@gmail.com	2020-03-25 12:09:39 +11:00
Oliver O'Halloran	b6eebb093c	powerpc/eeh: Make early EEH init pseries specific The eeh_ops->probe() function is called from two different contexts: 1. On pseries, where we set EEH_PROBE_MODE_DEVTREE, it's called in eeh_add_device_early() which is supposed to run before we create a pci_dev. 2. On PowerNV, where we set EEH_PROBE_MODE_DEV, it's called in eeh_device_add_late() which is supposed to run after the pci_dev is created. The "early" probe is required because PAPR requires that we perform an RTAS call to enable EEH support on a device before we start interacting with it via config space or MMIO. This requirement doesn't exist on PowerNV and shoehorning two completely separate initialisation paths into a common interface just results in a convoluted code everywhere. Additionally the early probe requires the probe function to take an pci_dn rather than a pci_dev argument. We'd like to make pci_dn a pseries specific data structure since there's no real requirement for them on PowerNV. To help both goals move the early probe into the pseries containment zone so the platform depedence is more explicit. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-5-oohall@gmail.com	2020-03-25 12:09:39 +11:00
Oliver O'Halloran	3ff32efb62	powerpc/eeh: Remove PHB check in probe This check for a missing PHB has existing in various forms since the initial PPC64 port was upstreamed in 2002. The idea seems to be that we need to guard against creating pci-specific data structures for the non-pci children of a PCI device tree node (e.g. USB devices). However, we only create pci_dn structures for DT nodes that correspond to PCI devices so there's not much point in doing this check in the eeh_probe path. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-4-oohall@gmail.com	2020-03-25 12:09:39 +11:00
Oliver O'Halloran	a4b4f61db8	powerpc/eeh: Do early EEH init only when required The pci hotplug helper (pci_hp_add_devices()) calls eeh_add_device_tree_early() to scan the device-tree for new PCI devices and do the early EEH probe before the device is scanned. This early probe is a no-op in a lot of cases because: a) The early init is only required to satisfy a PAPR requirement that EEH be configured before we start doing config accesses. On PowerNV it is a no-op. b) It's a no-op for devices that have already had their eeh_dev initialised. There are four callers of pci_hp_add_devices(): 1. arch/powerpc/kernel/eeh_driver.c Here the hotplug helper is called when re-scanning pci_devs that were removed during an EEH recovery pass. The EEH stat for each removed device (the eeh_dev) is retained across a recovery pass so the early init is a no-op in this case. 2. drivers/pci/hotplug/pnv_php.c This is also a no-op since the PowerNV hotplug driver is, suprisingly, PowerNV specific. 3. drivers/pci/hotplug/rpaphp_core.c 4. drivers/pci/hotplug/rpaphp_pci.c In these two cases new devices have been hotplugged and FW has provided new DT nodes for each. These are the only two cases where the EEH we might have new PCI device nodes in the DT so these are the only two cases where the early EEH probe needs to be done. We can move the calls to eeh_add_device_tree_early() to the locations where it's needed and remove it from the generic path. This is preparation for making the early EEH probe pseries specific. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-3-oohall@gmail.com	2020-03-25 12:09:38 +11:00
Oliver O'Halloran	2d0953f7d5	powerpc/eeh: Remove eeh_add_device_tree_late() On pseries and PowerNV pcibios_bus_add_device() calls eeh_add_device_late() so there's no need to do a separate tree traversal to bind the eeh_dev and pci_dev together setting up the PHB at boot. As a result we can remove eeh_add_device_tree_late(). Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-2-oohall@gmail.com	2020-03-25 12:09:38 +11:00
Oliver O'Halloran	8645aaa879	powerpc/eeh: Add sysfs files in late probe Move creating the EEH specific sysfs files into eeh_add_device_late() rather than being open-coded all over the place. Calling the function is generally done immediately after calling eeh_add_device_late() anyway. This is also a correctness fix since currently the sysfs files will be added even if the EEH probe happens to fail. Similarly, on pseries we currently add the sysfs files before calling eeh_add_device_late(). This is flat-out broken since the sysfs files require the pci_dev->dev.archdata.edev pointer to be set, and that is done in eeh_add_device_late(). Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-1-oohall@gmail.com	2020-03-25 12:09:38 +11:00
Michael Ellerman	7053f80d96	powerpc/64: Prevent stack protection in early boot The previous commit reduced the amount of code that is run before we setup a paca. However there are still a few remaining functions that run with no paca, or worse, with an arbitrary value in r13 that will be used as a paca pointer. In particular the stack protector canary is stored in the paca, so if stack protector is activated for any of these functions we will read the stack canary from wherever r13 points. If r13 happens to point outside of memory we will get a machine check / checkstop. For example if we modify initialise_paca() to trigger stack protection, and then boot in the mambo simulator with r13 poisoned in skiboot before calling the kernel: DEBUG: 19952232: (19952232): INSTRUCTION: PC=0xC0000000191FC1E8: [0x3C4C006D]: addis r2,r12,0x6D [fetch] DEBUG: 19952236: (19952236): INSTRUCTION: PC=0xC00000001807EAD8: [0x7D8802A6]: mflr r12 [fetch] FATAL ERROR: 19952276: (19952276): Check Stop for 0:0: Machine Check with ME bit of MSR off DEBUG: 19952276: (19952276): INSTRUCTION: PC=0xC0000000191FCA7C: [0xE90D0CF8]: ld r8,0xCF8(r13) [Instruction Failed] INFO: 19952276: (19952277): Execution stopped: Mambo Error, Machine Check Stop, systemsim % bt pc: 0xC0000000191FCA7C initialise_paca+0x54 lr: 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBED0 0x0 +0x0 stack:0x00000000198CBF00 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBF90 0x1801C968 +0x1801C968 So annotate the relevant functions to ensure stack protection is never enabled for them. Fixes: `06ec27aea9` ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-2-mpe@ellerman.id.au	2020-03-25 12:09:38 +11:00
Daniel Axtens	d4a8e98621	powerpc/64: Setup a paca before parsing device tree etc. Currently we set up the paca after parsing the device tree for CPU features. Prior to that, r13 contains random data, which means there is random data in r13 while we're running the generic dt parsing code. This random data varies depending on whether we boot through a vmlinux or a zImage: for the vmlinux case it's usually around zero, but for zImages we see random values like 912a72603d420015. This is poor practice, and can also lead to difficult-to-debug crashes. For example, when kcov is enabled, the kcov instrumentation attempts to read preempt_count out of the current task, which goes via the paca. This then crashes in the zImage case. Similarly stack protector can cause crashes if r13 is bogus, by reading from the stack canary in the paca. To resolve this: - move the paca setup to before the CPU feature parsing. - because we no longer have access to CPU feature flags in paca setup, change the HV feature test in the paca setup path to consider the actual value of the MSR rather than the CPU feature. Translations get switched on once we leave early_setup, so I think we'd already catch any other cases where the paca or task aren't set up. Boot tested on a P9 guest and host. Fixes: `fb0b0a73b2` ("powerpc: Enable kcov") Fixes: `06ec27aea9` ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Reword comments & change log a bit to mention stack protector] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-1-mpe@ellerman.id.au	2020-03-25 12:09:37 +11:00
Aneesh Kumar K.V	36b78402d9	powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and hugetlb hugepage entries. The difference is WRT how we handle hash fault on these address. THP address enables MPSS in segments. We want to manage devmap hugepage entries similar to THP pt entries. Hence use H_PAGE_THP_HUGE for devmap huge PTE entries. With current code while handling hash PTE fault, we do set is_thp = true when finding devmap PTE huge PTE entries. Current code also does the below sequence we setting up huge devmap entries. entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set for huge devmap PTE entries. This results in false positive error like below. kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128 .... NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900 LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740 Call Trace: str_spec.74809+0x22ffb4/0x2d116c (unreliable) dax_writeback_one+0x1a8/0x740 dax_writeback_mapping_range+0x26c/0x700 ext4_dax_writepages+0x150/0x5a0 do_writepages+0x68/0x180 __filemap_fdatawrite_range+0x138/0x180 file_write_and_wait_range+0xa4/0x110 ext4_sync_file+0x370/0x6e0 vfs_fsync_range+0x70/0xf0 sys_msync+0x220/0x2e0 system_call+0x5c/0x68 This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP. To make this all consistent, update pmd_mkdevmap to set H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP correctly. Fixes: `ebd3119793` ("powerpc/mm: Add devmap support for ppc64") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com	2020-03-25 12:09:30 +11:00
Christophe Leroy	697ece78f8	powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits. Reorder Linux PTE bits to (almost) match Hash PTE bits. RW Kernel : PP = 00 RO Kernel : PP = 00 RW User : PP = 01 RO User : PP = 11 So naturally, we should have _PAGE_USER = 0x001 _PAGE_RW = 0x002 Today 0x001 and 0x002 and _PAGE_PRESENT and _PAGE_HASHPTE which both are software only bits. Switch _PAGE_USER and _PAGE_PRESET Switch _PAGE_RW and _PAGE_HASHPTE This allows to remove a few insns. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c4d6c18a7f8d9d3b899bc492f55fbc40ef38896a.1583861325.git.christophe.leroy@c-s.fr	2020-03-25 12:09:27 +11:00
Christophe Leroy	af92bad615	powerpc/kasan: Fix kasan_remap_early_shadow_ro() At the moment kasan_remap_early_shadow_ro() does nothing, because k_end is 0 and k_cur < 0 is always true. Change the test to k_cur != k_end, as done in kasan_init_shadow_page_tables() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: `cbd18991e2` ("powerpc/mm: Fix an Oops in kasan_mmu_init()") Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.1583507333.git.christophe.leroy@c-s.fr	2020-03-25 12:09:27 +11:00
Christophe Leroy	eb4f8e259a	powerpc/kprobes: Remove redundant code At the time being we have something like if (something) { p = get(); if (p) { if (something_wrong) goto out; ... return; } else if (a != b) { if (some_error) goto out; ... } goto out; } p = get(); if (!p) { if (a != b) { if (some_error) goto out; ... } goto out; } This is similar to p = get(); if (!p) { if (a != b) { if (some_error) goto out; ... } goto out; } if (something) { if (something_wrong) goto out; ... return; } Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Reflow the comment that was moved] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/07a17425743600460ce35fa9432d42487a825583.1582099499.git.christophe.leroy@c-s.fr	2020-03-25 12:09:08 +11:00
Michael Ellerman	6eeb9b3b9c	powerpc/64s: Fix section mismatch warnings from boot code We currently have two section mismatch warnings: The function __boot_from_prom() references the function __init prom_init(). The function start_here_common() references the function __init start_kernel(). The warnings are correct, we do have branches from non-init code into init code, which is freed after boot. But we don't expect to ever execute any of that early boot code after boot, if we did that would be a bug. In particular calling into OF after boot would be fatal because OF is no longer resident. So for now fix the warnings by marking the relevant functions as __REF, which puts them in the ".ref.text" section. This causes some reordering of the functions in the final link: @@ -217,10 +217,9 @@ c00000000000b088 t generic_secondary_common_init c00000000000b124 t __mmu_off c00000000000b14c t __start_initialization_multiplatform -c00000000000b1ac t __boot_from_prom -c00000000000b1ec t __after_prom_start -c00000000000b260 t p_end -c00000000000b27c T copy_and_flush +c00000000000b1ac t __after_prom_start +c00000000000b220 t p_end +c00000000000b23c T copy_and_flush c00000000000b300 T __secondary_start c00000000000b300 t copy_to_here c00000000000b344 t start_secondary_prolog @@ -228,8 +227,9 @@ c00000000000b36c t enable_64b_mode c00000000000b388 T relative_toc c00000000000b3a8 t p_toc -c00000000000b3b0 t start_here_common -c00000000000b3d0 t start_here_multiplatform +c00000000000b3b0 t __boot_from_prom +c00000000000b3f0 t start_here_multiplatform +c00000000000b480 t start_here_common c00000000000b880 T system_call_common c00000000000b974 t system_call c00000000000b9dc t system_call_exit In particular __boot_from_prom moves after copy_to_here, which means it's not copied to zero in the first stage of copy of the kernel to zero. But that's OK, because we only call __boot_from_prom before we do the copy, so it makes no difference when it's copied. The call sequence is: __start -> __start_initialization_multiplatform -> __boot_from_prom -> __start -> __start_initialization_multiplatform -> __after_prom_start -> copy_and_flush -> copy_and_flush (relocated to 0) -> start_here_multiplatform -> early_setup Reported-by: Mauricio Faria de Oliveira <mauricfo@linux.ibm.com> Reported-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225031328.14676-1-mpe@ellerman.id.au	2020-03-25 12:07:59 +11:00
Michael Ellerman	d64c7dbb4d	powerpc/xmon: Lower limits on nidump and ndump In xmon we have two variables that are used by the dump commands. There's ndump which is the number of bytes to dump using 'd', and nidump which is the number of instructions to dump using 'di'. ndump starts as 64 and nidump starts as 16, but both can be set by the user. It's fairly common to be pasting addresses into xmon when trying to debug something, and if you inadvertently double paste an address like so: 0:mon> di c000000002101f6c c000000002101f6c The second value is interpreted as the number of instructions to dump. Luckily it doesn't dump 13 quintrillion instructions, the value is limited to MAX_DUMP (128K). But as each instruction is dumped on a single line, that's still a lot of output. If you're on a slow console that can take multiple minutes to print. If you were "just popping in and out of xmon quickly before the RCU/hardlockup detector fires" you are now having a bad day. Things are not as bad with 'd' because we print 16 bytes per line, so it's fewer lines. But it's still quite a lot. So shrink the maximum for 'd' to 64K (one page), which is 4096 lines. For 'di' add a new limit which is the above / 4 - because instructions are 4 bytes, meaning again we can dump one page. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200219110007.31195-1-mpe@ellerman.id.au	2020-03-25 12:07:58 +11:00
Alexey Kardashevskiy	74bb84e511	powerpc/prom_init: Pass the "os-term" message to hypervisor The "os-term" RTAS calls has one argument with a message address of OS termination cause. rtas_os_term() already passes it but the recently added prom_init's version of that missed it; it also does not fill args correctly. This passes the message address and initializes the number of arguments. Fixes: `6a9c930bd7` ("powerpc/prom_init: Add the ESM call to prom_init") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200312074404.87293-1-aik@ozlabs.ru	2020-03-25 12:07:58 +11:00
afzal mohammed	b4f00d5b20	powerpc: Replace setup_irq() by request_irq() request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200312064256.18735-1-afzal.mohd.ma@gmail.com	2020-03-25 12:07:57 +11:00
Joe Perches	addf3727ad	powerpc/cell: Use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/03073a9a269010ca439e9e658629c44602b0cc9f.1583896348.git.joe@perches.com	2020-03-25 12:07:57 +11:00
Balamuruhan S	3e74a0e163	powerpc/sstep: Fix DS operand in ld encoding to appropriate value ld instruction should have 14 bit immediate field (DS) concatenated with 0b00 on the right, encode it accordingly. Introduce macro `IMM_DS()` to encode DS form instructions with 14 bit immediate field. Fixes: `4ceae137bd` ("powerpc: emulate_step() tests for load/store instructions") Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Balamuruhan S <bala24@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200311102405.392263-1-bala24@linux.ibm.com	2020-03-25 12:06:46 +11:00
Tyrel Datwyler	c5e76fa05b	powerpc/pseries: Fix of_read_drc_info_cell() to point at next record The expectation is that when calling of_read_drc_info_cell() repeatedly to parse multiple drc-info records that the in/out curval parameter points at the start of the next record on return. However, the current behavior has curval still pointing at the final value of the record just parsed. The result of which is that if the ibm,drc-info property contains multiple properties the parsed value of the drc_type for any record after the first has the power_domain value of the previous record appended to the type string. eg: observed the following 0xffffffff prepended to PHB drc-info: type: \xff\xff\xff\xffPHB, prefix: PHB , index_start: 0x20000001 drc-info: suffix_start: 1, sequential_elems: 3072, sequential_inc: 1 drc-info: power-domain: 0xffffffff, last_index: 0x20000c00 In practice PHBs are the only type of connector in the ibm,drc-info property that has multiple records. So, it breaks PHB hotplug, but by chance not PCI, CPU, slot, or memory because they happen to only ever be a single record. Fix by incrementing curval past the power_domain value to point at drc_type string of next record. Fixes: `e83636ac33` ("pseries/drc-info: Search DRC properties for CPU indexes") Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200307024547.5748-1-tyreld@linux.ibm.com	2020-03-25 12:06:43 +11:00
Michael Ellerman	61da50b76b	powerpc/kuap: PPC_KUAP_DEBUG should depend on PPC_KUAP Currently you can enable PPC_KUAP_DEBUG when PPC_KUAP is disabled, even though the former has not effect without the latter. Fix it so that PPC_KUAP_DEBUG can only be enabled when PPC_KUAP is enabled, not when the platform could support KUAP (PPC_HAVE_KUAP). Fixes: `890274c2dc` ("powerpc/64s: Implement KUAP for Radix MMU") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200301111738.22497-1-mpe@ellerman.id.au	2020-03-20 13:10:22 +11:00
Nicholas Piggin	59ed2adf39	powerpc/lib: Fix emulate_step() std test We should be checking that the instruction was stepped and that the target register has the right value. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> [mpe: Write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200226055302.1577954-1-npiggin@gmail.com	2020-03-18 00:05:54 +11:00
Nicholas Piggin	993cfecc59	powerpc/64s/radix: Fix CONFIG_SMP=n build Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200302010410.2957362-1-npiggin@gmail.com	2020-03-18 00:05:54 +11:00
YueHaibing	a4037d1f1f	powerpc/pmac/smp: Drop unnecessary volatile qualifier core99_l2_cache/core99_l3_cache do not need to be marked as volatile, remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200303085604.24952-1-yuehaibing@huawei.com	2020-03-17 23:40:36 +11:00
Ilie Halip	9451c79bc3	powerpc/pmac/smp: Avoid unused-variable warnings When building with ppc64_defconfig, the compiler reports that these 2 variables are not used: warning: unused variable 'core99_l2_cache' [-Wunused-variable] warning: unused variable 'core99_l3_cache' [-Wunused-variable] They are only used when CONFIG_PPC64 is not defined. Move them into a section which does the same macro check. Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Ilie Halip <ilie.halip@gmail.com> [mpe: Move them into core99_init_caches() which is their only user] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190920153951.25762-1-ilie.halip@gmail.com	2020-03-17 23:40:36 +11:00
Laurentiu Tudor	aa4113340a	powerpc/fsl_booke: Avoid creating duplicate tlb1 entry In the current implementation, the call to loadcam_multi() is wrapped between switch_to_as1() and restore_to_as0() calls so, when it tries to create its own temporary AS=1 TLB1 entry, it ends up duplicating the existing one created by switch_to_as1(). Add a check to skip creating the temporary entry if already running in AS=1. Fixes: `d9e1831a42` ("powerpc/85xx: Load all early TLB entries at once") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com	2020-03-17 23:40:35 +11:00
Joe Lawrence	ffd3eaf178	powerpc/vdso: remove deprecated VDS64_HAS_DESCRIPTORS references The original 2005 patch that introduced the powerpc vdso, pre-git ("ppc64: Implement a vDSO and use it for signal trampoline") notes that: ... symbols exposed by the vDSO aren't "normal" function symbols, apps can't be expected to link against them directly, the vDSO's are both seen as if they were linked at 0 and the symbols just contain offsets to the various functions. This is done on purpose to avoid a relocation step (ppc64 functions normally have descriptors with abs addresses in them). When glibc uses those functions, it's expected to use it's own trampolines that know how to reach them. Despite that explanation, there remains dead #ifdef VDS64_HAS_DESCRIPTORS code-blocks that provide alternate function definitions that setup function descriptors. Since VDS64_HAS_DESCRIPTORS has been unused for all these years, we might as well finally remove it from the codebase. Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200224211848.26087-1-joe.lawrence@redhat.com	2020-03-13 21:13:06 +11:00
Christophe Leroy	cc6f0e3900	powerpc/32: Fix missing NULL pmd check in virt_to_kpte() Commit `2efc7c085f` ("powerpc/32: drop get_pteptr()"), replaced get_pteptr() by virt_to_kpte(). But virt_to_kpte() lacks a NULL pmd check and returns an invalid non NULL pointer when there is no page table. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Fixes: `2efc7c085f` ("powerpc/32: drop get_pteptr()") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b1177cdfc6af74a3e277bba5d9e708c4b3315ebe.1583575707.git.christophe.leroy@c-s.fr	2020-03-13 21:13:05 +11:00
Michael Ellerman	819723a8a2	Merge branch 'fixes' into next Merge in our fixes branch. In particular we want to merge the TM and KUAP fixes, so we can add selftests for them in next.	2020-03-10 15:16:42 +11:00
Michael Ellerman	59bee45b97	powerpc/mm: Fix missing KUAP disable in flush_coherent_icache() Stefan reported a strange kernel fault which turned out to be due to a missing KUAP disable in flush_coherent_icache() called from flush_icache_range(). The fault looks like: Kernel attempted to access user page (7fffc30d9c00) - exploit attempt? (uid: 1009) BUG: Unable to handle kernel data access on read at 0x7fffc30d9c00 Faulting instruction address: 0xc00000000007232c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV CPU: 35 PID: 5886 Comm: sigtramp Not tainted 5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40 #79 NIP: c00000000007232c LR: c00000000003b7fc CTR: 0000000000000000 REGS: c000001e11093940 TRAP: 0300 Not tainted (5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40) MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000884 XER: 00000000 CFAR: c0000000000722fc DAR: 00007fffc30d9c00 DSISR: 08000000 IRQMASK: 0 GPR00: c00000000003b7fc c000001e11093bd0 c0000000023ac200 00007fffc30d9c00 GPR04: 00007fffc30d9c18 0000000000000000 c000001e11093bd4 0000000000000000 GPR08: 0000000000000000 0000000000000001 0000000000000000 c000001e1104ed80 GPR12: 0000000000000000 c000001fff6ab380 c0000000016be2d0 4000000000000000 GPR16: c000000000000000 bfffffffffffffff 0000000000000000 0000000000000000 GPR20: 00007fffc30d9c00 00007fffc30d8f58 00007fffc30d9c18 00007fffc30d9c20 GPR24: 00007fffc30d9c18 0000000000000000 c000001e11093d90 c000001e1104ed80 GPR28: c000001e11093e90 0000000000000000 c0000000023d9d18 00007fffc30d9c00 NIP flush_icache_range+0x5c/0x80 LR handle_rt_signal64+0x95c/0xc2c Call Trace: 0xc000001e11093d90 (unreliable) handle_rt_signal64+0x93c/0xc2c do_notify_resume+0x310/0x430 ret_from_except_lite+0x70/0x74 Instruction dump: 409e002c 7c0802a6 3c62ff31 3863f6a0 f8010080 48195fed 60000000 48fe4c8d 60000000 e8010080 7c0803a6 7c0004ac <7c00ffac> 7c0004ac 4c00012c 38210070 This path through handle_rt_signal64() to setup_trampoline() and flush_icache_range() is only triggered by 64-bit processes that have unmapped their VDSO, which is rare. flush_icache_range() takes a range of addresses to flush. In flush_coherent_icache() we implement an optimisation for CPUs where we know we don't actually have to flush the whole range, we just need to do a single icbi. However we still execute the icbi on the user address of the start of the range we're flushing. On CPUs that also implement KUAP (Power9) that leads to the spurious fault above. We should be able to pass any address, including a kernel address, to the icbi on these CPUs, which would avoid any interaction with KUAP. But I don't want to make that change in a bug fix, just in case it surfaces some strange behaviour on some CPU. So for now just disable KUAP around the icbi. Note the icbi is treated as a load, so we allow read access, not write as you'd expect. Fixes: `890274c2dc` ("powerpc/64s: Implement KUAP for Radix MMU") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au	2020-03-05 17:15:08 +11:00
Srikar Dronamraju	247257b03b	powerpc/numa: Remove late request for home node associativity With commit ("powerpc/numa: Early request for home node associativity"), commit `2ea6263068` ("powerpc/topology: Get topology for shared processors at boot") which was requesting home node associativity becomes redundant. Hence remove the late request for home node associativity. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135301.24739-6-srikar@linux.vnet.ibm.com	2020-03-04 22:44:31 +11:00
Srikar Dronamraju	dc909d8b0c	powerpc/numa: Early request for home node associativity Currently the kernel detects if its running on a shared lpar platform and requests home node associativity before the scheduler sched_domains are setup. However between the time NUMA setup is initialized and the request for home node associativity, workqueue initializes its per node cpumask. The per node workqueue possible cpumask may turn invalid after home node associativity resulting in weird situations like workqueue possible cpumask being a subset of workqueue online cpumask. This can be fixed by requesting home node associativity earlier just before NUMA setup. However at the NUMA setup time, kernel may not be in a position to detect if its running on a shared lpar platform. So request for home node associativity and if the request fails, fallback on the device tree property. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135301.24739-5-srikar@linux.vnet.ibm.com	2020-03-04 22:44:30 +11:00
Srikar Dronamraju	413e40550c	powerpc/numa: Use cpu node map of first sibling thread All the sibling threads of a core have to be part of the same node. To ensure that all the sibling threads map to the same node, always lookup/update the cpu-to-node map of the first thread in the core. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135301.24739-4-srikar@linux.vnet.ibm.com	2020-03-04 22:44:30 +11:00
Srikar Dronamraju	76b7bfb173	powerpc/numa: Handle extra hcall_vphn error cases Currently code handles H_FUNCTION, H_SUCCESS, H_HARDWARE return codes. However hcall_vphn can return other return codes. Now it also handles H_PARAMETER return code. Also the rest return codes are handled under the default case. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135301.24739-3-srikar@linux.vnet.ibm.com	2020-03-04 22:44:30 +11:00
Srikar Dronamraju	e7214ae9d8	powerpc/vphn: Check for error from hcall_vphn There is no value in unpacking associativity, if H_HOME_NODE_ASSOCIATIVITY hcall has returned an error. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135301.24739-2-srikar@linux.vnet.ibm.com	2020-03-04 22:44:30 +11:00
Srikar Dronamraju	a05f0e5be4	powerpc/smp: Use nid as fallback for package_id package_id is to match cores that are part of the same chip. On PowerNV machines, package_id defaults to chip_id. However ibm,chip_id property is not present in device-tree of PowerVM LPARs. Hence lscpu output shows one core per socket and multiple cores. To overcome this, use nid as the package_id on PowerVM LPARs. Before the patch: Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 8 Core(s) per socket: 1 <---------------------- Socket(s): 16 <---------------------- NUMA node(s): 2 Model: 2.2 (pvr 004e 0202) Model name: POWER9 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-63 NUMA node1 CPU(s): 64-127 # # cat /sys/devices/system/cpu/cpu0/topology/physical_package_id -1 After the patch: Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 8 <--------------------- Core(s) per socket: 8 <--------------------- Socket(s): 2 NUMA node(s): 2 Model: 2.2 (pvr 004e 0202) Model name: POWER9 (architected), altivec supported Hypervisor vendor: pHyp Virtualization type: para L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-63 NUMA node1 CPU(s): 64-127 # # cat /sys/devices/system/cpu/cpu0/topology/physical_package_id 0 Now lscpu output is more in line with the system configuration. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> [mpe: Use pkg_id instead of ppid, tweak change log and comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200129135121.24617-1-srikar@linux.vnet.ibm.com	2020-03-04 22:44:30 +11:00
Christophe Leroy	532d43a73c	powerpc/irq: Use current_stack_pointer in do_IRQ() Until commit `7306e83ccf` ("powerpc: Don't use CURRENT_THREAD_INFO to find the stack"), the current stack base address was obtained by calling current_thread_info(). That inline function was simply masking out the value of r1. In that commit, it was changed to using current_stack_pointer() (since renamed current_stack_frame()), which is a heavier function as it is an outline assembly function which cannot be inlined and which reads the content of the stack at 0(r1). Convert to using current_stack_pointer for geting r1 and masking out its value to obtain the base address of the stack pointer as before. Fixes: `7306e83ccf` ("powerpc: Don't use CURRENT_THREAD_INFO to find the stack") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-5-mpe@ellerman.id.au	2020-03-04 22:44:30 +11:00
Christophe Leroy	0dec6e1cca	powerpc/irq: use IS_ENABLED() in check_stack_overflow() Instead of #ifdef, use IS_ENABLED(CONFIG_DEBUG_STACKOVERFLOW). This enable GCC to check for code validity even when the option is not selected. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-4-mpe@ellerman.id.au	2020-03-04 22:44:29 +11:00
Christophe Leroy	84ab148930	powerpc/irq: Use current_stack_pointer in check_stack_overflow() The purpose of check_stack_overflow() is to verify that the stack has not overflowed. To really know whether the stack pointer is still within boundaries, the check must be done directly on the value of r1. So use current_stack_pointer, which returns the current value of r1, rather than current_stack_frame() which causes a frame to be created and then returns that value. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-3-mpe@ellerman.id.au	2020-03-04 22:44:28 +11:00
Christophe Leroy	0e63f01517	powerpc: Add current_stack_pointer as a register global current_stack_frame() doesn't return the stack pointer, but the caller's stack frame. See commit `bfe9a2cfe9` ("powerpc: Reimplement __get_SP() as a function not a define") and commit `acf620ecf5` ("powerpc: Rename __get_SP() to current_stack_pointer()") for details. In some cases this is overkill or incorrect, as it doesn't return the current value of r1. So add a current_stack_pointer register global to get the value of r1 directly. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Split out of other patch, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-2-mpe@ellerman.id.au	2020-03-04 22:44:28 +11:00
Michael Ellerman	3d13e839e8	powerpc: Rename current_stack_pointer() to current_stack_frame() current_stack_pointer(), which was called __get_SP(), used to just return the value in r1. But that caused problems in some cases, so it was turned into a function in commit `bfe9a2cfe9` ("powerpc: Reimplement __get_SP() as a function not a define"). Because it's a function in a separate compilation unit to all its callers, it has the effect of causing a stack frame to be created, and then returns the address of that frame. This is good in some cases like those described in the above commit, but in other cases it's overkill, we just need to know what stack page we're on. On some other arches current_stack_pointer is just a register global giving the stack pointer, and we'd like to do that too. So rename our current_stack_pointer() to current_stack_frame() to make that possible. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200220115141.2707-1-mpe@ellerman.id.au	2020-03-04 22:44:28 +11:00
Kajol Jain	22697da36d	powerpc/kernel/sysfs: Add new config option PMU_SYSFS to enable PMU SPRs sysfs file creation Many of the performance monitoring unit (PMU) SPRs are exposed in the sysfs. This may not be a desirable since "perf" API is the primary interface to program PMU and collect counter data in the system. But that said, we cant remove these sysfs files since we dont whether anyone/anything is using them. So the patch adds a new CONFIG option 'CONFIG_PMU_SYSFS' (user selectable) to be used in sysfs file creation for PMU SPRs. New option by default is disabled, but can be enabled if user needs it. Tested this patch behaviour in powernv and pseries machines. Patch is also tested for pmac32_defconfig. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Nageswara R Sastry <nasastry@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200214080606.26872-2-kjain@linux.ibm.com	2020-03-04 22:44:28 +11:00
Madhavan Srinivasan	fcdb524d44	powerpc/kernel/sysfs: Refactor current sysfs.c An attempt to refactor the current sysfs.c file. To start with a big chuck of macro #defines and dscr functions are moved to start of the file. Secondly, HAS_ #define macros are cleanup based on CONFIG_ options Finally new HAS_ macro added: 1. HAS_PPC_PA6T (for PA6T) to separate out non-PMU SPRs. 2. HAS_PPC_PMC56 to separate out PMC SPR's from HAS_PPC_PMC_CLASSIC which come under CONFIG_PPC64. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200214080606.26872-1-kjain@linux.ibm.com	2020-03-04 22:44:28 +11:00
Oliver O'Halloran	672e480aa2	powerpc/powernv: Add explicit fast-reboot support Add a way to manually invoke a fast-reboot rather than setting the NVRAM flag. The idea is to allow userspace to invoke a fast-reboot using the optional string argument to the reboot() system call, or using the xmon zr command so we don't need to leave around a persistent changes on a system to use the feature. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-2-oohall@gmail.com	2020-03-04 22:44:27 +11:00
Oliver O'Halloran	16985f2d25	powerpc/powernv: Treat an empty reboot string as default Treat an empty reboot cmd string the same as a NULL string. This squashes a spurious unsupported reboot message that sometimes gets out when using xmon. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-1-oohall@gmail.com	2020-03-04 22:44:27 +11:00
Michael Ellerman	d42c6d0f8d	powerpc/Makefile: Mark phony targets as PHONY Some of our phony targets are not marked as such. This can lead to confusing errors, eg: $ make clean $ touch install $ make install make: 'install' is up to date. $ Fix it by adding them to the PHONY variable which is marked phony in the top-level Makefile, or in scripts/Makefile.build for the boot Makefile. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200219000434.15872-1-mpe@ellerman.id.au	2020-03-04 22:44:27 +11:00
Christophe Leroy	6453f9ed9d	powerpc/mm: Don't kmap_atomic() in pte_offset_map() on PPC32 On PPC32, pte_offset_map() does a kmap_atomic() in order to support page tables allocated in high memory, just like ARM and x86/32. But since at least 2008 and commit `8054a3428f` ("powerpc: Remove dead CONFIG_HIGHPTE"), page tables are never allocated in high memory. When the page is in low mem, kmap_atomic() just returns the page address but still disable preemption and pagefault. And it is not an inlined function, so we suffer function call for no reason. Make pte_offset_map() the same as pte_offset_kernel() and make pte_unmap() void, in the same way as PPC64 which doesn't have HIGHMEM. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/03c97f0f6b3790d164822563be80f2fd4713a955.1581932480.git.christophe.leroy@c-s.fr	2020-03-04 22:44:27 +11:00
Alexey Kardashevskiy	c4b78169e3	powerpc/book3s64: Fix error handling in mm_iommu_do_alloc() The last jump to free_exit in mm_iommu_do_alloc() happens after page pointers in struct mm_iommu_table_group_mem_t were already converted to physical addresses. Thus calling put_page() on these physical addresses will likely crash. This moves the loop which calculates the pageshift and converts page struct pointers to physical addresses later after the point when we cannot fail; thus eliminating the need to convert pointers back. Fixes: `eb9d7a62c3` ("powerpc/mm_iommu: Fix potential deadlock") Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191223060351.26359-1-aik@ozlabs.ru	2020-03-04 22:44:27 +11:00
Greg Kroah-Hartman	f344f0ab99	powerpc/powernv: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-6-gregkh@linuxfoundation.org	2020-03-04 22:44:26 +11:00
Greg Kroah-Hartman	e04906aa1f	powerpc/cell/axon_msi: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-5-gregkh@linuxfoundation.org	2020-03-04 22:44:25 +11:00
Greg Kroah-Hartman	f3c0520195	powerpc/mm: ptdump: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-4-gregkh@linuxfoundation.org	2020-03-04 22:44:25 +11:00
Greg Kroah-Hartman	08f6a7974a	powerpc/mm: book3s64: hash_utils: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-3-gregkh@linuxfoundation.org	2020-03-04 22:44:25 +11:00
Greg Kroah-Hartman	c4fd527f52	powerpc/kvm: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Because of this cleanup, we get to remove a few fields in struct kvm_arch that are now unused. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [mpe: Fix build error in kvm/timing.c, adapt kvmppc_remove_cpu_debugfs()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-2-gregkh@linuxfoundation.org	2020-03-04 22:44:25 +11:00
Greg Kroah-Hartman	860286cf33	powerpc/kernel: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-1-gregkh@linuxfoundation.org	2020-03-04 22:44:25 +11:00
Christophe JAILLET	88654d5b44	powerpc/83xx: Add some error handling in 'quirk_mpc8360e_qe_enet10()' In some error handling path, we should call "of_node_put(np_par)" or some resource may be leaking in case of error. Fixes: `8159df72d4` ("83xx: add support for the kmeter1 board.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200208140920.7652-1-christophe.jaillet@wanadoo.fr	2020-03-04 22:44:25 +11:00
Christophe JAILLET	365ad0b60d	powerpc/83xx: Fix some typo in some warning message "couldn;t" should be "couldn't". Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200208140904.7521-1-christophe.jaillet@wanadoo.fr	2020-03-04 22:44:17 +11:00
Desnes A. Nunes do Rosario	fc37a1632d	powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems PowerVM systems running compatibility mode on a few Power8 revisions are still vulnerable to the hardware defect that loses PMU exceptions arriving prior to a context switch. The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG cpu_feature bit, nevertheless this bit also needs to be set for PowerVM compatibility mode systems. Fixes: `68f2f0d431` ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com	2020-02-28 10:09:33 +11:00
Christophe Leroy	2efc7c085f	powerpc/32: drop get_pteptr() Commit `8d30c14cab` ("powerpc/mm: Rework I$/D$ coherency (v3)") and commit `90ac19a8b2` ("[POWERPC] Abolish iopa(), mm_ptov(), io_block_mapping() from arch/powerpc") removed the use of get_pteptr() outside of mm/pgtable_32.c In mm/pgtable_32.c, the only user of get_pteptr() is change_page_attr() which operates on kernel context and on lowmem pages only. Make virt_to_kpte() available outside of mm/mem.c and use it instead of get_pteptr(), and drop get_pteptr() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/788378c6c3ba5c5298caab7c7f95e6c3c88244b8.1578558199.git.christophe.leroy@c-s.fr	2020-02-26 10:34:41 +11:00
Christophe Leroy	0b1c524caa	powerpc/32: refactor pmd_offset(pud_offset(pgd_offset... At several places pmd pointer is retrieved through the same action: pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr); or pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr); Refactor this by implementing two helpers pmd_ptr() and pmd_ptr_k() This will help when adding the p4d level. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7b065c5be35726af4066cab238ee35cabceda1fa.1578558199.git.christophe.leroy@c-s.fr	2020-02-26 10:34:40 +11:00
Christophe Leroy	05642cf728	powerpc/32: don't restore r0, r6-r8 on exception entry path after trace_hardirqs_off() Since commit `b86fb88855` ("powerpc/32: implement fast entry for syscalls on non BOOKE") and commit `1a4b739bbb` ("powerpc/32: implement fast entry for syscalls on BOOKE"), syscalls don't use the exception entry path anymore. It is therefore pointless to restore r0 and r6-r8 after calling trace_hardirqs_off(). In the meantime, drop the '2:' label which is unused and misleading. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d2c6dc65d27e83964eb05f16a126161ab6455eea.1578388585.git.christophe.leroy@c-s.fr	2020-02-26 10:34:40 +11:00
Naveen N. Rao	cb0cc635c7	powerpc: Include .BTF section Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld: ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in section `.BTF' Include .BTF section in vmlinux explicitly to fix the same. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220113132.857132-1-naveen.n.rao@linux.vnet.ibm.com	2020-02-24 22:04:07 +11:00
Ravi Bangoria	e08658a657	powerpc/watchpoint: Don't call dar_within_range() for Book3S DAR is set to the first byte of overlap between actual access and watched range at DSI on Book3S processor. But actual access range might or might not be within user asked range. So for Book3S, it must not call dar_within_range(). This revert portion of commit `39413ae009` ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size."). Before patch: # ./tools/testing/selftests/powerpc/ptrace/perf-hwbreak ... TESTED: No overlap FAILED: Partial overlap: 0 != 2 TESTED: Partial overlap TESTED: No overlap FAILED: Full overlap: 0 != 2 failure: perf_hwbreak After patch: TESTED: No overlap TESTED: Partial overlap TESTED: Partial overlap TESTED: No overlap TESTED: Full overlap success: perf_hwbreak Fixes: `39413ae009` ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200222082049.330435-1-ravi.bangoria@linux.ibm.com	2020-02-24 11:19:35 +11:00
Christophe Leroy	e1347a020b	powerpc/32s: Slenderize _tlbia() for powerpc 603/603e _tlbia() is a function used only on 603/603e core, ie on CPUs which don't have a hash table. _tlbia() uses the tlbia macro which implements a loop of 1024 tlbie. On the 603/603e core, flushing the entire TLB requires no more than 32 tlbie. Replace tlbia by a loop of 32 tlbie. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/12f4f4f0ff89aeab3b937fc96c84fb35e1b2517e.1580748445.git.christophe.leroy@c-s.fr	2020-02-19 22:46:11 +11:00
Libor Pechacek	a83836dbc5	powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable In guests without hotplugagble memory drmem structure is only zero initialized. Trying to manipulate DLPAR parameters results in a crash. $ echo "memory add count 1" > /sys/kernel/dlpar Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ... NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000 REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000 CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 ... NIP dlpar_memory+0x6e4/0xd00 LR dlpar_memory+0x698/0xd00 Call Trace: dlpar_memory+0x698/0xd00 (unreliable) handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd0/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x68 Taking closer look at the code, I can see that for_each_drmem_lmb is a macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <= &drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs is NULL, the loop would iterate through the whole address range if it weren't stopped by the NULL pointer dereference on the next line. This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range macro behavior with the common C semantics, where the end marker does not belong to the scanned range, and alters get_lmb_range() semantics. As a side effect, the wraparound observed in the crash is prevented. Fixes: `6c6ea53725` ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Libor Pechacek <lpechacek@suse.cz> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de	2020-02-19 22:46:11 +11:00
Christophe Leroy	c06f0aff03	powerpc: Don't use thread struct for saving SRR0/1 on syscall. CR0 can be saved later, and CTR can also be used for saving. Keep SRR1 in r9 and stash SRR0 in CTR, this avoids using thread_struct in memory for that. Saves 3 cycles (ie 1%) in null_syscall selftest on 8xx. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b94c3bc03bac9431fec2dadb686384c481889422.1580470483.git.christophe.leroy@c-s.fr	2020-02-19 22:46:08 +11:00
Christophe Leroy	9e27086292	powerpc/32: Warn and return ENOSYS on syscalls from kernel Since commit `b86fb88855` ("powerpc/32: implement fast entry for syscalls on non BOOKE") and commit `1a4b739bbb` ("powerpc/32: implement fast entry for syscalls on BOOKE"), syscalls from kernel are unexpected and can have catastrophic consequences as it will destroy the kernel stack. Test MSR_PR on syscall entry. In case syscall is from kernel, emit a warning and return ENOSYS error. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8ee3bdbbdfdfc64ca7001e90c43b2aee6f333578.1580470482.git.christophe.leroy@c-s.fr	2020-02-19 22:46:08 +11:00
Christophe Leroy	030e347430	powerpc/32s: Don't flush all TLBs when flushing one page When flushing any memory range, the flushing function flushes all TLBs. When (start) and (end - 1) are in the same memory page, flush that page instead. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b30b2eae6960502eaf0d9e36c60820b839693c33.1580542939.git.christophe.leroy@c-s.fr	2020-02-19 22:46:08 +11:00
Sourabh Jain	d8e73458f3	powerpc/fadump: sysfs for fadump memory reservation Add a sys interface to allow querying the memory reserved by FADump for saving the crash dump. Also added Documentation/ABI for the new sysfs file. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-7-sourabhjain@linux.ibm.com	2020-02-19 22:46:07 +11:00
Sourabh Jain	8852c07a88	powerpc/powernv: Move core and fadump_release_opalcore under new kobject The /sys/firmware/opal/core and /sys/kernel/fadump_release_opalcore sysfs files are used to export and release the OPAL memory on PowerNV platform. let's organize them into a new kobject under /sys/firmware/opal/mpipl/ directory. A symlink is added to maintain the backward compatibility for /sys/firmware/opal/core sysfs file. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-5-sourabhjain@linux.ibm.com	2020-02-19 21:07:10 +11:00
Sourabh Jain	d418b19f34	powerpc/fadump: Reorganize /sys/kernel/fadump_* sysfs files As the number of FADump sysfs files increases it is hard to manage all of them inside /sys/kernel directory. It's better to have all the FADump related sysfs files in a dedicated directory /sys/kernel/fadump. But in order to maintain backward compatibility a symlink has been added for every sysfs that has moved to new location. As the FADump sysfs files are now part of a dedicated directory there is no need to prefix their name with fadump_, hence sysfs file names are also updated. For example fadump_enabled sysfs file is now referred as enabled. Also consolidate ABI documentation for all the FADump sysfs files in a single file Documentation/ABI/testing/sysfs-kernel-fadump. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Tested-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-4-sourabhjain@linux.ibm.com	2020-02-19 21:07:09 +11:00
Christophe Leroy	ba32f4b021	powerpc/process: Remove unneccessary #ifdef CONFIG_PPC64 in copy_thread_tls() is_32bit_task() exists on both PPC64 and PPC32, no need of an ifdefery. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6ecbda05b4119c40222dc8ec284604e1597c9bff.1580327381.git.christophe.leroy@c-s.fr	2020-02-19 21:07:09 +11:00
Vaibhav Jain	72c4ebbac4	powerpc/papr_scm: Mark papr_scm_ndctl() as static Function papr_scm_ndctl() is neither exported from the module nor called directly from outside 'papr.c' hence should be marked 'static'. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130040206.79998-1-vaibhav@linux.ibm.com	2020-02-19 21:07:09 +11:00
Oliver O'Halloran	8cbb00a901	powerpc/pseries/Makefile: Remove CONFIG_PPC_PSERIES check The pseries Makefile (arch/powerpc/platforms/pseries/Makefile) is only included by the platform Makefile (arch/powerpc/platform/Makefile) when CONFIG_PPC_PSERIES is selected, so checking for CONFIG_PPC_PSERIES in the pseries Makefile is pointless. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130063153.19915-2-oohall@gmail.com	2020-02-19 21:07:08 +11:00
Oliver O'Halloran	f98df5ed0a	powerpc/pseries/vio: Remove stray #ifdef CONFIG_PPC_PSERIES vio.c is in platforms/pseries, which is only built if PPC_PSERIES=y. In other words, this ifdef is pointless. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130063153.19915-1-oohall@gmail.com	2020-02-19 21:07:08 +11:00
Christophe Leroy	9eb425b2e0	powerpc/entry: Fix an #if which should be an #ifdef in entry_32.S Fixes: `12c3f1fd87` ("powerpc/32s: get rid of CPU_FTR_601 feature") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a99fc0ad65b87a1ba51cfa3e0e9034ee294c3e07.1582034961.git.christophe.leroy@c-s.fr	2020-02-19 10:35:22 +11:00
Oliver O'Halloran	066bc3576e	powerpc/xmon: Fix whitespace handling in getstring() The ls (lookup symbol) and zr (reboot) commands use xmon's getstring() helper to read a string argument from the xmon prompt. This function skips over leading whitespace, but doesn't check if the first "non-whitespace" character is a newline which causes some odd behaviour (<enter> indicates a the enter key was pressed): 0:mon> ls printk<enter> printk: c0000000001680c4 0:mon> ls<enter> printk<enter> Symbol ' printk' not found. 0:mon> With commit `2d9b332d99` ("powerpc/xmon: Allow passing an argument to ppc_md.restart()") we have a similar problem with the zr command. Previously zr took no arguments so "zr<enter> would trigger a reboot. With that patch applied a second newline needs to be sent in order for the reboot to occur. Fix this by checking if the leading whitespace ended on a newline: 0:mon> ls<enter> Symbol '' not found. Fixes: `2d9b332d99` ("powerpc/xmon: Allow passing an argument to ppc_md.restart()") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217041343.2454-1-oohall@gmail.com	2020-02-18 21:31:12 +11:00
Christophe Leroy	477f3488a9	powerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACK power_save_ppc32_restore() is called during exception entry, before re-enabling the MMU. It substracts KERNELBASE from the address of nap_save_msscr0 to access it. With CONFIG_VMAP_STACK enabled, data MMU translation has already been re-enabled, so power_save_ppc32_restore() has to access nap_save_msscr0 by its virtual address. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: `cd08f109e2` ("powerpc/32s: Enable CONFIG_VMAP_STACK") Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7bce32ccbab3ba3e3e0f27da6961bf6313df97ed.1581663140.git.christophe.leroy@c-s.fr	2020-02-18 21:31:12 +11:00
Christophe Leroy	5a528eb679	powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK With CONFIG_VMAP_STACK, data MMU has to be enabled to read data on the stack. Fixes: `cd08f109e2` ("powerpc/32s: Enable CONFIG_VMAP_STACK") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d2330584f8c42d3039896e2b56f5d39676dc919c.1581669558.git.christophe.leroy@c-s.fr	2020-02-18 21:31:11 +11:00
Christophe Leroy	232ca1eeca	powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK hash_page() needs to read page tables from kernel memory. When entire kernel memory is mapped by BATs, which is normally the case when CONFIG_STRICT_KERNEL_RWX is not set, it works even if the page hosting the page table is not referenced in the MMU hash table. However, if the page where the page table resides is not covered by a BAT, a DSI fault can be encountered from hash_page(), and it loops forever. This can happen when CONFIG_STRICT_KERNEL_RWX is selected and the alignment of the different regions is too small to allow covering the entire memory with BATs. This also happens when CONFIG_DEBUG_PAGEALLOC is selected or when booting with 'nobats' flag. Also, if the page containing the kernel stack is not present in the MMU hash table, registers cannot be saved and a recursive DSI fault is encountered. To allow hash_page() to properly do its job at all time and load the MMU hash table whenever needed, it must run with data MMU disabled. This means it must be called before re-enabling data MMU. To allow this, registers clobbered by hash_page() and create_hpte() have to be saved in the thread struct together with SRR0, SSR1, DAR and DSISR. It is also necessary to ensure that DSI prolog doesn't overwrite regs saved by prolog of the current running exception. That means: - DSI can only use SPRN_SPRG_SCRATCH0 - Exceptions must free SPRN_SPRG_SCRATCH0 before writing to the stack. This also fixes the Oops reported by Erhard when create_hpte() is called by add_hash_page(). Due to prolog size increase, a few more exceptions had to get split in two parts. Fixes: `cd08f109e2` ("powerpc/32s: Enable CONFIG_VMAP_STACK") Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Erhard F. <erhard_f@mailbox.org> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206501 Link: https://lore.kernel.org/r/64a4aa44686e9fd4b01333401367029771d9b231.1581761633.git.christophe.leroy@c-s.fr	2020-02-18 21:31:11 +11:00
Gustavo Luiz Duarte	2464cc4c34	powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery After a treclaim, we expect to be in non-transactional state. If we don't clear the current thread's MSR[TS] before we get preempted, then tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in suspended transaction state. When handling a signal caught in transactional state, handle_rt_signal64() calls get_tm_stackpointer() that treclaims the transaction using tm_reclaim_current() but without clearing the thread's MSR[TS]. This can cause the TM Bad Thing exception below if later we pagefault and get preempted trying to access the user's sigframe, using __put_user(). Afterwards, when we are rescheduled back into do_page_fault() (but now in suspended state since the thread's MSR[TS] was not cleared), upon executing 'rfid' after completion of the page fault handling, the exception is raised because a transition from suspended to non-transactional state is invalid. Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033 Oops: Unrecoverable exception, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32 NIP: c00000000000de44 LR: c000000000034728 CTR: 0000000000000000 REGS: c00000003fe7bd70 TRAP: 0700 Not tainted (5.4.0-rc2) MSR: 8000000302a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[SE]> CR: 44000884 XER: 00000000 CFAR: c00000000000dda4 IRQMASK: 0 PACATMSCRATCH: 800000010280b033 GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78 GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0 GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000 GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140 GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800 GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000 NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0 LR [c000000000034728] handle_rt_signal64+0x78/0xc50 Call Trace: [c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable) [c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460 [c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74 Instruction dump: 7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088 60000000 60000000 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed0989 ---[ end trace 93094aa44b442f87 ]--- The simplified sequence of events that triggers the above exception is: ... # userspace in NON-TRANSACTIONAL state tbegin # userspace in TRANSACTIONAL state signal delivery # kernelspace in SUSPENDED state handle_rt_signal64() get_tm_stackpointer() treclaim # kernelspace in NON-TRANSACTIONAL state __put_user() page fault happens. We will never get back here because of the TM Bad Thing exception. page fault handling kicks in and we voluntarily preempt ourselves do_page_fault() __schedule() __switch_to(other_task) our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared __switch_to(our_task) switch_to_tm() tm_recheckpoint_new_task() trechkpt # kernelspace in SUSPENDED state The page fault handling resumes, but now we are in suspended transaction state do_page_fault() completes rfid <----- trying to get back where the page fault happened (we were non-transactional back then) TM Bad Thing # illegal transition from suspended to non-transactional This patch fixes that issue by clearing the current thread's MSR[TS] just after treclaim in get_tm_stackpointer() so that we stay in non-transactional state in case we are preempted. In order to make treclaim and clearing the thread's MSR[TS] atomic from a preemption perspective when CONFIG_PREEMPT is set, preempt_disable/enable() is used. It's also necessary to save the previous value of the thread's MSR before get_tm_stackpointer() is called so that it can be exposed to the signal handler later in setup_tm_sigcontexts() to inform the userspace MSR at the moment of the signal delivery. Found with tm-signal-context-force-tm kernel selftest. Fixes: `2b0a576d15` ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9 Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com	2020-02-18 21:30:42 +11:00
Christophe Leroy	a4031afb9d	powerpc/8xx: Fix clearing of bits 20-23 in ITLB miss In ITLB miss handled the line supposed to clear bits 20-23 on the L2 ITLB entry is buggy and does indeed nothing, leading to undefined value which could allow execution when it shouldn't. Properly do the clearing with the relevant instruction. Fixes: `74fabcadfd` ("powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4f70c2778163affce8508a210f65d140e84524b4.1581272050.git.christophe.leroy@c-s.fr	2020-02-17 12:47:06 +11:00
Christophe Leroy	50a175dd18	powerpc/hugetlb: Fix 8M hugepages on 8xx With HW assistance all page tables must be 4k aligned, the 8xx drops the last 12 bits during the walk. Redefine HUGEPD_SHIFT_MASK to mask last 12 bits out. HUGEPD_SHIFT_MASK is used to for alignment of page table cache. Fixes: `22569b881d` ("powerpc/8xx: Enable 8M hugepage support with HW assistance") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/778b1a248c4c7ca79640eeff7740044da6a220a0.1581264115.git.christophe.leroy@c-s.fr	2020-02-17 12:47:06 +11:00
Christophe Leroy	f2b67ef90b	powerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page size Commit `55c8fc3f49` ("powerpc/8xx: reintroduce 16K pages with HW assistance") redefined pte_t as a struct of 4 pte_basic_t, because in 16K pages mode there are four identical entries in the page table. But the size of hugepage tables is calculated based of the size of (void *). Therefore, we end up with page tables of size 1k instead of 4k for 512k pages. As 512k hugepage tables are the same size as standard page tables, ie 4k, use the standard page tables instead of PGT_CACHE tables. Fixes: `3fb69c6a1a` ("powerpc/8xx: Enable 512k hugepage support with HW assistance") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/90ec56a2315be602494619ed0223bba3b0b8d619.1580997007.git.christophe.leroy@c-s.fr	2020-02-17 12:47:05 +11:00
Sam Bobroff	d4f194ed9e	powerpc/eeh: Fix deadlock handling dead PHB Recovering a dead PHB can currently cause a deadlock as the PCI rescan/remove lock is taken twice. This is caused as part of an existing bug in eeh_handle_special_event(). The pe is processed while traversing the PHBs even though the pe is unrelated to the loop. This causes the pe to be, incorrectly, processed more than once. Untangling this section can move the pe processing out of the loop and also outside the locked section, correcting both problems. Fixes: `2e25505147` ("powerpc/eeh: Fix crash when edev->pdev changes") Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com	2020-02-17 12:47:05 +11:00

1 2 3 4 5 ...

21325 Commits